Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparing raster::focal pad options to pfocal #26

Open
see24 opened this issue Oct 29, 2021 · 10 comments
Open

comparing raster::focal pad options to pfocal #26

see24 opened this issue Oct 29, 2021 · 10 comments
Labels
bug Something isn't working enhancement New feature or request help wanted Extra attention is needed

Comments

@see24
Copy link
Member

see24 commented Oct 29, 2021

There is one option in raster::focal that I can't replicate in pfocal where pad = FALSE and na.rm = TRUE. This option gives NA for cells near the edge (ie there is no padding) but allows NAs in the raster to be ignored.

Here are some plots showing the implications of the various options

library(raster)
library(pfocal)
library(tmap) # better default NA palette

mat <- matrix(c(rep(1,5000), rep(2, 5000)), nrow = 100, ncol = 100)
exps <- raster(mat, xmn = 0, xmx = 100, ymn = 0, ymx = 100)

# add an NA in the middle
exps[25,25] <- NA 

# raster options for dealing with edge effects #=============
rast_padF_narmF <- focal(exps, gaussian_kernel_radius(3), 
                   pad = FALSE, na.rm = FALSE)
qtm(rast_padF_narmF)

# this is the one I can't replicate with pfocal ***
rast_padF_narmT <- focal(exps, gaussian_kernel_radius(3), 
                         pad = FALSE, na.rm = TRUE)
qtm(rast_padF_narmT)
# the area around the NA value gets a lower value because the default function
# is sum
qtm(rast_padF_narmT < 0.999)

# this is the same as rast_padF_narmF
rast_padT_narmF <- focal(exps, gaussian_kernel_radius(3), 
                   pad = TRUE, na.rm = F, padValue = NA )

qtm(rast_padT_narmF)

# this has a larger area that is not NA
rast_padT_narmT <- focal(exps, gaussian_kernel_radius(3), 
                         pad = TRUE, na.rm = T, padValue = NA )

qtm(rast_padT_narmT)

# pfocal versions #======================

pfoc_evNA_narmNA <- pfocal(exps, gaussian_kernel_radius(3),
                           edge_value = NA, na.rm = NA)
qtm(pfoc_evNA_narmNA)

# seems to be the same as na.rm = NA for this case
pfoc_evNA_narmF <- pfocal(exps, gaussian_kernel_radius(3),
                           edge_value = NA, na.rm = FALSE)
qtm(pfoc_evNA_narmF)

# same as rast_padT_narmT
pfoc_evNA_narmT <- pfocal(exps, gaussian_kernel_radius(3),
                           edge_value = NA, na.rm = TRUE)
qtm(pfoc_evNA_narmT)
@VLucet
Copy link
Contributor

VLucet commented Oct 29, 2021

Yup, I see the issue. Thanks so much for the thorough reprex. I also agree with the fact that ideally pfocal should be able to cover the exact same cases than raster::focal.
It looks like this would require to dabble into the C code. Quick question: in which applications would we like a behavior such as only the NAs inside the rasters would be ignored?

@VLucet VLucet added bug Something isn't working enhancement New feature or request labels Oct 29, 2021
@see24
Copy link
Member Author

see24 commented Oct 30, 2021

It is the default in caribouMetrics when we use focal, but in that case I believe all the inputs are forced to be not NA any ways so it shouldn't matter. I think there are some cases where for example you might have photo interpreted data and the is an NA pixel because of cloud or something you probably wouldn't want the whole window to end up as NA but you might not want to include values outside the raster data

@VLucet
Copy link
Contributor

VLucet commented Dec 10, 2021

Coming back on this, it looks like we'd need a way for the C code to differentiate the edge values from the non-edge values. Unfortunately I do not have a good enough grasp of the C code to make that change.

@VLucet VLucet added the help wanted Extra attention is needed label Dec 10, 2021
@see24
Copy link
Member Author

see24 commented Dec 10, 2021

I think the solution would be to set all cells that are NA inside the raster to 0 before passing the raster to the C code. you can do that using raster::reclassify(rast, cbind(NA, 0))
For the caribouMetrics case I believe I already do that outside pfocal so it is fine but if you want to match the raster::focal option that is how I would do it.
But I suppose imputing NA values is it's own separate thing and it is probably best to make people sort that out explicitly on their own

@VLucet
Copy link
Contributor

VLucet commented Dec 10, 2021

I dont see how setting the cells to 0 would succeed. It would be counted as a cell and would not be ignored, which is the behavior we are looking for. For functions like mean, it wont match the behavior of raster focal.

@see24
Copy link
Member Author

see24 commented Dec 13, 2021

Hmm yes you are right. The challenge is with raster::focal it is way faster to use sum and then set the weights matrix to add to one. So in that context ignoring NAs doesn't take them out of the denominator. So that was what I was thinking of but you are right.

@xlirate
Copy link
Contributor

xlirate commented Jul 6, 2022

@see24 Are you still interested in having this feature? I am taking a crack at making the C++ in pfocal easier to work with, and will have the chance to make adjustments while I am in here. I am the programmer that wrote it in the first place, but I am not really a researcher in the field, so if you have time to educate me on your use-case I can see about how to make it work.

@see24
Copy link
Member Author

see24 commented Jul 6, 2022

@xlirate I think it would be somewhat helpful but only in fairly marginal cases so I would check with Josie when she is back to see if she thinks it is worth the effort. It depends how much we value being able to match the behaviour of raster::focal. I don't have a particular biological use case in mind only just that sometimes values that are NA within the raster should be treated differently than values that are outside the raster and raster::focal seems to make that possible.

If you do look into adding this I would also check out the behaviour of terra::focal which is an updated version of the raster package. It has an na.policy argument as well as the na.rm argument. I am working with a different researcher this week and next so I don't have time to look into it too much but if you do I think it could be helpful.

Also note this warning in the raster::focal documentation for when setting na.rm is recommended:
na.rm: logical. If TRUE, NA will be removed from focal computations. The result will only be NA if all focal cells are NA. Except for some special cases (weights of 1, functions like min, max, mean), using na.rm=TRUE may not be a good idea in this function because it can unbalance the effect of the weights

@xlirate
Copy link
Contributor

xlirate commented Jul 6, 2022

it can unbalance the effect of the weights

We have something to offset this issue implemented using mean_divider.

Using "KERNEL_SUM" will divide the value at every point by the sum of the kernel's weights, calculated globally.

Using "DYNAMIC_SUM" will divide the value at a point by the sum of the kernel values used at that point, calculated based only on the kernel values used at that point, specifically excluding any that are omitted due to NAs

All of the dynamic divisors will be slower because they recalculate at every point instead of calculating once at the start, but they happen in the C++ and in parallel so they are still worth considering.

@see24
Copy link
Member Author

see24 commented Jul 6, 2022

Ok cool! I think that would be great then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants