Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider exactextractr::exact_extract for raster extraction #104

Open
brownag opened this issue Feb 15, 2020 · 1 comment
Open

consider exactextractr::exact_extract for raster extraction #104

brownag opened this issue Feb 15, 2020 · 1 comment

Comments

@brownag
Copy link
Member

brownag commented Feb 15, 2020

This is a note to consider use of exactextractr package for faster raster extraction. I know we are expecting terra to be available in the near future, not sure how the performance of these will compare. My preliminary tests show extraction with terra operating at similar speeds to raster, maybe a little faster.

It seems exactextractr::exact_extract is orders of magnitude faster than raster::extract. It is available on CRAN and relies on GEOS.

Here is a small example showing processing of a PRISM raster. Though, I think the performance enhancement really shows when using more detailed rasters.

library(raster)
library(exactextractr)
library(tigris)
library(terra)

# extraction of all 800m cells from a county
ca <- tigris::counties(state="ca", class="sf")
calaveras <- ca[ca$COUNTYFP == "009",]

ra1 <- raster::raster("/data/geodata/MUSum_PRISM/final_MAP_mm_800m.tif")

system.time(r1 <- raster::extract(ra1, calaveras))

system.time(t1 <- terra::extract(ra1, calaveras))

system.time(x1 <- exactextractr::exact_extract(ra1, calaveras))

system.time(r1 <- raster::extract(ra1, calaveras))
user system elapsed
3.493 0.173 3.792
system.time(t1 <- terra::extract(ra1, calaveras))
user system elapsed
2.281 0.251 2.606
system.time(x1 <- exactextractr::exact_extract(ra1, calaveras))
user system elapsed
0.586 0.000 0.734
Warning message:
In .local(x, y, ...) : Polygons transformed from EPSG:4269 to EPSG:4269

Extraction of a small (10m) buffer around several point locations:

# extraction at/aruond points (investigation)
library(aqp)
library(soilDB)
library(sf)

data("loafercreek")
coordinates(loafercreek) <- ~ x_std + y_std
proj4string(loafercreek) <- "+proj=longlat +datum=WGS84"
ra2 <- raster("/data/geodata/ssro2_ann_beam_rad_int.tif")
loaf.spdf <- spTransform(as(loafercreek, 'SpatialPointsDataFrame'), 
                         CRS(proj4string(ra2)))

# create 10m buffer around points, no point extraction method in exact_extract
losp10 <- sf::st_buffer(sf::st_as_sf(loaf.spdf), dist=10)

system.time(r2 <- raster::extract(ra2, loaf.spdf))
system.time(t2 <- terra::extract(ra2, loaf.spdf))

# these take a _long_ time... n=106 tiny 20m diameter polygons
# system.time(r2 <- raster::extract(ra2, loaf.spdf, buffer=10))
# system.time(t2 <- terra::extract(ra2, loaf.spdf, buffer=10))

# in comparable time to simple point extraction, extract 10m radius 
system.time(x2 <- exactextractr::exact_extract(ra2, losp10, progress = FALSE))
x2[1:3]

# auto-weighted mean by contributing proportion ~ "point" values
system.time(x3 <- exactextractr::exact_extract(ra2, losp10, fun="mean",
                                               progress = FALSE))
x3[1:3]

Ordinary point extraction faster with raster/terra

system.time(r2 <- raster::extract(ra2, loaf.spdf))
user system elapsed
0.891 0.015 0.954
system.time(t2 <- terra::extract(ra2, loaf.spdf))
user system elapsed
0.833 0.023 0.879

But, you can create tiny polygons buffered around points and extract then nearly as fast with exactextractr. Calculating the weighted mean composition is an easy extension

in comparable time to simple point extraction, extract 10m radius:

system.time(x2 <- exactextractr::exact_extract(ra2, losp10, progress = FALSE))
user system elapsed
1.037 0.015 1.098
x2[1:3]
[[1]]
value coverage_fraction
1 60041 0.2454129
2 60393 0.1034935

[[2]]
value coverage_fraction
1 64985 0.27749354
2 65777 0.07141285

[[3]]
value coverage_fraction
1 64985 0.28094107
2 65777 0.06796531

Auto-weighted mean by contributing proportion ~ "point" values

system.time(x3 <- exactextractr::exact_extract(ra2, losp10, fun="mean",

  •                                            progress = FALSE))
    
    user system elapsed
    0.982 0.007 1.036

x3[1:3]
[1] 60145.41 65147.10 65139.28

@dylanbeaudette
Copy link
Member

I like it. Anything that makes the report faster is a good thing. I would like to do some profiling on the sampling code to see if that or raster::extract is the bottle-neck. I'm not sure, but perhaps we can use this kind of sampling to assist with reasonable adjustments to an effective sampling size (e.g. #26).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants