Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

velox slower than raster for extracting points from .adf rasters? #22

Closed
pbaylis opened this issue Feb 13, 2018 · 1 comment
Closed

Comments

@pbaylis
Copy link

pbaylis commented Feb 13, 2018

Hello,

I was trying to use velox to extract points from a wildfire raster and noticed that velox was much slower than raster. I've reproduced this issue on both Linux Mint and OS X. Apologies for the long MWE:

library(raster)
library(velox)
library(sf)

# https://www.fs.usda.gov/rds/archive/Product/RDS-2015-0046
rast <- raster("w001001.adf")
vx <- velox("w001001.adf")

n <- 100
points.df <- data.frame(x=runif(n, extent(rast)@xmin, extent(rast)@xmax),
                        y=runif(n, extent(rast)@ymin, extent(rast)@ymax))
points.sf <- st_as_sf(points.df, coords = c("x", "y"), crs = proj4string(rast), agr = "constant")
points.sf$test <- 1
points.sp <- as(points.sf, "Spatial")

system.time(extract(rast, points.sp)) # 0.3s
system.time(vx$extract_points(points.sf)) # 145s
system.time(vx$extract_points(points.sp)) # 154s

# Velox is much faster if raster is resaved as .tif
writeRaster(rast, "w001001_fromRaster.tif")
vx3 <- velox("w001001_fromRaster.tif")
rast3 <- raster("w001001_fromRaster.tif")
system.time(vx3$extract_points(points.sf)) # ~0s
system.time(vx3$extract_points(points.sp)) # ~0s
system.time(extract(rast3, points.sp)) # 0.3s

As you can see at the bottom, my resolution at the moment is to resave as GeoTiff and reload, at which point velox is once again much faster on OSX and at least comparable to raster on Linux Mint. But I admit I'm a bit confused. Both systems have plenty of memory.

@hunzikp
Copy link
Owner

hunzikp commented Feb 13, 2018

Hi there,

This is a tricky one, thanks for posting! The problem seems to arise if the raster data are stored as integers, rather than doubles. If you simply change the storage mode to double precision before extracting, the issue goes away:

library(raster)
library(velox)
library(sf)

## Get data
vx <- velox("w001001.adf")
rast <- raster("w001001.adf")

## Generate points
n <- 100
set.seed(0)
points.df <- data.frame(x=runif(n, extent(rast)@xmin, extent(rast)@xmax),
                        y=runif(n, extent(rast)@ymin, extent(rast)@ymax))
points.sf <- st_as_sf(points.df, coords = c("x", "y"), crs = proj4string(rast), agr = "constant")
points.sf$test <- 1

## Extract points
system.time(res1 <- vx$extract_points(points.sf))  # painfully slow

## Store raster data as double precision first
vx2 <- vx$copy()
storage.mode(vx2$rasterbands[[1]]) <- "double"
system.time(res2 <- vx2$extract_points(points.sf))  # fast

I'm not exactly sure why storing the data as integers causes such a slow-down; the way velox handles data types at the moment, the integer matrix should be converted to a double precision matrix by Rcpp once it's passed to C++, but evidently this isn't happening. I'll try to put together a fix for this particular issue later today and push it to GitHub. On a more general note, the way velox currently handles data types is unsatisfactory (integers are cast as doubles), and should probably be addressed more fundamentally in the long term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants