Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terra::rasterize() returns error about file already existing when trying to run on very large template raster #1072

Closed
mikoontz-vp opened this issue Mar 22, 2023 · 3 comments

Comments

@mikoontz-vp
Copy link

I'm still trying to come up with a minimum reproducible example here, but I'm noticing that terra::rasterize() returns an error about a file already existing (and suggests that I can try overwrite = TRUE), but I'm including a filename= argument for a file that definitely doesn't already exist, and am even including an overwrite = TRUE argument specification.

I'm trying to count overlapping polygons over a large extent at 30 meter resolution. I believe I'm getting to this part of the code:

terra/R/rasterize.R

Lines 275 to 282 in 3ec9212

if (fun == "sum") {
xopt = spatOptions()
y@pnt <- y@pnt$rasterize(x@pnt, field, values, background, touches[1], fun, FALSE, update[1], TRUE, xopt)
messages(y, "rasterize")
yy <- rast(y)
yy@pnt <- y@pnt$rasterize(x@pnt, "", values, NA, touches[1], "" , FALSE, update[1], TRUE, xopt)
messages(yy, "rasterize")
return(mask(y, yy, updatevalue=background, filename=filename, overwrite=overwrite, wopt=wopt))
. I'll keep trying to come up with a minimal example that doesn't require the full 30 m LANDFIRE raster of the US, but that's the use case.

@mikoontz-vp
Copy link
Author

For a very much not small (but perhaps still minimal?) working example, I was using the Landfire biophysical settings layer for CONUS, which is the first download link here: https://landfire.gov/version_download.php

I downloaded that file and called it bps using bps <- terra::rast('path/to/file.tif').

And a vector file here:

poly <- 
structure(list(cover = 1, geom = structure(list(structure(list(
    structure(c(-1452231.77129677, -1452222.69288167, -1452240.60254609, 
    -1452270.35783937, -1452431.4231203, -1452607.35777952, -1452659.21510414, 
    -1452762.18122499, -1452829.02984966, -1452847.25336439, 
    -1453057.82626641, -1453188.40774648, -1453269.02433191, 
    -1453299.76010571, -1453250.90020481, -1453383.5984184, -1453850.7850045, 
    -1454325.15129785, -1454937.70679109, -1455068.47470585, 
    -1455069.45910177, -1454836.21027498, -1454555.45671046, 
    -1454243.11075379, -1453868.51362787, -1453497.84995708, 
    -1453133.21154406, -1452935.74625203, -1452734.03590986, 
    -1452588.24683751, -1452592.59874916, -1452416.73195894, 
    -1452384.74877961, -1452404.49022335, -1452383.66868321, 
    -1452352.27579461, -1452231.77129677, 3016893.96397174, 3016874.65482534, 
    3016791.51837831, 3016804.38301741, 3016874.54797905, 3017041.40129502, 
    3017107.98013304, 3017123.83307112, 3017134.49944085, 3017128.00281342, 
    3017051.74335447, 3017004.41718255, 3017045.0942008, 3017148.44029969, 
    3017374.61816091, 3017480.90822985, 3017418.98736974, 3017262.35108648, 
    3017232.51001991, 3017301.84502218, 3017496.78422646, 3018117.03921131, 
    3018542.90609553, 3018915.93314532, 3019224.0682051, 3019513.82286606, 
    3019607.20339498, 3019555.12109982, 3019369.22962815, 3019124.07713922, 
    3018843.15713973, 3018384.2725625, 3018010.89539428, 3017367.36994335, 
    3017279.97399418, 3017148.58934714, 3016893.96397174), dim = c(37L, 
    2L))), class = c("XY", "POLYGON", "sfg"))), class = c("sfc_POLYGON", 
"sfc"), precision = 0, bbox = structure(c(xmin = -1455069.45910177, 
ymin = 3016791.51837831, xmax = -1452222.69288167, ymax = 3019607.20339498
), class = "bbox"), crs = structure(list(input = "NAD83 / Conus Albers", 
    wkt = "PROJCRS[\"NAD83 / Conus Albers\",\n    BASEGEOGCRS[\"NAD83\",\n        DATUM[\"North American Datum 1983\",\n            ELLIPSOID[\"GRS 1980\",6378137,298.257222101,\n                LENGTHUNIT[\"metre\",1]]],\n        PRIMEM[\"Greenwich\",0,\n            ANGLEUNIT[\"degree\",0.0174532925199433]],\n        ID[\"EPSG\",4269]],\n    CONVERSION[\"Conus Albers\",\n        METHOD[\"Albers Equal Area\",\n            ID[\"EPSG\",9822]],\n        PARAMETER[\"Latitude of false origin\",23,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8821]],\n        PARAMETER[\"Longitude of false origin\",-96,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8822]],\n        PARAMETER[\"Latitude of 1st standard parallel\",29.5,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8823]],\n        PARAMETER[\"Latitude of 2nd standard parallel\",45.5,\n            ANGLEUNIT[\"degree\",0.0174532925199433],\n            ID[\"EPSG\",8824]],\n        PARAMETER[\"Easting at false origin\",0,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8826]],\n        PARAMETER[\"Northing at false origin\",0,\n            LENGTHUNIT[\"metre\",1],\n            ID[\"EPSG\",8827]]],\n    CS[Cartesian,2],\n        AXIS[\"easting (X)\",east,\n            ORDER[1],\n            LENGTHUNIT[\"metre\",1]],\n        AXIS[\"northing (Y)\",north,\n            ORDER[2],\n            LENGTHUNIT[\"metre\",1]],\n    USAGE[\n        SCOPE[\"Data analysis and small scale data presentation for contiguous lower 48 states.\"],\n        AREA[\"United States (USA) - CONUS onshore - Alabama; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming.\"],\n        BBOX[24.41,-124.79,49.38,-66.91]],\n    ID[\"EPSG\",5070]]"), class = "crs"), n_empty = 0L)), row.names = 1L, class = c("sf", 
"data.frame"), sf_column = "geom", agr = structure(c(cover = NA_integer_), levels = c("constant", 
"aggregate", "identity"), class = "factor"))

Then I run:

terra::rasterize(x = poly, y = bps, fun = "sum", filename = "test.tif", overwrite = TRUE)

Which results in the following error:

Error: [rasterize] file exists. You can use 'overwrite=TRUE' to overwrite it

My sessionInfo() is:

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] magrittr_2.0.3 terra_1.7-18  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10         pillar_1.8.1        compiler_4.2.2      class_7.3-21        tools_4.2.2        
 [6] lifecycle_1.0.3     tibble_3.2.0        lattice_0.20-45     pkgconfig_2.0.3     rlang_1.1.0        
[11] DBI_1.1.3           cli_3.6.0           rstudioapi_0.14     parallel_4.2.2      e1071_1.7-13       
[16] withr_2.5.0         exactextractr_0.9.1 dplyr_1.1.0         raster_3.6-20       generics_0.1.3     
[21] vctrs_0.5.2         rprojroot_2.0.3     classInt_0.4-9      grid_4.2.2          tidyselect_1.2.0   
[26] here_1.0.1          glue_1.6.2          data.table_1.14.8   sf_1.0-12           R6_2.5.1           
[31] fansi_1.0.4         sp_1.6-0            stars_0.6-0         codetools_0.2-19    units_0.8-1        
[36] abind_1.4-5         fasterize_1.0.4     utf8_1.2.3          KernSmooth_2.23-20  proxy_0.4-27       
[41] lwgeom_0.2-11 

@mikoontz-vp
Copy link
Author

Maybe not relevant, but some additional context is that I'm trying terra::rasterize() after fasterize::fasterize() ran into a memory limit. I started having some success with exactextractr::exact_extract() which runs plenty fast and so far seems memory safe...

library(data.table)

this_year_count_DT <- 
  exactextractr::exact_extract(x = bps, 
                               y = poly,
                               include_xy = TRUE,
                               include_cell = TRUE) |> 
  data.table::rbindlist()

this_year_count_DT[, value := NULL]
this_year_count_DT[, covered := as.numeric(coverage_fraction > 0)]
this_year_count_DT <- this_year_count_DT[, .(count = sum(covered)), 
                                         by = .(x, y, cell)]

...but ran into trouble (R session aborted; memory issue?) when trying to re-populate a SpatRaster with a 1 for just the cells that were covered by the polygon.

updating_rast <- terra::init(x = bps,  fun = 0)

updating_rast[this_year_count_DT$cell][,1] <- 
  updating_rast[this_year_count_DT$cell][,1] + this_year_count_DT$count

@rhijmans
Copy link
Member

I believe this works now. At least the below works (simulating the conditions encountered with a large file):

library(terra)
p1 <- cbind(1, 1, rbind(c(-180,-20), c(-140,55), c(10, 0), c(-140,-60), c(-180,-20)), 0)
hole <- cbind(1, 2, rbind(c(-150,-20), c(-100,-10), c(-110,20), c(-150,-20)), 1)
p2 <- cbind(2, 1, rbind(c(-10,0), c(140,60), c(160,0), c(140,-55), c(-10,0)), 0)
p3 <- cbind(3, 1, rbind(c(-125,0), c(0,60), c(40,5), c(15,-45), c(-125,0)), 0)

x <- rbind(p1, hole, p2, p3)
v <- vect(x, "polygons")
r <- rast()

terraOptions(steps=4, todisk=T)
r = rasterize(v, r, fun=sum)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants