Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memfrac lower than 0.1 #476

Closed
mstrimas opened this issue Jan 9, 2022 · 4 comments
Closed

memfrac lower than 0.1 #476

mstrimas opened this issue Jan 9, 2022 · 4 comments

Comments

@mstrimas
Copy link
Contributor

mstrimas commented Jan 9, 2022

I'm running some terra code on a cluster with slurm where each node has 256 GB RAM and 64 cores. For a given job, the amount of RAM allocated is proportional to the number of cores (e.g. 4 GB for 2 cores), however, terra seems to think it has access to all 256 GB, and I'm consistently having my jobs killed because they're using too much RAM. I tried using terraOptions(memfrac = 0.015) (4/256 = 0.015), however, this silently uses 0.1 instead, which is the lower threshold specified in the terraOptions() documentation. Is there a way to use a lower memfrac or another way to control memory usage of terra.

@rhijmans
Copy link
Member

rhijmans commented Jan 9, 2022

Thanks for reporting this. I ran into the very same issue myself a while ago, and worked around it; but never fixed it.

I now changed terra so that you can set memfrac to zero. terra will still use some memory, of course. In most cases the very minimum is one row of raster values (and perhaps a copy or two of these values). I need to do some testing, as this may fail for some function that need multiple rows, such as focal; but that can be fixed.

This will go to CRAN very soon, but it may take a while for it to hit your cluster. With older versions, a work-around is to use argument steps that you can provide as additional argument, or as an element of wopts if the additional argument are used for something else. The extreme case would be steps=nrow(x) (where x is the input SpatRaster), but you could also do something like steps=nrow(x)/10 .

In my case, I opted for processing by tiles, which can be nice if you have relatively few, but very large files. It is a bit more involved, and it is on my to-do list to generalize that and make it available to all functions, as an option. It would allow you to specify the number of tiles, and the tile number so that you can parallelize. Tiles go to a folder with a .vrt file that virtually combines them. The benefit was that if a job got killed, none of the tiles already done were lost.

One reason why I did not fix this earlier that I was wondering if there would be a generic way to discover the number of cores that share the available RAM so that the memory available can be automatically adjusted; but that may be rather complicated or system specific like calling slurm commands.

However, another possibility would be to allow setting the available RAM as an option. I think I will add that.

@mstrimas
Copy link
Contributor Author

mstrimas commented Jan 9, 2022

This should be perfect! Also, I wasn't aware of the steps argument, that should come in handy, thanks!

rhijmans added a commit that referenced this issue Jan 9, 2022
@rhijmans
Copy link
Member

rhijmans commented Jan 9, 2022

terra now has a new option maxmem that allows you to cap the amount of RAM (expressed in GB) that is deemed available.

library(terra)
#terra 1.5.9
r <- rast(res=1/1200)
mem_info(r)

#------------------------
#Memory (GB) 
#------------------------
#available       : 50.31
#allowed (60%)   : 30.18
#needed (n=1)    : 695.23
#------------------------
#proc in memory  : FALSE
#nr chunks       : 39
#------------------------

## Cap available RAM to 10 GB

terraOptions(memmax=10)
terraOptions()
#memfrac   : 0.6
#tempdir   : C:/temp/RtmpMVMgRQ
#datatype  : FLT4S
#progress  : 3
#todisk    : FALSE
#verbose   : FALSE
#tolerance : 0.1
#memmax    : 10

mem_info(r)

#------------------------
#Memory (GB) 
#------------------------
#available       : 10  (memmax)
#allowed (60%)   : 6
#needed (n=1)    : 695.23
#------------------------
#proc in memory  : FALSE
#nr chunks       : 194
#------------------------

## Reset memmax with NA, zero, or a number <= 0

terraOptions(memmax=NA)
mem_info(r)

#------------------------
#Memory (GB) 
#------------------------
#available       : 50.77
#allowed (60%)   : 30.46
#needed (n=1)    : 695.23
#------------------------
#proc in memory  : FALSE
#nr chunks       : 39
#------------------------

@mstrimas
Copy link
Contributor Author

Thanks! This is a huge help to be able to control memory usage either by the fraction of the total or the absolute amount of RAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants