Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control of parallel computation using fasterFragmentation() #24

Closed
scrameri opened this issue Jan 7, 2021 · 7 comments
Closed

Control of parallel computation using fasterFragmentation() #24

scrameri opened this issue Jan 7, 2021 · 7 comments

Comments

@scrameri
Copy link

scrameri commented Jan 7, 2021

Hi Adam,

It's super nice to have fasterRaster for work involving large raster files! Projection and similar calculations always take so much time using the raster package, even when using beginCluster(), so thank you very much for this great tool!

I saw some unexpected behaviour when doing parallel computing with fasterFragmentation() using specific arguments for cores and forceMulti.

Typically, the program starts normally (using a single core for 1-2 minutes). Then, the parallel processing kicks in and typically runs at nearly 100% total CPU usage (but runs many (>20) individual processes, each with low %CPU) until the computer needs to "cool down" again for 1-2 minutes, and kicks back into parallel processing. On my best computer, the computation ran through after 5 hours but this behaviour caused an R session interruption on my laptop (4 cores, 16 GB RAM).

  • Isn't this unexpected given that cores = 3 and forceMulti = FALSE?
  • Is there an issue in fasterRaster, or is the issue in the evoked snow or parallel packages?
  • Any advice on which parameter combination to use for such a large raster?

Here are two .png screenshots of my activity manager (CPU consumption) during the function execution.

I can send you the raster to try and reproduce the behaviour, I've seen a similar behaviour on Mac, Windows and Linux OS.

Thanks again and best wishes,
Simon

##############################

Here is the code:

> cores
[1] 3
> forceMulti
[1] FALSE
> frag <- fasterFragmentation(rast = input, size = 5, pad = TRUE, padValue = NA, calcDensity = TRUE, calcConnect = TRUE, 
                              calcClass = TRUE, na.rm = TRUE, undet = "perforated", cores = cores, forceMulti = forceMulti)
  

This is how the input raster looks like (it has NA, 0, 1 values):

> input
class      : RasterLayer 
dimensions : 50884, 26746, 1360943464  (nrow, ncol, ncell)
resolution : 30, 30  (x, y)
extent     : 298440, 1100820, 7155900, 8682420  (xmin, xmax, ymin, ymax)
crs        : +proj=utm +zone=38 +south +datum=WGS84 +units=m +no_defs 
source     : /.../forest.tif 
names      : forest 
values     : 0, 1  (min, max)

Here is my computer info:

  • macOS Catalina (10.15.7)
  • 1 processor (4 GHz Quad-Core Intel Core i7)
  • 4 processor cores
  • 32 GB RAM

Here is my session info:

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] fasterRaster_0.6.0 rgrass7_0.2-3      XML_3.99-0.5       raster_3.4-5       sp_1.4-4           link2GI_0.4-5     

loaded via a namespace (and not attached):
 [1] xfun_0.19          tidyselect_1.1.0   remotes_2.2.0      purrr_0.3.4        sf_0.9-6           lattice_0.20-41   
 [7] vctrs_0.3.6        generics_0.1.0     testthat_3.0.1     htmltools_0.5.0    usethis_2.0.0      base64enc_0.1-3   
[13] rlang_0.4.10       pkgbuild_1.2.0     e1071_1.7-4        pillar_1.4.7       glue_1.4.2         withr_2.3.0       
[19] DBI_1.1.0          sessioninfo_1.1.1  lifecycle_0.2.0    stringr_1.4.0      devtools_2.3.2     htmlwidgets_1.5.3 
[25] codetools_0.2-18   memoise_1.1.0      knitr_1.30         callr_3.5.1        ps_1.5.0           crosstalk_1.1.0.1 
[31] class_7.3-17       fansi_0.4.1        leafem_0.1.3       Rcpp_1.0.5         KernSmooth_2.23-18 classInt_0.4-3    
[37] desc_1.2.0         pkgload_1.1.0      leaflet_2.0.3      fs_1.5.0           png_0.1-7          digest_0.6.27     
[43] stringi_1.5.3      processx_3.4.5     dplyr_1.0.2        grid_4.0.2         rprojroot_2.0.2    rgdal_1.5-18      
[49] cli_2.2.0          tools_4.0.2        magrittr_2.0.1     tibble_3.0.4       crayon_1.3.4       pkgconfig_2.0.3   
[55] ellipsis_0.3.1     xml2_1.3.2         prettyunits_1.1.1  assertthat_0.2.1   roxygen2_7.1.1     rstudioapi_0.13   
[61] R6_2.5.0           units_0.6-7        compiler_4.0.2 
@adamlilith
Copy link
Owner

Hi Simon,

Thanks for your kind words and your message. Some of the behavior you noted is expected and some is not. The fasterFragmentation() function calls fasterFocal() twice to calculate density and connectivity. I think the short waiting period (with 1 core running) is probably caused by the padding the fasterFocal() function does. This adds a few rows and columns of NAs to the outside of the raster so the focal function can get a full "window" of cells when the window is near the edge. You can turn this off in fasterFragmentation() with the argument pad = FALSE, but I see you have it set to TRUE and I think you're probably right to keep it that way (unless for some reason you don't care about cells on the edge of the raster).

The other behavior, with multiple hidden R sessions running is very odd--I see you can get quite a lot! fasterFragmentation() and fasterFocal() both call a function named .getCores() (it's a hidden function but you can see it with fasterRaster:::.getCores). That function should limit the number of cores you can use to the maximum number on your system, regardless of how many you want it to be. I see, though, that you are requesting 3 cores, and as you said, your system has 4. So it's very odd that you would have so many R instances running, and I don't have a firm explanation for this.

It's possible that I did not call raster::endCluster() correctly in fasterFocal() and fasterFragmentation(), which would end the child R processes when they're done. If this is so, fasterFragmentation() would start a set of cores, then when they're done with their job, start another set but keep the first ones running, and so on. I just added an additional raster::endCluster() to each of these functions and updated them on the branch called "cores". If you would be so kind, could you please re-install fasterRaster from that branch and see if you get the same behavior?

remotes::install_github('adamlilith/fasterRaster', ref='cores')

It's also possible that if you close the R session you manually started, but it created some child R sessions, that the children are still running. On my machine they "die" soon after I close the parent, but they might keep running in vain. If that's the case, I suggest either stopping the processes manually or (sorry) restarting your computer.

Thanks again,
Adam

@scrameri
Copy link
Author

scrameri commented Jan 7, 2021

Hi Adam,

thanks for reaching back so swiftly, and thanks for all these explanations!

Here is the output of .getCores():

> fasterRaster:::.getCores(rast = input, cores = cores, forceMulti = forceMulti)
[1] 3

Here are the suggested chunk sizes of my input raster:

> raster::blockSize(input, minblocks = cores)
$row
  [1]     1   468   935  1402  1869  2336  2803  3270  3737  4204  4671  5138  5605  6072  6539  7006
 [17]  7473  7940  8407  8874  9341  9808 10275 10742 11209 11676 12143 12610 13077 13544 14011 14478
 [33] 14945 15412 15879 16346 16813 17280 17747 18214 18681 19148 19615 20082 20549 21016 21483 21950
 [49] 22417 22884 23351 23818 24285 24752 25219 25686 26153 26620 27087 27554 28021 28488 28955 29422
 [65] 29889 30356 30823 31290 31757 32224 32691 33158 33625 34092 34559 35026 35493 35960 36427 36894
 [81] 37361 37828 38295 38762 39229 39696 40163 40630 41097 41564 42031 42498 42965 43432 43899 44366
 [97] 44833 45300 45767 46234 46701 47168 47635 48102 48569 49036 49503 49970 50437

$nrows
  [1] 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467
 [26] 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467
 [51] 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467
 [76] 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467 467
[101] 467 467 467 467 467 467 467 467 448

$n
[1] 109

I installed the @cores branch fasterRaster version 0.6.1, restarted the computer and started another call of the fasterFragmentation() function as above, in a clean RStudio session.

As far as I can see, the odd behaviour did not change (see the two new .pngs), it's still running > 20 child R processes.

It's (very) late here now, I'll let it run over night and report back tomorrow, we'll see if it ran through on the laptop.

Thanks again,
Simon

@adamlilith
Copy link
Owner

Hi Simon,

Again, sorry for the trouble! Frustratingly, I am unable to recreate the problem on the two machines (both Win 10) I have available to me. I did search StackExchange under all entries for "R snow" but the only thing I found that was relevant was someone not being able to stop child workers when the children were on a different computer from the one they were called from, which I don't think is related to your issue.

However, I did try de-anonomyzing the child worker function in fasterFocal(). fasterFragmentation() calls fasterFocal() twice, once to calculate density and once to calculate connectivity. There was some mention on StackExchange that using private functions could cause problems with multi-coring. I don't know if that would help, but if you would want to try it, I've put the changes into a new branch named cores3 (it was easier to start a new one).

If this doesn't work, I could redo fasterFocal() and fasterFragmentation() to use foreach(). Another alternative is to use the terra package, which is the successor to the raster package. terra is on CRAN but in beta form so in some cases I have found that it does not work as expected, which is why so far I've been relying on the older, slower, but stable raster package.

Adam

@scrameri
Copy link
Author

scrameri commented Jan 8, 2021

Hi Adam,

No problem, it's great that you are so strongly committed to fix this issue!

Update on the @cores branch (version 0.6.1)

  • RStudio session froze during connectivity calculation, after completing the density calculation step (but I could save the density raster from the tmpDir(), it looks complete)
  • tried to complete the connectivity calculation manually by calling fasterFocalwith fun = .fragConnect as it's done inside fasterFragmentation(), but it froze again after some time, and the activity manager shows that many processes started, each at very low %CPU usage.

Update on @cores3 branch (version 0.6.2)

  • running now but again, there are > 20 individual R child processes, each with low %CPU usage (total CPU usage is at nearly 100%).
  • again, the density calculation ran through, we'll see if the connectivity gets finished by tomorrow. Computer has already slowed down, I could barely send off this post.

The hardest part appears to be the connectivity calculation, but the excessive R child processes appear to be started during both the density and connectivity calculations.

This behaviour was on my Mac book (Catalina, 4 cores, 16 GB RAM). I could try both side branches on the desktop computer if it helps, just to make sure it's not a weird behaviour on a particular computer.

I'm confident that evetually, I'll manage to get the calculations done for all the rasters, just takes some patience and maybe some "manual" calculation of forest connectivity and fragmentation class.

I've also tried running fragmentation (single-core mode), but it didn't finish after 24 hours.

Thank you and best wishes,
Simon

@scrameri
Copy link
Author

scrameri commented Jan 10, 2021

Hi Adam,

I let the computer work over the weekend and all fragmentation calculations have successfully run through for 7 forest cover maps (same extent, derived from 7 different years of satellite images). They ran at constantly high numbers of R child processes during the density and connectivity calculations. The many (>20) child processes did not disappear with any fasterRaster version / branch, though, using cores = 3 and forceMulti = FALSE on two different Mac computers.

On my end, I'm quite happy with the result at the moment, and I can now proceed with a sampling of training / test data and deforestation location modeling.

However, if fragmentation class turns out to be an important predictor of deforestation probability, then I'd need to run the fasterFragmentation() function for every predicted future forest cover, which would take dozens of days of computations on a desktop computer for e.g. annual predictions of 60 years.

At the moment, I don't dare to run the calculations on a shared high-performance cluster, given the unusual behaviour and the generation of so many R child processes. I begin to understand why these sorts of calculations have probably not yet been attempted for very large rasters, but mainly for defined smaller areas :)

Best wishes,
Simon

@adamlilith
Copy link
Owner

Hi Simon,

Well, I'm glad it worked! I still don't know why so many processes spawned. It it wasn't stopping the child processes, from what you told me, it should have at most spawned 12 of them all total, but you were getting many more than that. I won't close this issue since it's not resolved, but since I can't reproduce it on my Windows machines I can't resolve it easily.

Thanks again bringing my attention to this.

Adam

@adamlilith adamlilith reopened this Aug 5, 2022
@adamlilith
Copy link
Owner

This function has been superseded by fragmentation() in fasterRaster 8.3.0 and above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants