Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force garbage collection after running FUN. #124

Merged
merged 1 commit into from Oct 17, 2020

Conversation

LTLA
Copy link
Contributor

@LTLA LTLA commented Oct 17, 2020

This provides one solution to the problem discussed today on Slack, to wit:

library(BiocParallel)
big <- runif(5e8)
BPPARAM <- MulticoreParam(workers=10)
bplapply(1:1000, function(i) { out <- sum(runif(1e7)); out }, BPPARAM=BPPARAM)

This should use 4 GB for big plus another 80 MB per worker, totalling to just under 5 GB. Indeed, my laptop reports about 6 GB RAM used after big is constructed, consistent with a bit of OS overhead. However, running the bplapply causes my laptop to go into swap, despite having a total of 16 GB RAM that should be more than enough to handle the 800 MB across all workers.

I think the underlying problem is that each forked process believes that the entirety of the parent's heap is available. When allocations are made in one child, I assume that the affected space doesn't show up as being used in another child; rather, the pages are copied so that the second child can still use that space for its own allocations. This increases the overall memory usage as expected, but the real issue is that this slips past the garbage collector. Within each child, there is no reason to trigger the GC as - for all it knows - it has plenty of remaining memory in the heap to work with, so why bother? As a consequence, each worker uses a profligate amount of memory across repeated FUN calls, roughly equivalent to the size of the heap in the parent.

This PR works towards a solution by forcing garbage collection at the end of each evaluation of FUN, which allows the code above to proceed without entering swap - total RAM usage hovers around ~7GB, consistent with the budgeting above. There is probably a better place to put this instead of .composeTry; that was just for convenience. I could also imagine an additional generic to only perform garbage collection in certain parallelization contexts (e.g., forking only) and in a user-tunable manner.

@mtmorgan mtmorgan closed this in 250d884 Oct 17, 2020
@mtmorgan mtmorgan merged commit 1ece9d0 into Bioconductor:master Oct 17, 2020
@mtmorgan
Copy link
Collaborator

thanks that seems really helpful.

mtmorgan added a commit that referenced this pull request Oct 17, 2020
@DarwinAwardWinner
Copy link

How much does this affect performance? I seem to recall garbage collection can be pretty slow sometimes. Maybe this should be an option when calling MulticoreParam?

@HenrikBengtsson
Copy link
Contributor

HenrikBengtsson commented Oct 17, 2020

Also, which is something I always wanted the investigate but never got around to do: is there a risk that garbage collection and finalizers in a child process break the shared memory and triggers memory copies? Was the GC written with forked processing in mind? Can it make things worse? Should there be a way to disable the GC in forks?

@mtmorgan
Copy link
Collaborator

@DarwinAwardWinner my feeling is that the coarse-grain granularity of bplapply makes the performance consequences of gc() moot.

@HenrikBengtsson this seems orthogonal to the PR -- if finalizers or gc() in forked processes are a problem, then it's probably better to have an explicit garbage collection and expose these problems, than to rely on intermittent garbage collection / errors.

@LTLA
Copy link
Contributor Author

LTLA commented Oct 18, 2020

Yes, I set full=FALSE to try to only collect the objects that were allocated inside FUN, to solve the immediate problem while avoiding the potential costs of a full collection. That said, I'm not sure how safe/effective this is in general.

A safer solution would be to not second-guess the GC and just cap the heap for workers. This might be possible for SnowParam given that it starts a new R session anyway, but a runtime decrease in memory seems difficult for MulticoreParam.

mtmorgan added a commit that referenced this pull request Sep 17, 2021
- set whether to force R garbage collection (expensive!) on every call to
  FUN()
- change default behavior -- force only for MulticoreParam
- improves #124
mtmorgan added a commit that referenced this pull request Sep 19, 2021
- set whether to force R garbage collection (expensive!) on every call to
  FUN()
- change default behavior -- force only for MulticoreParam
- improves #124
- only TRUE or FALSE allowed, with appropriate defaults in constructor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants