-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
usage with multiple threads? #29
Comments
There is currently no support for this. In particular two processes could write to the same file simultaneously, producing a corrupted file. e.g. if both processes called a function with the same arguments at the same time. To avoid this you would need to use some sort of file locking, maybe with the flock package, although that package is not on CRAN, would need to be tested on windows and this would likely need its own cache, since file locking is only nessesary for multi-process code. |
Thanks for the quick response! |
A possible implementation could be to maintain cache per worker using the process ID as a prefix to the filename. This way repeated calls in the workers would be indeed sped up. I mostly have a fn <- function() {
cl <- makeCluster(detectCores(), outfile='')
tryCatch({
result <- parLapply(cl, objects, FUN=my.slow.computation)
}, finally=stopCluster(cl))
}
fn <- memoise(fn)
fn() Perhaps that might help in other workflows too. |
We could use @gaborcsardi's new file locking package |
The package flock is on CRAN and it works really well. |
Just to add to the ideas already given, I secured the lock on the cache file just before the return statement in my memoised function and released the lock immediately after the function call(memoised function) in the calling environment |
My understanding is that Is there a way to create new subprocesses inside parallel processes that can be used by From Thank you! |
This looks like a great package. It's saving me and my collaborators a lot of unnecessary computation time.
I was wondering about how the package would perform if a memoized function were running in parallel on several threads, especially with caches stored on the filesystem. Given that the hashes are deterministic, it doesn't seem like there would be a problem, but I didn't see anything specifically about it in the documentation, so I thought it would be good to ask.
Thanks in advance!
The text was updated successfully, but these errors were encountered: