New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for *_ply to take parallel? #60
Comments
Yes, definitely. Just need to find the time to work on it. |
Glad to see this will be happening. I commonly operate on a large set of large files, performing batch operations of GIS analysis via sp and raster on a large cluster. Similar to the above case where intermediate and final products are stored as files rather than as R objects. |
While we're on this topic - would it be beneficial for the **ply() family of functions to also take an argument indicating whether side-effects should be propagated or not? If they knew that they only needed to transmit the return value back to the caller, and not transmit the entire set of changes that happened to the R environment, they could optimize the parallel execution environment. Or is that best handled by encapsulating the behavior in the |
I think that should be the default - and if you want side effects propagated outside, you need to use the parallel tools directly. |
Yeah, that seems reasonable, because it's not clear what automatic/default strategy should be used to merge all the effects back in to the main process. |
Sorry if this is the wrong place, but I wondered why d_ply doesn't have a parallel argument, I'm splitting up a data frame and drawing graphs with it, seems a good use for parallelization ... did a quick search of the list and Hadley says it isn't supported back in Feb. Is this something that might be supported?
Thanks,
James
The text was updated successfully, but these errors were encountered: