New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added map_as_series #256
added map_as_series #256
Conversation
This sounds great, any estimate on the performance boost over |
@d-v-b from my anecdotal experience, it can be the difference between the job completely failing and being able to run to completion! We decided to go this route rather than implementing a stand-alone |
Would be really cool to report at least one benchmark alongside this change, just pick some fairly big representative workflow and time the two methods. though obviously not so big that the older method fails. |
tests pass...merging I'll get back with some stats on how this performs compared to |
@d-v-b @freeman-lab @sofroniewn Preliminary analysis using a 20 node cluster (19 workers): https://gist.github.com/jwittenbach/dca311743395d904c3d7 The last cell is still running...25 minutes later |
Also bumped number of nodes up to 40 (39 workers). The new |
Nice, that's great! |
Nice, I hope we can kiss those hanging stages goodbye |
Adds a
Image.map_as_series
method that usesBlocks
to apply a function to each series in anImages
object and then turn the data back into anImages
object -- avoids needing to transform the data all the way to aSeries
representation, which can be quite expensive to turn back intoImages
due to the high level of fragmentation that can occur when the total size of the spatial dimensions greatly outnumbers the size of the temporal dimension.