Improvements to parallelization and related issues #27

DarwinAwardWinner · 2020-11-23T21:22:34Z

With the changes in d36ff29, the number of chunks should now always be an exact multiple of the number of workers, if that number is available.

Note that I've chosen to adjust the number of workers upward to the next smallest multiple of the number of workers. So for example, if n_workers = 60, then the number of chunks will be 120. This also means that if n_workers is greater over 100, then n_chunks will just be set to n_workers, no matter how large that may be. It might make more sense instead to adjust the number of chunks downward to the largest multiple of n_workers less than or equal to 100. But in that case you need to add a special case for n_workers > 100, in which you just set n_chunks to 100, since otherwise the largest multiple would be 0.

Another note: I'm not sure if you chose 100 chunks for reasons relating to memory usage or other concerns, which I why I implemented things as described above. However, it might make sense to simply always set n_chunks = n_workers whenever n_workers is available. Obviously, the code for that would be much simpler, and I'd be happy to implement that if you think it would be better.

Lastly, I've also adjusted the logic for avoiding excessive splitting of small objects by enforcing a minimum chunk size and adjusting n_chunks accordingly, instead of just setting n_chunks to 1 if the object is small enough. For example, with min_chunk_size = 20 and nrow = 100, n_chunks will be capped at 5, regardless of all other options. I've set min_chunksize to 20, but that's an arbitrary and untested choice, so if you have a better intuition for this, feel free to change it.

As for the other commits:

1c67bba makes the post-parallelization aggregation process a good bit faster by only calling the aggregation function once.
6b7fa04 fixes Broken error check in .fitVarPartModel produces uninterpretable error message #26 by using isTRUE within if statements to ensure that conditionals never evaluate to NA.

Fixes GabrielHoffman#26.

Instead of using REDUCE=c in bpiterate, use do.call(c, ...) on the result, to perform all the reductions at once. This saves a good bit of time in the aggregation step.

Now the number of chunks should always be an exact multiple of the number of workers, if that number is available.

DarwinAwardWinner · 2020-11-23T21:37:13Z

And for what it's worth, dream with 200 workers is actually somewhat faster on the cluster than 100 workers for our dataset (although nowhere near 2x faster thanks to the overhead of job submission).

Previously, if a model fit took 10000 seconds or longer, the runtime message would say "Total:%GabrielHoffman#4.0-1e s", which is a glitch caused by trying to print scientific notation with digits=0. This change sets scientific=FALSE in each call to format, so it always just prints an integer number of seconds with no scientific notation.

Ryan C. Thompson added 3 commits November 23, 2020 11:35

Prevent NA values within conditionals

6b7fa04

Fixes GabrielHoffman#26.

Aggregate bpiterate results all at once

1c67bba

Instead of using REDUCE=c in bpiterate, use do.call(c, ...) on the result, to perform all the reductions at once. This saves a good bit of time in the aggregation step.

Adapt parallel chunking algorithm to number of workers in BPPARAM

d36ff29

Now the number of chunks should always be an exact multiple of the number of workers, if that number is available.

GabrielHoffman merged commit 347b4f9 into GabrielHoffman:master Dec 29, 2020

DarwinAwardWinner mentioned this pull request Jan 4, 2021

Account for bpiterate returning a list of remote_error objects #28

Merged

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to parallelization and related issues #27

Improvements to parallelization and related issues #27

DarwinAwardWinner commented Nov 23, 2020

DarwinAwardWinner commented Nov 23, 2020

Improvements to parallelization and related issues #27

Improvements to parallelization and related issues #27

Conversation

DarwinAwardWinner commented Nov 23, 2020

DarwinAwardWinner commented Nov 23, 2020