Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
equilibrium: dask scheduler default incorrectly set to synchronous #101
The default setting of the
It looks like that change slipped in about 9 months ago in an unrelated commit, which is one of many reasons why we now do code review.
Some examples from NIMS databases. I assumed that parallelization in dask was independent of the shape of the equilibrium calculation. That is, for calculation shape of (P,T,X) that the scaling of (1,1,N) is similar to (1,sqrt(N),sqrt(N)). Each point is sampled twice and averaged.
It appears that multiprocessing would be a poor choice for default.
What are your thoughts then on client vs sync, where with client we will have to implicitly start a distributed client. I think maybe sync as the default is okay given that the biggest speedup was less than 2x even for 10k equilibrium calculations on 8 cores.
@bocklund Thanks for running this test. It appears that "sync" would be a sensible default up to, perhaps, 100 calculations, at which point we probably should switch to the distributed scheduler.
What do you think about leaving our current default scheduler (distributed), and just increasing the threshold