You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi again,
In iter_parallel_chains function of beat/sampler/base.py:476-482
if chunksize is None:
if draws < 10:
chunksize = int(np.ceil(float(n_chains) / n_jobs))
elif draws > 10 and tps < 0.5:
chunksize = int(np.ceil(float(n_chains) / n_jobs))
else:
chunksize = n_jobs
the tps seems to depend on hardware(I have installed libamdm), and if we set a bigger n_jobs, the chunksize will also be bigger when case tps > 0.5 and draws > 10 and stage > 0.
Refering https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map, the bigger chunksize leads to the smaller chunks count. when n_job > chunks count, the bigger n_job will decrease the number of parallels, which means the calculation time gets longer.
Is it correct? And can I set a arbitory chunksize in script manually?
Thank you!
The text was updated successfully, but these errors were encountered:
cool that you are still around ;) .
You are right. The intention behind that is, if your forward model takes a long time, you want to rather use a small chunksize, i.e. having the work distributed in smaller chunks to more workers, otherwise it often happens you have a single worker left with a big chunk of work, that all the other workers are waiting for to be finished until entering the next stage.
Vice versa if you have a fast forward modell you want to have a big chunk-size, because initialising the worker then takes longer than the sampling itself.
Is that understandable? Now I couldnt completely understand what your problem with that setup is. For now you cannot define chunksize in the config file, but if it would help you- we can surely add that- it is not a big deal.
Sorry for the late fixing, but I apparently didnt get the point correctly until I tried myself with larger number of chains.
It is fixed in the current dev branch here: #121 and should be released to master soon.
Hi again,
In
iter_parallel_chains
function ofbeat/sampler/base.py:476-482
the tps seems to depend on hardware(I have installed libamdm), and if we set a bigger n_jobs, the chunksize will also be bigger when case tps > 0.5 and draws > 10 and stage > 0.
Refering
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map
, the bigger chunksize leads to the smaller chunks count. when n_job > chunks count, the bigger n_job will decrease the number of parallels, which means the calculation time gets longer.Is it correct? And can I set a arbitory chunksize in script manually?
Thank you!
The text was updated successfully, but these errors were encountered: