Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

straussmaximilian · 2021-07-08T12:17:25Z

The python multiprocessing module has a a core limit, leading to problems when setting large number of cores in Python.

Currently, there is a process limit in AlphaPool, that caps at 50 processes. However, this limit is not used everywhere (e.g. n_jobs in ML). This leads to errors when running on systems with large number of CPUs.

The text was updated successfully, but these errors were encountered:

swillems · 2021-07-08T13:16:40Z

Should be straightforward to set in in the "set_worker_count" of the performance notebook, or are there some ML tasks where you really cannot update it?

straussmaximilian · 2021-07-08T13:56:32Z

I guess it boils down to three things here:

(1) Use the MAX_WORKER_COUNT from the performance notebook everywhere (AlphaPool) and set this when starting a workflow according to the parameter in settings.
(2) To parallelize the ML tasks, we need to pass MAX_WORKER_COUNT to n_jobs (used in score and in alignment) so that it is used there.
(3) There are some edge cases (e.g., there is a bug that pyinstaller only allows n_jobs = 1).

straussmaximilian mentioned this issue May 10, 2022

v0.4.6 #447

Merged

straussmaximilian closed this as completed in 7e989bc May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

straussmaximilian commented Jul 8, 2021

swillems commented Jul 8, 2021

straussmaximilian commented Jul 8, 2021

Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

Comments

straussmaximilian commented Jul 8, 2021

swillems commented Jul 8, 2021

straussmaximilian commented Jul 8, 2021