Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors related to multi-core systems (e.g. > 64 cores on Threadripper) #264

Closed
straussmaximilian opened this issue Jul 8, 2021 · 2 comments

Comments

@straussmaximilian
Copy link
Member

The python multiprocessing module has a a core limit, leading to problems when setting large number of cores in Python.

Currently, there is a process limit in AlphaPool, that caps at 50 processes. However, this limit is not used everywhere (e.g. n_jobs in ML). This leads to errors when running on systems with large number of CPUs.

@swillems
Copy link
Collaborator

swillems commented Jul 8, 2021

Should be straightforward to set in in the "set_worker_count" of the performance notebook, or are there some ML tasks where you really cannot update it?

@straussmaximilian
Copy link
Member Author

I guess it boils down to three things here:

(1) Use the MAX_WORKER_COUNT from the performance notebook everywhere (AlphaPool) and set this when starting a workflow according to the parameter in settings.
(2) To parallelize the ML tasks, we need to pass MAX_WORKER_COUNT to n_jobs (used in score and in alignment) so that it is used there.
(3) There are some edge cases (e.g., there is a bug that pyinstaller only allows n_jobs = 1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants