Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup of DIRACOS does not restrict cpu cores #189

Closed
maxfischer2781 opened this issue May 31, 2023 · 0 comments · Fixed by #190
Closed

Setup of DIRACOS does not restrict cpu cores #189

maxfischer2781 opened this issue May 31, 2023 · 0 comments · Fixed by #190

Comments

@maxfischer2781
Copy link
Contributor

maxfischer2781 commented May 31, 2023

When the Pilot creates its DIRACOS environment, it directly calls the DIRACOS-Linux-machine.sh which eventually invokes mamba. Mamba assumes that it can use all cores of the machine (see mamba-org/mamba#2463), which isn't realistic for Pilot environments. This leads to excessive process creation, which can negatively affect the pilot, user or even entire compute resource.

As far as I can tell, the templates from which DIRACOS is generated do no provide a feasible way to limit this internally. The Pilot thus seems like the best place, seeing how it is aware of resource restrictions.
A solution would be to set MAMBA_EXTRACT_THREADS when installing DIRACOS, either to a conservative 1 or pp.maxNumberOfProcessors.


For reference of scale, we caught this on a WLCG Tier 1 WN with 256 cores that got allocated mostly to one VO. Each of the single core pilots tried to use 256 child processes; each pilot quickly ground to a halt due to resource and fork bomb protection, which caused each new pilot to also immediately get stuck on nproc limits and similar safeguards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant