Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add env variable to limit number of threads used by ROOT #9805

Closed
EdwardMoyse opened this issue Feb 3, 2022 · 12 comments · Fixed by #13057
Closed

Add env variable to limit number of threads used by ROOT #9805

EdwardMoyse opened this issue Feb 3, 2022 · 12 comments · Fixed by #13057
Assignees
Labels
experiment Affects an experiment / reported by its software & computimng experts

Comments

@EdwardMoyse
Copy link
Collaborator

Hi,

ATLAS would like to request the addition of a mechanism to control the number of threads used by ROOT with implicit multi-threading enabled, ideally via an environment variable (e.g. ROOT_MAX_THREADS). The primary motivation for this is to give GRID sites an easier way to limit the number of threads to match that of a queue.

Thanks!

Ed

@eguiraud eguiraud added the experiment Affects an experiment / reported by its software & computimng experts label Feb 4, 2022
@Axel-Naumann Axel-Naumann removed the experiment Affects an experiment / reported by its software & computimng experts label Feb 4, 2022
@Axel-Naumann
Copy link
Member

Removed "experiment" label to cross-check that Ed can add it himself :-)

@Axel-Naumann Axel-Naumann self-assigned this Feb 4, 2022
@Axel-Naumann
Copy link
Member

Should this not be some generic grid variable that we honor? Does that exist, or if not, should it be defined? What does the GDB say - could you talk to them?

@EdwardMoyse EdwardMoyse added the experiment Affects an experiment / reported by its software & computimng experts label Feb 4, 2022
@elmsheus
Copy link
Collaborator

elmsheus commented Feb 4, 2022

This idea is to have an optional ROOT variable we can set in the PanDA pilot or the PanDA worker node wrapper environment that allows to limit the number of threads spawned on the worker node by ROOT. We did this in the past e.g. with ProofLite.Workers to prevent prooflite processes to grab all available CPU cores on a worker node instead of the actual usual 8 cores on a WLCG multi-core slot.

We steer AthenaMP and MT jobs via the environment variables ATHENA_PROC_NUMBER and ATHENA_CORE_NUMBER and set them to the PanDA queue core count.

Asking the GDB might be a bit too high level committee, but if needed this could be brought up in some WLCG operations meetings if others would be interested to steer ROOT in a similar way.

@lukasheinrich
Copy link

Hi, hard coding anything related to Athena into ROOT seems weird to me. Maybe the wrapper should take some generic value and then set application specific variables like ROOT_MAX_THREADS?

@EdwardMoyse
Copy link
Collaborator Author

Yes, that’s what we’re proposing.

@lukasheinrich
Copy link

Ok - maybe I misunderstood @Axel-Naumann's "generic grid variable that we honor" .. sorry!

@Axel-Naumann
Copy link
Member

Asking the GDB might be a bit too high level committee, but if needed this could be brought up in some WLCG operations meetings if others would be interested to steer ROOT in a similar way.

What's wrong with grid sites agreeing on an env var (whatever the mechanism) that sets the max threads to be started by any process for any grid job? Why would we want to only steer ROOT, only through PanDA - what happens if there's an analysis job that an ATLAS physicist starts outside ATHENA, outside PanDA? What if it's a CMS physicist? How isn't this a generic issue to be addressed, outside ATLAS or ROOT?

Maybe such an env var already exists - I sort of remember that the grid sites agreed to define one, actually... Do you want to drive this (my preference!) or do you prefer that I talk to WLCG folks?

@EdwardMoyse
Copy link
Collaborator Author

Hi @Axel-Naumann - we discussed this again internally.

From our side, we're happy with whatever suits you best (it's not really our place to decide what the variable is called or how this is implemented).

However it is at least conceivable that we might want to limit ROOT to less threads than provided by the queue, in which scenario we would need ROOT_MAX_THREADS (or whatever) anyway.

@zlmarshall
Copy link

Just a reminder of this ticket. It'd be nice to give users a way to handle this with an environment variable. We can (and often do) set this for virtually all other programs we run on the grid individually.

@jblomer
Copy link
Contributor

jblomer commented Jun 20, 2023

Just for the record: ROOT currently already respects the cgroups #CPU limits (see ROOT::Internal::LogicalCPUBandwithControl()).

Axel-Naumann added a commit to Axel-Naumann/root that referenced this issue Jun 21, 2023
This overrides the number of threads that ROOT will be using in the thread pool.
This can be set to be higher or lower than the number of physical cores; it can
be specified as decimal, octal, or hex.

Fixes issue root-project#9805.
@Axel-Naumann
Copy link
Member

We completely agree and the above PR implements what you propose. Apologies for the long wait for such a simple thing!

Axel-Naumann added a commit to Axel-Naumann/root that referenced this issue Jun 22, 2023
This overrides the number of threads that ROOT will be using in the thread pool.
This can be set to be higher or lower than the number of physical cores; it can
be specified as decimal, octal, or hex.

Fixes issue root-project#9805.

Mention the env var in `ROOT::EnableImplicitMT()`'s doc.
Axel-Naumann added a commit that referenced this issue Jun 28, 2023
This overrides the number of threads that ROOT will be using in the thread pool.
This can be set to be higher or lower than the number of physical cores; it can
be specified as decimal, octal, or hex.

Fixes issue #9805.

Mention the env var in `ROOT::EnableImplicitMT()`'s doc.
maksgraczyk pushed a commit to maksgraczyk/root that referenced this issue Jun 28, 2023
This overrides the number of threads that ROOT will be using in the thread pool.
This can be set to be higher or lower than the number of physical cores; it can
be specified as decimal, octal, or hex.

Fixes issue root-project#9805.

Mention the env var in `ROOT::EnableImplicitMT()`'s doc.
@github-actions
Copy link

Hi @Axel-Naumann,

It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.

Sincerely,
🤖

@Axel-Naumann Axel-Naumann added this to Issues in Fixed in 6.30/00 via automation Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment Affects an experiment / reported by its software & computimng experts
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

7 participants