Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SLURM RAM monitoring #473

Merged
merged 11 commits into from
Nov 28, 2023
Merged

Fix SLURM RAM monitoring #473

merged 11 commits into from
Nov 28, 2023

Conversation

benoit-cty
Copy link
Contributor

Fix SLURM RAM monitoring :

  • Fix error when calling scontrol
  • Do only one call instead of one at each measure

Will close #447

@benoit-cty
Copy link
Contributor Author

Tested on a SLURM environment : there is only too call per sessions and no more error message.

Copy link
Collaborator

@SaboniAmine SaboniAmine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Benoît! Small comment about the notebook, not sure we want to merge it to master

codecarbon/external/test.ipynb Outdated Show resolved Hide resolved
@vict0rsch
Copy link
Contributor

I'll have a look next week 😊

@vict0rsch
Copy link
Contributor

There was an issue with $SLURM_JOBID which was changed to $SLURM_JOB_ID, making some queries like scontrol show job $SLURM_JOBID" fail.

@vict0rsch
Copy link
Contributor

vict0rsch commented Nov 21, 2023

This works as expected

from codecarbon.external.hardware import RAM

ram = RAM(tracking_mode="process")
ram.slurm_memory_GB

@benoit-cty
Copy link
Contributor Author

There was an issue with $SLURM_JOBID which was changed to $SLURM_JOB_ID, making some queries like scontrol show job $SLURM_JOBID" fail.

Thanks for your review, it seems to be deprecated since 2017 so it is safe to remove the support of the old name !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CodeCarbon spams Slurm controler with scontrol
3 participants