Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpuset size information to the FJR. #15760

Merged
merged 1 commit into from Sep 8, 2016

Conversation

bbockelm
Copy link
Contributor

@bbockelm bbockelm commented Sep 7, 2016

We have found that it's difficult to detect when the cpuset (set by
some batch systems) is limiting the number of cores we can run on.

In one extreme case, the batch system arguments were being done
incorrectly and 32 cores worth of CMS jobs were being forced onto a
single core.

This records the current cpuset size to the FJR when CPU information
reporting is enabled. The resulting line looks like:

<Metric Name="cpusetCount" Value="1"/>

With this, we hope that experts are more likely to notice that the
cpuset is different from expectations

We have found that it's difficult to detect when the cpuset (set by
some batch systems) is limiting the number of cores we can run on.

In one extreme case, the batch system arguments were being done
incorrectly and 32 cores worth of CMS jobs were being forced onto a
single core.

This records the current cpuset size to the FJR when CPU information
reporting is enabled.  The resulting line looks like:

    <Metric Name="cpusetCount" Value="1"/>

With this, we hope that experts are more likely to notice that the
cpuset is different from expectations
@Dr15Jones
Copy link
Contributor

@smuzaffar Why didn't the bot see this pull request ?

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2016

A new Pull Request was created by @bbockelm (Brian Bockelman) for CMSSW_8_1_X.

It involves the following packages:

FWCore/Services

@cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @wddgit, @wmtan this is something you requested to watch as well.
@slava77, @smuzaffar you are the release manager for this.

cms-bot commands are list here #13028

@Dr15Jones
Copy link
Contributor

please test

@Dr15Jones
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2016

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2016

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2016

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 7, 2016

@davidlange6
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit e611e01 into cms-sw:CMSSW_8_1_X Sep 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants