Skip to content

Commit

Permalink
Add cpuset size information to the FJR.
Browse files Browse the repository at this point in the history
We have found that it's difficult to detect when the cpuset (set by
some batch systems) is limiting the number of cores we can run on.

In one extreme case, the batch system arguments were being done
incorrectly and 32 cores worth of CMS jobs were being forced onto a
single core.

This records the current cpuset size to the FJR when CPU information
reporting is enabled.  The resulting line looks like:

    <Metric Name="cpusetCount" Value="1"/>

With this, we hope that experts are more likely to notice that the
cpuset is different from expectations
  • Loading branch information
bbockelm committed Sep 7, 2016
1 parent 8ad6418 commit 840918a
Showing 1 changed file with 39 additions and 0 deletions.
39 changes: 39 additions & 0 deletions FWCore/Services/plugins/CPU.cc
Expand Up @@ -27,6 +27,11 @@
#include <sstream>
#include <set>

#ifdef __linux__
#include <sched.h>
#include <errno.h>
#endif

namespace edm {

namespace service {
Expand Down Expand Up @@ -102,6 +107,36 @@ namespace edm {
}
return aux;
}

// Determine the CPU set size; if this can be successfully determined, then this
// returns true.
bool getCpuSetSize(unsigned &set_size) {
#ifdef __linux__
cpu_set_t *cpusetp;
unsigned current_size = 128;
unsigned cpu_count = 0;
while (current_size*2 > current_size) {
cpusetp = CPU_ALLOC(current_size);
CPU_ZERO_S(CPU_ALLOC_SIZE(current_size), cpusetp);

if (sched_getaffinity(0, CPU_ALLOC_SIZE(current_size), cpusetp)) {
CPU_FREE(cpusetp);
if (errno == EINVAL) {
current_size *= 2;
continue;
}
return false;
}
cpu_count = CPU_COUNT_S(CPU_ALLOC_SIZE(current_size), cpusetp);
CPU_FREE(cpusetp);
break;
}
set_size = cpu_count;
return true;
#else
return false;
#endif
}
} // namespace {}


Expand Down Expand Up @@ -225,6 +260,10 @@ namespace edm {
}
reportCPUProperties.insert(std::make_pair("CPUModels", CPUModels));

unsigned set_size = -1;
if (getCpuSetSize(set_size)) {
reportCPUProperties.insert(std::make_pair("cpusetCount", i2str(set_size)));
}

reportSvc->reportPerformanceSummary("SystemCPU", reportCPUProperties);

Expand Down

0 comments on commit 840918a

Please sign in to comment.