Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shell: add builtin core affinity plugin #2397

Merged
merged 4 commits into from Sep 27, 2019

Conversation

@grondo
Copy link
Contributor

commented Sep 25, 2019

This PR adds a default affinity plugin for the shell. By default the affinity of the shell is set to the specified cores list in R. If the cpu-affinity shell option is set to "per-task", then the tasks managed by the shell are spread across the allocated cores using hwloc_distrib(3). Setting the cpu-affinity option to "off" disables the affinity plugin.

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 25, 2019

Oops, looks like hwloc-calc not installed in CentOS. Will look into this tonight

@grondo grondo force-pushed the grondo:shell-affinity branch from 20970b6 to cd20f86 Sep 26, 2019
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

There's two problems with coverage here:

  1. Travis only gives us one core -- hard to do "real" cpu affinity testing there
  2. I think coverage output between fork and exec is not captured, so task.exec code doesn't appear covered though it is.
@grondo grondo force-pushed the grondo:shell-affinity branch from cd20f86 to c4edd2e Sep 26, 2019
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

I'm attempting some tweaks to the shell-affinity test script to get a bit more coverage when only a single core is available (basically make sure the thing doesn't fall down).

Additionally, I added a call to __gcov_flush() before exec in libsubprocess/local.c. We'll see that might get us a small coverage bump. ;)

grondo added 3 commits Sep 25, 2019
Add a builtin 'affinity' plugin for the shell which sets affinity
for the job as a whole to the allocated list of cores by default,
using hwloc to find the cpuset that includes all assigned cores
and their associated PUs (if any).

The plugin looks for the shell option 'cpu-affinity' and if set to
"per-task", then hwloc_distrib(3) is used to distribute tasks
across the assigned reosurces.

If the cpu-affinity shell option is set to "off" then the plugin is
disabled.

If the cores list presented to the affinity plugin does not intersect
with the real cores available on the machine, then the affinity
plugin logs an informational message and gives up.

Fixes #2243
Add a set of tests for basic functionality of shell builtin
affinity plugin.
Add the hwloc package to CentOS 7 docker image in order to get
the hwloc-calc and hwloc-bind utilities.
@grondo grondo force-pushed the grondo:shell-affinity branch from c4edd2e to 7526881 Sep 26, 2019
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

Calling __gcov_flush() in libsubprocess did nothing. I guess each loaded shared library has its own gcov info so you have to call __gcov_flush() within the .so or executable for which you want to dump info. I moved the __gcov_flush() into the affinity plugin's task.exec handler, and it seems to have helped.

In libsubprocess, call __gcov_flush() before exec() in child process
if coverage is enabled to avoid losing coverage counts between fork
and exec.
@codecov-io

This comment has been minimized.

Copy link

commented Sep 26, 2019

Codecov Report

Merging #2397 into master will increase coverage by 0.01%.
The diff coverage is 81.11%.

@@            Coverage Diff             @@
##           master    #2397      +/-   ##
==========================================
+ Coverage   81.07%   81.09%   +0.01%     
==========================================
  Files         223      224       +1     
  Lines       35627    35770     +143     
==========================================
+ Hits        28886    29006     +120     
- Misses       6741     6764      +23
Impacted Files Coverage Δ
src/shell/builtins.c 95.23% <ø> (ø) ⬆️
src/common/libsubprocess/local.c 80.2% <100%> (+0.41%) ⬆️
src/shell/affinity.c 80.98% <80.98%> (ø)
src/modules/connector-local/local.c 73.11% <0%> (-1.17%) ⬇️
src/cmd/flux-job.c 85.03% <0%> (-0.29%) ⬇️
src/modules/job-exec/job-exec.c 75.71% <0%> (+0.21%) ⬆️
src/broker/module.c 75.17% <0%> (+0.23%) ⬆️
src/shell/shell.c 85.98% <0%> (+1.07%) ⬆️
... and 2 more
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2019

Looks like adding __gcov_flush() in both libsubprocess and the afifnity plugin was required to get coverage counted (or perhaps it succeeds intermittently 🤷‍♂)

Copy link
Member

left a comment

This looks great. I'll set merge-when-passing.

@mergify mergify bot merged commit 62535cd into flux-framework:master Sep 27, 2019
4 checks passed
4 checks passed
Summary 1 rule matches
Details
codecov/patch 81.11% of diff hit (target 81.07%)
Details
codecov/project 81.09% (+0.01%) compared to 317f80d
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

Thanks!

@grondo grondo deleted the grondo:shell-affinity branch Sep 27, 2019
@garlick garlick referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.