Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shell: add hwloc.restrict option to restrict HWLOC_XMLFILE to assigned resources #5944

Merged
merged 3 commits into from
May 7, 2024

Conversation

grondo
Copy link
Contributor

@grondo grondo commented May 6, 2024

This PR adds the hwloc.restrict job shell option as described in #5934, along with a trivial test and docs.

@@ -457,6 +457,12 @@ plugins include:
Note that this option will also unset ``HWLOC_COMPONENTS`` since presence
of this environment variable may cause hwloc to ignore ``HWLOC_XMLFILE``.

.. option:: hwloc.xmlfile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option:: hwloc.restrict (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@wihobbs
Copy link
Member

wihobbs commented May 6, 2024

(s=2,d=2)  corona171 ~ $ seq 1 8 | flux bulksubmit --watch -n12 -opmi=pmix -ohwloc.xmlfile -ohwloc.restrict ./flux-test-collective/mpi/hello 1> /dev/null
[corona173:1567010] PMIX ERROR: PMIX_ERROR in file client/pmix_client_topology.c at line 352
[corona173:1567004] PMIX ERROR: PMIX_ERROR in file client/pmix_client_topology.c at line 352
[corona173:1567006] PMIX ERROR: PMIX_ERROR in file client/pmix_client_topology.c at line 352
[corona173:1567005] PMIX ERROR: PMIX_ERROR in file client/pmix_client_topology.c at line 352

A bunch of warnings like this, but no impact to functionality as far as I can tell. Shows up on /usr/bin/flux too, so maybe a PMI or OpenMPI5 issue. Either way, my testing is working with this, so I'm good to approve. Thanks @grondo!

@grondo
Copy link
Contributor Author

grondo commented May 7, 2024

I'll go ahead and add MWP here as well. Thanks @wihobbs!

Copy link

codecov bot commented May 7, 2024

Codecov Report

Attention: Patch coverage is 70.00000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 83.31%. Comparing base (cc277a2) to head (753a7f1).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5944      +/-   ##
==========================================
- Coverage   83.31%   83.31%   -0.01%     
==========================================
  Files         515      515              
  Lines       83344    83352       +8     
==========================================
+ Hits        69441    69442       +1     
- Misses      13903    13910       +7     
Files Coverage Δ
src/shell/hwloc.c 53.19% <70.00%> (+1.90%) ⬆️

... and 10 files with indirect coverage changes

grondo added 3 commits May 7, 2024 02:02
Problem: The HWLOC_XMLFILE provided by the hwloc.xmlfile job shell
option is not restricted to the resources available to the job,
but in some circumstances this would be useful.

Add a hwloc.restrict option that, when used with hwloc.xmlfile,
restricts the HWLOC_XMLFILE to the current process binding.
Problem: No test in the testsuite checks that the -o hwloc.restrict
job shell option works.

Add a test to t2619-job-shell-hwloc.t.
Problem: The `hwloc.restrict` job shell option is not documented.

Document it in flux-shell(1) and the submission cli manpages.
@mergify mergify bot merged commit 6733a10 into flux-framework:master May 7, 2024
33 checks passed
@grondo grondo deleted the hwloc.xmfile-restrict branch May 7, 2024 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants