Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-job: add flux job taskmap --to=hosts #5941

Merged
merged 7 commits into from
May 4, 2024

Conversation

grondo
Copy link
Contributor

@grondo grondo commented May 4, 2024

This PR adds a --to=hosts option to flux job taskmap, which prints the hostname: tasks mapping for a job when given a jobid. E.g. for a 4 node 16 task "cyclic" job:

$ src/cmd/flux job taskmap --to=hosts $(flux job last)
fluke87: 0,4,8,12
fluke88: 1,5,9,13
fluke89: 2,6,10,14
fluke90: 3,7,11,15

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@grondo
Copy link
Contributor Author

grondo commented May 4, 2024

Thanks! I'll set MWP.

grondo added 7 commits May 4, 2024 21:44
Problem: A couple static functions in job/taskmap.c use a flux_job*
prefix, but this can be confusing because they are not part of the
libflux API.

Rename the functions to drop the flux_ prefix.
Problem: There are two functions in `flux job taskmap` that each open
a separate flux handle. This is unnecessary.

Remove the duplication by managing a global flux handle that is opened
as needed.
Problem: It would be convenient to have a function to get the job
hostlist in `flux job taskmap`, but currently that functionality is
embedded in a function that translates nodeid to hostname.

Split job_hostlist() out of job_nodeid_to_hostname().
Problem: It would be convenient to see the mapping of hostnames to
taskids for a job all in one go, but `flux job taskmap` doesn't
currently support that.

Add a new output format `--to=hosts` to display a list of taskids for
every host in a job.
Problem: When `flux job taskmap` options are used that require
nodeid->hostname mapping, such as `--hostname` and `--to=hosts`,
the R for a random jobid (or jobid 0 if id is initialized to 0)
is fetched and a generic error is emitted:

 failed to get hostlist for job: No such file or directory

Initialize `id` to FLUX_JOBID_ANY and issue an improved error
message in these conditions, e.g.

 flux-job: taskmap: can't use --hostname without a jobid
Problem: There are no tests that exercise recent updates in the
flux-job(1) taskmap subcommand, including the `--to=hosts` option
and error handling when hostnames are required but not available.

Update the t2616-job-shell-taskmap.t sharness test.
Problem: The --to=hosts option of the taskmap subcommand is not
documented in flux-job(1).

Add it to the documentation.
Copy link

codecov bot commented May 4, 2024

Codecov Report

Attention: Patch coverage is 93.18182% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 83.38%. Comparing base (ab6a49c) to head (bbce0e1).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5941      +/-   ##
==========================================
+ Coverage   83.34%   83.38%   +0.03%     
==========================================
  Files         514      514              
  Lines       83105    83134      +29     
==========================================
+ Hits        69264    69318      +54     
+ Misses      13841    13816      -25     
Files Coverage Δ
src/cmd/job/taskmap.c 90.96% <93.18%> (+1.28%) ⬆️

... and 10 files with indirect coverage changes

@mergify mergify bot merged commit 999d9bb into flux-framework:master May 4, 2024
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants