Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add --taskmap=hostfile:FILE support #5844

Merged
merged 5 commits into from Mar 31, 2024

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Mar 29, 2024

This PR adds a new taskmap job shell plugin with the traditional "hostfile" mechanism for assigning task ids based on a list of host names. This was requested as a means to do very specific arbitrary task mapping in an easy to understand manner. It was easy to implement and will probably be nice to have around, even if not useful a majority of the time.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks good, just one comment and that codeql shadowed variable should probably be fixed.

Comment on lines 81 to 82
if (!(fp = fopen (path, "r")))
shell_die_errno (1, "failed to open hostfile: %s", path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason why this function calls shell_die_errno() on error while hostlist_from_R() does not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that was left over from a prototype and should probably be fixed.

Problem: The active_shells idset is leaked in the shell output plugin.

Add the missing idset_destroy() in shell_output_destroy().
Problem: Builtin shell taskmap plugins are currently stored at the
top level in src/shell, but it would be better to keep these in a
subdirectory to avoid naming confusion.

Move the cyclic taskmap plugin into src/shell/taskmap.
Problem: There is currently no simple way to map tasks to specific
hosts using a list of hostnames.

Add a hostfile taskmap plugin to the job shell. This taskmap scheme
assigns tasks to hosts in order as they appear in a file provided by
the scheme value. The hostfile is read as a list of newline separated
hostnames or RFC 29 Hostlist strings. If there are not as many hosts
as tasks in the hostfile, then the list is reused, and if more hosts
than tasks appear in the file, then the list is truncated.

Like other taskmap schemes, it is an error if the same task count is
not assigned to each host as in the default task map.

Fixes flux-framework#5794
Problem: There are no tests of the job shell hostfile taskmap support.

Since the hostfile support requires different hostnames for each "node"
in an instance, create a new test specific to this taskmap type using
the "system" test personality, which assigns fake hostnames to brokers.
Problem: The hostfile taskmap scheme is not documented.

Mention this scheme in flux-shell(1) along with the other builtin
taskmap schemes, and document it in the cli submission commands.
@grondo
Copy link
Contributor Author

grondo commented Mar 31, 2024

Ok, addressed comments from @garlick and I'll set MWP.

Copy link

codecov bot commented Mar 31, 2024

Codecov Report

Merging #5844 (f31c26b) into master (57ac53a) will increase coverage by 0.00%.
The diff coverage is 74.00%.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #5844    +/-   ##
========================================
  Coverage   83.28%   83.29%            
========================================
  Files         510      511     +1     
  Lines       82594    82694   +100     
========================================
+ Hits        68785    68876    +91     
- Misses      13809    13818     +9     
Files Coverage Δ
src/shell/builtins.c 92.00% <ø> (ø)
src/shell/output.c 71.86% <100.00%> (+0.04%) ⬆️
src/shell/taskmap/cyclic.c 78.43% <ø> (ø)
src/shell/taskmap/hostfile.c 73.73% <73.73%> (ø)

... and 15 files with indirect coverage changes

@mergify mergify bot merged commit ed07466 into flux-framework:master Mar 31, 2024
34 of 35 checks passed
@grondo grondo deleted the taskmap-hostfile branch March 31, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants