Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for giving slurm jobs of workers different names #708

Closed
StHagel opened this issue May 22, 2024 · 7 comments · Fixed by #710
Closed

Support for giving slurm jobs of workers different names #708

StHagel opened this issue May 22, 2024 · 7 comments · Fixed by #710

Comments

@StHagel
Copy link

StHagel commented May 22, 2024

Currently, all workers are just named hq-alloc in slurm, when viewing them in squeue.
It would be nice to be able to give the workers custom names. In slurm this can be done via the --job-name='My job name' flag.

@Kobzol
Copy link
Collaborator

Kobzol commented May 22, 2024

If you execute hq alloc add --name <foo> and then all allocations from this queue would be named <foo>, would that be OK for you? Or should it be e.g. <foo>-1, <foo>-2 etc.?

@StHagel
Copy link
Author

StHagel commented May 22, 2024

Would it also be possible to add a flag to hq submit instead of hq alloc?

@Kobzol
Copy link
Collaborator

Kobzol commented May 22, 2024

You can already state the name of a job (hq submit --name <foo>), but this has nothing to do with allocations. Note that HQ jobs are completely separated from allocations, and therefore any attribute of a job cannot affect attributes of Slurm/PBS allocations.

@StHagel
Copy link
Author

StHagel commented May 22, 2024

Right, makes sense.

Then having a --name flag available for hq alloc seems reasonable. Would the index in the name indicate a worker?

@Kobzol
Copy link
Collaborator

Kobzol commented May 22, 2024

The index would indicate the order of the allocation created in the given allocation queue. So the first allocation created by HQ would get <foo>-1, the second one would get <foo>-2, etc. In theory, we could also let the allocator name the workers, currently they get their name from the hostname of the node on which they are spawned.

(The flag is already available btw, it just isn't propagated to the Slurm allocation name, which is what we could change).

@StHagel
Copy link
Author

StHagel commented May 22, 2024

I guess the solution with the index is better than without.

@Kobzol
Copy link
Collaborator

Kobzol commented May 22, 2024

Oops, according to our documentation, the --name parameter should already have been used to name the allocations, so this was actually a bug. Fixed by #710.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants