Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-uri slurm:jobid does not work for slurm batch jbos #5482

Closed
garlick opened this issue Oct 3, 2023 · 9 comments
Closed

flux-uri slurm:jobid does not work for slurm batch jbos #5482

garlick opened this issue Oct 3, 2023 · 9 comments

Comments

@garlick
Copy link
Member

garlick commented Oct 3, 2023

Note to self: while working on https://flux-framework.readthedocs.io/projects/flux-core/en/latest/guide/start.html#starting-with-slurm and running flux instances in the LLNL quartz debug queue, I was unable to get flux uri slurm:jobid to work. I didn't run it down. This needs to be revisited to see if it really works and I'm just doing something dumb or if something's gone sour in that code.

@grondo
Copy link
Contributor

grondo commented Oct 3, 2023

A simple test worked for me, but this is the simplest case. Any hints on what you might have been doing different?

$ squeue -u grondo
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           1521171    pdebug interact   grondo  R       0:44      1 quartz3
$ flux uri slurm:1521171
ssh://quartz3/var/tmp/grondo/flux-ptC8HX/local-0
$ flux proxy slurm:1521171
f(s=1,d=0) $ flux resource list
     STATE NNODES   NCORES    NGPUS NODELIST
      free      1       36        0 quartz3
 allocated      0        0        0 
      down      0        0        0 

Also this reminds me that this would be a good testcase to add to our extra tests for the gitlab CI. (cc @wihobbs)

@wihobbs
Copy link
Member

wihobbs commented Oct 3, 2023

Good idea @grondo. @garlick I tried the same thing as Mark but varied the number of nodes, and put some nested instances in there, and tried using flux uri slurm:jobid both in and out of the session. This is reminding me of when the LSF resolver broke because I got an allocation on lassen9-11 and the command in LSF sorted lassen10 as the rank 0 node due to the 1 being a lower number than 9...could be a really weird one off case like that. In any event, good idea to add that to our testing on real clusters.

@garlick
Copy link
Member Author

garlick commented Oct 6, 2023

It worked for me just now. I was probably doing something dumb before! Sorry for the noise.

@garlick
Copy link
Member Author

garlick commented Oct 6, 2023

Ah this is what I was doing. But perhaps this isn't intended to work:

[garlick@quartz386:~]$ sbatch -p pbatch -N2 --job-name flux --wrap "flux start sleep 120"
Submitted batch job 1533848
[garlick@quartz386:~]$ squeue|grep 1533848
           1533848    pbatch     flux  garlick  R       0:17      2 quartz[161-162]
[garlick@quartz386:~]$ flux uri slurm:1533848
flux-uri: ERROR: Unable to resolve Flux URI for Slurm job 1533848

@garlick
Copy link
Member Author

garlick commented Oct 6, 2023

Reopening since it would be nice if this worked for slurm batch jobs. I think the only problem is that the batch script is the first child of the slurmstepd and we need to look one level deeper if the first LOCALID=0 process does not work out. Perhaps we could just try the pids in sorted order?

On the first node of a job submitted like above:

[garlick@quartz161:~]$ scontrol listpids
PID      JOBID    STEPID   LOCALID GLOBALID
3188397  1533855  batch    0       0
3188401  1533855  batch    -       -
3188489  1533855  batch    -       -
-1       1533855  extern   0       0
3188390  1533855  extern   -       -

and those pids are:

UID          PID    PPID  C STIME TTY          TIME CMD
garlick  3188397 3188392  0 05:41 ?        00:00:00 /bin/sh /var/spool/slurmd/job1533855/slurm_script
garlick  3188401 3188397  0 05:41 ?        00:00:00 /usr/libexec/flux/cmd/flux-broker sleep 360
garlick  3188489 3188401  0 05:41 ?        00:00:00 sleep 360
root     3188390 3188385  0 05:41 ?        00:00:00 sleep 100000000

        ├─slurmstepd─┬─slurm_script───flux-broker-0─┬─sleep
        │            │                              └─17*[{flux-broker-0}]
        │            └─2*[{slurmstepd}]

Confirmed that flux uri pid:3188401 works.

@garlick garlick reopened this Oct 6, 2023
@garlick garlick changed the title flux-uri slurm:jobid needs a retest flux-uri slurm:jobid does not work for slurm batch jbos Oct 6, 2023
@grondo
Copy link
Contributor

grondo commented Oct 6, 2023

The slurm resolver doesn't walk the process tree of slurmstepd, but uses scontrol listpids to list the pids for the job (and I think for all job steps for a batch/alloc job). In this case flux start is not run under srun, so the PID of the broker won't be available.

I think it would work if you did srun flux start because then Flux would actually be running under Slurm.

Not to say we couldn't fix this particular case, but searching for the first flux-broker that happens to be running under a Slurm batch job might give surprising results. For example, I could get a random test instance returned if running make -j 16 check in flux-core under a batch job...

@grondo
Copy link
Contributor

grondo commented Oct 6, 2023

I think it would work if you did srun flux start because then Flux would actually be running under Slurm.

Should test this one though...

@grondo
Copy link
Contributor

grondo commented Oct 6, 2023

Just another thought, we'd have a similar issue with flux if you run flux batch -N1 --wrap flux start. Since the flux start is a singleton not run under flux run you couldn't get the uri with flux URI jobid1/jobid2...

@garlick
Copy link
Member Author

garlick commented Oct 6, 2023

Oh duh! My example was not doing what I thought it was - I was just starting a size=1 flux instance on the first node of the batch allocation wasn't I? Yeah this works

[garlick@quartz386:~]$ sbatch -p pdebug -N2 --job-name flux --wrap "srun flux start sleep 360"
Submitted batch job 1533904
[garlick@quartz386:~]$ flux uri slurm:1533904
ssh://quartz3/var/tmp/garlick/flux-H2g2KI/local-0
[garlick@quartz386:~]$ flux proxy slurm:1533904
[garlick@quartz386:~]$ flux resource list
     STATE NNODES   NCORES    NGPUS NODELIST
      free      2       72        0 quartz[3-4]
 allocated      0        0        0 
      down      0        0        0 

Sorry for the noise!

@garlick garlick closed this as completed Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants