Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job stuck in SCHED state after queue reconfiguration (and sched modules reload) #1204

Open
grondo opened this issue May 14, 2024 · 0 comments

Comments

@grondo
Copy link
Contributor

grondo commented May 14, 2024

I was testing reconfiguration without a restart via resource and Fluxion module reload and ran into one issue.

Here's a test that somewhat randomly redistributes resources among queues while jobs are running. The queues are reconfigured and no running jobs are lost, but in the last step, all the resources are moved to a single queue (one) and a full system job is submitted. Once the resources are available, the 100 node job should start, but instead it is stuck indefinitely in the SCHED state. If I submit any job to that queue, then the pending job is scheduled and things move along.

#!/bin/bash

reconfig() {
	read -rd '' CONFIG
	flux queue stop --all
	flux module remove sched-fluxion-qmanager
	flux module remove sched-fluxion-resource
	flux module remove resource
	cat <<-EOF | flux config load
	[resource]
	noverify = true
	norestrict = true

	[queues.one]
	requires = ["one"]

	[queues.two]
	requires = ["two"]

	[queues.three]
	requires = ["three"]

	[queues.four]
	requires = ["four"]

	[[resource.config]]
	hosts = "test[1-100]"
	cores = "0-95"
	gpus = "0-3"

	$CONFIG
	EOF
	flux module load resource noverify monitor-force-up
	flux module load sched-fluxion-resource
	flux module load sched-fluxion-qmanager
	flux queue start --all
}

reconfig <<EOF
[[resource.config]]
hosts = "test[1-25]"
properties = ["one"]

[[resource.config]]
hosts = "test[26-50]"
properties = ["two"]

[[resource.config]]
hosts = "test[51-75]"
properties = ["three"]

[[resource.config]]
hosts = "test[76-100]"
properties = ["four"]
EOF

flux resource list

flux bulksubmit \
	--wait-event=start --cc=1-10 -q {} \
	--setattr=exec.test.run_duration=10s \
	-N2 sleep 10 \
	::: one two three four

flux jobs 

reconfig <<EOF
[[resource.config]]
hosts = "test[1-10]"
properties = ["one"]

[[resource.config]]
hosts = "test[26-50]"
properties = ["two"]

[[resource.config]]
hosts = "test[51-75]"
properties = ["three"]

[[resource.config]]
hosts = "test[11-25,76-100]"
properties = ["four"]
EOF

flux resource list
flux jobs


reconfig <<EOF
[[resource.config]]
hosts = "test[1-10]"
properties = ["one"]

[[resource.config]]
hosts = "test[26-50]"
properties = ["two"]

[[resource.config]]
hosts = "test[11,51-75]"
properties = ["three"]

[[resource.config]]
hosts = "test[12-25,76-100]"
properties = ["four"]
EOF

flux resource list

reconfig <<EOF
[[resource.config]]
hosts = "test[1-100]"
properties = ["one"]
EOF

flux resource list

jobid=$(flux submit -q one --setattr=exec.test.run_duration=1s -N100 sleep 1)

flux jobs
sleep 5

flux resource list
flux jobs

flux job wait-event -vvv $jobid clean

Output:

Scheduling is stopped
flux-module: remove sched-fluxion-qmanager: No such file or directory
flux-module: remove sched-fluxion-resource: No such file or directory
May 14 16:16:38.141986 sched-simple.err[0]: exiting due to resource update failure: the resource module was unloaded
three: Scheduling is started
two: Scheduling is started
four: Scheduling is started
one: Scheduling is started
     STATE QUEUE      NNODES   NCORES    NGPUS NODELIST
      free one            25     2400      100 test[1-25]
      free two            25     2400      100 test[26-50]
      free three          25     2400      100 test[51-75]
      free four           25     2400      100 test[76-100]
 allocated                 0        0        0 
      down                 0        0        0 
f2HvLhyh
f2HvLhyi
f2HwphG3
f2HyJgYP
f2HyJgYQ
f2Hznfpj
f2Hznfpk
f2J2Gf75
f2J2Gf76
f2J3kePR
f2J3kePS
f2J5Edfm
f2J5Edfn
f2J6icx7
f2J6icx8
f2J8CcET
f2J9gbWo
f2J9gbWp
f2JBAao9
f2JMYVmZ
f2JMYVma
f2JMYVmb
f2JMYVmc
f2JMYVmd
f2JMYVme
f2JMYVmf
f2JP2V3u
f2JP2V3v
f2JP2V3w
f2JP2V3x
f2JP2V3y
f2JP2V3z
f2Ki7qhq
f2Ki7qhr
f2Ki7qhs
f2Ki7qht
f2KjbpzB
f2KjbpzC
f2Km5pGX
f2Km5pGY
       JOBID QUEUE    USER     NAME       ST NTASKS NNODES     TIME INFO
    f2Km5pGY four     grondo   sleep       R      2      2   0.272s test[81-82]
    f2KjbpzC three    grondo   sleep       R      2      2   0.273s test[56-57]
    f2KjbpzB three    grondo   sleep       R      2      2   0.273s test[58-59]
    f2Ki7qht three    grondo   sleep       R      2      2   0.274s test[60-61]
    f2Ki7qhs three    grondo   sleep       R      2      2   0.274s test[62-63]
    f2Ki7qhr three    grondo   sleep       R      2      2   0.274s test[64-65]
    f2Km5pGX four     grondo   sleep       R      2      2   0.279s test[83-84]
    f2Ki7qhq three    grondo   sleep       R      2      2   0.280s test[66-67]
    f2JP2V3z three    grondo   sleep       R      2      2   0.293s test[68-69]
    f2JP2V3y three    grondo   sleep       R      2      2   0.297s test[70-71]
    f2JP2V3w three    grondo   sleep       R      2      2   0.297s test[72-73]
    f2JP2V3x four     grondo   sleep       R      2      2   0.297s test[85-86]
    f2JP2V3v four     grondo   sleep       R      2      2   0.301s test[87-88]
    f2JMYVmf four     grondo   sleep       R      2      2   0.301s test[89-90]
    f2JMYVmd four     grondo   sleep       R      2      2   0.302s test[91-92]
    f2JMYVmb four     grondo   sleep       R      2      2   0.302s test[93-94]
    f2JMYVmZ four     grondo   sleep       R      2      2   0.302s test[95-96]
    f2J9gbWp four     grondo   sleep       R      2      2   0.304s test[97-98]
    f2JMYVme two      grondo   sleep       R      2      2   0.305s test[31-32]
    f2JMYVmc two      grondo   sleep       R      2      2   0.305s test[33-34]
    f2JMYVma two      grondo   sleep       R      2      2   0.306s test[35-36]
    f2JBAao9 two      grondo   sleep       R      2      2   0.306s test[37-38]
    f2J9gbWo two      grondo   sleep       R      2      2   0.306s test[39-40]
    f2J6icx8 two      grondo   sleep       R      2      2   0.307s test[41-42]
    f2J6icx7 two      grondo   sleep       R      2      2   0.311s test[43-44]
    f2J5Edfn two      grondo   sleep       R      2      2   0.316s test[45-46]
    f2J5Edfm two      grondo   sleep       R      2      2   0.316s test[47-48]
    f2J3kePR one      grondo   sleep       R      2      2   0.316s test[6-7]
    f2J2Gf76 one      grondo   sleep       R      2      2   0.326s test[8-9]
    f2J2Gf75 one      grondo   sleep       R      2      2   0.326s test[10-11]
    f2Hznfpk one      grondo   sleep       R      2      2   0.326s test[12-13]
    f2Hznfpj one      grondo   sleep       R      2      2   0.327s test[14-15]
    f2HyJgYQ one      grondo   sleep       R      2      2   0.327s test[16-17]
    f2HyJgYP one      grondo   sleep       R      2      2   0.327s test[18-19]
    f2HwphG3 one      grondo   sleep       R      2      2   0.327s test[20-21]
    f2HvLhyi one      grondo   sleep       R      2      2   0.328s test[22-23]
    f2JP2V3u three    grondo   sleep       R      2      2   0.328s test[74-75]
    f2J8CcET four     grondo   sleep       R      2      2   0.328s test[99-100]
    f2J3kePS two      grondo   sleep       R      2      2   0.328s test[49-50]
    f2HvLhyh one      grondo   sleep       R      2      2   0.329s test[24-25]
three: Scheduling is stopped
two: Scheduling is stopped
four: Scheduling is stopped
one: Scheduling is stopped
three: Scheduling is started
two: Scheduling is started
four: Scheduling is started
one: Scheduling is started
     STATE QUEUE      NNODES   NCORES    NGPUS NODELIST
      free one             5      480       20 test[1-5]
      free two             5      480       20 test[26-30]
      free three           5      480       20 test[51-55]
      free four            5      480       20 test[76-80]
 allocated one             5      480       20 test[6-10]
 allocated one,four       15     1440       60 test[11-25]
 allocated two            20     1920       80 test[31-50]
 allocated three          20     1920       80 test[56-75]
 allocated four           20     1920       80 test[81-100]
      down                 0        0        0 
       JOBID QUEUE    USER     NAME       ST NTASKS NNODES     TIME INFO
    f2Km5pGY four     grondo   sleep       R      2      2   2.080s test[81-82]
    f2KjbpzC three    grondo   sleep       R      2      2   2.081s test[56-57]
    f2KjbpzB three    grondo   sleep       R      2      2   2.081s test[58-59]
    f2Ki7qht three    grondo   sleep       R      2      2   2.081s test[60-61]
    f2Ki7qhs three    grondo   sleep       R      2      2   2.082s test[62-63]
    f2Ki7qhr three    grondo   sleep       R      2      2   2.082s test[64-65]
    f2Km5pGX four     grondo   sleep       R      2      2   2.087s test[83-84]
    f2Ki7qhq three    grondo   sleep       R      2      2   2.087s test[66-67]
    f2JP2V3z three    grondo   sleep       R      2      2   2.101s test[68-69]
    f2JP2V3y three    grondo   sleep       R      2      2   2.104s test[70-71]
    f2JP2V3w three    grondo   sleep       R      2      2   2.104s test[72-73]
    f2JP2V3x four     grondo   sleep       R      2      2   2.105s test[85-86]
    f2JP2V3v four     grondo   sleep       R      2      2   2.109s test[87-88]
    f2JMYVmf four     grondo   sleep       R      2      2   2.109s test[89-90]
    f2JMYVmd four     grondo   sleep       R      2      2   2.109s test[91-92]
    f2JMYVmb four     grondo   sleep       R      2      2   2.109s test[93-94]
    f2JMYVmZ four     grondo   sleep       R      2      2   2.110s test[95-96]
    f2J9gbWp four     grondo   sleep       R      2      2   2.112s test[97-98]
    f2JMYVme two      grondo   sleep       R      2      2   2.113s test[31-32]
    f2JMYVmc two      grondo   sleep       R      2      2   2.113s test[33-34]
    f2JMYVma two      grondo   sleep       R      2      2   2.113s test[35-36]
    f2JBAao9 two      grondo   sleep       R      2      2   2.114s test[37-38]
    f2J9gbWo two      grondo   sleep       R      2      2   2.114s test[39-40]
    f2J6icx8 two      grondo   sleep       R      2      2   2.114s test[41-42]
    f2J6icx7 two      grondo   sleep       R      2      2   2.118s test[43-44]
    f2J5Edfn two      grondo   sleep       R      2      2   2.123s test[45-46]
    f2J5Edfm two      grondo   sleep       R      2      2   2.123s test[47-48]
    f2J3kePR one      grondo   sleep       R      2      2   2.124s test[6-7]
    f2J2Gf76 one      grondo   sleep       R      2      2   2.134s test[8-9]
    f2J2Gf75 one      grondo   sleep       R      2      2   2.134s test[10-11]
    f2Hznfpk one      grondo   sleep       R      2      2   2.134s test[12-13]
    f2Hznfpj one      grondo   sleep       R      2      2   2.134s test[14-15]
    f2HyJgYQ one      grondo   sleep       R      2      2   2.135s test[16-17]
    f2HyJgYP one      grondo   sleep       R      2      2   2.135s test[18-19]
    f2HwphG3 one      grondo   sleep       R      2      2   2.135s test[20-21]
    f2HvLhyi one      grondo   sleep       R      2      2   2.135s test[22-23]
    f2JP2V3u three    grondo   sleep       R      2      2   2.135s test[74-75]
    f2J8CcET four     grondo   sleep       R      2      2   2.136s test[99-100]
    f2J3kePS two      grondo   sleep       R      2      2   2.136s test[49-50]
    f2HvLhyh one      grondo   sleep       R      2      2   2.136s test[24-25]
three: Scheduling is stopped
two: Scheduling is stopped
four: Scheduling is stopped
one: Scheduling is stopped
three: Scheduling is started
two: Scheduling is started
four: Scheduling is started
one: Scheduling is started
     STATE QUEUE      NNODES   NCORES    NGPUS NODELIST
      free one             5      480       20 test[1-5]
      free two             5      480       20 test[26-30]
      free three           5      480       20 test[51-55]
      free four            5      480       20 test[76-80]
 allocated one             5      480       20 test[6-10]
 allocated one,three       1       96        4 test11
 allocated one,four       14     1344       56 test[12-25]
 allocated two            20     1920       80 test[31-50]
 allocated three          20     1920       80 test[56-75]
 allocated four           20     1920       80 test[81-100]
      down                 0        0        0 
three: Scheduling is stopped
two: Scheduling is stopped
four: Scheduling is stopped
one: Scheduling is stopped
three: Scheduling is started
two: Scheduling is started
four: Scheduling is started
one: Scheduling is started
     STATE QUEUE      NNODES   NCORES    NGPUS NODELIST
      free one            20     1920       80 test[1-5,26-30,51-55,76-80]
 allocated one            20     1920       80 test[6-25]
 allocated one,two        20     1920       80 test[31-50]
 allocated one,three      20     1920       80 test[56-75]
 allocated one,four       20     1920       80 test[81-100]
      down                 0        0        0 
       JOBID QUEUE    USER     NAME       ST NTASKS NNODES     TIME INFO
    f4xFwwXm one      grondo   sleep       S    100    100        - 
    f2Km5pGY four     grondo   sleep       R      2      2   6.195s test[81-82]
    f2KjbpzC three    grondo   sleep       R      2      2   6.196s test[56-57]
    f2KjbpzB three    grondo   sleep       R      2      2   6.196s test[58-59]
    f2Ki7qht three    grondo   sleep       R      2      2   6.196s test[60-61]
    f2Ki7qhs three    grondo   sleep       R      2      2   6.196s test[62-63]
    f2Ki7qhr three    grondo   sleep       R      2      2   6.197s test[64-65]
    f2Km5pGX four     grondo   sleep       R      2      2   6.202s test[83-84]
    f2Ki7qhq three    grondo   sleep       R      2      2   6.202s test[66-67]
    f2JP2V3z three    grondo   sleep       R      2      2   6.216s test[68-69]
    f2JP2V3y three    grondo   sleep       R      2      2   6.219s test[70-71]
    f2JP2V3w three    grondo   sleep       R      2      2   6.219s test[72-73]
    f2JP2V3x four     grondo   sleep       R      2      2   6.219s test[85-86]
    f2JP2V3v four     grondo   sleep       R      2      2   6.224s test[87-88]
    f2JMYVmf four     grondo   sleep       R      2      2   6.224s test[89-90]
    f2JMYVmd four     grondo   sleep       R      2      2   6.224s test[91-92]
    f2JMYVmb four     grondo   sleep       R      2      2   6.224s test[93-94]
    f2JMYVmZ four     grondo   sleep       R      2      2   6.225s test[95-96]
    f2J9gbWp four     grondo   sleep       R      2      2   6.227s test[97-98]
    f2JMYVme two      grondo   sleep       R      2      2   6.227s test[31-32]
    f2JMYVmc two      grondo   sleep       R      2      2   6.228s test[33-34]
    f2JMYVma two      grondo   sleep       R      2      2   6.228s test[35-36]
    f2JBAao9 two      grondo   sleep       R      2      2   6.228s test[37-38]
    f2J9gbWo two      grondo   sleep       R      2      2   6.229s test[39-40]
    f2J6icx8 two      grondo   sleep       R      2      2   6.229s test[41-42]
    f2J6icx7 two      grondo   sleep       R      2      2   6.233s test[43-44]
    f2J5Edfn two      grondo   sleep       R      2      2   6.238s test[45-46]
    f2J5Edfm two      grondo   sleep       R      2      2   6.238s test[47-48]
    f2J3kePR one      grondo   sleep       R      2      2   6.239s test[6-7]
    f2J2Gf76 one      grondo   sleep       R      2      2   6.248s test[8-9]
    f2J2Gf75 one      grondo   sleep       R      2      2   6.249s test[10-11]
    f2Hznfpk one      grondo   sleep       R      2      2   6.249s test[12-13]
    f2Hznfpj one      grondo   sleep       R      2      2   6.249s test[14-15]
    f2HyJgYQ one      grondo   sleep       R      2      2   6.249s test[16-17]
    f2HyJgYP one      grondo   sleep       R      2      2   6.250s test[18-19]
    f2HwphG3 one      grondo   sleep       R      2      2   6.250s test[20-21]
    f2HvLhyi one      grondo   sleep       R      2      2   6.250s test[22-23]
    f2JP2V3u three    grondo   sleep       R      2      2   6.250s test[74-75]
    f2J8CcET four     grondo   sleep       R      2      2   6.251s test[99-100]
    f2J3kePS two      grondo   sleep       R      2      2   6.251s test[49-50]
    f2HvLhyh one      grondo   sleep       R      2      2   6.251s test[24-25]
     STATE QUEUE      NNODES   NCORES    NGPUS NODELIST
      free one           100     9600      400 test[1-100]
 allocated                 0        0        0 
      down                 0        0        0 
       JOBID QUEUE    USER     NAME       ST NTASKS NNODES     TIME INFO
    f4xFwwXm one      grondo   sleep       S    100    100        - 
1715728605.926426 submit userid=6885 urgency=16 flags=0 version=1
1715728605.938349 validate
1715728605.949083 depend
1715728605.949115 priority priority=16

I can then attach this instance with flux proxy pid:BROKER_PID and submit one job to queue one and this will cause the stuck job to be scheduled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant