Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug for a particular user #3

Open
paciorek opened this issue Oct 13, 2021 · 2 comments
Open

bug for a particular user #3

paciorek opened this issue Oct 13, 2021 · 2 comments

Comments

@paciorek
Copy link
Collaborator

I'm seeing this when running sq right now.

[paciorek@ln001 ~]$ sq -u sridharan 
Showing results for user sridharan
Traceback (most recent call last):
  File "sq.py", line 524, in <module>
  File "sq.py", line 463, in display_queued_jobs
  File "pandas/core/frame.py", line 7547, in apply
  File "pandas/core/apply.py", line 180, in get_result
  File "pandas/core/apply.py", line 255, in apply_standard
  File "pandas/core/apply.py", line 284, in apply_series_generator
  File "sq.py", line 420, in inner
  File "sq.py", line 185, in available_qos
IndexError: index 0 is out of bounds for axis 0 with size 0
[8359] Failed to execute script sq

Here are the squeue entries for this user at the moment:

[paciorek@ln001 ~]$ squeue -o "%.7i %.12P %.20j %.8u %.2t %.9M %.5C %.8r %.3D %.20R %.8p %.20q %b" | grep sridhara
9734427 savio3_bigme                 test sridhara PD      0:00     1 QOSGrpNo   1    (QOSGrpNodeLimit) 0.000254 genomicdata_bigmem3_ N/A
9734428 savio3_bigme                 test sridhara PD      0:00     1 QOSGrpNo   1    (QOSGrpNodeLimit) 0.000254 genomicdata_bigmem3_ N/A
9738740 savio3_bigme                 bash sridhara PD      0:00     1 QOSGrpNo   1    (QOSGrpNodeLimit) 0.000254 genomicdata_bigmem3_ N/A
@paciorek
Copy link
Collaborator Author

But it seems fine now:

[paciorek@ln001 ~]$ sq -u sridharan 
Showing results for user sridharan
Currently 1 running job and 0 pending jobs (most recent job first):
+---------+------+----------------+------------------+---------------+---------+---------+--------+
| Job ID  | Name |    Account     |      Nodes       |      QOS      |  Time   |  State  | Reason |
+---------+------+----------------+------------------+---------------+---------+---------+--------+
| 9796950 | bash | co_genomicdata | 1x savio3_bigmem | savio_lowprio | 1:19:04 | RUNNING |        |
+---------+------+----------------+------------------+---------------+---------+---------+--------+

[paciorek@ln001 ~]$ squeue -u sridharan -o "%.7i %.12P %.20j %.8u %.2t %.9M %.5C %.8r %.3D %.20R %.8p %.20q %b"
  JOBID    PARTITION                 NAME     USER ST      TIME  CPUS   REASON NOD     NODELIST(REASON) PRIORITY                  QOS TRES_PER_NODE
9796950 savio3_bigme                 bash sridhara  R   1:19:39    32     None   1         n0009.savio3 0.000011        savio_lowprio N/A

@nicolaschan
Copy link
Collaborator

I suspect there was something in the output of one of the Slurm commands sq uses that threw off the parsing. Next time, if you add --freeze $DIRNAME it will create a new directory at $DIRNAME containing all the Slurm command outputs. We can use the saved files to debug in the future with --load $DIRNAME.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants