Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with pyslurm on example jobs_list.py #20

Closed
laparn opened this issue Dec 21, 2012 · 7 comments
Closed

Problem with pyslurm on example jobs_list.py #20

laparn opened this issue Dec 21, 2012 · 7 comments

Comments

@laparn
Copy link

laparn commented Dec 21, 2012

So now import pyslurm is working (ubuntu 12.04, slurm2.5.0, trunk of pyslurm, cython from pip install). So I am going through the different examples.

I see the partition, blocks_list.py send back No Blocks.

However :
python jobs_list.py
Traceback (most recent call last):
File "jobs_list.py", line 47, in
jobs = a.get()
File "pyslurm.pyx", line 1573, in pyslurm.pyslurm.job.get (pyslurm/pyslurm.c:20204)
File "pyslurm.pyx", line 1669, in pyslurm.pyslurm.job.__get (pyslurm/pyslurm.c:21206)
File "pyslurm.pyx", line 4119, in pyslurm.pyslurm.get_conn_type_string (pyslurm/pyslurm.c:45363)
TypeError: an integer is required

Same thing for sjobs.py (normal, exactly the same error).

And with node_list.py :
arnaud@D3550:~/src/pyslurm/examples$ python node_list.py
python: error: Unsupported option 4 for get_nodeinfo.

python: error: Unsupported option 4 for get_nodeinfo.

D3550 :
alloc_cpus : 0
arch : x86_64
boards : 1355999036
boot_time : Thu Dec 20 10:23:56 2012
cores : 1
cpu_load : 116
cpus : 1
err_cpus : 0
features : []
gres : []
name : D3550
node_addr : 127.0.0.1
node_hostname : D3550
Traceback (most recent call last):
File "node_list.py", line 41, in
display(node_dict)
File "node_list.py", line 25, in display
print "\t%-17s : %s" % (part_key, pyslurm.get_node_state(value[part_key]))
File "pyslurm.pyx", line 3944, in pyslurm.pyslurm.get_node_state (pyslurm/pyslurm.c:43705)
TypeError: an integer is required

Regards,

Arnaud Laprévote

@phantez
Copy link
Contributor

phantez commented Dec 21, 2012

We are lagging a bit behind in term of API support for this new release, will try to have a look at it soon.

@gingergeeks
Copy link
Member

Hi, we have made a quick mod to the 2.5 branch so please do another pull. As phantez stated we are trying to catch up with API changes and correct our code.

@laparn
Copy link
Author

laparn commented Dec 26, 2012

It still does not work :

arnaud@D3550:~/src/pyslurm/examples$ python jobs_list.py
Traceback (most recent call last):
File "jobs_list.py", line 43, in
jobs = a.get()
File "pyslurm.pyx", line 1581, in pyslurm.pyslurm.job.get (pyslurm/pyslurm.c:20212)
File "pyslurm.pyx", line 1677, in pyslurm.pyslurm.job.__get (pyslurm/pyslurm.c:21214)
File "pyslurm.pyx", line 4146, in pyslurm.pyslurm.get_conn_type_string (pyslurm/pyslurm.c:45448)
TypeError: an integer is required

Thanks for the try. Keep me posted.

@gingergeeks
Copy link
Member

Looking at it now :)

@gingergeeks
Copy link
Member

Pushed a small change, I tested the previous code on an emulated BG/Q so I built Slurm-2.5.0 for a standard cluster, this is where the connection type is NULL where the Python equivalent is None but the decode routine was expecting an integer. We will have to also modify any other routines where this may also be the case.

Please let me know if this works for you.

@laparn
Copy link
Author

laparn commented Dec 27, 2012

This time, it seems that it works :
JobID 128 :
account : None
alloc_node : D3550
alloc_sid : 14351
altered : None
assoc_id : 0
batch_flag : 0
batch_host : D3550
batch_script : None
block_id : None
blrts_image : None
boards_per_node : 0
cnode_cnt : None
command : wait-arg.sh
comment : None
conn_type : (None, 'None')
contiguous : False
cores_per_socket : 65534
cpus_per_task : 1
dependency : None
derived_ec : 0
eligible_time : Thu Dec 27 10:12:14 2012
end_time : Fri Dec 27 10:12:14 2013
exc_nodes : []
exit_code : 0
features : []
gres : []
group_id : 1001
ionodes : None
job_state : (1, 'RUNNING')
licenses : {}
linux_image : None
max_cpus : 0
max_nodes : 0
mloader_image : None
name : wait-arg.sh
network : None
nice : 10000
nodes : None
ntasks_per_core : 65535
ntasks_per_node : 0
ntasks_per_socket : 65535
num_cpus : 1
num_nodes : 1
partition : debug
pn_min_cpus : 1
pn_min_memory : 0
pn_min_tmp_disk : 0
pre_sus_time : 0
preempt_time : 0
priority : 4294901727
qos : None
ramdisk_image : None
reboot : None
req_nodes : []
req_switch : 0
requeue : True
resize_time : N/A
restart_cnt : 0
resv_id : None
resv_name : None
rotate : False
shared : 0
show_flags : 0
sockets_per_board : 0
sockets_per_node : 65534
start_time : Thu Dec 27 10:12:14 2012
state_desc : None
state_reason : (0, 'None')
submit_time : Thu Dec 27 10:12:14 2012
suspend_time : 0
threads_per_core : 65534
time_limit : Infinite
time_min : 0
user_id : 1001
wait4switch : 0
wckey : None

work_dir : /home/arnaud/src/slurmjob

Number of Jobs - 1

[]
Number of pending jobs - 0
Number of running jobs - 0

JobIDs in Running state - []

I am just a little surprised by the statements at the end : there is a running job, but no list of jobIDs. Looking at the code of the example, we have :
running = a.find('job_state', pyslurm.JOB_RUNNING)

So, I suppose that in find, there is something wrong, or the representation (here job_state is (1, 'RUNNING')) has changed. Do you want me to open a new bug on the subject ? You can close this one.

Thanks for the help,

Regards,

Arnaud

@gingergeeks
Copy link
Member

Yes, please open a new issue and we will take a look as soon as possible. closing this one and thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants