-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job hangs - “waiting for job to start” on a PBS Cluster #70
Comments
The same was happening to me. I still can't get PBS to work with my cluster, but I was able to solve this problem. In On line 92 of |
@affans could you please confirm this fixed in master? Should this issue be closed? |
using file systems for interprocess communication is generally a bad idea. try |
Hi @bjarthur , I have just tried
Do you have an idea of what may be happening? |
have you tried cutting and pasting the command in the error message directly into a bash terminal? breaking it down into parts would help isolate the problem:
i don't have access to a PBS cluster anymore, so am not able to provide much more help. |
Hi @bjarthur sorry for the delay in the reply. I tried copying the commands in the root node, but the first command hangs indefinitely. Do you have any other suggestion on how to debug this? |
Any progress on that matter? I tried again on a different PBS cluster. Still the same issue. |
I think the hope in this case is to use MPI.jl, they have a MPIManager defined there now that should work in theory. I will try it as soon as I find some time. |
We are trying to revive the package. Please review the latest stable release (released today) and report any issues. PRs are more than welcome! |
I'm trying to use ClusterManagers on a PBS cluster (interactively e.g.)
`
julia> using ClusterManagers
julia> addprocs_pbs(2, queue="default")
job id is 135963, waiting for job to start ................................................................
`
The job seems to hang even though it appears to run on qstat
Job id Name User Time Use S Queue
135963[].pippen julia-26303 snirgaz 0 R default
Any thoughts?
The text was updated successfully, but these errors were encountered: