Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Apr 22, 2021

In case qstat returns 153, although we do set the job status as completed, we did not set the _completed variable. This is the variable that is queried to check if the job has actually completed. The reason we don't check the job state only is that, if a test has finished, we must wait until its stdout and stderr are copied back to the test's working dir.

Addresses #1930.

@cblackworth Could you check if this fix works for you?

In case `qstat` returns 153, although we did set the job status as completed, we
did not set the `_completed` variable.
@vkarak vkarak added this to the ReFrame sprint 21.04.2 milestone Apr 22, 2021
@vkarak vkarak requested a review from ekouts April 22, 2021 11:45
@vkarak vkarak self-assigned this Apr 22, 2021
@vkarak
Copy link
Contributor Author

vkarak commented Apr 22, 2021

@cblackworth Hmm, looking again at the issue, this will likely solve the hang that you get if you disable the job history. It will not solve the previous error.

@vkarak
Copy link
Contributor Author

vkarak commented Apr 22, 2021

@cblackworth I've pushed another patch that supposedly fixes the original issue. Let me know if this works for you. I don't have access to a PBS scheduler that supports the -x option, but I understood what is the problem.

@vkarak
Copy link
Contributor Author

vkarak commented May 17, 2021

I will merge this one, since it indeed fixes a bug, but I will keep the linked issue open to follow up on the discussions.

@codecov-commenter
Copy link

Codecov Report

Merging #1944 (eb1e7c1) into master (f389456) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1944      +/-   ##
==========================================
- Coverage   87.51%   87.50%   -0.02%     
==========================================
  Files          50       50              
  Lines        8713     8714       +1     
==========================================
  Hits         7625     7625              
- Misses       1088     1089       +1     
Impacted Files Coverage Δ
reframe/core/schedulers/pbs.py 47.26% <0.00%> (-0.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f389456...eb1e7c1. Read the comment docs.

@vkarak vkarak merged commit d5bb8b0 into reframe-hpc:master May 17, 2021
@vkarak vkarak deleted the bugfix/openbs-hang branch May 17, 2021 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants