Skip to content

Conversation

@giordano
Copy link
Contributor

@giordano giordano commented Sep 11, 2023

This is a start to address #2936. It doesn't quite work as desired at the moment, because I don't know what to do exactly with the exit status, how do I use it to make ReFrame error if the job failed?

Closes #2936.

@jenkins-cscs
Copy link
Collaborator

Can I test this patch?

@codecov
Copy link

codecov bot commented Sep 11, 2023

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (6cdf0bb) 87.28% compared to head (519b70f) 87.23%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2993      +/-   ##
==========================================
- Coverage   87.28%   87.23%   -0.05%     
==========================================
  Files          60       60              
  Lines       11707    11716       +9     
==========================================
+ Hits        10219    10221       +2     
- Misses       1488     1495       +7     
Files Coverage Δ
reframe/core/schedulers/pbs.py 44.72% <22.22%> (-1.34%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vkarak
Copy link
Contributor

vkarak commented Sep 11, 2023

I would suggest re-opening this as a bugfix against master.

@giordano giordano changed the base branch from develop to master September 12, 2023 13:36
@giordano
Copy link
Contributor Author

Changed target branch to master.

@giordano giordano marked this pull request as ready for review September 12, 2023 14:31
@vkarak vkarak requested review from ekouts and vkarak September 12, 2023 14:45
@vkarak vkarak changed the title Check exit status of PBS Pro jobs [bugfix] Check exit status of PBS Pro jobs Sep 12, 2023
@vkarak vkarak added this to the ReFrame 4.4 milestone Sep 12, 2023
@giordano
Copy link
Contributor Author

Ok, I think this is ready for review: before it wasn't working because I made a mistake in the regexp pattern (didn't use the re.MULTILINE flag and so ^ wasn't matching the beginning of a line within the whole string), after fixing that the build job is failing as expected. It looks like that setting job._exitcode is enough for ReFrame to handle this correctly.

@giordano
Copy link
Contributor Author

giordano commented Sep 12, 2023

Uhm, I think now this fails also when jobs are successful, but I don't understand why 😕

Edit: fixed (had to force the type of job._exitcode to int), now it should be all good.

@vkarak
Copy link
Contributor

vkarak commented Sep 16, 2023

ok to test

@vkarak
Copy link
Contributor

vkarak commented Sep 22, 2023

@giordano I updated the PR based on the last comments. Can you give it a try and check if anything is broken so that we can merge it?

@vkarak vkarak changed the title [bugfix] Check exit status of PBS Pro jobs [bugfix] Retrieve exit status of PBS Pro jobs Sep 29, 2023
@vkarak vkarak changed the title [bugfix] Retrieve exit status of PBS Pro jobs [bugfix] Properly retrieve exit status for PBS Pro jobs Sep 29, 2023
@vkarak vkarak merged commit df18164 into reframe-hpc:master Sep 29, 2023
@giordano giordano deleted the mg/pbs-job-status branch October 5, 2023 12:11
@giordano
Copy link
Contributor Author

giordano commented Oct 5, 2023

Sorry, I just managed to run the test with the latest version of develop (which includes the latest changes in this PR), and I can confirm that the exit status of the build job is handled correctly in PBS Pro. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

When using pbs scheduler, failures during non-local build jobs aren't fatal

4 participants