-
Notifications
You must be signed in to change notification settings - Fork 117
[bugfix] Fix early abort when compile step fails with async policy #1331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bugfix] Fix early abort when compile step fails with async policy #1331
Conversation
|
Hello @sleak-lbl, Thank you for updating! Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide Comment last updated at 2020-05-28 05:12:54 UTC |
|
Can I test this patch? |
Codecov Report
@@ Coverage Diff @@
## master #1331 +/- ##
==========================================
+ Coverage 91.75% 91.77% +0.01%
==========================================
Files 83 83
Lines 12493 12523 +30
==========================================
+ Hits 11463 11493 +30
Misses 1030 1030
Continue to review full report at Codecov.
|
…frame into bugfix/async-compile-aborts
|
I've just had an early abort with this patch in place, so it's clearly not the soltion (or at least, the full solution) .. I've added a WIP marker to the PR, digging a bit more now |
|
Check my comment in the issue for a possible cause of this bug. |
As a result of another task's exit.
|
@sleak-lbl Can you test this PR now with a real workload? |
ekouts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
vkarak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current fix lgtm. @sleak-lbl did you have time to try it? My plan is to merge this PR for ReFrame 3.0 since it indeed fixes a bug, which I hope is your case.
|
tested with real workload and verified it worked (and, as far as I can tell, worked for the right reasons) - so I think it is good to go |
|
Thanks for testing and confirming @sleak-lbl ! |
Fixes #1330
Exceptions raised from build tasks within
compile_waitwith the async policy are not caught byreschedule_task, and so bubble up and stop the reframe process. Any queued jobs then get abandoned.This PR catches exceptions that should cause the test, but not the whole reframe run, to fail