Skip to content

Conversation

@rsarm
Copy link
Contributor

@rsarm rsarm commented Dec 10, 2019

Closes #881

@pep8speaks
Copy link

pep8speaks commented Dec 10, 2019

Hello @rsarm, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2020-03-12 11:03:42 UTC

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please (a) fix the unit tests and (b) this feature should implemented for all the scheduler backends, except the local one.

@rsarm rsarm changed the title [wip][feat] Add maximum queuing time to jobs [feat] Add maximum queuing time to jobs Dec 16, 2019
@vkarak vkarak removed this from the ReFrame sprint 2019w50 milestone Dec 17, 2019
@vkarak vkarak added this to the ReFrame sprint 20.01 milestone Jan 21, 2020
@vkarak vkarak changed the title [feat] Add maximum queuing time to jobs [wip] [feat] Add maximum queuing time to jobs Jan 22, 2020
@rsarm rsarm changed the title [wip] [feat] Add maximum queuing time to jobs [feat] Add maximum queuing time to jobs Feb 25, 2020
@codecov-io
Copy link

codecov-io commented Feb 25, 2020

Codecov Report

Merging #1099 into master will decrease coverage by 0.15%.
The diff coverage is 44.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1099      +/-   ##
==========================================
- Coverage   92.07%   91.92%   -0.16%     
==========================================
  Files          84       84              
  Lines       12171    12209      +38     
==========================================
+ Hits        11207    11223      +16     
- Misses        964      986      +22
Impacted Files Coverage Δ
reframe/core/schedulers/slurm.py 53.51% <0%> (-1.32%) ⬇️
reframe/core/schedulers/pbs.py 67.85% <0%> (-0.82%) ⬇️
reframe/core/pipeline.py 92.39% <100%> (+0.02%) ⬆️
reframe/core/schedulers/torque.py 33.33% <20%> (-1.81%) ⬇️
unittests/test_schedulers.py 95.37% <57.14%> (-1.49%) ⬇️
reframe/core/schedulers/__init__.py 95.48% <75%> (-0.48%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a32e677...99188bf. Read the comment docs.

@vkarak vkarak requested a review from ekouts February 26, 2020 14:08
@ekouts
Copy link
Contributor

ekouts commented Feb 26, 2020

You don't have unittests that test this for slurm/pbs but I am not sure how you could test it. Maybe you could give a very low max_pending_time so that you make sure it cancels the job. I don't know whether @vkarak has better suggestions.

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely have some sort of unit tests as well. The PBS implementation is broken this is not caught.

Copy link
Contributor

@ekouts ekouts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just a minor comment.

@vkarak
Copy link
Contributor

vkarak commented Mar 11, 2020

@rsarm I am going to fix this PR now.

Also:

- Enable max_pending_time unit test for the Torque backend.
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm now. Only thing to be determined is whether to treat this timeout as an error or not.

@vkarak vkarak changed the title [feat] Add maximum queuing time to jobs [feat] Allow tests to timeout if their associated job is pending for too long Mar 12, 2020
@vkarak vkarak merged commit db805c2 into reframe-hpc:master Mar 12, 2020
@rsarm rsarm deleted the feat/queuing-timeout branch March 10, 2021 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cancel job after a certain queueing time

5 participants