Skip to content

Conversation

@vkarak
Copy link
Contributor

@vkarak vkarak commented Oct 27, 2019

This PR replaces #944. It is a much simpler implementation and provides the basis for the support of dependencies in the asynchronous execution policy.

Some highlights of the implementation:

  • The initial value of the reference counter for cleaning up a test is determined by the in-degree of each node in the dependency graph and is calculated during the test graph construction.
  • The serial execution policy is now also event-based. The cleanup phase is deferred and tasks are appended to a list for finalization. As soon as a test succeeds, the reference count of all its dependencies is decreased and then all pending tasks are tried for cleanup. A new finalize pseudo pipeline phase was added for this purpose, that if reached the test is marked as "success." This phase lies between the performance and the cleanup phase. We could not mark as "success" the end of the performance phase, because it will not be executed if --skip-performance-check option is passed.
  • A new unit test is added that exercises the execution of dependent tests. A set of interdependent tests is created, such that each tests needs the output of its "parents". Namely, each test writes a number in a file in its stage directory, which is the sum of the numbers it had read from its "parents" and its own id. A sanity check, the expected sum is checked. This arrangement of tests allows checking at once that the dependency order is respected during execution and that the resources of the tests are not cleaned up prematurely.
  • Failures of dependent tests due to their parents failing are handled cleanly by simply raising a TaskDependencyError if any of the current test's parents has failed.

Other notes:

A side effect of the reference counting for cleaning up is that a successful test will not be cleaned up (meaning also that its resources will not be copied to the output directory) if any of its immediately dependent tests fail. This is useful for debugging the failing tests, but it might be a bit confusing if you try to look it up in the output directory.

Fixes UES-285.

@vkarak vkarak requested a review from ekouts October 28, 2019 00:06
Order of tests is important for some of the async policy tests.
Sort the cases topologically only if needed.
@vkarak
Copy link
Contributor Author

vkarak commented Oct 28, 2019

@jenkins-cscs retry daint

@codecov-io
Copy link

codecov-io commented Oct 28, 2019

Codecov Report

Merging #1015 into master will increase coverage by 0.12%.
The diff coverage is 98.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1015      +/-   ##
==========================================
+ Coverage   91.84%   91.97%   +0.12%     
==========================================
  Files          80       81       +1     
  Lines       10734    10937     +203     
==========================================
+ Hits         9859    10059     +200     
- Misses        875      878       +3
Impacted Files Coverage Δ
unittests/resources/checks_unlisted/deps_simple.py 100% <ø> (ø)
reframe/frontend/executors/__init__.py 97.87% <100%> (-0.36%) ⬇️
reframe/core/exceptions.py 82.44% <100%> (+0.13%) ⬆️
reframe/frontend/cli.py 80.63% <100%> (+0.24%) ⬆️
unittests/test_policies.py 98.86% <100%> (+0.13%) ⬆️
reframe/frontend/executors/policies.py 97.76% <96.55%> (+0.72%) ⬆️
...nittests/resources/checks_unlisted/deps_complex.py 97.67% <97.67%> (ø)
reframe/frontend/dependency.py 96.77% <97.91%> (-0.17%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8840718...5b915be. Read the comment docs.

@vkarak vkarak changed the title [feat] Add support for test dependencies in serial execution policy [wip] [feat] Add support for test dependencies in serial execution policy Oct 28, 2019
@vkarak
Copy link
Contributor Author

vkarak commented Oct 28, 2019

I still need to disable this feature with the retries.

Retries are now supported ;-)

@pep8speaks
Copy link

pep8speaks commented Oct 29, 2019

Hello @vkarak, Thank you for updating!

Cheers! There are no PEP8 issues in this Pull Request!Do see the ReFrame Coding Style Guide

Comment last updated at 2019-10-31 10:50:40 UTC

@vkarak vkarak requested a review from ekouts October 29, 2019 23:38
@vkarak vkarak changed the title [wip] [feat] Add support for test dependencies in serial execution policy [feat] Add support for test dependencies in serial execution policy Oct 29, 2019
Copy link
Contributor

@ChristopherBignamini ChristopherBignamini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vkarak vkarak added this to the ReFrame sprint 2019w44 milestone Oct 31, 2019
@vkarak vkarak merged commit 55dd37f into reframe-hpc:master Oct 31, 2019
@vkarak vkarak deleted the feat/deps-serial-execution branch October 31, 2019 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants