Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally split regression tests into test groups #207

Merged
merged 2 commits into from Aug 8, 2017
Merged

Conversation

4tXJ7f
Copy link
Member

@4tXJ7f 4tXJ7f commented Jul 26, 2017

To prevent timing out on Travis, this commit adds the option to split
the regression tests into groups and run each group in a separate job.
The group of a test is determined by computing a checksum of its name.

To prevent timing out on Travis, this commit adds the option to split
the regression tests into groups and run each group in a separate job.
The group of a test is determined by computing a checksum of its name.
@@ -49,6 +49,16 @@ cvc4=$1
benchmark_orig=$2
benchmark="$benchmark_orig"

if [ ! -z "$TEST_GROUP" ]; then

This comment was marked as spam.

Copy link
Contributor

@timothy-king timothy-king left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should go with this suggestion or to use the existing explicit mechanism of breaking up the regressions into groups. regress0, regress1, etc. This keeps the labeling obvious at the expense of needing to maintain it. This suggested system of sharding gives up a lot of control in exchange for easily breaking into N random shards in the future. It is not obvious to me which is better.

@4tXJ7f
Copy link
Member Author

4tXJ7f commented Jul 27, 2017

I definitely hear your concerns about giving up control. Also, the proposed solution is not a very elegant one. To an extent, however, the regressX folders and TEST_GROUPs are orthogonal. The idea of this commit was to make it possible to split up one job into an arbitrary number of jobs (the slowest job is run as two jobs with the changes in .travis.yml). As far as I know, we are not using regressX for this purpose (if I am not mistaken, we only run regress0 on Travis). Either way, we should probably come up with a clear purpose for each of the regressX folders and a protocol for adding new tests, keeping in mind our limited resources.

@4tXJ7f
Copy link
Member Author

4tXJ7f commented Jul 27, 2017

@mpreiner counted the test cases in each regressX directory:

regress0: 1538
regress1: 30
regress2: 36
regress3: 12
regress4: 6

(just copying it here because it might be relevant to the discussion)

@timothy-king
Copy link
Contributor

As far as I know, we are not using regressX for this purpose (if I am not mistaken, we only run regress0 on Travis).
Historical context: Originally the regressX directories were a scheme for running more expensive tests less frequently but still having them checked in and available. So regressX should be faster and run more often than regressX+1. But along the way regress0 became synonym with make check, which itself became the minimum for checking in code. Over time, all of the tests were just added to regress0. The incentives encourage this as adding a test makes a no big difference in personal running time and it is the natural home for always having the test run, which at the moment you definitely want. Tests that take so long can then just not listed in the Makefile.am files. So the original idea got out of sync with the maintenance of the tests.

So nothing really stops us from going back and cleaning these up and having cleaner regress groups with regress0 being tied to make check. We can also additionally have optional sharding. This is largely about preferences and making a benefits vs. time call. I suggest we talk about this at next week's meeting as it is a good candidate for folks expressing preferences.

@kbansal
Copy link
Contributor

kbansal commented Jul 28, 2017

Uncomment ulimit line in this to detect long tests:

https://github.com/CVC4/CVC4/blob/master/test/regress/run_regression

I think the requirement was 1 sec in debug mode (on Dejan's machine? ) for it to be in regress0, but with increasing number of tests maybe that is not sufficient :)

Copy link
Contributor

@timothy-king timothy-king left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the meeting, I'm convinced this is a reasonable way forward.

Let's update the docs for test/regress/run_regression to describe the effects of the flags.

@@ -49,6 +49,16 @@ cvc4=$1
benchmark_orig=$2
benchmark="$benchmark_orig"

if [ ! -z "$TEST_GROUP" ]; then

This comment was marked as spam.

This comment was marked as spam.

@4tXJ7f 4tXJ7f merged commit e73457b into master Aug 8, 2017
@4tXJ7f 4tXJ7f deleted the test_groups branch August 8, 2017 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants