Optionally split regression tests into test groups #207

4tXJ7f · 2017-07-26T19:03:01Z

To prevent timing out on Travis, this commit adds the option to split
the regression tests into groups and run each group in a separate job.
The group of a test is determined by computing a checksum of its name.

To prevent timing out on Travis, this commit adds the option to split the regression tests into groups and run each group in a separate job. The group of a test is determined by computing a checksum of its name.

test/regress/run_regression

@@ -49,6 +49,16 @@ cvc4=$1
 benchmark_orig=$2
 benchmark="$benchmark_orig"

+if [ ! -z "$TEST_GROUP" ]; then


timothy-king

I am not sure if we should go with this suggestion or to use the existing explicit mechanism of breaking up the regressions into groups. regress0, regress1, etc. This keeps the labeling obvious at the expense of needing to maintain it. This suggested system of sharding gives up a lot of control in exchange for easily breaking into N random shards in the future. It is not obvious to me which is better.

4tXJ7f · 2017-07-27T09:37:06Z

I definitely hear your concerns about giving up control. Also, the proposed solution is not a very elegant one. To an extent, however, the regressX folders and TEST_GROUPs are orthogonal. The idea of this commit was to make it possible to split up one job into an arbitrary number of jobs (the slowest job is run as two jobs with the changes in .travis.yml). As far as I know, we are not using regressX for this purpose (if I am not mistaken, we only run regress0 on Travis). Either way, we should probably come up with a clear purpose for each of the regressX folders and a protocol for adding new tests, keeping in mind our limited resources.

4tXJ7f · 2017-07-27T09:39:20Z

@mpreiner counted the test cases in each regressX directory:

regress0: 1538
regress1: 30
regress2: 36
regress3: 12
regress4: 6

(just copying it here because it might be relevant to the discussion)

timothy-king · 2017-07-28T03:03:51Z

As far as I know, we are not using regressX for this purpose (if I am not mistaken, we only run regress0 on Travis).
Historical context: Originally the regressX directories were a scheme for running more expensive tests less frequently but still having them checked in and available. So regressX should be faster and run more often than regressX+1. But along the way regress0 became synonym with make check, which itself became the minimum for checking in code. Over time, all of the tests were just added to regress0. The incentives encourage this as adding a test makes a no big difference in personal running time and it is the natural home for always having the test run, which at the moment you definitely want. Tests that take so long can then just not listed in the Makefile.am files. So the original idea got out of sync with the maintenance of the tests.

So nothing really stops us from going back and cleaning these up and having cleaner regress groups with regress0 being tied to make check. We can also additionally have optional sharding. This is largely about preferences and making a benefits vs. time call. I suggest we talk about this at next week's meeting as it is a good candidate for folks expressing preferences.

kbansal · 2017-07-28T17:15:18Z

Uncomment ulimit line in this to detect long tests:

https://github.com/CVC4/CVC4/blob/master/test/regress/run_regression

I think the requirement was 1 sec in debug mode (on Dejan's machine? ) for it to be in regress0, but with increasing number of tests maybe that is not sufficient :)

timothy-king

After the meeting, I'm convinced this is a reasonable way forward.

Let's update the docs for test/regress/run_regression to describe the effects of the flags.

test/regress/run_regression

@@ -49,6 +49,16 @@ cvc4=$1
 benchmark_orig=$2
 benchmark="$benchmark_orig"

+if [ ! -z "$TEST_GROUP" ]; then


Optionally split regression tests into test groups

8d9eab9

To prevent timing out on Travis, this commit adds the option to split the regression tests into groups and run each group in a separate job. The group of a test is determined by computing a checksum of its name.

4tXJ7f requested review from timothy-king and mpreiner July 26, 2017 19:03

timothy-king reviewed Jul 27, 2017

View reviewed changes

test/regress/run_regression Outdated

@@ -49,6 +49,16 @@ cvc4=$1

benchmark_orig=$2

benchmark="$benchmark_orig"

if [ ! -z "$TEST_GROUP" ]; then

This comment was marked as spam.

Sign in to view

timothy-king reviewed Jul 27, 2017

View reviewed changes

timothy-king approved these changes Aug 3, 2017

View reviewed changes

mpreiner reviewed Aug 3, 2017

View reviewed changes

test/regress/run_regression Outdated

@@ -49,6 +49,16 @@ cvc4=$1

benchmark_orig=$2

benchmark="$benchmark_orig"

if [ ! -z "$TEST_GROUP" ]; then

This comment was marked as spam.

Sign in to view

This comment was marked as spam.

Sign in to view

mpreiner approved these changes Aug 3, 2017

View reviewed changes

Add documentation, minor fix

f15e474

4tXJ7f merged commit e73457b into master Aug 8, 2017

4tXJ7f deleted the test_groups branch August 8, 2017 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optionally split regression tests into test groups #207

Optionally split regression tests into test groups #207

4tXJ7f commented Jul 26, 2017

This comment was marked as spam.

timothy-king left a comment

4tXJ7f commented Jul 27, 2017

4tXJ7f commented Jul 27, 2017

timothy-king commented Jul 28, 2017

kbansal commented Jul 28, 2017

timothy-king left a comment

This comment was marked as spam.

This comment was marked as spam.

Optionally split regression tests into test groups #207

Optionally split regression tests into test groups #207

Conversation

4tXJ7f commented Jul 26, 2017

This comment was marked as spam.

timothy-king left a comment

Choose a reason for hiding this comment

4tXJ7f commented Jul 27, 2017

4tXJ7f commented Jul 27, 2017

timothy-king commented Jul 28, 2017

kbansal commented Jul 28, 2017

timothy-king left a comment

Choose a reason for hiding this comment

This comment was marked as spam.

This comment was marked as spam.