Refactor system test scripts to make it easy to add new tests #146

billsacks · 2015-08-31T12:45:48Z

There is currently a lot of duplication between the different system test scripts (the scripts in Testcases). I know there are plans to rewrite these in perl or python, and my thoughts can (and should!) wait until that is done. But I wanted to create this issue now, as a place to record thoughts about how these should be refactored.

At a broad level, I can see two possible approaches; I give examples referencing my new LII test (#145):

(1) Have a library of shared test routines. Your test would then call the appropriate routines. My new LII test, for example, could then look something like this:

initial_test_setup()
cp user_nl_nointerp/* .
do_run(suffix='base')
cp user_nl_interp/*
do_run(suffix='interp')
compare_runs(suffix1='base', suffix2='interp')

(2) Use a template method approach. There would be an abstract base class like two_run_test (for tests where you do two different runs and compare them to ensure they are bit-for-bit). This base class would have a template method for running the test, which would look like:

initial_test_setup()
setup_run1()
do_run1()
setup_run2()
do_run2()
compare_runs()

setup_run1 and setup_run2 would need to be implemented by each specific test; the other routines would be common to all tests (but could be overridden / extended by specific tests). My new LII test would have:

def setup_run1():
    cp user_nl_nointerp/* .

def setup_run2():
    cp user_nl_interp/* .

The text was updated successfully, but these errors were encountered:

billsacks · 2015-08-31T12:45:59Z

cc @sholly @mvertens

billsacks · 2015-09-04T20:52:57Z

Another strategy to accomplish this would be to allow tests to specify two testmods directories: The test would run a case with each testmods directories (probably as a ref case and then the actual case) and compare the two.

billsacks · 2015-09-16T21:29:54Z

Moving bugzilla bug 1852 here.... This partly duplicates what I have above, but not entirely.

Here was the original bugzilla enhancement request (from me):

This is an enhancement that has been on my mental wish list for a little while... it's probably a big project, although maybe with the testmods directories an initial version could be implemented without too much trouble....

I'd love to have the ability to create a test that compares two cases with an arbitrary set of differences. For example, compare one case that doesn't use a testmod directory with a similar case that uses a particular testmod directory. We already have quite a few special cases of this, so this would be allowing for the general case. That way you could ensure that a particular namelist option doesn't change answers, for example. (For example, see comment 1 in bug 1851.)

This addition to the test system is reminiscent of the NOAA system presented by Tom Henderson / Paul Madden, in which they use large groups of tests that are all supposed to give identical answers – e.g., they might include exact restart, threading, and different processor counts all in the same compare group. The first test in the compare group is taken to be the correct version, and each other test is compared against that 'master'.

Some extra credit items that could be considered:

(1) Allow for simple expressions specifying the expected results of the test. Thus, rather than just testing for equality, you could have a test that says something like, "if I do two runs that are the same, except irrigation is turned on in run B, then run B should have higher global average latent heat flux" (in order to confirm that irrigation is working, to first order)

(2) Allow a test to point to a script that does some arbitrary set of commands in order to set up one or both runs. This would be useful, for example, in testing CLM's interpinic (see bug 1703, which proposes doing a test that involves multiple CESM runs as well as running CLM's interpinic).

And here is comment 1, from Erik Kluzek:

We talked about this at our testing meeting and batted around a few different ideas. In the end, the suggestion that Jay had was that with his refactoring testing will be OO with a base class. So the hope is that one of the class levels for a test could be extended with the ability to do the requirements stated here.

One way to implement it might be to have an additional "test_mods directory" as part of the test name and then another one with the comparison operation to perform (which might be a script that could do something fairly complicated like compare that some fields are exact and others are NOT).

{test_mods_dir}.{2nd test_mods_dir to compare to}.{compare script to use}

billsacks · 2016-02-18T21:09:49Z

Another example of a test that would be good to have: turning on carbon isotopes in CLM should not (I think) affect any of the standard outputs. This would have caught http://bugs.cgd.ucar.edu/show_bug.cgi?id=2207

mnlevy1981 · 2016-02-19T17:13:59Z

I'm just catching up on this issue, and noticed your comment from September

Another strategy to accomplish this would be to allow tests to specify two testmods directories: The test would run a case with each testmods directories (probably as a ref case and then the actual case) and compare the two.

This is a feature that @klindsay28 has been asking for for a while, and seems to also solve your issue from yesterday: add a testmod dir to turn on carbon isotopes, compare to a run without carbon isotopes. (I think that's what you were implying with your comment, but I figured I'd come out and say it explicitly.)

So the 4th floor would be interested in seeing this feature implemented.

billsacks · 2016-05-04T16:28:18Z

I'll check to what extent this is addressed in the python

billsacks · 2016-07-26T03:36:29Z

This is somewhat addressed in the new python-based tests, though not completely. I am closing this one, and have opened a new issue in ESMCI: ESMCI#292

billsacks added the status: fixed in python label May 4, 2016

billsacks self-assigned this May 4, 2016

billsacks modified the milestones: cesm2, post cesm2 May 4, 2016

billsacks mentioned this issue Jul 26, 2016

Refactor system test scripts to make it easier to add new tests that compare two runs ESMCI/cime#292

Closed

billsacks closed this as completed Jul 26, 2016

billsacks mentioned this issue Dec 16, 2017

Add a test that ensures that setting all_active = .true. doesn't change answers for gridcell averages ESCOMP/CTSM#47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor system test scripts to make it easy to add new tests #146

Refactor system test scripts to make it easy to add new tests #146

billsacks commented Aug 31, 2015

billsacks commented Aug 31, 2015

billsacks commented Sep 4, 2015

billsacks commented Sep 16, 2015

billsacks commented Feb 18, 2016

mnlevy1981 commented Feb 19, 2016

billsacks commented May 4, 2016

billsacks commented Jul 26, 2016

Refactor system test scripts to make it easy to add new tests #146

Refactor system test scripts to make it easy to add new tests #146

Comments

billsacks commented Aug 31, 2015

billsacks commented Aug 31, 2015

billsacks commented Sep 4, 2015

billsacks commented Sep 16, 2015

billsacks commented Feb 18, 2016

mnlevy1981 commented Feb 19, 2016

billsacks commented May 4, 2016

billsacks commented Jul 26, 2016