Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor system test scripts to make it easy to add new tests #146

Closed
billsacks opened this issue Aug 31, 2015 · 7 comments
Closed

Refactor system test scripts to make it easy to add new tests #146

billsacks opened this issue Aug 31, 2015 · 7 comments

Comments

@billsacks
Copy link
Member

There is currently a lot of duplication between the different system test scripts (the scripts in Testcases). I know there are plans to rewrite these in perl or python, and my thoughts can (and should!) wait until that is done. But I wanted to create this issue now, as a place to record thoughts about how these should be refactored.

At a broad level, I can see two possible approaches; I give examples referencing my new LII test (#145):

(1) Have a library of shared test routines. Your test would then call the appropriate routines. My new LII test, for example, could then look something like this:

initial_test_setup()
cp user_nl_nointerp/* .
do_run(suffix='base')
cp user_nl_interp/*
do_run(suffix='interp')
compare_runs(suffix1='base', suffix2='interp')

(2) Use a template method approach. There would be an abstract base class like two_run_test (for tests where you do two different runs and compare them to ensure they are bit-for-bit). This base class would have a template method for running the test, which would look like:

initial_test_setup()
setup_run1()
do_run1()
setup_run2()
do_run2()
compare_runs()

setup_run1 and setup_run2 would need to be implemented by each specific test; the other routines would be common to all tests (but could be overridden / extended by specific tests). My new LII test would have:

def setup_run1():
    cp user_nl_nointerp/* .

def setup_run2():
    cp user_nl_interp/* .
@billsacks
Copy link
Member Author

cc @sholly @mvertens

@billsacks
Copy link
Member Author

Another strategy to accomplish this would be to allow tests to specify two testmods directories: The test would run a case with each testmods directories (probably as a ref case and then the actual case) and compare the two.

@billsacks
Copy link
Member Author

Moving bugzilla bug 1852 here.... This partly duplicates what I have above, but not entirely.

Here was the original bugzilla enhancement request (from me):

This is an enhancement that has been on my mental wish list for a little while... it's probably a big project, although maybe with the testmods directories an initial version could be implemented without too much trouble....

I'd love to have the ability to create a test that compares two cases with an arbitrary set of differences. For example, compare one case that doesn't use a testmod directory with a similar case that uses a particular testmod directory. We already have quite a few special cases of this, so this would be allowing for the general case. That way you could ensure that a particular namelist option doesn't change answers, for example. (For example, see comment 1 in bug 1851.)

This addition to the test system is reminiscent of the NOAA system presented by Tom Henderson / Paul Madden, in which they use large groups of tests that are all supposed to give identical answers – e.g., they might include exact restart, threading, and different processor counts all in the same compare group. The first test in the compare group is taken to be the correct version, and each other test is compared against that 'master'.

Some extra credit items that could be considered:

(1) Allow for simple expressions specifying the expected results of the test. Thus, rather than just testing for equality, you could have a test that says something like, "if I do two runs that are the same, except irrigation is turned on in run B, then run B should have higher global average latent heat flux" (in order to confirm that irrigation is working, to first order)

(2) Allow a test to point to a script that does some arbitrary set of commands in order to set up one or both runs. This would be useful, for example, in testing CLM's interpinic (see bug 1703, which proposes doing a test that involves multiple CESM runs as well as running CLM's interpinic).

And here is comment 1, from Erik Kluzek:

We talked about this at our testing meeting and batted around a few different ideas. In the end, the suggestion that Jay had was that with his refactoring testing will be OO with a base class. So the hope is that one of the class levels for a test could be extended with the ability to do the requirements stated here.

One way to implement it might be to have an additional "test_mods directory" as part of the test name and then another one with the comparison operation to perform (which might be a script that could do something fairly complicated like compare that some fields are exact and others are NOT).

{test_mods_dir}.{2nd test_mods_dir to compare to}.{compare script to use}

@billsacks
Copy link
Member Author

Another example of a test that would be good to have: turning on carbon isotopes in CLM should not (I think) affect any of the standard outputs. This would have caught http://bugs.cgd.ucar.edu/show_bug.cgi?id=2207

@mnlevy1981
Copy link

I'm just catching up on this issue, and noticed your comment from September

Another strategy to accomplish this would be to allow tests to specify two testmods directories: The test would run a case with each testmods directories (probably as a ref case and then the actual case) and compare the two.

This is a feature that @klindsay28 has been asking for for a while, and seems to also solve your issue from yesterday: add a testmod dir to turn on carbon isotopes, compare to a run without carbon isotopes. (I think that's what you were implying with your comment, but I figured I'd come out and say it explicitly.)

So the 4th floor would be interested in seeing this feature implemented.

@billsacks billsacks self-assigned this May 4, 2016
@billsacks billsacks modified the milestones: cesm2, post cesm2 May 4, 2016
@billsacks
Copy link
Member Author

I'll check to what extent this is addressed in the python

@billsacks
Copy link
Member Author

This is somewhat addressed in the new python-based tests, though not completely. I am closing this one, and have opened a new issue in ESMCI: ESMCI#292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants