-
Notifications
You must be signed in to change notification settings - Fork 174
Maintaining Regression Tests
The following instructions are intended to assist in maintaining regression test data for the JWST pipeline, via command-line and interactive tools.
-
You need JFROG CLI to talk to Artifactory. Get it here:
Download it via
curl -fL https://getcli.jfrog.io | sh
Install it in
$HOME/bin
, making sure tochmod u+x
on it and making sure$HOME/bin
is in your$PATH
in your.bash_profile
. The first time you run it it will ask for configuration. Our artifactory server ishttps://bytesalad.stsci.edu/artifactory/
and it accepts your AD (Active Directory) credentials. It will save these to a profile in
$HOME/.jfrog/
so you don't have to enter them every time you use it. To configure:$ jfrog config add https://bytesalad.stsci.edu --artifactory-url=https://bytesalad.stsci.edu/artifactory/ JFrog platform URL [https://bytesalad.stsci.edu/]: JFrog Distribution URL (Optional): Access token (Leave blank for username and password/API key): User [<YOUR_USERNAME>]: Password/API key: <YOUR_AD_PASSWORD> Is the Artifactory reverse proxy configured to accept a client certificate? (y/n) [n]? [Info] Encrypting password...
When you update your AD password, you will need to run
jfrog config edit https://bytesalad.stsci.edu
and go through the same menu options above to update the password, which it will cache (encrypted). Be sure to leave "Access Token" blank at the prompt, so it instead gets your AD username and password.
If you experience issues with your JFrog configuration, ensure that jfrog config show
looks like
Server ID: https://bytesalad.stsci.edu
JFrog platform URL: https://bytesalad.stsci.edu/
Artifactory URL: https://bytesalad.stsci.edu/artifactory/
Distribution URL: https://bytesalad.stsci.edu/distribution/
Xray URL: https://bytesalad.stsci.edu/xray/
Mission Control URL: https://bytesalad.stsci.edu/mc/
Pipelines URL: https://bytesalad.stsci.edu/pipelines/
User: <your-username>
Password: ***
Default: true
If it doesn't, it should be fixed, either with jfrog config edit
, or by running jfrog config rm
and re-trying jfrog config add
.
If your config shows a different Server ID (say, you set it up a looooong time ago and forgot about it), do an edit on that ID instead and put in your new credentials to fix authentication problem.
It's good practice to run regression tests with your local repository before running them on GitHub Actions, but running the whole suite will need to download ~80GB of data. And even single tests can spend more time downloading the data files than running the test. So to make this a bit quicker to iterate, you can cache the whole test suite input and truth data locally. Here's how:
Make sure you have jfrog cli
installed and in your path, say ~/bin/
. See above. If you do not have it in your PATH, you can still call it but you have to provide full path to the executable in every command line call.
All interactions with https://bytesalad.stsci.edu/artifactory need to be done on VPN. It is only available on the internal network.
If you want to store the test suite in $HOME
for example (important note for Linux users, do not use $HOME
, set up the test_bigdata
path on a disk close to the CPU, e.g., /internal/1/
), make a directory and then do the sync:
$ mkdir ~/test_bigdata
$ mkdir ~/test_bigdata/jwst-pipeline
$ cd ~/test_bigdata/jwst-pipeline
$ jfrog rt dl "jwst-pipeline/dev/*" ./
And that will keep your local cache updated with what is on bytesalad in much the same way as rsync -av
. The first time you do it you'll be downloading ~50GB of data, but every subsequent update will just get any diffs.
Then add
export TEST_BIGDATA=$HOME/test_bigdata/
to your .bash_profile
(or whatever shell script you source on login) and you're good to go. Some of the files can change day-to-day, so just remember to do
cd ~/test_bigdata/jwst-pipeline
jfrog rt dl "jwst-pipeline/dev/*" ./
or if you want to make sure deletions are also sync-ed
jfrog rt dl "jwst-pipeline/dev/*" ./ --sync-deletes dev
before you run the tests.
Usually you do not want to run the full suite every time during development because it takes a very long time. If you want to target a particular test function do this at the root directory of your source checkout:
pytest jwst/regtest/test_module.py -k test_function_name --bigdata --slow --run-slow --basetemp=/path/to/pytest_tmp
where you replace test_module.py
with actual test module filename, test_function_name
with the actual test function name, and /path/to/pytest_tmp
with a directory pytest should use to dump test output files. If --basetemp
is not given, /tmp
will be used by default and on Linux, you might get OSError
when disk space quota runs out. For why you need both --slow --run-slow
to run a test marked as "slow", see https://github.com/spacetelescope/ci_watson/issues/83
This is how we are currently updating truth files on our regression tests.
The okify_regtests
script prompts the user to okify or skip failed tests. The script relies on JFrog CLI (see above for instructions on installing and configuring JFrog CLI).
To OKify test(s) from a Github Actions process, run the script like so:
$ okify_regtests jwst <build number>
where the build number is the number designation of the action shown at the top of the run summary page.
The script will provide the assertion error, traceback, and request a decision:
Downloading test okify artifacts to local directory /var/folders/jg/by5st33j7ps356dgb4kn8w900001n5/T/tmpd5rxjrx0
24 failed tests to okify
----------------------------------------------------------------- test_name -----------------------------------------------------------------
run_pipelines = {'input': '.../test_outputs/popen-gw28/test_nircam_tsgrism_run_pipelines0/jw0072101...h_remote': 'jwst-pipeline/dev/truth/test_nircam_tsgrism_stages/jw00721012001_03103_00001-seg001_nrcalong_calints.fits'}
fitsdiff_default_kwargs = {'atol': 1e-07, 'ignore_fields': ['DATE', 'CAL_VER', 'CAL_VCS', 'CRDS_VER', 'CRDS_CTX', 'NAXIS1', ...], 'ignore_hdus': ['ASDF'], 'ignore_keywords': ['DATE', 'CAL_VER', 'CAL_VCS', 'CRDS_VER', 'CRDS_CTX', 'NAXIS1', ...], ...}
suffix = 'calints'
@pytest.mark.bigdata
@pytest.mark.parametrize("suffix", ["calints", "extract_2d", "flat_field",
"o012_crfints", "srctype", "x1dints"])
def test_nircam_tsgrism_stage2(run_pipelines, fitsdiff_default_kwargs, suffix):
"""Regression test of tso-spec2 pipeline performed on NIRCam TSO grism data."""
rtdata = run_pipelines
rtdata.input = "jw00721012001_03103_00001-seg001_nrcalong_rateints.fits"
output = "jw00721012001_03103_00001-seg001_nrcalong_" + suffix + ".fits"
rtdata.output = output
rtdata.get_truth("truth/test_nircam_tsgrism_stages/" + output)
diff = FITSDiff(rtdata.output, rtdata.truth, **fitsdiff_default_kwargs)
> assert diff.identical, diff.report()
E AssertionError:
E fitsdiff: 4.0
E a: .../jw00721012001_03103_00001-seg001_nrcalong_calints.fits
E b: .../truth/jw00721012001_03103_00001-seg001_nrcalong_calints.fits
E HDU(s) not to be compared:
E ASDF
E Keyword(s) not to be compared:
E CAL_VCS CAL_VER CRDS_CTX CRDS_VER DATE NAXIS1 TFORM*
E Table column(s) not to be compared:
E CAL_VCS CAL_VER CRDS_CTX CRDS_VER DATE NAXIS1 TFORM*
E Maximum number of different data values to be reported: 10
E Relative tolerance: 1e-05, Absolute tolerance: 1e-07
E
E Extension HDU 1:
E
E Headers contain differences:
E Headers have different number of cards:
E a: 49
E b: 48
E Extra keyword 'SRCTYPE' in a: 'POINT'
E
E assert False
E + where False = <astropy.io.fits.diff.FITSDiff object at 0x7fb1d9ed2950>.identical
jwst/regtest/test_nircam_tsgrism.py:48: AssertionError
---------------------------------------------------------------------------------------------------------------------------------------------
OK: jwst-pipeline-results/.../test_nircam_tsgrism_stage2/jw00721012001_03103_00001-seg001_nrcalong_calints.fits
--> jwst-pipeline/dev/truth/test_nircam_tsgrism_stages/jw00721012001_03103_00001-seg001_nrcalong_calints.fits
---------------------------------------------------------------------------------------------------------------------------------------------
Enter 'o' to okify, 's' to skip:
Choosing 'o' will cause the script to overwrite the artifactory truth file with the result file from the failed test run, while 's' will ignore this diff output and move on to the next.
If the OKify script above does not work for you, then you may have to use the following method:
- You need JFROG CLI to talk to Artifactory. See above.
- Find out what truth files need updating by looking at the test results.
- Find the results for the failed tests at https://bytesalad.stsci.edu/artifactory/jwst-pipeline-results/ , where there is a directory with test results for each build ordered by date. Old builds are retained until they are purged manually, usually before a new JWST delivery (see below).
- Copy these files from
jwst-pipeline-results
tojwst-pipeline/dev/truth
, using the dropdown menu in the upper right, making sure the the correct path and file names are used.
Following a release, delete any testing artifacts that are more than one build old; e.g.:
jfrog rt del "jwst-pipeline-results/2021-02*"
or whatever pattern deletes artifacts generated prior to the previous build release date. This can be done interactively via the web interface too, but it is tedious.
For new regression tests, you may need to upload new data to the Artifactory server. This can be done via the command line or the web interface. New input data for instrument modes should be uploaded to the jwst-pipeline/dev/<instrument>/<mode>
directories; the corresponding output truth data should be uploaded to jwst-pipeline/dev/truth/<test-name>
.
To use the web interface, navigate to the jwst-pipeline
project, under the dev
folder:
https://bytesalad.stsci.edu/ui/repos/tree/General/jwst-pipeline/dev
Make sure you are logged in with your AD credentials.
Click on the folder to upload to, then click the "Deploy" button at the top-right. Select "Multiple Deploy" to add multiple files at once. Drag and drop the files you need, or use the "Select File" interface to browse for them.
Make sure the "Target Path" is set to the directory to upload to. If you need to make a new directory, edit the path to enter the new name.
Click "Deploy" to upload the files. Note that there is sometimes no obvious visual indicator of progress when uploading multiple files.
To upload a single file:
jfrog rt u <filename> jwst-pipeline/dev/<path on Artifactory>
To recursively upload multiple files to the corresponding folder on Artifactory:
jfrog rt u '<top-level folder>/(**)' 'jwst-pipeline/dev/<top-level folder>/{1}'