Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compass python package (compass v1.0) #28

Merged
merged 306 commits into from
Apr 23, 2021
Merged

Conversation

xylar
Copy link
Collaborator

@xylar xylar commented Nov 16, 2020

This merge creates a compass python package that can be used to list, set up, run, validate, and clean up test cases and set up, run, and clean up test suites.

This prototype contains landice and ocean MPAS cores. The landice core has all test cases in the sia_integration test suite and a few other related test cases in the test groups that are in that test suite. The ocean core has all of the test cases for the baroclinic_channel test group, all test cases from the nightly regression suite, and the QU240, QUwISC240, EC30to60, ECwIsC30to60 and SOwISC12to60 meshes from "legacy" COMPASS (COMPASS as it currently exists).

A significant number of python modules (python files) and packages (subdirectories) have been added within compass that break up the tasks related to listing, setting up, and cleaning up test cases that were previously in list_testcases.py, setup_testcase.py and clean_testcase.py as well as functionality related to test suites that was in manage_regression_suite.py before. New functionality has also been added to make setting up and running test cases easier, and to promote more code reuse than the current COMPASS framework permits. See the design document in #29 for more details.

Each test case has an associated configuration file (.cfg) that is constructed by combining config options from various sources: a core-specific config file, one associated with the machine (if running on a "supported" machine), one for the configuration, one for the test case, and a user-defined config file. Typically, these contain different types of options with different purposes. For example, the core's config file points to namelist and streams templates that are core specific. The machine config file contains information about the batch queuing system, the number of cores per node, etc. The test group and test case config files contain options that can be treated as the python equivalents of the namelist options used in the Fortran MPAS-Model code. The user-defined config file can override any config options from these other sources, and can provide paths to initial conditions, meshes and other data that the core or test case may require. An effort has been made to automate most of this so a user config file should only be needed on a non-supported machine. Users can also edit the local config file within a test case (and symlinked within each step) that collects all of the config options for that test case before running.

closes #11
closes #9
closes #20

@xylar xylar added the enhancement New feature or request label Nov 16, 2020
@xylar xylar self-assigned this Nov 16, 2020
@xylar xylar added this to In progress in compass 1.0 via automation Nov 16, 2020
@xylar
Copy link
Collaborator Author

xylar commented Nov 25, 2020

Instructions for setting up and running the Baroclinic Channels test case

Setting up and activating the conda environemnt

Install Miniconda3 if you haven't already. Activate your base environment if you haven't already, e.g.:

source ~/miniconda3/etc/profile.d/conda.sh
conda activate base
conda config --add channels conda-forge
conda config --set channel_priority strict

Don't try to add a new environment to the shared conda environments (where E3SM-Unified is found). Hopefully, no one but me has permission to do this anyway.

Create a conda environment:

conda create -y -n test_compass_1.0 python=3.8 geometric_features=0.1.12 mpas_tools=0.0.14 \
    jigsaw=0.9.12 jigsawpy=0.2.1 metis cartopy_offlinedata ffmpeg mpich "esmf=*=mpi_mpich_*" \
    "netcdf4=*=nompi_*" nco  "pyremap>=0.0.7,<0.1.0" rasterio affine ipython jupyter lxml \
    matplotlib cmocean

Each time you want to run, you should first do:

conda activate test_compass_1.0

Then, load whatever modules you need to build and run MPAS-Ocean

Setting up the test cases

First, build MPAS-Ocean as normal. You can use the submodule at MPAS-Model/ocean/develop or an absolute path to your own check-out.

Once the model is built, set up a config file similar to this one (ocean.cfg):

[paths]
  
mpas_model = MPAS-Model/ocean/develop
mesh_database = /home/xylar/data/mpas/meshes
initial_condition_database = /home/xylar/data/mpas/initial_conditions
bathymetry_database = /home/xylar/data/mpas/bathymetry_database


# The parallel section describes options related to running tests in parallel
[parallel]

# parallel system of execution: slurm or single_node
system = single_node

# whether to use mpirun or srun to run the model
parallel_executable = mpirun

# cores per node on the machine
cores_per_node = 8

# the number of multiprocessing or dask threads to use
threads = 8

If you're on one of the supported machines, you can specify the machine name and you won't need to specify the mesh database, etc. or any of the parallel options because that's all known. The example above is more for testing on a personal machine.

To list test cases:

$ python -m compass list
Testcases:
   0: examples/example_compact/1km/test1
   1: examples/example_compact/1km/test2
   2: examples/example_compact/2km/test1
   3: examples/example_compact/2km/test2
   4: examples/example_expanded/1km/test1
   5: examples/example_expanded/1km/test2
   6: examples/example_expanded/2km/test1
   7: examples/example_expanded/2km/test2
   8: ocean/baroclinic_channel/1km/rpe_test
   9: ocean/baroclinic_channel/4km/rpe_test
  10: ocean/baroclinic_channel/10km/rpe_test
  11: ocean/baroclinic_channel/10km/decomp_test
  12: ocean/baroclinic_channel/10km/default
  13: ocean/baroclinic_channel/10km/restart_test
  14: ocean/baroclinic_channel/10km/threads_test

To list the available machines, do:

$ python -m compass list --machines
Machines:
   anvil
   badger
   default
   cori-haswell
   cori-knl
   compy
   grizzly

To set up test cases:

$ python -m compass setup --help
usage: __main__.py [-h] [-t PATH] [-n NUM [NUM ...]] [-f FILE] [-m MACH]
                   [-w PATH] [-b PATH] [-q] [--no_download]

Set up one or more test cases

optional arguments:
  -h, --help            show this help message and exit
  -t PATH, --test PATH  Relative path for a test case to set up
  -n NUM [NUM ...], --case_number NUM [NUM ...]
                        Case number(s) to setup, as listed from 'compass
                        list'. Can be a space-separatedlist of case numbers.
  -f FILE, --config_file FILE
                        Configuration file for test case setup
  -m MACH, --machine MACH
                        The name of the machine for loading machine-related
                        config options
  -w PATH, --work_dir PATH
                        If set, case directories are created in work_dir
                        rather than the current directory.
  -b PATH, --baseline_dir PATH
                        Location of baselines that can be compared to
  -q, --quiet           If set, the command_history file will not be written
  --no_download         If set, files will not be auto-downloaded during setup

To set up the 10-km test cases:

$ python -m compass setup -n 10 11 12 13 14 -w ~/data/mpas/test_baroclinic_channel -f ocean.cfg 
Setting up testcases:
  ocean/baroclinic_channel/10km/rpe_test
  ocean/baroclinic_channel/10km/decomp_test
  ocean/baroclinic_channel/10km/default
  ocean/baroclinic_channel/10km/restart_test
  ocean/baroclinic_channel/10km/threads_test

Running the test cases

Change to the work directory, where you will see the directory structure indicated above.

$ cd ~/data/mpas/test_baroclinic_channel/ocean/baroclinic_channel/10km
$ ls
decomp_test  default  restart_test  rpe_test  threads_test
$ cd default
$ ./run.py

This should run the full test case. You can also cd into individual steps and run them as usual. You can take a look at the
run.py scripts to see what they are running:

#!/usr/bin/env python
import pickle
import configparser

from compass.ocean.tests.baroclinic_channel.initial_state import run as run


def main():
    with open('initial_state.pickle', 'rb') as handle:
        test = pickle.load(handle)

    config = configparser.ConfigParser(
        interpolation=configparser.ExtendedInterpolation())
    config.read('default.cfg')

    run(test, config)


if __name__ == '__main__':
    main()

In this case, we're running the function run() from compass.ocean.tests.baroclinic_channel.initial_state. You can edit the code for this function by going back to your compass branch and editing compass/ocean/tests/baroclinic_channel/initial_state.py If you make changes, it will affect the working directory, since there's a local symlink to the compass package (the compass directory in the root of the branch).

Don't worry about the pickle stuff -- that's just loading a dictionary with a few pieces of information about the test case that the user shouldn't modify. The user can modify the config file (in this case default.cfg) to update config options before running the test.

All the 10km tests work for me (though they produce some IEEE underflows that are probably worth looking into).

@mark-petersen
Copy link
Collaborator

I'm trying to test this, but getting errors on the conda environment above. @xylar could you check that it works for you?

details

gr-fe1:bin$ source /usr/projects/climate/mpeterse/software/miniconda3/etc/profile.d/conda.sh
gr-fe1:bin$ conda activate base
(base) gr-fe1:bin$ conda create -y -n test_compass_1.0 python=3.8 geometric_features=0.1.12 mpas_tools=0.0.14 \
>     jigsaw=0.9.12 jigsawpy=0.2.1 metis cartopy_offlinedata ffmpeg mpich "esmf=*=mpi_mpich_*" \
>     "netcdf4=*=nompi_*" nco  "pyremap>=0.0.7,<0.1.0" rasterio affine ipython jupyter lxml \
>     matplotlib cmocean
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - geometric_features=0.1.12
  - esmf[build=mpi_mpich_*]
  - netcdf4[build=nompi_*]
  - pyremap[version='>=0.0.7,<0.1.0']
  - cmocean
  - jigsawpy=0.2.1
  - cartopy_offlinedata
  - nco
  - jigsaw=0.9.12
  - mpas_tools=0.0.14

Current channels:

  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.


@mark-petersen
Copy link
Collaborator

The ones causing problems are our own packages:

geometric_features=0.1.12 mpas_tools=0.0.14 jigsaw=0.9.12 jigsawpy=0.2.1

I'm guessing I need to give conda create a -c CHANNEL or something similar?

OK, I tried -c conda-forge -c e3sm and it looks like it is working.

@xylar
Copy link
Collaborator Author

xylar commented Dec 1, 2020

@mark-petersen, the first steps you need to do when you install Miniconda3 are:

conda config --add channels conda-forge
conda config --set channel_priority strict

This should become a habit as part of installing Miniconda3 but I will try to be explicit about it. conda-forge will then always be the highest priority channel to get packages from.

I added these two steps above.

We get all of the package we need for conda-forge so you don't need the e3sm channel. The compass metapackage goes there but you don't need it for this test because the environment you're creating is the equivalent of the metapackage for the prototype.

@mark-petersen
Copy link
Collaborator

@xylar I tested on grizzly, and it worked beautifully! I ran the test cases, and then within the run directory ocean/baroclinic_channel/10km/default/initial_state I edited compass/ocean/tests/baroclinic_channel/initial_state.py and added 1000 to the temperature, reran, and it worked! That is great! I thought I would need to remake a conda environment to do simple edits. That takes care of item 2 on my design comment.

A few comments:

  1. I like the compass link in the run directory. Could we also put a case link that automatically links to the case, i.e. to compass/ocean/tests/baroclinic_channel for the baroclinic channel? Then when we edit initial_state.py locally, it's always vim case/initial_state.py.
  2. The tests (restart, thread) do not report success or fail for the comparison.

@xylar
Copy link
Collaborator Author

xylar commented Dec 2, 2020

Could we also put a case link that automatically links to the case, i.e. to compass/ocean/tests/baroclinic_channel for the baroclinic channel? Then when we edit initial_state.py locally, it's always vim case/initial_state.py.

No, unfortunately, that would be a dangerous thing to do because it could be imported incorrectly as a module with import initial_state. And it would be pretty confusing. The link to compass is not just for convenience, it's fundamental to how the test case works. Linking to other things for convenience seems very risky to me.

The tests (restart, thread) do not report success or fail for the comparison.

Yes, I haven't yet put in support for validation checks in the prototype. It wasn't my top priority. But it won't be hard. Similarly, the baseline flag doesn't actually do anything right now, nor the flag for not downloading. This is beyond the level of review that the prototype is up for at this point but I'm glad it runs for you!

@pep8speaks
Copy link

pep8speaks commented Dec 25, 2020

Hello @xylar! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 165:80: E501 line too long (92 > 79 characters)

Line 20:80: E501 line too long (91 > 79 characters)
Line 115:80: E501 line too long (87 > 79 characters)
Line 141:80: E501 line too long (92 > 79 characters)
Line 218:80: E501 line too long (80 > 79 characters)
Line 242:80: E501 line too long (82 > 79 characters)

Line 70:80: E501 line too long (80 > 79 characters)
Line 125:80: E501 line too long (81 > 79 characters)

Line 3:80: E501 line too long (80 > 79 characters)

Line 44:80: E501 line too long (80 > 79 characters)

Line 94:80: E501 line too long (87 > 79 characters)

Line 80:80: E501 line too long (83 > 79 characters)
Line 112:80: E501 line too long (80 > 79 characters)
Line 123:80: E501 line too long (80 > 79 characters)
Line 124:80: E501 line too long (81 > 79 characters)
Line 134:80: E501 line too long (81 > 79 characters)

Line 33:80: E501 line too long (94 > 79 characters)
Line 56:80: E501 line too long (98 > 79 characters)

Line 32:80: E501 line too long (80 > 79 characters)

Line 23:80: E501 line too long (80 > 79 characters)
Line 84:80: E501 line too long (80 > 79 characters)

Line 21:80: E501 line too long (80 > 79 characters)

Line 23:80: E501 line too long (80 > 79 characters)

Line 24:80: E501 line too long (80 > 79 characters)
Line 61:80: E501 line too long (81 > 79 characters)
Line 88:80: E501 line too long (83 > 79 characters)

Line 45:80: E501 line too long (81 > 79 characters)
Line 47:80: E501 line too long (80 > 79 characters)

Line 97:80: E501 line too long (81 > 79 characters)
Line 112:80: E501 line too long (82 > 79 characters)

Line 29:80: E501 line too long (80 > 79 characters)
Line 109:80: E501 line too long (80 > 79 characters)

Line 56:80: E501 line too long (105 > 79 characters)

Comment last updated at 2021-04-23 20:49:09 UTC

@mark-petersen
Copy link
Collaborator

@xylar, I was excited to see your progress here, as I'll be working on test cases this month, and would like to add to this system if possible. I gave this PR a try again, but ran into an error. I see above you moved to mpas_tools 0.1.0., so I used:

source /usr/projects/climate/mpeterse/software/miniconda3/etc/profile.d/conda.sh
conda activate base
conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -y -n test_compass_1.0 python=3.8 geometric_features=0.1.12 mpas_tools=0.1.0 \
    jigsaw=0.9.12 jigsawpy=0.2.1 metis cartopy_offlinedata ffmpeg mpich "esmf=*=mpi_mpich_*" \
    "netcdf4=*=nompi_*" nco  "pyremap>=0.0.7,<0.1.0" rasterio affine ipython jupyter lxml \
    matplotlib cmocean
conda activate test_compass_1.0

but I always get an error:

python -m compass list --machines
Traceback (most recent call last):
...
  File "/turquoise/usr/projects/climate/mpeterse/repos/compass/pr28_xylar_compass1/compass/testcase/__init__.py", line 7, in <module>
    from mpas_tools.logging import LoggingContext
ModuleNotFoundError: No module named 'mpas_tools.logging'

Are there other library versions that should be updated in the conda create? Or does your mpas_tools.logging come from a branch?

@xylar
Copy link
Collaborator Author

xylar commented Jan 5, 2021

Yes, I made changes in MPAS-Tools that are required here. There is not a release of that package yet because I didn't think anyone else needed it. So this branch is not available for use by anyone but me at the moment. I'll let you know when you can try it out again, with updated instructions.

@xylar xylar force-pushed the compass_1.0 branch 4 times, most recently from 65d1988 to 9746b41 Compare January 13, 2021 14:37
@xylar
Copy link
Collaborator Author

xylar commented Jan 13, 2021

@mark-petersen, @vanroekel, @matthewhoffman and anyone else that's following along, just a quick note to say that I think I'm done with developing the prototype, other than minor changes and any bug fixes that emerge from testing and review.

I have successfully:

  • successfully run all test cases ported over from the ocean nightly suite (except for the periodic planar test for particles, which @mark-petersen and I decided to drop since it cannot easily be ported)
  • tested 4 global-ocean meshes: QU240, QUwISC240, EC30to60 and ECwISC30to60

The additional flexibility of the new framework is paying off beyond what I had envisioned! It is easy, for example, to set up a full initialization and spin-up of any supported mesh with or without ice-shelf cavities, with or without BGC, with the PHC or EN4_1900 initial condition, and with either split-explicit or RK4 time stepping. The same applies to all the shorter test cases (performance, restart, threads, cores, analysis, etc.). I've currently gone a bit overboard with that to prove the possibilities but it's really something!

In the next two weeks, I will work on:

  • updating the instructions here for running the test cases and test suites
  • filling out the remainder of the design document
  • putting together as much documentation as I can

For now, I don't need anything further from any of you, just wanted to let you know where things stand.

@vanroekel
Copy link
Collaborator

Thanks @xylar this sounds like really great progress and some real promising capabilities! If you don't mind I'd like to log a summary of this under the NGD ocean 6 month report so you can be credited for this work.

xylar and others added 22 commits April 23, 2021 22:48
Correct documentation of 60layerPHC grid
This merge includes a large number of updates to the User's Guide
following recent moves to classes and a new naming convention.
Machine usage has also been updated to be consistent with recent
testing with E3SM modules.

The Developer's Guide has also been updated with the change
from "configuration" to "test group".
@xylar
Copy link
Collaborator Author

xylar commented Apr 23, 2021

Thanks @vanroekel! And thanks @mark-petersen and @matthewhoffman as well!

I just added the documentation and will merge as soon as CI passes (if it does).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python package DEPRECATED: PRs and Issues involving the python package (master branch)
Projects
No open projects
5 participants