-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Namelist updates and runtime fixes #502
Conversation
9224dcb
to
f008ae0
Compare
@dustinswales @scrasmussen @mkavulich All RTs are running successfully on the GitHub servers, although there are failures associated with nvfortran running without GPU acceleration and within the Docker container. Are these worth debugging before the release? Are we sure that the Docker container is grabbing the latest commit in the PR branch (and doing so recursively)? |
@dustinswales @scrasmussen @mkavulich @ligiabernardet I think that all of the CI tests should be passing, but the Nvidia RTs and docker tests are timing out due to the large aerosol climo file handling, I think. If I create a container with the code in this PR, I'm able to run and pass all tests in rt_test_cases.py, so there are no failing RTs on Docker anymore. We may need to streamline the nvidia and docker tests to get them to reliably pass after the release, but I would like to merge this as-is. |
@grantfirl I think it's fine to proceed w/o the NVIDIA CI working. Post-release we should look at "containerizing" the NVIDIA CI tests (And add Intel) |
OK, thanks. The previous nvidia CI failures were failures when running RTs due to lack of the climatology data, so what I did should fix them, but GitHub servers keep on hanging when running the tests, so I'm not sure and I don't have an independent way to check with the nvidia compilers. |
The Docker container does not grab the PR branch code, it builds from
I didn't realize Soren's PR still included the aerosol climo download. We should look into the potential of having a subset of these files for use in RTs, because I don't think this is ever going to work on Github runners. |
Soren added some code to pull from the PR branch (recursively) in the Dockerfile when invoked through the CI with a PR number as an argument. I added back the aerosol climo download for the Docker container testing only (it shouldn't be in the image, only when the image starts up and runs). Otherwise, a bunch of RTs fail due to lack of the aerosol data. This must fill up the space alotted for containers though. There's just no good way to shuffle around GBs of data for the GitHub runners, really. |
Was typing these answer up but you beat me to it, lol I was thinking maybe we need to add another file like Or maybe there is only a subset of the aerosol data that is needed, we can create a new tarball that is smaller |
@grantfirl I'm not surprised by this issue. |
OK, excellent. Since this is a known issue, I feel confident that the RTs actually run on every platform/compiler combo at this point. The only remaining issue is to streamline the data component of the RTs, which can happen after the release. |
Note: This PR is stacked on the dephy_conversion branch, so #496 should be merged first
Requires NCAR/ccpp-physics#1082