Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[production/RRFS.v1] physics updates for RRFS.v1 code freeze #2147

Conversation

haiqinli
Copy link
Contributor

@haiqinli haiqinli commented Feb 23, 2024

Commit Queue Requirements:

  • Fill out all sections of this template.
  • All sub component pull requests have been reviewed by their code managers.
  • Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • Commit 'test_changes.list' from previous step

Description:

This PR includes the following updates:
1). MYNN updates;
2). RUC LSM updates (ufs-community/ccpp-physics#163);
3). GF updates for humidity DA; restore scale-awareness in the 1st hour; suppress weak radar reflectivity over water;
4). smoke plume rise updates for stability on WCOSS2.

Commit Message:

* UFSWM - 
  * FV3 - 
    * ccpp-physics - 

Priority:

  • Critical Bugfix: For the RRFS.v1 code freeze

Git Tracking

UFSWM:

  • Closes #
  • None

Sub component Pull Requests:

UFSWM Blocking Dependencies:

  • None

Changes

Regression Test Changes (Please commit test_changes.list):

RegressionTests_hera.log

  • PR Updates/Changes Baselines.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Derecho
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
  • opnReqTest (complete task if unnecessary)

@BrianCurtis-NOAA BrianCurtis-NOAA changed the title physics updates for RRFS.v1 code freeze [production/RRFS.v1] physics updates for RRFS.v1 code freeze Feb 26, 2024
@grantfirl
Copy link
Collaborator

@haiqinli If you ran RTs, could you commit the test_changes.list file?

@jkbk2004
Copy link
Collaborator

@haiqinli I am setting a new RRFS.v1 baseline location across RDHPCS first. @MatthewPyle-NOAA RRFS team may need to catch up to set wcoss2 RRFS.v1 baseline line location.

@haiqinli
Copy link
Contributor Author

@haiqinli If you ran RTs, could you commit the test_changes.list file?

@grantfirl Do you mean the file of fail_test? I committed the logs/RegressionTests_hera.log, which includes the changed test run cases. Thanks.

@grantfirl
Copy link
Collaborator

@haiqinli If you ran RTs, could you commit the test_changes.list file?

@grantfirl Do you mean the file of fail_test? I committed the logs/RegressionTests_hera.log, which includes the changed test run cases. Thanks.

You can disregard that. I forgot that the test_changes.list file was just added at the top of develop and the release branch doesn't include this yet.

@MatthewPyle-NOAA
Copy link
Collaborator

@haiqinli I am setting a new RRFS.v1 baseline location across RDHPCS first. @MatthewPyle-NOAA RRFS team may need to catch up to set wcoss2 RRFS.v1 baseline line location.

@jkbk2004 Not sure I completely understand. What does the RRFS team need to do? Make a change to point our regression tests at this new RRFS.v1 baseline?

@jkbk2004
Copy link
Collaborator

@haiqinli I am setting a new RRFS.v1 baseline location across RDHPCS first. @MatthewPyle-NOAA RRFS team may need to catch up to set wcoss2 RRFS.v1 baseline line location.

@jkbk2004 Not sure I completely understand. What does the RRFS team need to do? Make a change to point our regression tests at this new RRFS.v1 baseline?

@MatthewPyle-NOAA https://github.com/haiqinli/ufs-weather-model/blob/production/RRFS.v1-codefreeze/tests/rt.sh#L279 and https://github.com/haiqinli/ufs-weather-model/blob/production/RRFS.v1-codefreeze/tests/rt.sh#L299 are for develop branch baseline. On RDHPCS side, I created another directory for RRFS.v1: e.g. hera /scratch2/NAGAPE/epic/UFS-WM_RT/RRFS.v1. So we can continue to test once PRs come to production/RRFS.v1 branch. On acorn and wcoss2, RRFS team may need to do same thing to start maintaining RRFS.v1 baseline in separate folders and running regression test when needed. If NCO/code-freeze requires some level of code cleanup, we can follow separately: e.g. https://github.com/ufs-community/ufs-weather-model/tree/production/AQM.v7. But I notice production/hafs.v1 goes w/o much cleanup.

@jkbk2004
Copy link
Collaborator

gaea c5 and Rocky8 OS migration were kicked in after the production branch hash. ecflow update on derecho after the hash as well. so we may need to skip gaea/jet/derecho.

@jkbk2004
Copy link
Collaborator

orion and hercules are on maintenance. @MatthewPyle-NOAA @haiqinli is it ok to merge this pr by tomorrow?

@jkbk2004
Copy link
Collaborator

@haiqinli I am setting a new RRFS.v1 baseline location across RDHPCS first. @MatthewPyle-NOAA RRFS team may need to catch up to set wcoss2 RRFS.v1 baseline line location.

@jkbk2004 Not sure I completely understand. What does the RRFS team need to do? Make a change to point our regression tests at this new RRFS.v1 baseline?

@MatthewPyle-NOAA https://github.com/haiqinli/ufs-weather-model/blob/production/RRFS.v1-codefreeze/tests/rt.sh#L279 and https://github.com/haiqinli/ufs-weather-model/blob/production/RRFS.v1-codefreeze/tests/rt.sh#L299 are for develop branch baseline. On RDHPCS side, I created another directory for RRFS.v1: e.g. hera /scratch2/NAGAPE/epic/UFS-WM_RT/RRFS.v1. So we can continue to test once PRs come to production/RRFS.v1 branch. On acorn and wcoss2, RRFS team may need to do same thing to start maintaining RRFS.v1 baseline in separate folders and running regression test when needed. If NCO/code-freeze requires some level of code cleanup, we can follow separately: e.g. https://github.com/ufs-community/ufs-weather-model/tree/production/AQM.v7. But I notice production/hafs.v1 goes w/o much cleanup.

@BrianCurtis-NOAA It sounds like it will be better RRFS team maintains new RRFS.v1 baselines on acorn and wcoss2. But FYI: in case to check on your side.

@jkbk2004
Copy link
Collaborator

@grantfirl On RDHPCS side, new RRFS.v1 baseline locations were created ok. You may go ahead to test from #2158

@BrianCurtis-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA It sounds like it will be better RRFS team maintains new RRFS.v1 baselines on acorn and wcoss2. But FYI: in case to check on your side.

@jkbk2004 I don't know what you mean?
If they wish to do something similar to what I have done with AQM in maintaining baselines, I have baselines stored in /lfs/h2/emc/nems/noscrub/emc.nems/RT/NEMSfv3gfs/prod/AQM.v7 on WCOSS2 dev machine. If they have someone who has emc.nems access, then they are welcome to store their production baselines at prod/RRFS.v1.

@MatthewPyle-NOAA
Copy link
Collaborator

MatthewPyle-NOAA commented Feb 29, 2024

@jkbk2004 I'm creating baselines on WCOSS for an RRFS subset of regression tests (using the PR 2147 code). Can get them on acorn once that platform returns. I don't have emc.nems access, so will put them under /lfs/h2/emc/lam/noscrub/emc.lam/RRFS.v1_RT.

Updates list of tests to run for RRFSv1
Adds wcoss2 regression test log
@MatthewPyle-NOAA
Copy link
Collaborator

@jkbk2004 I think this one might be ready to go now. Let me know if more is needed on the WCOSS side - thanks!

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Mar 1, 2024

I agree. We can start merging process. I will go to ufs-community/ccpp-physics#176.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Mar 1, 2024

@MatthewPyle-NOAA This pr is ready to merge. When you squash/merge this pr, you may update commit messages for book-keeping information as

* Bugfix for the RRFS.v1 code freeze
* MYNN updates
* RUC LSM updates (https://github.com/ufs-community/ccpp-physics/pull/163)
* GF updates for humidity DA; restore scale-awareness in the 1st hour; suppress weak radar reflectivity over water
smoke plume rise updates for stability on WCOSS2

Copy link
Collaborator

@MatthewPyle-NOAA MatthewPyle-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all of the help @jkbk2004 and others...will merge now.

@MatthewPyle-NOAA MatthewPyle-NOAA merged commit 7c52456 into ufs-community:production/RRFS.v1 Mar 1, 2024
@grantfirl
Copy link
Collaborator

@jkbk2004 @MatthewPyle-NOAA Could you remind me what the status of the RRFS baselines are? For example, what is in /scratch2/NAGAPE/epic/UFS-WM_RT/RRFS.v1/NEMSfv3gfs/develop-20240227? Are the baselines in that directory valid for this PR? What is the purpose of the new rt.conf_rrfs? Are we only supposed to be running those tests? Sorry for the questions. I'm just running into unexpected RT failures comparing against /scratch2/NAGAPE/epic/UFS-WM_RT/RRFS.v1/NEMSfv3gfs/develop-20240227 that I'm trying to understand. I previously had no RT failures with the same PR going into the develop branch, so I'm confused.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Mar 1, 2024

@grantfirl For this PR and RDHPCS side, I recreated full rt.conf baselines. I agree we should use only rt.conf_rrfs for this production branch. I need to clean up /scratch2/NAGAPE/epic/UFS-WM_RT/RRFS.v1/NEMSfv3gfs/develop-20240227. Maybe clean up thru test along with #2158. Do you see any hiccups of develop-20240227 to test rt.conf_rrfs ?

@MatthewPyle-NOAA
Copy link
Collaborator

@grantfirl The new rt.conf_rrfs is intended to be a subset of tests more relevant to RRFS. Those were the tests I ran on WCOSS. I'm not sure if everyone wants to use it to limit the number of regression tests run elsewhere. Those develop-20240227 should be valid for this PR, but will defer to @jkbk2004

@haiqinli haiqinli deleted the production/RRFS.v1-codefreeze branch March 5, 2024 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants