Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to use CRTM 2.4 #530

Closed
wants to merge 2 commits into from
Closed

Change to use CRTM 2.4 #530

wants to merge 2 commits into from

Conversation

CoryMartin-NOAA
Copy link
Contributor

As the title says!

@CoryMartin-NOAA CoryMartin-NOAA added hera-GW-RT Queue for automated testing with global-workflow on Hera orion-GW-RT Queue for automated testing with global-workflow on Orion labels Jul 12, 2023
@emcbot emcbot added hera-GW-RT-Running Automated testing with global-workflow running on Hera orion-GW-RT-Running Automated testing with global-workflow running on Orion and removed hera-GW-RT Queue for automated testing with global-workflow on Hera orion-GW-RT Queue for automated testing with global-workflow on Orion labels Jul 12, 2023
@emcbot
Copy link

emcbot commented Jul 12, 2023

Automated Global-Workflow GDASApp Testing Results:
Machine: orion

Start: Wed Jul 12 14:32:39 CDT 2023 on Orion-login-1.HPC.MsState.Edu
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Wed Jul 12 15:47:18 CDT 2023
---------------------------------------------------
Tests:                                  *Failed*
Tests: Failed at Wed Jul 12 16:05:39 CDT 2023
Tests: 92% tests passed, 4 tests failed out of 49
	1530 - test_gdasapp_atm_jjob_var_run (Failed)
	1531 - test_gdasapp_atm_jjob_var_final (Failed)
	1533 - test_gdasapp_atm_jjob_ens_run (Failed)
	1534 - test_gdasapp_atm_jjob_ens_final (Failed)
Tests: see output at /work2/noaa/stmp/cmartin/CI/GDASApp/workflow/PR/530/global-workflow/sorc/gdas.cd/build/log.ctest

@emcbot emcbot added orion-GW-RT-Failed Automated testing with global-workflow failed on Orion and removed orion-GW-RT-Running Automated testing with global-workflow running on Orion labels Jul 12, 2023
@emcbot
Copy link

emcbot commented Jul 12, 2023

Automated Global-Workflow GDASApp Testing Results:
Machine: hera

Start: Wed Jul 12 19:36:51 UTC 2023 on hfe07
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Wed Jul 12 20:53:04 UTC 2023
---------------------------------------------------
Tests:                                  *Failed*
Tests: Failed at Wed Jul 12 21:09:58 UTC 2023
Tests: 92% tests passed, 4 tests failed out of 49
	1530 - test_gdasapp_atm_jjob_var_run (Failed)
	1531 - test_gdasapp_atm_jjob_var_final (Failed)
	1533 - test_gdasapp_atm_jjob_ens_run (Failed)
	1534 - test_gdasapp_atm_jjob_ens_final (Failed)
Tests: see output at /scratch1/NCEPDEV/da/Cory.R.Martin/CI/GDASApp/workflow/PR/530/global-workflow/sorc/gdas.cd/build/log.ctest

@emcbot emcbot added hera-GW-RT-Failed Automated testing with global-workflow failed on Hera and removed hera-GW-RT-Running Automated testing with global-workflow running on Hera labels Jul 12, 2023
@emilyhcliu
Copy link
Collaborator

@CoryMartin-NOAA Does this mean we have already decided to switch to CRTM 2.4 even though this change would break the current ctests?

@RussTreadon-NOAA
Copy link
Contributor

Hera test

Install g-w develop at 84842f4 and GDASApp feature/jack-bauer. Run ctests. Four tests fail

The following tests FAILED:
        1530 - test_gdasapp_atm_jjob_var_run (Failed)
        1531 - test_gdasapp_atm_jjob_var_final (Failed)
        1533 - test_gdasapp_atm_jjob_ens_run (Failed)
        1534 - test_gdasapp_atm_jjob_ens_final (Failed)

The var_run and ens_run tests failed with CRTM coefficient errors.

1:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
3:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
5:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
0:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
2:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
4:  AerosolCoeff_ValidRelease(INFORMATION) : A AerosolCoeff data update is needed. AerosolCoeff release is 3. Valid release is 4.
5:  AerosolCoeff_ReadFile(Binary)(FAILURE) : AerosolCoeff Release check failed.
5:  CRTM_AerosolCoeff_Load(FAILURE) : Error reading AerosolCoeff file /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/develop/sorc/gdas.cd/build/test/atm/global-workflow/testrun/RUNDIRS/gdas_test/gdasatmanl_18/crtm/AerosolCoeff.bin
3:  AerosolCoeff_ReadFile(Binary)(FAILURE) : AerosolCoeff Release check failed.
3:  CRTM_AerosolCoeff_Load(FAILURE) : Error reading AerosolCoeff file /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/develop/sorc/gdas.cd/build/test/atm/global-workflow/testrun/RUNDIRS/gdas_test/gdasatmanl_18/crtm/AerosolCoeff.bin
3:  CRTM_Init(FAILURE) : Error loading AerosolCoeff data from AerosolCoeff.bin
3:  ufo_radiancecrtm_simobs(FAILURE) :  Error initializing CRTM on rank            3
5:  CRTM_Init(FAILURE) : Error loading AerosolCoeff data from AerosolCoeff.bin
5:  ufo_radiancecrtm_simobs(FAILURE) :  Error initializing CRTM on rank            5
1:  AerosolCoeff_ReadFile(Binary)(FAILURE) : AerosolCoeff Release check failed.
1:  CRTM_AerosolCoeff_Load(FAILURE) : Error reading AerosolCoeff file /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/develop/sorc/gdas.cd/build/test/atm/global-workflow/testrun/RUNDIRS/gdas_test/gdasatmanl_18/crtm/AerosolCoeff.bin
1:  CRTM_Init(FAILURE) : Error loading AerosolCoeff data from AerosolCoeff.bin
3: Abort from ufo_radiancecrtm_simobs
5: Abort from ufo_radiancecrtm_simobs
1:  ufo_radiancecrtm_simobs(FAILURE) :  Error initializing CRTM on rank            1
1: Abort from ufo_radiancecrtm_simobs

The var_final and ens_final tests failed because expected output from the run jobs was not available.

The var_init and ens_init jobs copy CRTM coefficients from g-w directory $HOMEgfs/fix/gdas/crtm/2.3.0. The g-w crtm directory points at /scratch1/NCEPDEV/global/glopara/fix/gdas/crtm/20220805. These are v2.3.0 coefficients. Looks like we need to copy and use v2.4.0 coefficients.

@RussTreadon-NOAA
Copy link
Contributor

Try the following as a test

  1. create g-w fix/gdas/crtm/2.4.0 as a link to /scratch2/NCEPDEV/nwprod/hpc-stack/libs/hpc-stack/intel-18.0.5.274/crtm/2.4.0/fix
  2. change crtm_VERSION=2.3.0 to crtm_VERSION=2.4.0 in GDASApp test/atm/global-workflow/config.atmanl

This failed because I forgot that gsi.x reads big endian CRTM coefficient files. fv3jedi_var.x reads little endian CRTMfiles.

Execute ufo_get_crtm_test_data to pull CRTM coefficients from server. This pulls down crtm tarball with coefficients in distributed format (ie, directories AerosolCoeff, CloudCoeff, EmisCoeff, SpcCoeff, TauCoeff). fv3jedi_var.x and gsi.x expect coefficients to be in single directory containing all coefficient types.

Do we have a script to remap the distributed directory structure to the flat directory?

@CoryMartin-NOAA
Copy link
Contributor Author

@emilyhcliu The develop branch of UFO requires 2.4 to build. We have a few options here, none are ideal in my opinion

  1. go to using CRTM 2.4. This is the easiest, but it would require us updating the "Darth Vader" tests ASAP as they would likely fail.
  2. keep a branch of UFO that we keep up to date with develop, but manually edit it to support CRTM 2.3. This is somewhat labor intensive.
  3. Open a PR to UFO to make the minimum version 2.3. It was 3.0, and Dan changed it to 2.4. We can go to 2.3 (that is my fault for thinking 2.4 would be fine going forward)

Thoughts? As @RussTreadon-NOAA showed we need 2.4 coefficients for 2.4. I forget how the JEDI fix directory was created... GSI had a script to stage the CRTM coefficients, right?

@RussTreadon-NOAA
Copy link
Contributor

With the splitting of our single GSI repo into several functional GSI repos it looks like the script to stage crtm fix files got lost ... at least I can't find it yet. Perhaps the fix file script resides with the NCEPLIBS crtm install scripts.

@RussTreadon-NOAA
Copy link
Contributor

link_crtm_coeffs.sh found in an old GSI hash. The script didn't work out of the box. I needed to comment out special links for airs, cris, and iasi at the end of the script. Used the revised script to populate flat directory $HOMEgfs/fix/gdas/crtm/2.4.0 from distributed directory $HOMEgfs/fix/gdas/crtm/2.4.1_skylab_4.0.

Set crtm_VERSION=2.4.0 in GDASApp test/atm/global-workflow/config.atmanl and g-w parm/config/gfs/config.atmensanl. (Yes, changing crtm_VERSION in two different locations is problematic.)

Reran `test_gdasapp_atm_jjob`.   All 6 jobs completed with _Passed_.
(gdasapp) Hera(hfe11):/scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/develop/sorc/gdas.cd/build$ ctest -R test_gdasapp_atm_jjob   Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/develop/sorc/gdas.cd/build
    Start 1529: test_gdasapp_atm_jjob_var_init
1/6 Test #1529: test_gdasapp_atm_jjob_var_init ....   Passed   43.16 sec
    Start 1530: test_gdasapp_atm_jjob_var_run
2/6 Test #1530: test_gdasapp_atm_jjob_var_run .....   Passed  140.42 sec
    Start 1531: test_gdasapp_atm_jjob_var_final
3/6 Test #1531: test_gdasapp_atm_jjob_var_final ...   Passed   42.17 sec
    Start 1532: test_gdasapp_atm_jjob_ens_init
4/6 Test #1532: test_gdasapp_atm_jjob_ens_init ....   Passed   44.32 sec
    Start 1533: test_gdasapp_atm_jjob_ens_run
5/6 Test #1533: test_gdasapp_atm_jjob_ens_run .....   Passed  300.45 sec
    Start 1534: test_gdasapp_atm_jjob_ens_final
6/6 Test #1534: test_gdasapp_atm_jjob_ens_final ...   Passed   42.10 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 613.03 sec

I advise against using my $HOMEgfs/fix/gdas/crtm/2.4.0. Someone more knowledgeable about the CRTM should set up a little endian crtm/2.4.0 flat directory.

@CoryMartin-NOAA
Copy link
Contributor Author

I'm perhaps wondering if we should instead stick to 2.3 a bit longer with the UFO feature/gdasapp branch so that things can continue to run as intended?

@CoryMartin-NOAA CoryMartin-NOAA marked this pull request as draft July 13, 2023 19:43
@CoryMartin-NOAA CoryMartin-NOAA removed hera-GW-RT-Failed Automated testing with global-workflow failed on Hera orion-GW-RT-Failed Automated testing with global-workflow failed on Orion labels Jul 13, 2023
@CoryMartin-NOAA CoryMartin-NOAA deleted the feature/jack-bauer branch August 10, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants