Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables missing from ozone diagnostic files #618

Closed
RussTreadon-NOAA opened this issue Sep 5, 2023 · 15 comments · Fixed by #619
Closed

Variables missing from ozone diagnostic files #618

RussTreadon-NOAA opened this issue Sep 5, 2023 · 15 comments · Fixed by #619
Assignees

Comments

@RussTreadon-NOAA
Copy link
Contributor

GSI PR #591 modified the information written to GSI ozone diagnostic files. Some variables are now missing and this breaks the EnKF update step, eupd, in the global-workflow. This issue is opened to document the problem and develop a fix.

@RussTreadon-NOAA
Copy link
Contributor Author

Tag @CoryMartin-NOAA for awareness. g-w PR #1835 likely depends on resolution of the ozone diagnostic file problem documented in this issue.

@CoryMartin-NOAA
Copy link
Contributor

CoryMartin-NOAA commented Sep 5, 2023

For extra information:

Previous versions of GSI had ozone diagnostic files that had these fields:

	int Observation_Operator_Jacobian_stind(nobs, Observation_Operator_Jacobian_stind_arr_dim) ;
	int Observation_Operator_Jacobian_endind(nobs, Observation_Operator_Jacobian_endind_arr_dim) ;
	float Observation_Operator_Jacobian_val(nobs, Observation_Operator_Jacobian_val_arr_dim) ;

But in develop, they have only this field:

	float Observation_Operator_Jacobian(nobs, Observation_Operator_Jacobian_arr_dim) ;

I'm not sure if this is an intended change or not. If so, then the EnKF code needs updated appropriately to handle reading in these ozone diagnostic files.

Conventional and radiance diagnostic files still have the stind, endind, and val fields in them.

@RussTreadon-NOAA
Copy link
Contributor Author

Jim Jung reports the same problem from his global parallels on S4.

Here are the errors I sent to Dave Huber about the enkf failures.

I updated my version of the gsi last week and now the enkf portion of my workflow fails. Here are parts of the errors.

3: enkf.x 00000000008F1DCE ncdr_vars_fetch_m 393 ncdr_vars_fetch.f90
3: enkf.x 000000000047C2BD readconvobs_mp_ge 625 readconvobs.f90
3: enkf.x 000000000045F867 mpi_readobs_mp_mp 184 mpi_readobs.f90
3: enkf.x 000000000042B8FC enkf_obsmod_mp_re 197 enkf_obsmod.f90
3: enkf.x 0000000000410CDD MAIN__ 157 enkf_main.f90

The other portion of the error is this:

I found another error with the enkf from the RUNDIR directory.
41: ** ERROR: The specified variable 'Observation_Operator_Jacobian_stind' does not exist!
41: ** Failed to read NetCDF4.
40: ** ERROR: The specified variable 'Observation_Operator_Jacobian_stind' does not exist!
40: ** Failed to read NetCDF4.

@RussTreadon-NOAA
Copy link
Contributor Author

Orion test

I reproduced @CoryMartin-NOAA and @wx20jjung eupd failures using the NOAA-EMC/GSI develop at be4a3d9 and @CoryMartin-NOAA g-w in PR #1835.

GSI PR #591 added a call to fullarray in the jacobian section of setupoz.f90. A grep on fullarray shows that is this routine is only called in src/gsi/setupoz.f90. It is not referenced in src/enkf.

./gsi/setupq.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setuprw.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupps.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/sparsearr.f90:public writearray, readarray, fullarray
./gsi/sparsearr.f90:interface fullarray
./gsi/sparsearr.f90:  module procedure fullarray_sparr2
./gsi/sparsearr.f90:subroutine fullarray_sparr2(this, array, ierr)
./gsi/sparsearr.f90:end subroutine fullarray_sparr2
./gsi/setuplwcp.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupoz.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupoz.f90:                    call fullarray(dhx_dx, dhx_dx_array)
./gsi/setupspd.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setuppw.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setuprad.f90:  use sparsearr, only: sparr2, new, writearray, size, fullarray
./gsi/setupw.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupdw.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupt.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/genstats_gps.f90:  use sparsearr, only: sparr2, readarray, fullarray
./gsi/setuptcp.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupswcp.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray
./gsi/setupdbz.f90:  use sparsearr, only: sparr2, new, size, writearray, fullarray

The src/gsi/setupoz.f90 usage is

                 if (save_jacobian) then
                    call fullarray(dhx_dx, dhx_dx_array)
                    call nc_diag_data2d("Observation_Operator_Jacobian", dhx_dx_array)
                 endif

However, src/enkf/readozobs.f90 expects

         if (lobsdiag_forenkf) then
            call nc_diag_read_get_global_attr(iunit, "jac_nnz", nnz)
            call nc_diag_read_get_global_attr(iunit, "jac_nind", nind)
            allocate(Observation_Operator_Jacobian_stind(nind, nobs_curr))
            allocate(Observation_Operator_Jacobian_endind(nind, nobs_curr))
            allocate(Observation_Operator_Jacobian_val(nnz, nobs_curr))
            call nc_diag_read_get_var(iunit,'Observation_Operator_Jacobian_stind', Observation_Operator_Jacobian_stind)
            call nc_diag_read_get_var(iunit,'Observation_Operator_Jacobian_endind', Observation_Operator_Jacobian_endind)
            call nc_diag_read_get_var(iunit,'Observation_Operator_Jacobian_val', Observation_Operator_Jacobian_val)
         endif

@HaixiaLiu-NOAA made the following comment in issue #564

HaixiaLiu-NOAA commented on Jun 15
I believe that some ozone-related changes would cause the enkf job to fail. Has the enkf job been run to test the changes?

This echoes @CoryMartin-NOAA 's question.

There is a mismatch between how gsi.x writes the ozone jacobian and how enkf.x reads it. As a test, I reverted the jacobian block in setupoz.f90 back to

                 if (save_jacobian) then
                   call nc_diag_data2d("Observation_Operator_Jacobian_stind", dhx_dx%st_ind)
                   call nc_diag_data2d("Observation_Operator_Jacobian_endind", dhx_dx%end_ind)
                   call nc_diag_data2d("Observation_Operator_Jacobian_val", real(dhx_dx%val,r_single))
                 endif

eobs, ediag, and eupd were rewound and rebooted. This time eupd ran to completion.

@jack-woollen's reply indicates that the jacobian question was raised by @jswhit and that the issue has been fixed

jack-woollen commented on Jun 15
@HaixiaLiu-NOAA I tested it without enkf. Are you refering to the issue @jswhit mentioned regaarding missing Jacobian writes to the diag files in setupozlev? If so that has been fixed. If not what other changes are problematic?

@jack-woollen and @jswhit , did we loose a change to src/enkf/readozobs.f90 in PR #591?

We need a fix. We can not run global parallels using NOAA-EMC/GSI develop at 9e5aa09c or hashes after PR #591.

@jack-woollen
Copy link
Contributor

@RussTreadon-NOAA @CoryMartin-NOAA Sorry, I hadn't a clue before reading the latest notes. Now I see comparing to one version back there were three lines removed from setupoz rouine setupozlay. This probably done at Goddard. The three calls removed were presumably replaced by the commented out lines. If simply reverting these back in fixes the problem that would be nice. Thanks for the pointers.

             if (save_jacobian) then
                !call fullarray(dhx_dx, dhx_dx_array)
                !call nc_diag_data2d("Observation_Operator_Jacobian", dhx_dx_array)
                call nc_diag_data2d("Observation_Operator_Jacobian_stind", dhx_dx%st_ind)
                call nc_diag_data2d("Observation_Operator_Jacobian_endind", dhx_dx%end_ind)
                call nc_diag_data2d("Observation_Operator_Jacobian_val", real(dhx_dx%val,r_single))
             endif

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @jack-woollen for taking a look. Let me create a branch off develop, update the code, and get back to @CoryMartin-NOAA and you.

@RussTreadon-NOAA
Copy link
Contributor Author

Create branch RussTreadon-NOAA/feature/oznstat as copy of develop at d7ac706. Work to add the missing variables back into the ozone diagnostic file will be done in this branch.

@RussTreadon-NOAA
Copy link
Contributor Author

Orion ctests
Install develop at d7ac706 and RussTreadon-NOAA:feature/oznstat at 8d33344. Run ctests with following results

Orion-login-2:/work2/noaa/da/rtreadon/git/gsi/oznstat/build$ ctest -j 9
Test project /work2/noaa/da/rtreadon/git/gsi/oznstat/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #9: global_enkf ......................   Passed  488.46 sec
2/9 Test #8: netcdf_fv3_regional ..............   Passed  602.93 sec
3/9 Test #4: hwrf_nmm_d2 ......................   Passed  607.38 sec
4/9 Test #7: rrfs_3denvar_glbens ..............   Passed  665.64 sec
5/9 Test #5: hwrf_nmm_d3 ......................   Passed  1034.79 sec
6/9 Test #6: rtma .............................***Failed  1389.65 sec
7/9 Test #3: global_4denvar ...................   Passed  1621.73 sec
8/9 Test #2: global_4dvar .....................   Passed  2042.12 sec
9/9 Test #1: global_3dvar .....................   Passed  2042.39 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 2042.41 sec

The following tests FAILED:
          6 - rtma (Failed)
Errors while running CTest
Output from these tests are in: /work2/noaa/da/rtreadon/git/gsi/oznstat/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

The rtma failure is due to

The runtime for rtma_loproc_updat is 304.258784 seconds.  This has exceeded maximum allowable threshold time of 297.173804 seconds,
resulting in Failure time-thresh of the regression test.

A check of the gsi.x wall times shows that the loproc_updat did run longer than the `contrl. Wall time variability is known behavior on Orion. This is not a fatal fail.

tma_hiproc_contrl/stdout:The total amount of wall time                        = 266.088256
rtma_hiproc_updat/stdout:The total amount of wall time                        = 272.302351
rtma_loproc_contrl/stdout:The total amount of wall time                        = 270.158004
rtma_loproc_updat/stdout:The total amount of wall time                        = 304.258784

Rerun rtma ctest with following result

Orion-login-2:/work2/noaa/da/rtreadon/git/gsi/oznstat/build$ ctest -R rtma
Test project /work2/noaa/da/rtreadon/git/gsi/oznstat/build
    Start 6: rtma
1/1 Test #6: rtma .............................***Failed  1208.84 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) = 1209.03 sec

The following tests FAILED:
          6 - rtma (Failed)
Errors while running CTest
Output from these tests are in: /work2/noaa/da/rtreadon/git/gsi/oznstat/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

The failure is due to

The case has Failed the scalability test.
The slope for the update (1.115670 seconds per node) is less than that for the control (36.860682 seconds per node).

The gsi.x wall times for this run are

rtma_hiproc_contrl/stdout:The total amount of wall time                        = 247.356460
rtma_hiproc_updat/stdout:The total amount of wall time                        = 267.385722
rtma_loproc_contrl/stdout:The total amount of wall time                        = 284.217142
rtma_loproc_updat/stdout:The total amount of wall time                        = 265.154383

The updat gsi.x wall times are comparable between the low and high task count jobs. The wall time difference between low and high task counts is greater for the contrl. This is not a fatal fail.

Rerun the rtma test one more time. This time the test passed.

Orion-login-2:/work2/noaa/da/rtreadon/git/gsi/oznstat/build$ ctest -R rtma
Test project /work2/noaa/da/rtreadon/git/gsi/oznstat/build
    Start 6: rtma
1/1 Test #6: rtma .............................   Passed  1271.30 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 1271.32 sec

Hera ctests
Repeat the above on Hera with the following results

Hera(hfe09):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/oznstat/build$ ctest -j 9
Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/oznstat/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #4: hwrf_nmm_d2 ......................   Passed  606.84 sec
2/9 Test #8: netcdf_fv3_regional ..............   Passed  663.44 sec
3/9 Test #7: rrfs_3denvar_glbens ..............   Passed  666.47 sec
4/9 Test #5: hwrf_nmm_d3 ......................   Passed  733.48 sec
5/9 Test #9: global_enkf ......................   Passed  1083.75 sec
6/9 Test #3: global_4denvar ...................   Passed  1726.39 sec
7/9 Test #2: global_4dvar .....................   Passed  1802.79 sec
8/9 Test #6: rtma .............................   Passed  1811.03 sec
9/9 Test #1: global_3dvar .....................   Passed  1908.18 sec

100% tests passed, 0 tests failed out of 9

Total Test time (real) = 1908.19 sec

This is an expected result. The change in RussTreadon-NOAA:feature/oznstat only impacts the ozone netcdf diagnostic file. None of the ctest checks examine diagnostic file output. The enkf test uses canned ozone netcdf diagnostic files. The error triggering this issue was only detected when running the eupd (enkf update) step in a global parallel.

WCOSS2 ctests
Repeat the above on Dogwood with the following results.

russ.treadon@dlogin05:/lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/oznstat/build> ctest -j 9
Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/oznstat/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #9: global_enkf ......................   Passed  758.17 sec
2/9 Test #8: netcdf_fv3_regional ..............***Failed  791.63 sec
3/9 Test #7: rrfs_3denvar_glbens ..............   Passed  793.19 sec
4/9 Test #5: hwrf_nmm_d3 ......................   Passed  1683.42 sec
5/9 Test #4: hwrf_nmm_d2 ......................   Passed  1717.86 sec
6/9 Test #6: rtma .............................   Passed  1896.28 sec
7/9 Test #2: global_4dvar .....................   Passed  2029.86 sec
8/9 Test #1: global_3dvar .....................   Passed  2030.62 sec
9/9 Test #3: global_4denvar ...................   Passed  2134.99 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 2135.05 sec

The following tests FAILED:
          8 - netcdf_fv3_regional (Failed)
Errors while running CTest

The netcdf_fv3_regional test failure is due to

The case has Failed the scalability test.
The slope for the update (2.132118 seconds per node) is less than that for the control (5.284001 seconds per node).

A check of the gsi.x wall times does not show anomalous behavior

netcdf_fv3_regional_hiproc_contrl/stdout:The total amount of wall time                        = 62.371075
netcdf_fv3_regional_hiproc_updat/stdout:The total amount of wall time                        = 63.579778
netcdf_fv3_regional_loproc_contrl/stdout:The total amount of wall time                        = 67.655076
netcdf_fv3_regional_loproc_updat/stdout:The total amount of wall time                        = 65.285473

Summary
The ctests results on Orion, Hera, and WCOSS2 are acceptable.

@RussTreadon-NOAA
Copy link
Contributor Author

RussTreadon-NOAA commented Sep 10, 2023

20210814 18 gdas C192L127 test on Orion

Run g-w eobs, ediag, and eupd using RussTreadon-NOAA:feature/oznstat gsi.x. While enkf.x ran to completion, the final innovation statistics indicate problems.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -8279.20898438        5397.09667969
  0: ens. mean anal. increment min/max  v   -4427.13867188        4246.75292969
  0: ens. mean anal. increment min/max  tv   -786.219848633        1734.14416504
  0: ens. mean anal. increment min/max  q   -1.00134491920        1.11288321018
  0: ens. mean anal. increment min/max  oz  -0.254847807810E-03   0.293574848911E-03
  0: ens. mean anal. increment min/max  ps   -1803.67932129        1230.38708496

The minimum and maximum increments for u, v, tv, ps are non-physical.

As a test, rerun with the oznstat file removed. This time the increment ranges look reasonable.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -28.3665580750        29.3495750427
  0: ens. mean anal. increment min/max  v   -37.3808593750        37.9873161316
  0: ens. mean anal. increment min/max  tv   -11.8124332428        14.6227235794
  0: ens. mean anal. increment min/max  q  -0.791083090007E-02   0.571870524436E-02
  0: ens. mean anal. increment min/max  oz  -0.153843575390E-05   0.176900198312E-05
  0: ens. mean anal. increment min/max  ps   -7.21317768097        11.7183351517

Run enkf.x with oznstat only containing omi ozone data. Analysis increments are reasonable.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -28.3665523529        49.6735534668
  0: ens. mean anal. increment min/max  v   -37.3808403015        43.9031410217
  0: ens. mean anal. increment min/max  tv   -12.1193714142        14.6227235794
  0: ens. mean anal. increment min/max  q  -0.791083090007E-02   0.571870524436E-02
  0: ens. mean anal. increment min/max  oz  -0.253697317021E-05   0.261416380454E-05
  0: ens. mean anal. increment min/max  ps   -7.21317768097        11.7183475494

Run enkf.x with oznstat only containing ompstc8_n20. Analysis increments are larger for u but are generally comparable with run with omi.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -53.8940238953        60.4213790894
  0: ens. mean anal. increment min/max  v   -39.7494506836        37.9872932434
  0: ens. mean anal. increment min/max  tv   -15.6424007416        14.6227235794
  0: ens. mean anal. increment min/max  q  -0.791082624346E-02   0.571870664135E-02
  0: ens. mean anal. increment min/max  oz  -0.285660780719E-05   0.484466681883E-05
  0: ens. mean anal. increment min/max  ps   -7.21316528320        11.7183475494

Run enkf.x with oznstat only containing ompstc8_npp. Analysis increments are larger for u, v, t, and ps but within the same order of magnitude as run with omi.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -71.5207977295        58.9594726562
  0: ens. mean anal. increment min/max  v   -41.9071540833        51.1254043579
  0: ens. mean anal. increment min/max  tv   -20.0045509338        16.4163913727
  0: ens. mean anal. increment min/max  q  -0.791083090007E-02   0.571870571002E-02
  0: ens. mean anal. increment min/max  oz  -0.346155388797E-05   0.468888401883E-05
  0: ens. mean anal. increment min/max  ps   -10.4861326218        14.0718746185

Run enkf.x with oznstat file containing ompstc8_n20 and ompstc8_npp. Analysis increments are larger for u, v, t, and ps but within the same order of magnitude as run with omi.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -75.8891601562        66.7193756104
  0: ens. mean anal. increment min/max  v   -47.9710388184        50.9767074585
  0: ens. mean anal. increment min/max  tv   -19.3403663635        17.6053104401
  0: ens. mean anal. increment min/max  q  -0.791083090007E-02   0.571870664135E-02
  0: ens. mean anal. increment min/max  oz  -0.331128057951E-05   0.519078048455E-05
  0: ens. mean anal. increment min/max  ps   -12.6932802200        15.6209535599

Run enkf.x with oznstat file containing omi and ompstc8_n20. Analysis increments are larger for u, v, t, and ps but within the same order of magnitude as run with omi.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -4566.12792969        5120.05273438
  0: ens. mean anal. increment min/max  v   -5155.38671875        4625.09472656
  0: ens. mean anal. increment min/max  tv   -1120.18688965        1617.76428223
  0: ens. mean anal. increment min/max  q  -0.980346858501        1.35617148876
  0: ens. mean anal. increment min/max  oz  -0.201713948627E-03   0.297629332636E-03
  0: ens. mean anal. increment min/max  ps   -1031.86730957        1240.99975586

Run enkf.x with oznstat file containing omi and ompstc8_npp. Analysis increments are are larger for u, v, t, and ps but within the same order of magnitude as run with omi.

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -71.5208358765        58.9595031738
  0: ens. mean anal. increment min/max  v   -41.9071731567        51.1253318787
  0: ens. mean anal. increment min/max  tv   -20.0045547485        16.4163780212
  0: ens. mean anal. increment min/max  q  -0.791083183140E-02   0.571870431304E-02
  0: ens. mean anal. increment min/max  oz  -0.346155002262E-05   0.487613124278E-05
  0: ens. mean anal. increment min/max  ps   -10.4861879349        14.0718746185

Why does the combination of omi and ompstc8_n20 yield such large analysis increments? Each by itself does not result in non-physical increments?

A closer look at setupoz.f90 is necessary.

@RussTreadon-NOAA
Copy link
Contributor Author

Tagging @jack-woollen , @HaixiaLiu-NOAA , @jswhit , and @CoryMartin-NOAA for awareness. I'll repeat the omi & ompstc8_n20 test tomorrow on Hera to confirm similar behavior.

@jack-woollen
Copy link
Contributor

Its possible I made a mistake in commenting out the first two lines of the if block below.
Maybe they should go together with the other three.

   if (save_jacobian) then
            !call fullarray(dhx_dx, dhx_dx_array)
            !call nc_diag_data2d("Observation_Operator_Jacobian", dhx_dx_array)
            call nc_diag_data2d("Observation_Operator_Jacobian_stind", dhx_dx%st_ind)
            call nc_diag_data2d("Observation_Operator_Jacobian_endind", dhx_dx%end_ind)
            call nc_diag_data2d("Observation_Operator_Jacobian_val", real(dhx_dx%val,r_single))
         endif

@RussTreadon-NOAA
Copy link
Contributor Author

Hera test
Install g-w develop with feature/oznstat on Hera. Rerun 20210814 18 gdas case. The oznstat read in the test include omi_aura, ompstc8_npp, and ompstc8_n20. The {min,max} range of analysis increments is large for u, v, t, and ps

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -81.1819305420        75.4771728516
  0: ens. mean anal. increment min/max  v   -103.093261719        82.0465011597
  0: ens. mean anal. increment min/max  tv   -46.2415657043        27.5214729309
  0: ens. mean anal. increment min/max  q  -0.767223024741E-02   0.103665869683E-01
  0: ens. mean anal. increment min/max  oz  -0.523680591868E-05   0.577472474106E-05
  0: ens. mean anal. increment min/max  ps   -29.9680366516        21.8856506348

However it is not the orders of magnitude larger range found on Orion.

Additional investigation is necessary.

@RussTreadon-NOAA
Copy link
Contributor Author

Change GSI hash in working copy of Hera g-w to install GSI develop at accb07e2. This is the hash prior to the ozone changes from PR #591. Rerun 20210814 18 gdas eobs, ediag, and eupd. The enkf.x analysis increments are

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -81.1819305420        75.4771728516
  0: ens. mean anal. increment min/max  v   -103.093261719        82.0465011597
  0: ens. mean anal. increment min/max  tv   -46.2415657043        27.5214729309
  0: ens. mean anal. increment min/max  q  -0.767223024741E-02   0.103665869683E-01
  0: ens. mean anal. increment min/max  oz  -0.523680591868E-05   0.577472474106E-05
  0: ens. mean anal. increment min/max  ps   -29.9680366516        21.8856506348

These are exactly the same {min,max} range values generated from the run using feature/oznstat. This is the expected result. The changes committed in PR #591 were not supposed to change gsi.x or enkf.x results.

This calls into question the 9/10 tests run on Orion. The builds and setup need to be re-examined to ensure everything was done something correctly.

@RussTreadon-NOAA
Copy link
Contributor Author

Orion tests - rerun

Install fresh copy of g-w develop at 06c2e284.

Repeat Hera test by first installing GSI develop at accb07e2. Run 20210814 18 gdas eobs, ediag, and eupd. The following range of analysis increments were generated by enkf.x

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -69.7386932373        46.7033233643
  0: ens. mean anal. increment min/max  v   -87.8862152100        48.1129302979
  0: ens. mean anal. increment min/max  tv   -19.2770118713        15.6233501434
  0: ens. mean anal. increment min/max  q  -0.808076281101E-02   0.816340185702E-02
  0: ens. mean anal. increment min/max  oz  -0.248382752943E-05   0.388220405512E-05
  0: ens. mean anal. increment min/max  ps   -6.85783100128        9.24414062500

Install RussTreadon-NOAA:feature/oznstat at 8d33344. Rewind and rerun 20210814 18 gdas eobs, ediag, and eupd. feature/oznstat enkf.x generated following range of analysis increments

  0:  time level            1
  0:  --------------
  0: ens. mean anal. increment min/max  u   -69.7386932373        46.7033233643
  0: ens. mean anal. increment min/max  v   -87.8862152100        48.1129302979
  0: ens. mean anal. increment min/max  tv   -19.2770118713        15.6233501434
  0: ens. mean anal. increment min/max  q  -0.808076281101E-02   0.816340185702E-02
  0: ens. mean anal. increment min/max  oz  -0.248382752943E-05   0.388220405512E-05
  0: ens. mean anal. increment min/max  ps   -6.85783100128        9.24414062500

The two sets of analysis increment ranges are identical. This agrees with the Hera test results and expectation. The changes in PR #591 should not change analysis results.

It is not clear why the 9/10 Orion test generated unexpected, non-physical results. Since Hera and Orion tests are now consistent and ctests pass, I'll stop here.

@RussTreadon-NOAA
Copy link
Contributor Author

Its possible I made a mistake in commenting out the first two lines of the if block below. Maybe they should go together with the other three.

   if (save_jacobian) then
            !call fullarray(dhx_dx, dhx_dx_array)
            !call nc_diag_data2d("Observation_Operator_Jacobian", dhx_dx_array)
            call nc_diag_data2d("Observation_Operator_Jacobian_stind", dhx_dx%st_ind)
            call nc_diag_data2d("Observation_Operator_Jacobian_endind", dhx_dx%end_ind)
            call nc_diag_data2d("Observation_Operator_Jacobian_val", real(dhx_dx%val,r_single))
         endif

@jack-woollen , I can not reproduce the 9/10 Orion cycled test results. 9/11 Orion tests behave as expected as do Hera tests. Yesterday's results must be due to operator error (me). There must be a mistake in my 9/10 experiment, g-w, or gsi setup.

CoryMartin-NOAA pushed a commit that referenced this issue Sep 12, 2023
)

**Description**
PR #591 removed jacobian information from the netcdf ozone diagnostic
file. This caused `enkf.x` to crash. This PR adds the removed ozone
jacobian arrays back to the netcdf ozone diagnostic file.

Fixes #618

**Type of change**
- [x] Bug fix (non-breaking change which fixes an issue)


**How Has This Been Tested?**
The revised code was tested in the 20210814 18 gdas cycle of a C192L127
enkf parallel. The updated `gsi.x` created an oznstat file which was
successfully processed by `enkf.x`.
  
**Checklist**
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] New and existing tests pass with my changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants