Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes for reanalysis runs #591

Merged
merged 8 commits into from
Aug 8, 2023
Merged

Conversation

jack-woollen
Copy link
Contributor

@jack-woollen jack-woollen commented Jul 20, 2023

Description

Fixes #565

NOTE: close PR #564 when this PR is merged into develop

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

DUE DATE for this PR is 8/31/2023. If this PR is not merged into develop by this date, the PR will be closed and returned to the developer.

@jack-woollen
Copy link
Contributor Author

Since I couldn't edit Jeff's PR fork, I made my own fork and installed the changes mentioned by reviewers of the first PR. I probably should've had my own fork for this to begin with because I had made all the initial code changes, and tested them as well. So I think this PR should supersede the first one from this point on. In any case I would request final reviews for this PR from @BrettHoover-NOAA , @HaixiaLiu-NOAA , @ilianagenkova , @jswhit , and @RussTreadon-NOAA. Thanks all.

Copy link
Contributor Author

@jack-woollen jack-woollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I haven't figured out how to add reviewers to the list at the top.

obsdat(4) > 100000000.0_r_kind) cycle loop_readsb
if(ppb >r10000) ppb=ppb/r100
if (ppb>rmiss .or. hdrdat(3)>rmiss .or. obsdat(4)>rmiss) cycle loop_readsb
ppb=ppb/r100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pressure value range check was removed here: if(ppb >r10000)
I think it's a good idea to keep it, in case a data provider changes the pressure units.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-woollen I agree with Iliana, I think this is the only remaining issue for satwnd and I can approve once this check is reinstated (or if it was included elsewhere and I missed it, let me know)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilianagenkova @BrettHoover-NOAA How about if I add a comment such as

if(ppb >r10000) ppb=ppb/r100 ! ppb<10000 may indicate data reported in hPa

Although it seems like a realistic cutoff might be ppb>r1000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work, the original limit is probably better for keeping the check completely unambiguous about realistic Pa/hPa values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BrettHoover-NOAA Okay! I'll leave it at that. Though I have to say, haven't seen much 10K hPa data! Let's say the data can be Pa/daPa/hpa.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-woollen , adding that comment is very helpful, especially for users who don't have experience with retrievals and what goes on data provider's end. Thanks!

Copy link
Contributor

@ilianagenkova ilianagenkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes are minimal.
A few spaces were removed, and some lines were added - it helps the readability of the code.
The wording of a few comment lines was improved.

Copy link
Contributor

@HaixiaLiu-NOAA HaixiaLiu-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all changes look good to me except for the two comments I added in the code. Thanks.

@@ -538,11 +554,12 @@ subroutine setupozlay(obsLL,odiagLL,lunin,mype,stats_oz,nlevs,nreal,nobs,&
if (ozone_diagsave .and. luse(i)) then
rdiagbuf(1,k,ii) = ozobs(k)
rdiagbuf(2,k,ii) = ozone_inv(k) ! obs-ges
errorinv = sqrt(varinv4diag(k)*rat_err4diag)
errorinv = sqrt(varinv4diag(k)*rat_err2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable rat_err2 should be reverted back to rat_err4diag. A few lines before this, rat_err2 was set to zero for any monitored data. However, in the diagnostic data files, we want to save real error inverse values rather than zeros. Therefore, the new variable rat_err4diag was introduced to correctly store the values in the diagnostic data file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HaixiaLiu-NOAA I'm looking again at subroutine setupozlev where the above change is.

Grepping for rat_err4diag (and adding definitions of rat_err2) gets:

real(r_kind),dimension(nlevs):: ratio_errors,error,rat_err4diag
real(r_kind) omg,rat_err2
rat_err2 = ratio_errors(k)**2
rat_err4diag=rat_err2
errorinv = sqrt(varinv4diag(k)*rat_err4diag(k))

So really, rat_err4diag(k)=ratio_errors(k)**2 , is that right? If so I can simplify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both ratio_errors and rat_err2 are set to 0s in the following section of the code:
! If not assimilating this observation, reset inverse variance to zero
if (iouse(k)<1) then
varinv3(k)=zero
ratio_errors(k)=zero
rat_err2 = zero
end if

Then a few lines down, we have the following:
! Optionally save data for diagnostics
if (ozone_diagsave .and. luse(i)) then
rdiagbuf(1,k,ii) = ozobs(k)
rdiagbuf(2,k,ii) = ozone_inv(k) ! obs-ges
errorinv = sqrt(varinv4diag(k)*rat_err4diag)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HaixiaLiu-NOAA Got it. I see what your saying. Thx!

@@ -1021,7 +1039,7 @@ subroutine ozlev_ncread_(dfile,dtype,ozout,nmrecs,ndata,nodata, gstime,twind)
if (ozone(ilev, iprof) < -900.0_r_kind) cycle ! undefined
if (err(ilev, iprof) < -900.0_r_kind) cycle ! undefined
if (iuse_oz(ipos(ilev)) < 0) then
usage = 100._r_kind
usage = 10000._r_kind
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason for changing the usage from 100 to 10000? This section is in the subroutine ozlev_ncread_. There is a similar subroutine ozlev_bufrread_ where the usage is still set to 100. I am unsure the impact of changing the usage from 100 to 10000, but it is important to ensure consistency for the same variables in both subroutines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HaixiaLiu-NOAA Thanks again for checking. The first one is my blooper, I made half the required change. The second one is from NASA I guess. There's a couple other places there where usage is set to 10000 but I agree the one you pointed out should probably be 100. Pushing thos changes now.

@jack-woollen
Copy link
Contributor Author

jack-woollen commented Jul 25, 2023

@BrettHoover-NOAA I agree with you to merge the PR with minimal changes to read_satwnd.f90, to get things moving. I've tested a more complete refactoring to address some of the issues we discussed and got identical gsistat output from observer runs. That version is in cactus:/lfs/h2/emc/global/noscrub/Jack.Woollen/gfsda.gsidev/src/gsi. If you want to go through it when you have time and put it another PR later on that's good with me. Cheers.

Copy link
Contributor

@HaixiaLiu-NOAA HaixiaLiu-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all my comments have been addressed. approved now.

@BrettHoover-NOAA
Copy link

Thanks @jack-woollen, I'm fine with @RussTreadon-NOAA approving these changes.

Copy link

@BrettHoover-NOAA BrettHoover-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_satwnd.f90 changes look good to me

@RussTreadon-NOAA
Copy link
Contributor

Hera ctests

NOAA-EMC/GSI develop and jack-woollen:develop built on Hera and 9 ctests run with the following results

Hera(hfe12):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build$ ctest -j 9
Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #8: netcdf_fv3_regional ..............   Passed  544.71 sec
2/9 Test #7: rrfs_3denvar_glbens ..............   Passed  550.27 sec
3/9 Test #5: hwrf_nmm_d3 ......................   Passed  557.11 sec
4/9 Test #4: hwrf_nmm_d2 ......................***Failed  608.14 sec
5/9 Test #9: global_enkf ......................   Passed  977.82 sec
6/9 Test #6: rtma .............................   Passed  1333.84 sec
7/9 Test #3: global_4denvar ...................   Passed  1670.29 sec
8/9 Test #2: global_4dvar .....................   Passed  1685.18 sec
9/9 Test #1: global_3dvar .....................   Passed  1909.86 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 1909.88 sec

The following tests FAILED:
          4 - hwrf_nmm_d2 (Failed)
Errors while running CTest
Output from these tests are in: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

The hwrf_nmm_d2 failure is a true fail.

Comparison of the initial Jo terms between the update (jack-woollen:develop) and control (develop) show differences in the initial penalties for wind observations.

  • update: wind 2.9998004912255288E+04
  • control: wind 3.0027332995400833E+04

Comparison of the initial wind fit-to-obs statistics from the update and control show differences for subtype 257 of prepbufr report types 245, 246, and 247. Below is the difference in the assimilated observation counts for these obs:

update

o-g 01      uv asm 245 0257 count          0         46        131          0          0        120         38         75         64          2          0        476
o-g 01      uv asm 246 0257 count          0          0          0          0          0         35         16         50         49          2          0        152
o-g 01      uv asm 247 0257 count          0          0          0          0         59         50          9          2          0          0          0        120

control

o-g 01      uv asm 245 0257 count          0         29         91          0          0         59         13         17         24          2          0        235
o-g 01      uv asm 246 0257 count          0          0          0          0          0         15          5         20         21          2          0         63
o-g 01      uv asm 247 0257 count          0          0          0          0         59         33          3          1          0          0          0         96

The update code is allowing more observations to pass quality control and be assimilated. Why?

FYI, the above values were taken from fort.202 in the following run directories on Hera

  • update: /scratch1/NCEPDEV/stmp2/Russ.Treadon/pr591/tmpreg_hwrf_nmm_d2/hwrf_nmm_d2_loproc_updat/fort.202
  • control: /scratch1/NCEPDEV/stmp2/Russ.Treadon/pr591/tmpreg_hwrf_nmm_d2/hwrf_nmm_d2_loproc_contrl/fort.202

The hwrf_nmm_d2 test assimilates data from the 2021102812 gfs satwnd file - specifically, /scratch1/NCEPDEV/da/Russ.Treadon/CASES/regtest/regional/hwrf_nmm/2012102812/gfs.t12z.satwnd.tm00.bufr_d.

This PR can not be merged into develop until the reason for the above differences is understood and an appropriate resolution identified.

@RussTreadon-NOAA
Copy link
Contributor

Tagging @jack-woollen , @BrettHoover-NOAA , and @ilianagenkova for awareness.

For some reason the update code processes more satwnd obs for report types 245, 246, and 247 from subtype 0257. We can't merge PR #591 into develop until the reason for this difference is understood and code modifications, if warranted, are made.

@@ -1609,6 +1666,11 @@ subroutine read_satwnd(nread,ndata,nodata,infile,obstype,lunout,gstime,twind,sis
deallocate(lmsg,tab,nrep)
! Close unit to bufr file
call closbf(lunin)


do i=1,1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this do block with a print a left over section of debug code? Can we remove it?

@jack-woollen
Copy link
Contributor Author

jack-woollen commented Aug 7, 2023 via email

do_qc = subset(1:7)=='NC00503'.and.nint(hdrdat(1))>=270
do_qc = do_qc.or.subset=='NC005081'.or.subset=='NC005091'
do_qc = do_qc.or.qcdat(1,1)<rmiss
if(.not.do_qc) goto 99
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WCOSS implementation standards state that we should not use goto statements. NCO has a long-standing bugzilla (#216) requiring that GFS applications, including, GSI, reduce the number of goto statements in every implementation. Can we handle the .not.do_qc condition without using a goto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussTreadon-NOAA I could add a do_qc check in every extra block .or. make a gigantic if(do_qc) block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which approach yields more readable & logical code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like adding do_qc as a condition in each extra block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, but sorry for the extra work for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the bug. It has to do with not doing QC on NC00501*, where goes13 can be found. It'll take a couple days to get the changes worked out. Thanks for finding the problem!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the NC00501* parts of the code is the culprit.

@ilianagenkova
Copy link
Contributor

The difference between "update" and "control" -
update

o-g 01 uv asm 245 0257 count 0 46 131 0 0 120 38 75 64 2 0 476
o-g 01 uv asm 246 0257 count 0 0 0 0 0 35 16 50 49 2 0 152
o-g 01 uv asm 247 0257 count 0 0 0 0 59 50 9 2 0 0 0 120
control

o-g 01 uv asm 245 0257 count 0 29 91 0 0 59 13 17 24 2 0 235
o-g 01 uv asm 246 0257 count 0 0 0 0 0 15 5 20 21 2 0 63
o-g 01 uv asm 247 0257 count 0 0 0 0 59 33 3 1 0 0 0 96

appears to be coming from an older GOES instrument (SAID=257 is GOES-13). May be a SAID value (or range) check in the code needs to be revisited to include these earlier GOES instruments.

@RussTreadon-NOAA
Copy link
Contributor

@jack-woollen , appreciate your quickly identifying the problem. If you want me to test changes, just let me know. It's easy to drop in updated code, recompile, and rerun.

@jack-woollen
Copy link
Contributor Author

@RussTreadon-NOAA @ilianagenkova Here's the bulk of the new bits. I think its ready to test again. Thanks!

       ! test for QCSTR or MANDATORY QC - if not skip over the extra blocks
       call ufbrep(lunin,qcdat,3,12,qcret,qcstr)
       do_qc = subset(1:7)=='NC00503'.and.nint(hdrdat(1))>=270
       do_qc = do_qc.or.subset(1:7)=='NC00501'
       do_qc = do_qc.or.subset=='NC005081'.or.subset=='NC005091'
       do_qc = do_qc.or.qcret==0
       
       ! assign types and get quality info: start
       
       if(.not.do_qc) then
          continue
       else if(trim(subset) == 'NC005064' .or. trim(subset) == 'NC005065' .or. &
          trim(subset) == 'NC005066') then
      etc...

@RussTreadon-NOAA
Copy link
Contributor

@jack-woollen , your changes work for hwrf_nmm_d2.

Hera(hfe12):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build$ ctest -R hwrf_nmm_d2
Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build
    Start 4: hwrf_nmm_d2
1/1 Test #4: hwrf_nmm_d2 ......................   Passed  909.58 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 909.59 sec

I'll rerun all 9 ctests later today. Thank you for the quick fix!

@RussTreadon-NOAA
Copy link
Contributor

Rerun Hera ctests

Update working copy of jack-woollen:develop to 441754d and rerun ctests. The rtma ctest has not yet completed. Here are results from completed tests.

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build
    Start 4: hwrf_nmm_d2
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #9: global_enkf ......................   Passed  1952.74 sec
2/9 Test #7: rrfs_3denvar_glbens ..............   Passed  3371.39 sec
3/9 Test #8: netcdf_fv3_regional ..............   Passed  3728.79 sec
4/9 Test #1: global_3dvar .....................***Failed  3774.28 sec
5/9 Test #3: global_4denvar ...................***Failed  3832.00 sec
6/9 Test #2: global_4dvar .....................***Failed  5103.84 sec
7/9 Test #5: hwrf_nmm_d3 ......................   Passed  5477.02 sec
8/9 Test #4: hwrf_nmm_d2 ......................   Passed  5709.27 sec

The global var failures are true failures.

Examination of stdout shows differences in the initial penalties for wind observations. Comparison for update and control fort.202 files show differences for many satwnd observations. Listed below are the initial data counts for assimilated satwnd obs for the update code are

<  o-g 01      uv asm 242 0173 count          0       1028       1575        616          0          0          0          0          0          0          0       3219
<  o-g 01      uv asm 243 0056 count          0        465       1100        546          0          0          0          0          0          0          0       2111
<  o-g 01      uv asm 243 0070 count          0        102        146         79          0          0          0          0          0          0          0        327
<  o-g 01      uv asm 250 0173 count          0          0          0          0          0       2309       1418       1614       1676        543          0       7560
<  o-g 01      uv asm 252 0173 count          0        882       1313          0       1134       1561        945       1212       1223        329          0       8599
<  o-g 01      uv asm 253 0056 count          0       1014       2813          0         22       2060       1120       1302       1250        235          0       9816
<  o-g 01      uv asm 253 0070 count          0       1575       2814          0         34       2628       1356       1370       1109        160          0      11046
<  o-g 01      uv asm 254 0056 count          0          0          0          0          0      23194       9062      11713      12481       2503          0      58953
<  o-g 01      uv asm 254 0070 count          0          0          0          0          0       6030       1669       1592       1361        195          0      10847
<  o-g 01      uv asm 255 0854 count          0       2476      23469      14747      22744       7806        176          8          0          0          0      71426
<  o-g 01         asm all      count      28386      57893      61624      64936      87637      89419      40905      71237      46301       8302       6287     575516

The counts from the control code are

>  o-g 01      uv asm 242 0173 count          0       1016       1468        559          0          0          0          0          0          0          0       3043
>  o-g 01      uv asm 243 0056 count          0        289        664        235          0          0          0          0          0          0          0       1188
>  o-g 01      uv asm 243 0070 count          0         36         39         10          0          0          0          0          0          0          0         85
>  o-g 01      uv asm 250 0173 count          0          0          0          0          0       2351       1426       1628       1639        548          0       7592
>  o-g 01      uv asm 252 0173 count          0        833       1189          0       1007       1451        868       1154       1164        306          0       7972
>  o-g 01      uv asm 253 0056 count          0        703       1609          0          9        919        609        930        886        113          0       5778
>  o-g 01      uv asm 253 0070 count          0       1070       1960          0         16       1207        787        917        680         64          0       6701
>  o-g 01      uv asm 254 0056 count          0          0          0          0          0       7167       6119       8159       8307       1055          0      30807
>  o-g 01      uv asm 254 0070 count          0          0          0          0          0       1904       1131       1280       1005         73          0       5393
>  o-g 01      uv asm 255 0854 count          0       2568      24094      16992      31567      10876        291         18          0          0          0      86406
>  o-g 01         asm all      count      28386      56866      59417      66744      96302      69706      36390      66512      40882       6496       6287     546577

The update is assimilating more satwnd observations.

The above are taken from fort.202 in

  • /scratch1/NCEPDEV/stmp2/Russ.Treadon/pr591/tmpreg_global_3dvar/global_3dvar_loproc_updat
  • /scratch1/NCEPDEV/stmp2/Russ.Treadon/pr591/tmpreg_global_3dvar/global_3dvar_loproc_contrl

@jack-woollen
Copy link
Contributor Author

@RussTreadon-NOAA @ilianagenkova Here's the bulk of the new bits.
The typo which failed the second test was do_qc = do_qc.or.qcret==0. Should've been qcret>0, as in below.
I verified the changes are now correct comparing gsistats from develop code and this fork from wcoss2 runs.
I think its ready to test again, and pass this time. Thanks!

       ! test for QCSTR or MANDATORY QC - if not skip over the extra blocks
       call ufbrep(lunin,qcdat,3,12,qcret,qcstr)
       do_qc = subset(1:7)=='NC00503'.and.nint(hdrdat(1))>=270
       do_qc = do_qc.or.subset(1:7)=='NC00501'
       do_qc = do_qc.or.subset=='NC005081'.or.subset=='NC005091'
       do_qc = do_qc.or.qcret>0
       
       ! assign types and get quality info: start
       
       if(.not.do_qc) then
          continue
       else if(trim(subset) == 'NC005064' .or. trim(subset) == 'NC005065' .or. &
          trim(subset) == 'NC005066') then
      etc...

@RussTreadon-NOAA
Copy link
Contributor

Hera ctest using 2e0df54

Recompile jack-woollen:develop at 2e0df54 and rerun ctests with following results

Hera(hfe04):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build$ ctest -j 9
Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build
    Start 1: global_3dvar
    Start 2: global_4dvar
    Start 3: global_4denvar
    Start 4: hwrf_nmm_d2
    Start 5: hwrf_nmm_d3
    Start 6: rtma
    Start 7: rrfs_3denvar_glbens
    Start 8: netcdf_fv3_regional
    Start 9: global_enkf
1/9 Test #7: rrfs_3denvar_glbens ..............   Passed  1515.92 sec
2/9 Test #8: netcdf_fv3_regional ..............   Passed  1628.23 sec
3/9 Test #4: hwrf_nmm_d2 ......................   Passed  1629.16 sec
4/9 Test #5: hwrf_nmm_d3 ......................***Failed  1637.75 sec
5/9 Test #9: global_enkf ......................   Passed  2099.90 sec
6/9 Test #6: rtma .............................   Passed  2352.48 sec
7/9 Test #2: global_4dvar .....................   Passed  2645.34 sec
8/9 Test #3: global_4denvar ...................   Passed  2869.59 sec
9/9 Test #1: global_3dvar .....................   Passed  2990.26 sec

89% tests passed, 1 tests failed out of 9

Total Test time (real) = 2990.26 sec

The following tests FAILED:
          4 - hwrf_nmm_d3 (Failed)
Errors while running CTest
Output from these tests are in: /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr591/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

The hwrf_nmm_d3 failure is due to the scalability test.

The case has Failed the scalability test.
The slope for the update (16.886161 seconds per node) is less than that for the control (17.028182 seconds per node).

A check of the update and control gsi.x wall times does not find any anomalous behavior. The hiproc update and control wall times are comparable. The update loproc job ran 5 seconds faster than the control.

hwrf_nmm_d3_hiproc_contrl/stdout:The total amount of wall time                        = 67.074503
hwrf_nmm_d3_hiproc_updat/stdout:The total amount of wall time                        = 67.784582
hwrf_nmm_d3_loproc_contrl/stdout:The total amount of wall time                        = 84.102685
hwrf_nmm_d3_loproc_updat/stdout:The total amount of wall time                        = 79.042023

The hwrf_nmm_d3 failure is not a fatal fail.

nodata = 0
read_loop1: do iprof = 1, nprofs
do ilev = 1, levs
if (ozone(ilev, iprof) .lt. -900.0) cycle ! undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add _r_kind suffix to -900.0 (GSI coding standard).

ozout(4,ndata)=dlat ! grid relative latitude
ozout(5,ndata)=dlon_earth_deg ! earth relative longitude (degrees)
ozout(6,ndata)=dlat_earth_deg ! earth relative latitude (degrees)
ozout(7,ndata)=0. ! total ozone error flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add _r_kind suffix to 0.0 or use zero from constants module.
(GSI coding standard).

ozout(5,ndata)=dlon_earth_deg ! earth relative longitude (degrees)
ozout(6,ndata)=dlat_earth_deg ! earth relative latitude (degrees)
ozout(7,ndata)=0. ! total ozone error flag
ozout(8,ndata)=0. ! profile ozone error flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add _r_kind suffix to 0.0 or use zero from constants module.
(GSI coding standard).



!--- Extract brightness temperature data. Apply gross check to data.
! If obs fails gross check, reset to missing obs value.

call ufbrep(lnbufr,mirad,1,nchanl*nscan,iret,str2)

!print*,mirad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this print and abort are commented out, I guess we don't need them. Is there any reason to keep them in the code as inactive lines?

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three approvals from peer reviewers. Standard suite of ctests pass. Changes look good to me - only minor comments. Approve upon addressing minor comments.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jack-woollen for making these changes and also catching ones I missed. Recompile and rerun Hera ctests. All tests pass except netcdf_fv3_regional. This failure was due to the scalability test. Nothing anomalous found in gsi.x wall times for netcdf_fv3_regional test.

Approve.

@RussTreadon-NOAA RussTreadon-NOAA merged commit 9e5aa09 into NOAA-EMC:develop Aug 8, 2023
4 checks passed
@ilianagenkova
Copy link
Contributor

@jack-woollen , thanks for handling the satwnd code with grace! I wished I was of more help.

@guoqing-noaa guoqing-noaa mentioned this pull request Aug 23, 2023
9 tasks
@CoryMartin-NOAA
Copy link
Contributor

Did anyone try to cycle in the workflow with this PR? In the process of updating the hash for global-workflow, I'm running into an issue that might be related. See NOAA-EMC/global-workflow#1835 for more information.

CoryMartin-NOAA pushed a commit that referenced this pull request Sep 12, 2023
)

**Description**
PR #591 removed jacobian information from the netcdf ozone diagnostic
file. This caused `enkf.x` to crash. This PR adds the removed ozone
jacobian arrays back to the netcdf ozone diagnostic file.

Fixes #618

**Type of change**
- [x] Bug fix (non-breaking change which fixes an issue)


**How Has This Been Tested?**
The revised code was tested in the 20210814 18 gdas cycle of a C192L127
enkf parallel. The updated `gsi.x` created an oznstat file which was
successfully processed by `enkf.x`.
  
**Checklist**
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] New and existing tests pass with my changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GSI needs updating to handle reprocessed ozone, ssmi and satwnd observations for reanalysis
6 participants