Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub Issue NOAA-EMC/GSI#459 Add Scale/Variable/Time-Dependent Localization for EnVar #460

Merged

Conversation

shoyokota
Copy link
Collaborator

This PR is to add Scale/Variable/Time-Dependent Localization (SDL/VDL/TDL) for GSI EnVar. #459.
I checked that it is possible to run regional SDL and VDL and to obtain reasonable results in single observation tests and RRFS cycle tests on Orion. I haven't run global DA tests and GSI's regression tests. (The regression tests are not probably setup on Orion.)

@MichaelLueken
Copy link
Contributor

Hi @shoyokota. There are currently three commits associated with this work. To reduce this to a single commit, please use the following commands:

  • git clone --recursive git@github.com:shoyokota/GSI.git update
  • cd update
  • git checkout feature/PR_NOAA-EMC_EnVar-SDL
  • git reset --soft HEAD~1
  • git reset --soft HEAD~1
  • git commit --amend
  • Please add GitHub Issue NOAA-EMC/GSI#459. to the front of your commit message, then save and close the commit message window.
  • git push origin feature/PR_NOAA-EMC_EnVar-SDL --force

These steps will reduce the number of commit messages from three to one and add GitHub Issue NOAA-EMC/GSI#459. to your commit message.

I will run the regression tests on Hera and let you know if any fail.

My last question is, who would you like to have review your PR? A third party reviewer needs to complete a review of the changes before I go through the changes.

If you have any questions or encounter any issues, please let me know.

@shoyokota shoyokota force-pushed the feature/PR_NOAA-EMC_EnVar-SDL branch from 41de6c1 to 368349d Compare August 25, 2022 15:00
@shoyokota
Copy link
Collaborator Author

@MichaelLueken-NOAA Thank you. I unified three commits to a single commit.

I created this PR referring to the code of global SDL made by @TingLei-daprediction. I heard @dtkleist was also interested in the SDL work. Although it depends on their convenience, I think they can review this PR in detail. Because I haven't run global DA tests, someone who can test global DA may be better.

Copy link
Contributor

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have completed my review of your changes.

The code was successfully compiled using Intel and GNU compilers. The code was also compiled in debug mode, and three unused variables were identified and marked in this review (in hybrid_ensemble_isotropic.F90).

The regression tests have been run and the only issue is with the RTMA configuration. The RTMA fails to run. While running the RTMA debug test, the code fails immediately with the following error:

forrtl: severe (408): fort: (3): Subscript #1 of the array IDAEN2D has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
gsi.x              0000000007602A70  Unknown               Unknown  Unknown
gsi.x              0000000000692136  control_vectors_m         400  control_vectors.f90
gsi.x              0000000000412C6B  gsimod_mp_gsimain        1625  gsimod.F90
gsi.x              00000000004121C5  MAIN__                    618  gsimain.f90
gsi.x              000000000041209E  Unknown               Unknown  Unknown
libc-2.17.so       00002B0E900BA555  __libc_start_main     Unknown  Unknown
gsi.x              0000000000411FA9  Unknown               Unknown  Unknown

I'm still attempting to see what might be happening here in the RTMA. If you have any ideas, please feel free to make modifications.

I have also identified several coding standard issues that will need to be addressed before this work can move forward. Once these issues have been corrected, another reviewer has given approval, and the RTMA regression and debug tests are working, then this work can be submitted to the review committee.

To ensure that a single commit is maintained following these code changes:

  1. Please use git add on the updated routines.
  2. git commit --amend
  3. Please save and close out of the commit message window without changing the current commit message.
  4. git push origin feature/PR_NOAA-EMC_EnVar-SDL --force

Please let me know if you encounter any issues or have any questions, comments, or concerns.

src/gsi/control_vectors.f90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/en_perts_io.f90 Outdated Show resolved Hide resolved
src/gsi/gsimod.F90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/hybrid_ensemble_isotropic.F90 Outdated Show resolved Hide resolved
src/gsi/control_vectors.f90 Outdated Show resolved Hide resolved
@MichaelLueken
Copy link
Contributor

Hi @shoyokota. I have just finished pushing an update to the authoritative repository that caused four conflicts with your changes. Please update your fork's branch with the following:

You will then need to go into the four routines with conflicts:

  • src/gsi/cplr_get_fv3_regional_ensperts.f90
  • src/gsi/gsimod.F90
  • src/gsi/hybrid_ensemble_isotropic.F90
  • src/gsi/hybrid_ensemble_parameter.f90

and correct the conflicts (you will want to look for strings of <<<<<, ========, and >>>>>>> within these four files).

Once the conflicts have been corrected, please use git add on the four files, then git rebase --continue. Once finished, please push back to the repository using git push origin feature/PR_NOAA-EMC_EnVar-SDL --force.

Please let me know if you have any questions.

@shoyokota shoyokota force-pushed the feature/PR_NOAA-EMC_EnVar-SDL branch 2 times, most recently from 1d56e52 to 7c8033b Compare August 26, 2022 20:10
@shoyokota
Copy link
Collaborator Author

Hi @MichaelLueken-NOAA. Thank you for pointing out issues I should correct. I pushed the changes you requested to the repository. I also finished "git rebase."

The failure of the RTMA test was probably because the previous code incorrectly regarded "ilev<1" as 2D variables (ilev is input from anavinfo). I also corrected this bug together. Could you try the test again?

Copy link
Contributor

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shoyokota. Thank you very much for addressing my concerns! I have resubmitted the RTMA GNU, Intel, and debug tests and all have successfully passed. With this, I approve your changes. Please note that I will still need an approval from either @TingLei-daprediction / @TingLei-NOAA or @dtkleist before I can move forward with submitting this work to the review committee.

@MichaelLueken MichaelLueken marked this pull request as ready for review August 29, 2022 14:45
@MichaelLueken
Copy link
Contributor

The code compiles with Intel and GNU compilers. The code successfully compiled in debug mode without issues. The regression tests were run for GNU, Intel, and debug and all passed successfully. All of my concerns have been addressed, so I have officially opened this PR for review.

@TingLei-NOAA
Copy link
Contributor

I have done a few tests using a set of setup ( mix ens, global ens and regional ens ) for one fv3-lam hybrid analysis of 13 km conus domain. The PR give the identical results ( from GSI standard output). it convinces me this PR don't change existing functions of GSI for fv3-lam. That is great. I will go on to review the changes in this PR which I hope will be finished in one week.

@JacobCarley-NOAA
Copy link
Contributor

@TingLei-NOAA just checking to see if you've had a chance to complete your review.

@TingLei-NOAA
Copy link
Contributor

@JacobCarley-NOAA I will finish my reviewing part as soon as possible.

@TingLei-NOAA
Copy link
Contributor

TingLei-NOAA commented Sep 9, 2022

@shoyokota I have gone through your codes . The SDL implementation is efficiently done in the GSI ' style . That is great!
Now, a question is about its' function for global app. In the global app using SDL, would you tell me how the program would generate multiple groups of ensembles from the raw ensemble? Why didn't you use the method/codes using apply_scaledepwgts in get_gefs_ensperts_dualres.f90 in our original GSI SDL implementation for global application?(branch EXP-sdl_emc_TL_v1 at ttps://github.com/TingLei-NOAA/GSI.git) . Hopefully @dtkleist would take care of test/evaluation of the global application with SDL, and, hence I bring this discussion to his attention. Thanks.

@shoyokota
Copy link
Collaborator Author

@TingLei-NOAA Here, the scale separation is done with bkgcov_a_en_new_factorization in hybens_localization_setup for both regional and global applications. In this subroutine, recursive filter is used for horizontal smoothing in regional app while spectral filter is used in global app. This implementation is for unifying the coding of regional SDL and global SDL and for using pre-existing subroutine (recursive filter and spectral filter) for the scale separation as much as possible.

In my understanding, the pre-existing spectral filter is also based on Gaussian. So, I expect it is worked for smoothing in a similar way to the recursive filter. However, I haven't checked if it is properly worked or not. If it is not properly worked in the global test, we may have to normalize spectral_filter used for the weight of each wave band.

If we will additionally introduce apply_scaledepwgts in this coding, it should probably be located not in get_gefs_ensperts_dualres.f90 but in hybens_localization_setup. It is probably possible, but is the next step. Thanks.

@TingLei-NOAA
Copy link
Contributor

@shoyokota Thanks for this explanation. The spectral separation method/codes had been implemented at EMC and tests for global app earlier and works as expected. So, I suggest as a staring point, it should be added in this SDL for both global and regional . Of course, as you mentioned, you can limit this PR only for regional app and we can see further development in the future PR. I will leave this decision to @dtkleist . I would mark my review conclusion as "approval pending conditions"

@shoyokota
Copy link
Collaborator Author

@TingLei-NOAA @dtkleist Thank you for your suggestion. The scale separation using apply_scaledepwgts is included in another PR. So I expect it will be implemented there. On the other hand, the scale separation using the pre-existing spectral filter implemented here is also useful because the setting of SDL is the same as the regional app using the recursive filter. So, I think the global test should be done here in any case. I wait for the decision.

@shoyokota
Copy link
Collaborator Author

How should we go on the review of the global application? If @dtkleist have no time to review, could you recommend another reviewer who can execute the global test? Or, could you let me know how to execute and confirm the global test?

@JacobCarley-NOAA
Copy link
Contributor

@shoyokota Good question! Let me query a few others and see who else we might be able to find to assist.

@JacobCarley-NOAA
Copy link
Contributor

@shoyokota I just chatted with @CatherineThomas-NOAA and she has agreed to perform a review. Thanks @CatherineThomas-NOAA !

@shoyokota
Copy link
Collaborator Author

Sorry, I forgot to update "program history log" in the code. I added only the history log.

Copy link
Contributor

@TingLei-NOAA TingLei-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the discussion during this review process, I approve this PR.

@CatherineThomas-NOAA
Copy link
Collaborator

I ran a standalone analysis test for global with the code in this PR compared to the current develop branch, with the new features turned off as in the defaults. The PR code reproduced the current develop as expected.

A couple of questions:

  • For both time and variable localization, it is an all or nothing approach, correct? Either time/variable localization is off and all time levels/variables are fully correlated OR time/variable localization is on and all time levels/variables are not correlated at all. For instance, there is no way to have groups of variables, like have the hydrometeor variables be correlated with one another but not with other variables, right? Or to have hour and hour+1 be correlated but hour and hour+3 not correlated?

  • How does this PR connect with PR#473?

@shoyokota
Copy link
Collaborator Author

@CatherineThomas-NOAA Thank you for the standalone test. Could you also confirm if the SDL work in the global application?

Yes, the cross-variable/cross-time localization of this PR is an all or nothing approach. All {variables, time-levels} are fully correlated in i_ensloccov4{var,tim}=0, and all {variables, time-levels} are not correlated at all in i_ensloccov4{var,tim}=1. We can adjust these correlations more flexibly by changing ensloccov4var and ensloccov4tim in the subroutine "setup_ensgrp2aensgrp", but I did not apply such options for simplification. Do you think such options should be added here?

PR#473 is the code tested in OU/MAP. Both of these PRs are developed based on the same motivation of implementing SDL/VDL, but PR#473 has following differences. So, it is also planned to merge PR#473 after resolving conflicts with this PR#460.

  • Scale separation in global application (using newly implemented hyperbolic tangent-type spectral filter)
  • Scale separation in regional application (using newly implemented Gaussian filter)
  • How to implement VDL
  • Radar reflectivity assimilation

@CatherineThomas-NOAA
Copy link
Collaborator

I ran some tests with SDL for a global single analysis using 2 scales. It ran successfully to completion. Then I tried changing the length and got the exact same answer, which I did not expect. I added some debug statements and found that the read in scales are being overwritten:
l_read_success F

The second set of scales are supposed to appended to the global_hybens file, correct? I have some examples of what I tried on Orion (the tests themselves are on Hera): /work2/noaa/da/cthomas/gsi/sdl Could you take a look @shoyokota ?

Continuing the thread about PR#473... So that PR will use much of the same code that is already written here, except performing the scale separation by another method? Will it use the same logicals and other parameters as this code?

@shoyokota
Copy link
Collaborator Author

shoyokota commented Sep 26, 2022

@CatherineThomas-NOAA To run 2-scale SDL tests, nsclgrp=2 and naensloc=3 should be set. Here, the third scale (used in the spectral/recursive filter for scale separation) is also required. I modified your global_hybens file and put it here: /work/noaa/da/syokota/gsi/cthomas/sdl/global_hybens_info.l127.txt.sdl_rev. Could you try the test with this file? (Please note that the first scale should be larger than the second scale.)

The basic logic of coding is probably the same between this PR and PR#473, but currently, there are a lot of conflicts and the names of the parameters for SDL/VDL have not been unified. Once this PR is merged, we should resolve the conflicts and unify the parameters as much as possible before starting the review of PR#473.

@shoyokota shoyokota force-pushed the feature/PR_NOAA-EMC_EnVar-SDL branch from f2959b3 to a337c4d Compare October 27, 2022 18:43
@shoyokota
Copy link
Collaborator Author

@CatherineThomas-NOAA Thanks to your single-observation DA tests, I found the following bugs in the scale separation with the spectral filter. These bugs are not related to the regional application. I fixed them. If you accept this modified code, could you approve this PR?

  • unnecessary zero clear of spectral_filter in "init_sf_xy" -> The zero clear should be done in each spectral_filter(ig,:,:)
  • using "sfilter" in "sf_xy" -> For scale separation, "general_g2s0", "general_s2g0", and the spectral filter of max(spectral_filter)=1 should be used not to multiply unnecessary factors.

@shoyokota shoyokota force-pushed the feature/PR_NOAA-EMC_EnVar-SDL branch from a337c4d to 3b97057 Compare November 4, 2022 15:51
@shoyokota
Copy link
Collaborator Author

According to the advice from @CatherineThomas-NOAA, I revised comments about how to set options of SDL/VDL/TDL in more detail.

@CatherineThomas-NOAA
Copy link
Collaborator

I’ve performed a series of global single observation tests with this branch which unveiled a few issues that @shoyokota has addressed. I tested the following capabilities:

  • Horizontal scale dependent localization
  • Vertical scale dependent localization
  • Time localization
  • Variable localization

These tests are not to examine the skill of adding these capabilities to the global system, just a sanity check to see if the increments respond as expected. Here are a set of slides summarizing the final set of tests.

After Sho’s last set of changes, I reran the full obs global test with the new localization turned off and the results reproduced the develop branch. I also reran the ctests on Hera. One test failed:

19/19 Test  #6: [=[global_fv3_4denvar_C192]=] ....***Failed  12152.83 sec

The memory for global_fv3_4denvar_C192_loproc_updat is 3216976 KBs.  This has exceeded maximum allowable hardware memory limit of 2516582 KBs,
resulting in Failure maxmem of the regression test.

I repeated the ctests with the develop branch and the same test failed in the same way. This failure is not an issue with this PR.

Thank you for your patience Sho and helping debug. Approved.

@dtkleist dtkleist removed their request for review November 14, 2022 21:50
@RussTreadon-NOAA
Copy link
Contributor

@TingLei-NOAA , You already reviewed this PR but changes were made after your review. Would you please review again? If you approve, I'll merge this PR into develop. Thanks.

@TingLei-NOAA
Copy link
Contributor

@RussTreadon-NOAA Got it! I will re-run a few regression tests of my own and have a quick look at the codes as soon as possible. Thank you!

@RussTreadon-NOAA
Copy link
Contributor

Thank you, @TingLei-NOAA

@RussTreadon-NOAA
Copy link
Contributor

Thank you, @TingLei-daprediction , for re-reviewing these changes.

@TingLei-daprediction
Copy link
Contributor

The current PR passed my tests : 3 FV3-LAM hybrid GSI using fv3-lam ensembles, global ensembles and both.
Congrats to @shoyokota putting this PR which add significant functions which GSI developers have been expecting.
Also a shoutout to @CatherineThomas-NOAA for helping verifying and finalizing a few "advanced" options in the current PR.
Approved.

Copy link
Contributor

@TingLei-NOAA TingLei-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an "duplicated" approval :) to avoid possible confusions. Because I had given the approval while using another github account.
@RussTreadon-NOAA Thanks a lot for your coordination to finally finish this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Scale/Variable/Time-Dependent Localization for EnVar
7 participants