Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GSI monitor scripts #969

Conversation

WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA WalterKolczynski-NOAA commented Aug 12, 2022

Description
As part of the overarching goal of moving global workflow scripts out of component repositories and under global workflow, all of the scripts currently linked in GSI monitor are added to global workflow. Scripts have been updated to reflect standardization already made to other workflow scripts (preamble, no backticks, etc.). See 7aa637 for those modifications specifically.

Related GSI-Montitor issue: NOAA-EMC/GSI-Monitor/pull/23

Fixes #967

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Cycled test on Hera

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes

Copies scripts out of the GSI monitor repo into global workflow.

Refs: NOAA-EMC#967
Updates the new GSI monitor scripts to incorporate some of the improvements
already added to other workflow scripts:
- Let preamble handle script entry/exit messages and bash set
- Replace backticks with $( ) for subshells
- Remove env prints

Refs: NOAA-EMC#967
@WalterKolczynski-NOAA
Copy link
Contributor Author

Still waiting for another cycle or two to complete on Hera.

.gitignore Outdated Show resolved Hide resolved
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/GSI-Monitor that referenced this pull request Aug 12, 2022
Global workflow is moving all scripts that are only used by global workflow
currently held by other repos into global workflow.

Related workflow issue NOAA-EMC/global-workflow/pull/967
Related workflow PR NOAA-EMC/global-workflow/pull/969
@WalterKolczynski-NOAA
Copy link
Contributor Author

WalterKolczynski-NOAA commented Aug 12, 2022

Added a companion issue to GSI-Monitor that removes the scripts added here.

@WalterKolczynski-NOAA
Copy link
Contributor Author

So, after running this I realize we don't test these jobs in any of our current development layouts. @EdwardSafford-NOAA @aerorahul can this be tested (or at least mostly tested) otherwise, or just fix it later once we have it set up to run?

@EdwardSafford-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA I can run the test driver scripts. It's not a robust test but it will be something. I'll try to make that happen this morning and will advise.

@aerorahul
Copy link
Contributor

aerorahul commented Aug 15, 2022

From config.vrfy:

Screen Shot 2022-08-15 at 9 44 37 AM

Do these not trigger the GSI monitoring jobs/scripts?

@EdwardSafford-NOAA
Copy link
Contributor

Unfortunately the test-driver scripts will require some work to run with these changes. It won't be difficult but it's more than I can do quickly today.

I don't know a great deal (yet) about the global-workflow, but from what I see @aerorahul is right -- the settings in config.vrfy should set the 3 monitor jobs to run.

@WalterKolczynski-NOAA
Copy link
Contributor Author

From config.vrfy:

Screen Shot 2022-08-15 at 9 44 37 AM

Do these not trigger the GSI monitoring jobs/scripts?

Oh, they might. gempak and awips get turned on in config.base, so I didn't think to look anywhere else. Will check.

@WalterKolczynski-NOAA
Copy link
Contributor Author

From config.vrfy:
Screen Shot 2022-08-15 at 9 44 37 AM
Do these not trigger the GSI monitoring jobs/scripts?

Oh, they might. gempak and awips get turned on in config.base, so I didn't think to look anywhere else. Will check.

They don't appear to. I actually already had them on for my run and I don't see any monitoring jobs. Unless they are run as part of another job name (they shouldn't, since they have j-jobs).

`ush/radmon_ck_stdout.sh` was being ignored, but it was not even being
linked by the link script anymore, so the ignore is unnecessary.

Refs: NOAA-EMC#967
@RussTreadon-NOAA
Copy link
Contributor

config.vrfy defines TANKverf variables under the VRFYRAD, VRFYMINMON, and VRFYOZN sections. I believe monitoring output is written to the TANKverf directories. The rocoto workflow executes the monitoring jobs from vrfy.sh. This differs from operations. A check of gdasvrfy.log will show us if (a) the monitoring jobs were run and (b) where the output is saved.

We (developers) run some jobs which operations does not run. For other jobs which both ops and dev run, we sometimes run the jobs from difference places in the workflow and write results to different directory structures.

@aerorahul
Copy link
Contributor

config.vrfy defines TANKverf variables under the VRFYRAD, VRFYMINMON, and VRFYOZN sections. I believe monitoring output is written to the TANKverf directories. The rocoto workflow executes the monitoring jobs from vrfy.sh. This differs from operations. A check of gdasvrfy.log will show us if (a) the monitoring jobs were run and (b) where the output is saved.

We (developers) run some jobs which operations does not run. For other jobs which both ops and dev run, we sometimes run the jobs from difference places in the workflow and write results to different directory structures.

Thanks @RussTreadon-NOAA. Your feedback and in-depth knowledge of the system is greatly appreciated.
This part of the workflow is not exercised by all developers and so it remains somewhat unknown to most who do not exercise data assimilation components.
We will hope to do this better, starting with perhaps not including a number of activities in the same vrfy job. A lot gets swept under this one job.

@WalterKolczynski-NOAA
When we run the next "cycled" test, see if we can run vrfy and capture some of the DA verification and monitoring jobs output. We might need to run for a few cycles.
If you are running a test with this branch, I can take a look. I think I remember enough to tell the difference if this is being exercised or not.

@WalterKolczynski-NOAA
Copy link
Contributor Author

Okay, at least some of these scripts are running. Currently exgdas_atmos_verfrad.sh is dying on a failed move for abi_g16. I can add checks that the gunzip file exists before trying to rename and expand it, but are there any SATYPEs that should cause a fail if missing?

@EdwardSafford-NOAA
Copy link
Contributor

Yikes! Missing diag files are common and shouldn't cause the script to die, but rather be reported as missing in the warning report. Can you point me to your log file? I'd like to take a look.

@NOAA-EMC NOAA-EMC deleted a comment from bubba0077 Aug 15, 2022
@WalterKolczynski-NOAA
Copy link
Contributor Author

Yikes! Missing diag files are common and shouldn't cause the script to die, but rather be reported as missing in the warning report. Can you point me to your log file? I'd like to take a look.

This is just a side-effect of having set -e on now. I just wanted to confirm none of these should cause a failure.

Copy link
Contributor

@EdwardSafford-NOAA EdwardSafford-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me.

One radmon script is needed but was not previously linked, so it hadn't
been added to the repository.

Refs: NOAA-EMC#967
@WalterKolczynski-NOAA
Copy link
Contributor Author

WalterKolczynski-NOAA commented Aug 19, 2022

Finally worked through all the issues due to undefined variables or operating on missing files.

@EdwardSafford-NOAA Please carefully review specifically those changes in the most recent commit (cb29351) to make sure I haven't caused any unwanted behavior with my changes.

After moving the GSI monitor scripts to workflow, several of the scripts
would fail because set -eu is on, either due to undefined variables or
trying to operate on non-existent files. This corrects some of those issues.

Some of the non-existent files were globs, for these the bash built-in
`compgen -G` is used to check if it resolves to any files.

Refs: NOAA-EMC#967
Copy link
Contributor

@EdwardSafford-NOAA EdwardSafford-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and thank you for making many improvements. I noted some minor things that can be removed, either now or as a later step.

Removes unused mail options from radmon scripts. Also an outdated comment.

Refs: NOAA-EMC#967
@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit da164f8 into NOAA-EMC:develop Aug 23, 2022
@WalterKolczynski-NOAA WalterKolczynski-NOAA deleted the feature/gsi_mon_scripts branch August 23, 2022 15:05
KateFriedman-NOAA added a commit that referenced this pull request Aug 26, 2022
* develop:
  Update UFS_UTILS tag to `ufs_utils_1_8_0` (#1001)
  Fix preamble id (#996)
  Add missing "atmos" into job dependencies (#998)
  Bugfix in arch.sh to remove hardwired "htar" (#992)
  Add in stubs for aerosol DA tasks + bugfix for setup_expt where cycled and ATMA are used (#990)
  Add GSI monitor scripts (#969)
  Fix product generation at some fcst hrs (#988)
KateFriedman-NOAA added a commit to KateFriedman-NOAA/global-workflow that referenced this pull request Jan 30, 2023
* develop:
  Correct issue in linking final restart files (NOAA-EMC#1285)
  Remove execute permissions from config files (NOAA-EMC#1281)
  Make needed updates to run forecast from GEFS (NOAA-EMC#1203)
  Remove unnecessary variables which reference to nemsio (NOAA-EMC#1259)
  Create analysis files for early-cycle EnKF by default (NOAA-EMC#1237)
  Don't wipe $DATA before running ocean bmat (NOAA-EMC#1280)
  More marine DA j-jobs (NOAA-EMC#1270)
  Update UFS-DA atmospheric prep script to be consistent with GDASApp update (NOAA-EMC#1265)
  Add new jjob for ocean analysis bmat (NOAA-EMC#1239)
  Retire ecf/versions in develop (NOAA-EMC#1267)
  Deploy documentation to RTD (NOAA-EMC#1264)
  Temporarily disable failing pytest (NOAA-EMC#1263)
  Remove incorrect/misleading comments in config.base (NOAA-EMC#1261)
  Add initial Sphinx documentation (NOAA-EMC#1258)
  Remove nemsio support (NOAA-EMC#1255)
  Increase wallclock for diag jobs (NOAA-EMC#1216)
  Use correct resources for GFS gempak (NOAA-EMC#1214)
  Abstract common j-job tasks (NOAA-EMC#1230)
  Add missing mkgfsawps.x link (NOAA-EMC#1218)
  Fix post sounding job (NOAA-EMC#1212)
  Revert "Use fracoro data for all new UFS applications (NOAA-EMC#1182)" (NOAA-EMC#1240)
  Use fracoro data for all new UFS applications (NOAA-EMC#1182)
  Revert "Merge GFS v16.3 operational GSI changes into develop branch. (NOAA-EMC#1158)" (NOAA-EMC#1238)
  Add more user defined parameters for the marine DA (NOAA-EMC#1235)
  Update pytests action version and run sequentially (NOAA-EMC#1236)
  Add utility to compare Fortran namelists (NOAA-EMC#1234)
  Updates for pygw (NOAA-EMC#1231)
  Merge GFS v16.3 operational GSI changes into develop branch. (NOAA-EMC#1158)
  Move member up in directory hierarchy (NOAA-EMC#1201)
  Enable staging ics for cycled experiments. (NOAA-EMC#1199)
  Add tests for configuration.py (NOAA-EMC#1192)
  Replace ocnanal_${CDATE}} with ${RUN}ocnanal_${cyc} (NOAA-EMC#1191)
  define NET and RUN in the Rocoto XML to accurately mimic the ecf in ecflow (NOAA-EMC#1193)
  Fix checking for restart files (NOAA-EMC#1186)
  Fix 'DEBUG' option in build_ufs.sh (NOAA-EMC#1188)
  Update archive job memory request value for R&Ds (NOAA-EMC#1183)
  Reorder post so all flux files are generated when running offline (NOAA-EMC#1181)
  Stop checking for restarts on non-GFS CDUMPs (NOAA-EMC#1179)
  Add missing jobids in some pre-job scripts (NOAA-EMC#1176)
  Remove existing directory if it exists when getic runs (NOAA-EMC#1165)
  Add logging decorator, test and test for yaml_file (NOAA-EMC#1178)
  fix coding norm check in `hosts.py` (NOAA-EMC#1174)
  Fix some bugs and make other changes so ctest in GDASApp works (NOAA-EMC#1172)
  Support for the GDASApp testing in containers (NOAA-EMC#1151)
  ATM 3DVAR with and without IAU (NOAA-EMC#1113)
  Enable checking for python norms and fix violating code (NOAA-EMC#1168)
  Enforce decimal math in atmos post (NOAA-EMC#1171)
  Update marine DA j-jobs to new format (NOAA-EMC#1149)
  Add utility to manipulate files en masse  (NOAA-EMC#1166)
  add action to run pytests (NOAA-EMC#1167)
  Pin `differential-shellcheck` to `v3` tag (NOAA-EMC#1162)
  Add a task base class and basic logger (NOAA-EMC#1160)
  Recursively convert dict to AttrDict when making an AttrDict (NOAA-EMC#1154)
  move configuration.py to pygw. Use it from there.  return AttrDict after sourcing configs (NOAA-EMC#1153)
  JEDI based Marine DA tasks (NOAA-EMC#1134)
  Allow customizations based on user/configuration (NOAA-EMC#1146)
  First step towards making j-jobs consistent in use from ecflow and rocoto (NOAA-EMC#1120)
  enable APP=S2SWA on WCOSS2 (NOAA-EMC#1142)
  Fix typo in .shellcheckrc
  Remove prod_envir module load from WCOSS2 (NOAA-EMC#1138)
  Link staged GSI fix files instead of cloning them from gerrit (NOAA-EMC#1132)
  Address shellcheck warnings in env files (NOAA-EMC#1136)
  Adds group size and nmem for GEFS (NOAA-EMC#1127)
  Remove unnecessary sCDATE assignment in forecast_predet.sh (NOAA-EMC#1133)
  Convert archive jobs to proper j-jobs (NOAA-EMC#1115)
  Update C48 forecast to run with one thread (NOAA-EMC#1131)
  Improved error messages from atmos analysis (NOAA-EMC#1125)
  Update MODULEPATH for Orion (NOAA-EMC#1126)
  MPMD variable updates and fix (NOAA-EMC#1124)
  Introduce FHMAX_ENKF_GFS to extending ensemble forecast capabilities (NOAA-EMC#1122)
  Update R&D launcher commands for tasks and multi-prog (NOAA-EMC#1112)
  Correct crtm path in UFS DA atmospheric analysis scripts (NOAA-EMC#1111)
  Correct syntax in remaining sorc scripts (NOAA-EMC#1105)
  Add GSI background error covariance as an option for UFS DA variational assimilation (NOAA-EMC#1104)
  Add Early Cycle EnKF workflow (NOAA-EMC#1022)
  Correct errors with gdas and monitoring symlinks (NOAA-EMC#1101)
  Fixed gfs-utils links (NOAA-EMC#1099)
  Fix build scripts and bring into compliance (NOAA-EMC#1096)
  Feature/updates for gdas app (NOAA-EMC#1091)
  Change GLDAS USE_CFP to NO on Hera (NOAA-EMC#1094)
  Resource updates to support WCOSS2 (NOAA-EMC#1070)
  Set COMPILER in link for detect machine (NOAA-EMC#1092)
  gfs utils update (NOAA-EMC#1088)
  GFS-UTILS update for build and ush scripts (NOAA-EMC#1082)
  Update UFS version to 2022 Oct 19 (NOAA-EMC#1083)
  Use more cycledefs for task control (NOAA-EMC#1078)
  removing superfluous EFSOI-specific files from develop (NOAA-EMC#1079)
  Update UFS to Sept 9 version (NOAA-EMC#1073)
  Modify default file location for monitor data when using rocoto (NOAA-EMC#1065)
  Fix companion ocean resolution for C48 (NOAA-EMC#1066)
  Add trailing slash for gldas topo path (NOAA-EMC#1064)
  Limit number of CPU for post (NOAA-EMC#1061)
  Fix eupd trace (NOAA-EMC#1057)
  Port to S4 (NOAA-EMC#1023)
  Update to obsproc.v1.0.2 and prepobs.v1.0.1 (NOAA-EMC#1049)
  Add GDAS to the partial build list (NOAA-EMC#1050)
  Fix group number being treated as octal in gdas arch (NOAA-EMC#1053)
  Remove trace from link script (NOAA-EMC#1046)
  Update gfs-utils hash to 3a609ea (NOAA-EMC#1048)
  Fix link script usage statement (NOAA-EMC#1045)
  Replace preamble variable commands with functions (NOAA-EMC#1012)
  Implement fix reorg and remove gfs-utils code (NOAA-EMC#1009)
  Rename post scripts (NOAA-EMC#1038)
  Fix missing @ symbol with COMINsyn in config.base (NOAA-EMC#1039)
  WCOSS2 run support and script/config updates (NOAA-EMC#1030)
  Remove base_svn from Hera and Orion hosts files (NOAA-EMC#1036)
  initial commit for incoming yaml work (NOAA-EMC#1029)
  Fix radiance verification failing to find diag files (NOAA-EMC#1031)
  Supported resolutions on platforms and defaults for mode (NOAA-EMC#1026)
  Add GLDAS scripts & fix GLDAS job (NOAA-EMC#1018)
  Update GSI Monitor for radmon fix
  Correct shell linter config (NOAA-EMC#1013)
  Correct diagnostic file handling in ush/ozn_xtrct.sh (NOAA-EMC#1016)
  Add shell linter Github action for pull requests (NOAA-EMC#1007)
  Build updates for WCOSS2 (NOAA-EMC#1002)
  Update UFS_UTILS tag to `ufs_utils_1_8_0` (NOAA-EMC#1001)
  Fix preamble id (NOAA-EMC#996)
  Add missing "atmos" into job dependencies (NOAA-EMC#998)
  Bugfix in arch.sh to remove hardwired "htar" (NOAA-EMC#992)
  Add in stubs for aerosol DA tasks + bugfix for setup_expt where cycled and ATMA are used (NOAA-EMC#990)
  Add GSI monitor scripts (NOAA-EMC#969)
  Fix product generation at some fcst hrs (NOAA-EMC#988)
  Add initial config files for global aerosol DA (NOAA-EMC#986)
  Update diag table to remove wav-ocn coupling fields (NOAA-EMC#979)
  use a robust Findwgrib2.cmake to find wgrib2 built w/ native wgrib2 build (NOAA-EMC#970)
  Externals.cfg was stale and had drifted off (NOAA-EMC#965)
  Fix post comparison with zero-padded numbers (NOAA-EMC#964)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Regular updates and maintenance work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Absorb GSI monitor scripts
5 participants