Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HRRR Availability and Related Metadata #593

Merged
merged 28 commits into from
Sep 21, 2023
Merged

HRRR Availability and Related Metadata #593

merged 28 commits into from
Sep 21, 2023

Conversation

cmarshak
Copy link
Collaborator

@cmarshak cmarshak commented Sep 15, 2023

Fixes

  • Issue #584: failed Raider step function in hyp3 job submission when HRRR model times are not available (even within the valid model range) - to resolve, we check availability of files when delay workflow called with a) azimuth_grid_interpolation and b) input to workflow is GUNW. If weather model files are unavailable and the GUNW is on s3, do nothing to GUNW (i.e. do not add tropo delay) and exit successfully. If weather model files are unavailable and the GUNW is on local disk, raise ValueError
  • Issue #587: similar to 584 except added here to the mix is control flow in RAiDER.py passes over numerous exceptions in workflow. This is fixed identically as above.

Removed

  • Removes update option (either True or False) from calcGUNW workflow which asks whether the GUNW should be updated or not. In existing code, it was not being used/applied, i.e. previous workflow always updated GUNW. Removed input arguments related from respective functions so that it can be updated later.

Added

  • Allow for Hyp3 GUNW workflow with HRRR (i.e. specifying a gunw path in s3) to successfully exit if any of the HRRR model times required for azimuth-time-grid interpolation (which is default interpolatin method) are not available when using bucket inputs (i.e. only on the cloud)
  • For GUNW workflow, when model is HRRR, azimuth_time_grid interpolation used, and using a local GUNW, if requisite weather model files are not available for raise a ValueError (before processing)
  • Raise a value error if non-unique dates are given in the function get_inverse_weights_for_dates in s1_azimuth_timing.py
  • Added metadata provenance for each delay layer that is included in GUNW and the cube workflow generally in calcDelays including:
    • model_times_used - the weather models used and interpolated for the delay calculation
    • interpolation_method - whether none, center_time, or azimuth_time_grid methods were used - see description in calcDelayGUNW
    • scene_center_time - the center time in which the associated SAR image was acquired
  • Stages GMAO data for GUNW testing of correct dataset update i.e. in the test test_GUNW_dataset_update.
  • Stages HRRR data for test_HRRR_ztd test.

Changed

  • Get only 2 or 3 model times required for azimuth-time-interpolation (previously obtained all 3 as it was easier to implement) - this ensures slightly less failures associated with HRRR availability. Importantly, if a acquisition time occurs during a model time, then we order by distance to the reference time and how early it occurs (so earlier times come first if two times are equidistant to the aquisition time).
  • Made test names in test_GUNW.py more descriptive
  • Numpy docstrings and general linting to modified function including removing variables that were not being accessed
  • hrrr_download to ensure that hybrid coordinate is obtained regardless how herbie returns datacubes and ensures test_HRRR_ztd passes consistently
    • Remove the command line call in test_HRRR_ztd.py and call using the python mock up of CLI for better error handling and data mocking.
  • Return xarray.Dataset types for RAiDER.calcGUNW.tropo_gunw_slc and RAiDER.raider.calcDelayGUNW for easier inspection and testing

@cmarshak cmarshak marked this pull request as draft September 15, 2023 01:12
@cmarshak
Copy link
Collaborator Author

cmarshak commented Sep 19, 2023

For relevant GUNWs where weather models are not available here is the new output from the PR (the GUNW is from #584) and the same will be true for #587.

raider.py ++process calcDelaysGUNW -f S1-GUNW-A-R-106-tops-20170210_20170129-225948-00078W_00042N-PP-cde3-v3_0_0.nc -m HRRR -interp azimuth_time_grid -m HRRR -interp azimuth_time_grid
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-10 23:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-10 22:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-11 00:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
💔 Did not find ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-29 23:00 UTC F00
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-29 22:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-30 00:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
Traceback (most recent call last):
  File "/Users/cmarshak/mambaforge/envs/RAiDER/bin/raider.py", line 8, in <module>
    sys.exit(main())
  File "/Users/cmarshak/bekaert-team/RAiDER/tools/RAiDER/cli/__main__.py", line 42, in main
    process_entry_point.load()()
  File "/Users/cmarshak/bekaert-team/RAiDER/tools/RAiDER/cli/raider.py", line 609, in calcDelaysGUNW
    raise ValueError('The required HRRR data for time-grid interpolation is not available')
ValueError: The required HRRR data for time-grid interpolation is not available

and for S3 buckets

raider.py ++process calcDelaysGUNW --bucket s3://dummy/ --bucket-prefix S1-GUNW-A-R-106-tops-20170210_20170129-225948-00078W_00042N-PP-cde3-v3_0_0.nc -m HRRR -interp azimuth_time_grid -m HRRR -interp azimuth_time_grid
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-10 23:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-10 22:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Feb-11 00:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
💔 Did not find ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-29 23:00 UTC F00
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-29 22:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
✅ Found ┊ model=hrrr ┊ product=nat ┊ 2017-Jan-30 00:00 UTC F00 ┊ GRIB2 @ aws ┊ IDX @ aws
The required HRRR data for time-grid interpolation is not available; returning None and not modifying GUNW dataset

@cmarshak cmarshak marked this pull request as ready for review September 19, 2023 17:27
@@ -76,7 +77,7 @@ def read_template_file(fname):
params[key] = {}

# Parse the user-provided arguments
template = DEFAULT_DICT
template = DEFAULT_DICT.copy()
Copy link
Collaborator Author

@cmarshak cmarshak Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extremely important piece that @jhkennedy help me add. The change ensures tests pass with updates to this PR. The DEFAULT_DICT is mutable and if its changed during runtime (i.e. during tests) there are mind boggling consequences (might be worth changing default dict to a constructor e.g. a frozen dataclass so the dict is constant). For now, this resolves this issue with random tests not passing after above changes made. There is a fixture in test/test_checkArgs.py that loads this dictionary. After adding this copy, the few tests that were failing that had nothing to do with modified files, passed. It's unclear except in raider.py how this dictionary gets modified during tests. Might also have to do with how pytest and pytest-mock work. Again, for another time.

Remove the decimal reduction of almost equal in HRRR_ztd
Copy link
Collaborator

@jlmaurer jlmaurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmarshak I think this looks good! I actually added a bunch of your updates in #585 before I read this PR. I just had a few minor comments (on your comments actually!); I think it will be best to merge this first then I will rebase #585 on top of it.

if ds_list_filt_0:
ds_out = ds_list_filt_0[0]
coord = 'hybrid'
# I do not think that this coord name will result in successful processing nominally as variables are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this comment to "TODO: check whether isobaricInhPa workflow will result in successfull processing"

@@ -410,12 +416,18 @@ def calcDelays(iargs=None):
else:
out_filename = w

# A dataset was returned by the above
# Dataset returned: Cube e.g. GUNW workflow
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these comments needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tropo_delay has the two different return types and I was trying to figure out which of these two options the GUNW was using so I could add a breakpoint - would be nice to have a sign post.

Can change language so long as going back it's clear that GUNW uses a particular branch.

@jlmaurer
Copy link
Collaborator

@cmarshak I'm going to go ahead and merge this and will modify the comments in the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants