Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine TC-Diag logic for handling missing data #2609

Closed
8 of 20 tasks
JohnHalleyGotway opened this issue Jul 12, 2023 · 4 comments · Fixed by #2680
Closed
8 of 20 tasks

Refine TC-Diag logic for handling missing data #2609

JohnHalleyGotway opened this issue Jul 12, 2023 · 4 comments · Fixed by #2680
Assignees
Labels
MET: Tropical Cyclone Tools priority: high High Priority requestor: NCAR National Center for Atmospheric Research type: enhancement Improve something that it is currently doing
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Jul 12, 2023

Describe the Enhancement

Theses refinement to the handling of missing data were discussed on July 12, 2023 at the TC-Diag project-wide meeting (see meeting notes). As of MET version 11.1.0, if TC-Diag tries to read data from a GRIB file that does not exist, or can't find an individual field of data in a file that does exist, it exits with bad status:

ERROR  : 
ERROR  : get_series_entry() -> Could not find data for TMP/P1000, ValidTime = 20220924_060000 in file list:
ERROR  : :/Volumes/d1/projects/MET/MET_unit_test/MET_test_input/model_data/grib2/gfs/gfs.0p25.2022092400.f000.grib2,/Volumes/d1/projects/MET/MET_unit_test/MET_test_input/model_data/grib2/gfs/gfs.0p25.2022092400.f006_BAD.grib2,/Volumes/d1/projects/MET/MET_unit_test/MET_test_input/model_data/grib2/gfs/gfs.0p25.2022092400.f012.grib2,/Volumes/d1/projects/MET/MET_unit_test/MET_test_input/model_data/grib2/gfs/gfs.0p25.2022092400.f018.grib2,/Volumes/d1/projects/MET/MET_unit_test/MET_test_input/model_data/grib2/gfs/gfs.0p25.2022092400.f024.grib2
ERROR  : 

This task it to handle missing data that falls into 3 categories:

  1. The input file exists but the requested field does not. In this case, MET should print a warning message and populate the cylindrical coordinate field with bad data.
  2. The input file does not exist. In this case, MET should print a warning message about the missing file and populate the cylindrical coordinates with bad data for ALL fields.
  3. The track ends prior to the expected duration (e.g. 0 to 126 hours every 6 hours, by default). In this case, MET should print a warning message about lead times that are not present in the track and populate the cylindrical coordinates with bad data for ALL fields.

Some details:

  • Update the default TC-Diag config file to explicitly list the lead times to be processed, from 0 to 126 hours, every 6 hours.
  • For missing fields (1), missing input files (2), and missing track points (3), still write missing data values to the cylindrical coordinates files.
  • For (2) and (3), the temporary NetCDF files will contain ALL missing data. In this case, recommend calling the python diagnostics script WITHOUT specifying the temporary NetCDF file. That'll trigger the diagnostics script to just return all missing data values.

Time Estimate

2 or 3 days?

Sub-Issues

Consider breaking the enhancement down into sub-issues.
Hopefully none needed

Relevant Deadlines

Ideally complete this prior to Aug 1, 2023 so we can start #2550 at that point.

Funding Source

2700043

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED CYCLE ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added type: enhancement Improve something that it is currently doing requestor: NCAR National Center for Atmospheric Research MET: Tropical Cyclone Tools priority: high High Priority labels Jul 12, 2023
@JohnHalleyGotway JohnHalleyGotway added this to the MET 12.0.0 milestone Jul 12, 2023
JohnHalleyGotway added a commit that referenced this issue Jul 25, 2023
…args to control searching all files lists, erroring out, and printing warnings. The default settings maintain existing functionality. Update tc_diag code to suppress errors and warnings due to missing data. Still do need to suppress additional warnings from the vx_data libraries.
JohnHalleyGotway added a commit that referenced this issue Jul 25, 2023
…, and use it in the vx_series_data library to control to limit the missing data warning messages from tc-diag.
@JohnHalleyGotway
Copy link
Collaborator Author

In emails on 7/26/23, we noted that a significant change is needed in the file processing logic. It is often the case the we process input files that contain all data for each valid time in a single file.

As of MET-11.1.0, we're still reading one field at a time from each input file. For the default set of 62 fields, that means we're opening/closing the same file at least 62 times. We definitely need to change that and read all 62 fields in the same pass for efficiency.

The decision is to add a boolean for the user to specify whether all input data for a single time can be read from the same input file. Should have a default value of true:

one_time_per_file = true;

If false, call the existing logic to search the list of input files for each field separately.
If true, implement new logic to read all 62 input fields from a single input file.

JohnHalleyGotway added a commit that referenced this issue Jul 26, 2023
…of the logger and reset it back to that state when done reading data. Get rid of the search_all_files flag since it'll always be true, even in tc_diag's use of it.
JohnHalleyGotway added a commit that referenced this issue Jul 26, 2023
…l need to actually implement logic to handle this in tc_diag.
@JohnHalleyGotway
Copy link
Collaborator Author

Made progress updating logic for the one_time_per_file config option. Need to add support for a new read_data_planes() function to the class hierarchy to read multiple fields from the same input file while opening it only once.

JohnHalleyGotway added a commit that referenced this issue Sep 6, 2023
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Sep 7, 2023

Running a very simple test with GRIB2 input files using the one_time_per_file config option, I see runtimes of:
one_time_per_file = TRUE;: 01:30
one_time_per_file = FALSE;: very, very long time (killed after 10 minutes)
So supporting and using this option is definitely recommended!

JohnHalleyGotway added a commit that referenced this issue Sep 7, 2023
…DataFile::data_planes() since the logic is the same for all derived classes.
@jvigh
Copy link
Contributor

jvigh commented Sep 7, 2023

@sethlinden Please obtain test data from the following:

TC diagnostics: seneca:/d1/projects/TCDiag/data_output/DIAGNOSTICS/TCDIAG

adecks: seneca:/d1/projects/TCDiag/data_input/ATCF/adecks/NHC_JTWC

bdecks: seneca:/d1/projects/TCDiag/data_input/ATCF/bdecks/NHC_JTWC

@JohnHalleyGotway JohnHalleyGotway linked a pull request Sep 11, 2023 that will close this issue
15 tasks
JohnHalleyGotway added a commit that referenced this issue Sep 13, 2023
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: Seth Linden <linden@seneca.rap.ucar.edu>
Co-authored-by: jprestop <jpresto@ucar.edu>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: John and Cindy <halleygotway@Halleys-Mac-mini.local>
Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Co-authored-by: davidalbo <dave@ucar.edu>
Co-authored-by: Lisa Goodrich <lisag@ucar.edu>
Co-authored-by: metplus-bot <97135045+metplus-bot@users.noreply.github.com>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <jvigh@ucar.edu>
Co-authored-by: Tracy Hertneky <39317287+hertneky@users.noreply.github.com>
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
fix 2596 main v11.1 rpath compilation (#2614)
fix #2514 main_v11.1 clang (#2628)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: Tropical Cyclone Tools priority: high High Priority requestor: NCAR National Center for Atmospheric Research type: enhancement Improve something that it is currently doing
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

2 participants