Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TC-Gen: Create a new MET-TC tool to handle tropical cyclone genesis. #1127

Closed
JohnHalleyGotway opened this issue May 16, 2019 · 9 comments
Closed
Assignees
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

Write new tool to handle tropical cyclone genesis (tc-gen, tc_gen, tcgen).

@JohnHalleyGotway JohnHalleyGotway added this to the MET 9.0 milestone May 16, 2019
@JohnHalleyGotway JohnHalleyGotway self-assigned this May 16, 2019
@JohnHalleyGotway JohnHalleyGotway changed the title Create a new MET-TC tools to handle tropical cyclone genesis. Create a new MET-TC tool to handle tropical cyclone genesis. May 16, 2019
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented May 17, 2019

Questions To/Answers From Kathryn:

  • No need to distinguish between ADECK/BDECK. Just read from "-atcf" command line option and let config file define the comparisons to be made.
  • No need to derive tracks on the fly (interp12, consensus, lag times, reference tracks like BCLIP)
  • No need to make check_dup configurable... but make code smart enough to ignore duplicate genesis events.
  • No need to filter by watch/warning status or distance to land.
  • The match_points option does not apply.
  • No valid_mask is needed but init_mask is needed. Instead, just name it vx_mask.
  • Probably don't need init_beg/init_end/init_inc/init_exc/init_hour/valid_beg/valid_end. Instead just define time_beg and time_end... which can be fairly applied to the fcst/obs genesis events.
  • Do we need the ATCF ID "suffix" option like we have for tc_pairs?

From Dan:

  • Would want to filter by model initialization time. May also was to filter by initialization hour or lead time.
  • For example, let's say there's a disturbance at forecast hour 48. We decide if it's a hit or false alarm by searching the best track. However, this will likely result in several misses.
    Sounds like we should dump out MPR information as well so that we can filter them down more later. Make it configurable which lead times should be included.
    ... do need to put back in the time filters.

In the past, he has defined hits, misses, and false alarms, but not correct negatives.

For a genesis event in the best track, the model has 20 opportunities to forecast that event if it's initialized every 6 hours. If it gets it right 1/2 the time, then that's 10 hits and 10 misses.

Some questions...
(1) Need to update the wwpts file for MET-TC since the last date is 2014 - Ask Mark.
(2) We'd like to include the watches/warnings in
(3) Can have him run via a Docker container... need to place Dockerfile in MET repo.

Do we need to support the JTWC format? Supporting that would reach a wider format but there is no direct requirement for this project.

3 input formats to support:

  • ATCF output from Tim Marchok.
  • Tropical weather outlook disturbance files containing 24/48 hour probabilities (do this in tc-gen or tc-pairs).
  • Experimental probabilistic forecasts... need a standard format such as an ADECK format.

Need to compare versus the BEST track and CARQ combined.

//
// Model initialization time windows to include or exclude
//
init_beg = "";
init_end = "";
init_inc = [];
init_exc = [];

init_hour = [];

//
// Valid model time window
//
valid_beg = "";
valid_end = "";

//
// Required lead time in hours
//
lead_req = [];

@KathrynNewman
Copy link
Contributor

Location of Dan's Python function: https://github.com/NCAR/gmtb-utilities/tree/master/tcgen/scripts

@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented May 23, 2019

Hi all,

I wanted to provide a better explanation of the reason we use the a-deck CARQ entries for verification. Consider the following scenarios:
We are verifying GFDL tracker output from the GFS. We define a temporal tolerance of +/-24 hr and a spatial tolerance of +/-5 degrees lat/lon to define a hit. The GFS 08-12-00Z model initialization cycle forecasts a TC genesis event valid at 08-15-00Z at 20N, 70W. The first line in the Best-Track for storm AL 03 shows genesis occurring at 08-16-00Z at 20N, 76W. If we consider only this information, the forecast would be outside of our spatial tolerance and therefore a false alarm. There is also a CARQ entry in the AL 03 a-deck file that shows that the disturbance that eventually became AL 03 was located at 20N, 72W on 08-15-00Z. So at the valid time of the GFS forecast, the model was only off by 2 degrees lon. The timing error of genesis, while within our tolerance, caused a spatial error outside of our tolerance. Arguably, this is not quite a hit, but not quite a false alarm. Davis et al. (2016) define a 3 x 3 contingency table to deal with this type of error. The scenario described above would correspond to an "early genesis" event, which they define as YM in their contingency table. The opposite scenario (i.e., the model predicts genesis too late) is defined as MY. They used MET-TC (TC-Pairs?) to generate their statistics, so the logic to account for this scenario may already exist. See their Section 2 and Appendix, and their Fig. A1 in particular: https://journals.ametsoc.org/doi/pdf/10.1175/MWR-D-16-0021.1

The 08-12-00Z issuance time of one of the experimental probabilistic genesis guidance products shows a 70% probability of genesis occurring at 20N, 70W on 08-15-00Z. The spatial tolerance to define a hit is 5 degrees lat/lon. Temporally, these forecasts are verified the same way that NHC verifies their Tropical Weather Outlooks: Did genesis occur within 5 days of the initialization/issuance time? If yes, it's a hit. Let's say we have the same Best-Track and a-deck CARQ info as scenario 1. We would not be able to match the forecast information in the Best-Track because the forecast valid time is before the Best-Track genesis time of 08-16-00Z. However, we would be able to match the forecast time with the a-deck CARQ entry for the valid time, 08-15-00Z. We then say that because the issuance time of 08-12-00Z is within the Best-Track genesis time, the forecast is a hit.

When verifying forecasts in either scenario above, my logic is:

  • For each forecast, search the b-decks for an entry with the same time as the forecast valid time.
    • If a match is found, compare the lat/lon difference between the forecast and the b-deck entry.
      • If the lat/lon difference is within the spatial tolerance, consider the forecast to correspond to storm XX.
    • If no match is found, search the a-decks for a CARQ entry (forecast hour 0) with the same time as the forecast valid time.
      • If a match is found, compare the lat/lon difference between the forecast and the a-deck entry.
        • If the lat/lon difference is within the spatial tolerance, consider the forecast to correspond to storm XX.
    • If a match was found in either the a- or b-decks, compare the forecast to the b-deck genesis information to determine whether the forecast is a hit.
    • If no match is found in either the a- or b-decks, the forecast is a false alarm.

Happy to chat more about this by phone or email.

Dan

@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Jun 5, 2019

Email from Wei-Wei with sample data.

The EMC’s TC track files can be found on cheyenne in /glade/u/home/weiweili/my_work/MET_diag/fv3q2fy19retro_TCYC

  1. atcfunixp.gfs.*: TC track info of the TC cases that were already existed in the observation when the model (fv3gfs) was initialized
    For my personal use, I grouped the vortices in these files into storms based on the storm ID (first 7 characters). You can find the ascii files that I generated at /glade/u/home/weiweili/my_work/MET_diag/TC_diag/tctrk_realTCs
    The filename in this directory is the model initialization date. You can see the 1st record of each storm has a lead time < 24 h, suggesting that this is something already existed in the nature (not model generated) and that the model initialization brings in.

  2. trak.gfso.atcfunix.altg.*: TC track info of the TC cases that were either "already existed in the observation when the model (fv3gfs) was initialized” or “purely generated by the model"
    Similar to Consider alternatives for automated handling of incoming "met-help" email #1, you can find the grouped storms at /glade/u/home/weiweili/my_work/MET_diag/TC_diag/tctrk_allTCs
    You can see the 1st record of a storm may not be < 24 h, which means that the storm is generated by the model. However, like I said, there was no warm-core info in this type of files. In other words, it’s hard for us to know if a storm is a tropical cyclone or something else such as an artificial disturbance or an extratropical cyclone.

Also, I forgot to mention, in atcfunixp.gfs., the #242 character (Y or N) indicates if a vortex has a warm core or not. trak.gfso.atcfunix.altg don’t have this information.

This was one of the criterion that I used to determine if a storm was a tropical cyclone or not in addition to storm lifetime, i.e., lifetime of a storm >= 72 h and warm core can last at least 48 h accumulatively. Perhaps some documentation in my paper is helpful? (https://doi.org/10.1175/WAF-D-15-0176.1).

@TaraJensen
Copy link
Contributor

Charge 7790901

@JohnHalleyGotway
Copy link
Collaborator Author

Talked with Kathryn on 8/30/2019 about genesis data that can be found here:
ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/ens_tracker/prod/

Look in the cmce, fens, and gefs subdirectories for "genesis" subdirectories. They contain files which include "atcf_gen" which have genesis information in them. These are ATCF file BUT they have an extra 3rd column in there for the storm id.

The tasks are:
(1) Enhance the MET library code to recognize and parse these line types on the fly. Update the ATCFLine class to support "boolean is_atcf_gen()". Store the contents of the 3rd columns as the STORM_ID. Otherwise, compute it from the other columns.
(2) Also, find the location of this genesis data in tar files on the WCOSS HPSS and pull several months of them for testing.

@JohnHalleyGotway
Copy link
Collaborator Author

Merged existing functionality into the develop branch for inclusion in the met-9.0_beta2 release. Will continue further development and testing there.

@JohnHalleyGotway JohnHalleyGotway changed the title Create a new MET-TC tool to handle tropical cyclone genesis. TC-Gen: Create a new MET-TC tool to handle tropical cyclone genesis. Jan 3, 2020
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Feb 28, 2020

Remaining tasks of 02/28/2020:

  • Remove fcst_genesis.category config file option.
  • Define unit_tc_gen.xml and add to unit_test.sh.
  • Work on debug 3/4/5/6 log messages.
  • Fix bug in the units for min_duration.
  • Remove config file options for basin and cyclone. These do not apply for forecast genesis events and are confusing. Keep the storm_id and storm_name options to make it easy to run this for a single storm... even though the bulk stats wouldn't be all that meaningful.
  • Remove the init_inc and init_exc options which don't make sense for the bulk statistics of tc_gen.
  • Update init_beg/init_end and valid_beg/valid_end logic as follows:
    If init_beg is set and valid_beg is not... set valid_beg =
  • Create global 0.1 degree NetCDF file defining the basins over water. Get basin defs from Kathryn. Include this in the MET distribution. Get basin defs from Jonathan Vigh.
  • Check why 1970 is showing up in the OBS_VALID_BEG and OBS_VALID_END columns.
  • Add tc_gen config file options to README_TC.
  • Populate tc_gen chapter of the MET User's Guide.

@JohnHalleyGotway
Copy link
Collaborator Author

Split off the definition of a basin mask file into a separate issue:
#1274

Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants