Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow tc_gen to match fcst/obs when forecast genesis times are prior to observed genesis times #1714

Closed
19 of 21 tasks
DanielAdriaansen opened this issue Mar 12, 2021 · 13 comments · Fixed by #1750 or #1751
Closed
19 of 21 tasks
Assignees
Labels
requestor: DTC/MRW DTC Medium Range Weather T&E requestor: University/UIUC University of Illinois, Urbana-Champaign type: enhancement Improve something that it is currently doing
Milestone

Comments

@DanielAdriaansen
Copy link
Contributor

DanielAdriaansen commented Mar 12, 2021

Describe the Enhancement

During recent improvements to tc_gen (i.e. #1448, #1447, etc...) I failed to recognize that the code changes would prevent tc_gen from creating matched pairs of forecast and observed genesis events if the forecast valid time is before the observed genesis time. For computation of subseasonal-to-seasonal (S2S) TC metrics being able to match forecasts of early genesis (i.e. forecast genesis events that occurred prior to observed genesis events) is required. I imagine this could be useful in other NWP evaluation as well, such as if your model is biased early. Currently, if your model always forecasts genesis 6 hours early, tc_gen will create no matches and thus no hits.

I propose adding a new parameter to allow a user to add a time dimension to the already existing distance dimension for creating matched pairs of genesis. The parameter could be called genesis_match_window and would have a beg and an end. In theory, setting beg = 0 and end = 0 would produce identical results to current behavior, beg = 0 and end = >0 should also produce identical behavior, but setting beg = <0 to allow early genesis forecasts to be matched with observed genesis events would change the results (these comments are my hypotheses, we should test these conditions).

The new genesis_match_window would compliment the genesis_match_radius to allow a time and distance component to creating matched pairs. Note the dev_hit_window and dev_hit_radius will still exist to affect scoring within tc_gen.

Time Estimate

1 day

Sub-Issues

Consider breaking the enhancement down into sub-issues.
No sub-issues needed.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

2791541

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
  • Select milestone to next major version milestone or "Future Versions"

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s), Project(s), Milestone, and Linked issues
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@DanielAdriaansen DanielAdriaansen added type: enhancement Improve something that it is currently doing alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Mar 12, 2021
@DanielAdriaansen DanielAdriaansen added this to the MET 10.0.0 milestone Mar 12, 2021
@DanielAdriaansen DanielAdriaansen added this to To do in MET-10.0.0-beta5 (4/26/21) via automation Mar 12, 2021
@DanielAdriaansen DanielAdriaansen changed the title Allow tc_gen to match fcst/obs when forecasts times of genesis prior to observed genesis times Allow tc_gen to match fcst/obs when forecast genesis times are prior to observed genesis times Mar 12, 2021
@DanielAdriaansen DanielAdriaansen added the requestor: University/UIUC University of Illinois, Urbana-Champaign label Mar 13, 2021
@halperin-erau
Copy link

halperin-erau commented Mar 15, 2021 via email

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Mar 15, 2021

@DanielAdriaansen is testing the application of tc_gen to S2S and there is no CARQ data available in that context. Dan A, please correct me if I’m wrong.

So the intended method for matching “early” forecasts of genesis won’t work in this context. @halperin-erau can you think of any solution that’s preferable to adding this new configuration option?

@halperin-erau
Copy link

halperin-erau commented Mar 15, 2021 via email

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Mar 15, 2021 via email

@halperin-erau
Copy link

halperin-erau commented Mar 15, 2021 via email

JohnHalleyGotway added a commit that referenced this issue Mar 15, 2021
…fine a search window relative to the forecast genesis time.
JohnHalleyGotway added a commit that referenced this issue Mar 15, 2021
…lows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.
@JohnHalleyGotway JohnHalleyGotway moved this from To do to In progress in MET-10.0.0-beta5 (4/26/21) Mar 15, 2021
@JohnHalleyGotway
Copy link
Collaborator

I realize we have more to talk about. But since I had cycles available today, I made changes on a new feature_1714_tc_gen branch adding this config option:

//
// Time window in hours, relative to the model genesis time, to search for a
// matching Best track point
//
genesis_match_window = {
   beg = 0;
   end = 0;
}

And it's compiled locally for testing on kiowa in:
/d1/projects/MET/MET_pull_requests/met-10.0.0_beta5/feature_1714/MET-feature_1714_tc_gen/met/bin

This wasn't intuitive to me at first, but to allow early forecasts to find a match in the Best track, you'd set the end > 0, like:

genesis_match_window = { beg = 0; end = 24; }

For each forecast genesis event, this searches 24 hours AFTER that genesis time when looking for a matching Best/Operational track. @DanielAdriaansen will test this out to see if it has the desired result with his data.

@KathrynNewman
Copy link
Contributor

KathrynNewman commented Mar 16, 2021 via email

@halperin-erau
Copy link

halperin-erau commented Mar 16, 2021 via email

@DanielAdriaansen
Copy link
Contributor Author

I will test out John's latest changes prior to our meeting. Thursday at 10 or 2 MT works for me.

@DanielAdriaansen
Copy link
Contributor Author

DanielAdriaansen commented Mar 17, 2021

I did some testing, and my numbers changed more than John's. I had some additional questions and after discussing with John we performed a test to see if the order in which events were provided to tc_gen caused differences in the output. For 10.0.0-beta3 (which does not include John's latest change of genesis_match_window), depending on the order in which fcst and obs data are provided to tc_gen different results occur.

Test 1- provide tc_gen a file list for -track and -genesis data created like this:

ls /home/dadriaan/projects/s2s/gdf/data/obs/bdecks/all/* > bdecks.out
ls /home/dadriaan/projects/s2s/gdf/data/fcst/tracker/reformat/* > tracker.out

Run test 1:

/d1/projects/MET/MET_releases/met-10.0.0-beta3/bin/tc_gen -genesis /home/dadriaan/projects/s2s/testing/tracker.out -track /home/dadriaan/projects/s2s/testing/bdecks.out -out ./met_out -log ./met_log -v 10 -config /home/dadriaan/projects/s2s/testing/TCGenConfig_lead48v10.0.0-beta3

Test 1 output:

/home/dadriaan/projects/s2s/testing/test10.0.0-beta3_fwd

Test 2- reverse the ordering of the file lists created for test 1:

cat bdecks.out | sort -r > bdecks_rev.out
cat tracker.out | sort -r > tracker_rev.out

Run test 2:

/d1/projects/MET/MET_releases/met-10.0.0-beta3/bin/tc_gen -genesis /home/dadriaan/projects/s2s/testing/tracker_rev.out -track /home/dadriaan/projects/s2s/testing/bdecks_rev.out -out ./met_out -log ./met_log -v 10 -config /home/dadriaan/projects/s2s/testing/TCGenConfig_lead48v10.0.0-beta3

Test 2 output:

/home/dadriaan/projects/s2s/testing/test10.0.0-beta3_rev

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Mar 17, 2021

Thanks @DanielAdriaansen for doing this testing. Here's a little more analysis of the output to isolate the actual differences. Note that I removed the INDEX output column since the reverse order will obvious cause differences there!

awk '{for(i=1;i<=NF-1;i++) if(i != 26) printf $i" "; print ""}' test10.0.0-beta3_fwd/met_out_genmpr.txt | sort > test10.0.0-beta3_fwd/met_out_genmpr_sort.txt 
awk '{for(i=1;i<=NF-1;i++) if(i != 26) printf $i" "; print ""}' test10.0.0-beta3_rev/met_out_genmpr.txt | sort > test10.0.0-beta3_rev/met_out_genmpr_sort.txt 
diff test10.0.0-beta3_fwd/met_out_genmpr_sort.txt	test10.0.0-beta3_rev/met_out_genmpr_sort.txt

Running opendiff, it flags 9 differences. Here is the last one listed. It shows the same genesis forecast matching different Best tracks... EP182017 (fwd) vs EP982017 (rev)

VERSION MODEL DESC FCST_LEAD FCST_VALID_BEG FCST_VALID_END OBS_LEAD OBS_VALID_BEG OBS_VALID_END FCST_VAR FCST_UNITS FCST_LEV OBS_VAR OBS_UNITS OBS_LEV OBTYPE VX_MASK INTERP_MTHD INTERP_PNTS FCST_THRESH OBS_THRESH COV_THRESH ALPHA LINE_TYPE TOTAL STORM_ID AGEN_INIT AGEN_FHR AGEN_LAT AGEN_LON AGEN_DLAND BGEN_LAT BGEN_LON BGEN_DLAND GEN_DIST GEN_TDIFF INIT_TDIFF DEV_CAT

< V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_120000 20170923_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP182017 20170921_000000 48 17.6 -106.3 129.43196 17.9 -105 77.01662 141.81415 -120000 600000 FYOY 
---
> V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_180000 20170923_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP982017 20170921_000000 48 17.6 -106.3 129.43196 18.2 -105.2 66.36826 134.30865 -180000 660000 FYOY 

Here's the full set of diff output:

musial6:dadriaan_data_20200317 johnhg$ diff test10.0.0-beta3_fwd/met_out_genmpr_sort.txt test10.0.0-beta3_rev/met_out_genmpr_sort.txt

331c331
< V10.0.0 GFSO GDF 480000 20150925_060000 20150925_060000 NA 20150925_060000 20150925_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 NA 20150923_060000 48 8.7 -99.4 445.60834 NA NA NA NA NA NA FYON 
---
> V10.0.0 GFSO GDF 480000 20150925_060000 20150925_060000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP932015 20150923_060000 48 8.7 -99.4 445.60834 13.1 -103 277.32053 628.237 -360000 840000 FYON 

333c333
< V10.0.0 GFSO GDF 480000 20150926_180000 20150926_180000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP172015 NA NA NA NA NA 13.1 -103 277.32053 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20150926_180000 20150926_180000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP932015 NA NA NA NA NA 13.1 -103 277.32053 NA NA NA FNOY 

372c372
< V10.0.0 GFSO GDF 480000 20151012_120000 20151012_120000 NA 20151012_120000 20151012_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 NA 20151010_120000 48 6.6 -104 658.70215 NA NA NA NA NA NA FYON 
---
> V10.0.0 GFSO GDF 480000 20151012_120000 20151012_120000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP962015 20151010_120000 48 6.6 -104 658.70215 10.2 -117 861.32074 1486.35665 -600000 1080000 FYON 

377d376
< V10.0.0 GFSO GDF 480000 20151013_120000 20151013_120000 NA 20151013_120000 20151013_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 NA 20151011_120000 48 8.9 -108.1 627.31934 NA NA NA NA NA NA FYON 

378a378
> V10.0.0 GFSO GDF 480000 20151013_120000 20151013_120000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP962015 20151011_120000 48 8.9 -108.1 627.31934 10.2 -117 861.32074 987.62358 -360000 840000 FYON 

383c383
< V10.0.0 GFSO GDF 480000 20151015_000000 20151015_000000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP192015 NA NA NA NA NA 10.2 -117 861.32074 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20151015_000000 20151015_000000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP962015 NA NA NA NA NA 10.2 -117 861.32074 NA NA NA FNOY 

867c867
< V10.0.0 GFSO GDF 480000 20160911_120000 20160911_120000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 AL502016 NA NA NA NA NA 15.8 -36.5 1100.86023 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20160911_120000 20160911_120000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 AL522016 NA NA NA NA NA 15.8 -36.5 1100.86023 NA NA NA FNOY 

873,874c873,874
< V10.0.0 GFSO GDF 480000 20160913_000000 20160913_000000 NA 20160913_000000 20160913_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 NA 20160911_000000 48 26.7 -78.8 70.80664 NA NA NA NA NA NA FYON 
< V10.0.0 GFSO GDF 480000 20160913_060000 20160913_060000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 AL112016 NA NA NA NA NA 27.3 -80.2 1.1964 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20160913_000000 20160913_000000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 AL932016 20160911_000000 48 26.7 -78.8 70.80664 27.3 -80.2 1.1964 154.08765 -060000 540000 FYOY 
> V10.0.0 GFSO GDF 480000 20160913_060000 20160913_060000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 AL932016 NA NA NA NA NA 27.3 -80.2 1.1964 NA NA NA FNOY 

1476c1476
< V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_120000 20170923_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP182017 20170921_000000 48 17.6 -106.3 129.43196 17.9 -105 77.01662 141.81415 -120000 600000 FYOY 
---
> V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_180000 20170923_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1925 EP982017 20170921_000000 48 17.6 -106.3 129.43196 18.2 -105.2 66.36826 134.30865 -180000 660000 FYOY 

@JohnHalleyGotway
Copy link
Collaborator

@DanielAdriaansen ran the same test on the feature_1714 branch, and I ran the same commands on the genmpr output lines:

awk '{for(i=1;i<=NF-1;i++) if(i != 26) printf $i" "; print ""}' test10.0.0-beta5_fwd/met_out_genmpr.txt | sort > test10.0.0-beta5_fwd/met_out_genmpr_sort.txt 
awk '{for(i=1;i<=NF-1;i++) if(i != 26) printf $i" "; print ""}' test10.0.0-beta5_rev/met_out_genmpr.txt | sort > test10.0.0-beta5_rev/met_out_genmpr_sort.txt 
diff test10.0.0-beta5_fwd/met_out_genmpr_sort.txt	test10.0.0-beta5_rev/met_out_genmpr_sort.txt

There are now 12 differences in the pairs instead of 9:

diff test10.0.0-beta5_fwd/met_out_genmpr_sort.txt test10.0.0-beta5_rev/met_out_genmpr_sort.txt

331c331
< V10.0.0 GFSO GDF 480000 20150925_060000 20150925_060000 NA 20150925_060000 20150925_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 NA 20150923_060000 48 8.7 -99.4 445.60834 NA NA NA NA NA NA FYON 
---
> V10.0.0 GFSO GDF 480000 20150925_060000 20150925_060000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP932015 20150923_060000 48 8.7 -99.4 445.60834 13.1 -103 277.32053 628.237 -360000 840000 FYON 

333c333
< V10.0.0 GFSO GDF 480000 20150926_180000 20150926_180000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP172015 NA NA NA NA NA 13.1 -103 277.32053 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20150926_180000 20150926_180000 NA 20150926_180000 20150926_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP932015 NA NA NA NA NA 13.1 -103 277.32053 NA NA NA FNOY 

372c372
< V10.0.0 GFSO GDF 480000 20151012_120000 20151012_120000 NA 20151012_120000 20151012_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 NA 20151010_120000 48 6.6 -104 658.70215 NA NA NA NA NA NA FYON 
---
> V10.0.0 GFSO GDF 480000 20151012_120000 20151012_120000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP962015 20151010_120000 48 6.6 -104 658.70215 10.2 -117 861.32074 1486.35665 -600000 1080000 FYON 

377d376
< V10.0.0 GFSO GDF 480000 20151013_120000 20151013_120000 NA 20151013_120000 20151013_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 NA 20151011_120000 48 8.9 -108.1 627.31934 NA NA NA NA NA NA FYON 

378a378
> V10.0.0 GFSO GDF 480000 20151013_120000 20151013_120000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP962015 20151011_120000 48 8.9 -108.1 627.31934 10.2 -117 861.32074 987.62358 -360000 840000 FYON 

383c383
< V10.0.0 GFSO GDF 480000 20151015_000000 20151015_000000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP192015 NA NA NA NA NA 10.2 -117 861.32074 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20151015_000000 20151015_000000 NA 20151015_000000 20151015_000000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP962015 NA NA NA NA NA 10.2 -117 861.32074 NA NA NA FNOY 

859c859
< V10.0.0 GFSO GDF 480000 20160909_180000 20160909_180000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL502016 20160907_180000 48 13.1 -33.6 942.82153 15.8 -36.5 1100.86023 433.63889 -420000 900000 FYON 
---
> V10.0.0 GFSO GDF 480000 20160909_180000 20160909_180000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL522016 20160907_180000 48 13.1 -33.6 942.82153 15.8 -36.5 1100.86023 433.63889 -420000 900000 FYON 

867c867
< V10.0.0 GFSO GDF 480000 20160911_120000 20160911_120000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL502016 NA NA NA NA NA 15.8 -36.5 1100.86023 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20160911_120000 20160911_120000 NA 20160911_120000 20160911_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL522016 NA NA NA NA NA 15.8 -36.5 1100.86023 NA NA NA FNOY 

873,874c873,874
< V10.0.0 GFSO GDF 480000 20160913_000000 20160913_000000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL112016 20160911_000000 48 26.7 -78.8 70.80664 27.3 -80.2 1.1964 154.08765 -060000 540000 FYOY 
< V10.0.0 GFSO GDF 480000 20160913_060000 20160913_060000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL112016 NA NA NA NA NA 27.3 -80.2 1.1964 NA NA NA FNOY 
---
> V10.0.0 GFSO GDF 480000 20160913_000000 20160913_000000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL932016 20160911_000000 48 26.7 -78.8 70.80664 27.3 -80.2 1.1964 154.08765 -060000 540000 FYOY 
> V10.0.0 GFSO GDF 480000 20160913_060000 20160913_060000 NA 20160913_060000 20160913_060000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 AL932016 NA NA NA NA NA 27.3 -80.2 1.1964 NA NA NA FNOY 

1476c1476
< V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_120000 20170923_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP182017 20170921_000000 48 17.6 -106.3 129.43196 17.9 -105 77.01662 141.81415 -120000 600000 FYOY 
---
> V10.0.0 GFSO GDF 480000 20170923_000000 20170923_000000 NA 20170923_180000 20170923_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP982017 20170921_000000 48 17.6 -106.3 129.43196 18.2 -105.2 66.36826 134.30865 -180000 660000 FYOY 

1841d1840
< V10.0.0 GFSO GDF 480000 20180805_060000 20180805_060000 NA 20180804_180000 20180804_180000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP112018 20180803_060000 48 12.4 -102.4 301.81152 12.3 -94.5 187.68414 859.11471 120000 360000 FYON 

1843a1843
> V10.0.0 GFSO GDF 480000 20180805_060000 20180805_060000 NA 20180805_120000 20180805_120000 GENESIS NA NA GENESIS NA NA BEST NA NA NA NA NA NA NA GENMPR 1924 EP122018 20180803_060000 48 12.4 -102.4 301.81152 13.7 -105 287.53906 316.91791 -060000 540000 FYOY 

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Mar 18, 2021

Telecon on 3/18/21 to discuss these issues.

  • Problem with different matches when the order of the input files is reversed:

    • Recommend excluding Best track data with a cyclone number >= 50.
    • Cyclone numbers of 50 are used by NHC for pre-season testing and 90's are used for INVESTS.
    • Alternatively, could exclude based on storm name "INVEST".
    • Will hard-code this threshold of 50 rather than adding a config file option. Consider adding a high debug level log message to note when 50+ tracks are being skipped.
    • NOTE: Be sure update the tc_gen documentation to note this!
    • @DanielAdriaansen will re-test on this feature branch once this change has been added
  • Problem with TC-Gen logic for the application to S2S:

    • Recommend that we add a new config option to control whether we're matching the fcst genesis point to the entire Best track or only to the Best genesis point. Default should be point to track matching.
  • Per @halperin-erau , NHC would like us to change ops_hit_tdiff to ops_hit_window.

@TaraJensen TaraJensen added requestor: DTC/MRW DTC Medium Range Weather T&E and removed alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Mar 31, 2021
JohnHalleyGotway added a commit that referenced this issue Apr 2, 2021
JohnHalleyGotway added a commit that referenced this issue Apr 7, 2021
… TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?
JohnHalleyGotway added a commit that referenced this issue Apr 7, 2021
…mple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.
JohnHalleyGotway added a commit that referenced this issue Apr 7, 2021
JohnHalleyGotway added a commit that referenced this issue Apr 7, 2021
JohnHalleyGotway added a commit that referenced this issue Apr 8, 2021
…ated the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!
@JohnHalleyGotway JohnHalleyGotway linked a pull request Apr 8, 2021 that will close this issue
11 tasks
@JohnHalleyGotway JohnHalleyGotway moved this from In progress to Pull request review in MET-10.0.0-beta5 (4/26/21) Apr 8, 2021
JohnHalleyGotway pushed a commit that referenced this issue Apr 8, 2021
…enesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.
JohnHalleyGotway added a commit that referenced this issue Apr 9, 2021
… columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!
JohnHalleyGotway added a commit that referenced this issue Apr 9, 2021
* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
MET-10.0.0-beta5 (4/26/21) automation moved this from Pull request review to Done Apr 9, 2021
@JohnHalleyGotway JohnHalleyGotway linked a pull request Apr 12, 2021 that will close this issue
11 tasks
JohnHalleyGotway added a commit that referenced this issue Apr 12, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
JohnHalleyGotway added a commit that referenced this issue Apr 17, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
JohnHalleyGotway added a commit that referenced this issue Apr 27, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
JohnHalleyGotway added a commit that referenced this issue May 24, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
JohnHalleyGotway added a commit that referenced this issue Jun 1, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
JohnHalleyGotway added a commit that referenced this issue Jun 13, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Feature 1792 gen_vx_mask (#1816)

* Per issue #1792, Change -type from optional to required. Set default_mask_type to MaskType_None. Added a check on mask_type to see if it's set and print error message accordingly.

* Update test_gen_vx_mask.sh

* For the first test, added -type poly, since the masking type is now required. SL

* For all of the Poly unit tests added -type poly to the command line. The mask type is now required. SL

* Modified document to indicate that -type string (masking type) is now required on the command line for gen_vx_mask. SL

* Update met/docs/Users_Guide/masking.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Update met/src/tools/other/gen_vx_mask/gen_vx_mask.cc

Co-authored-by: johnhg <johnhg@ucar.edu>

Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu>
Co-authored-by: johnhg <johnhg@ucar.edu>

* Fix 2 minor formatting errors in the release notes.

* PR #1816 for issue #1792 unexpectedly caused the NB to fail on 20210605. We changed -type from optional to required, but missed adding the -type option in unit_met_test_scripts.xml and unit_ref_config.xml. This is a hotfix to resolve that.

* Feature 1811 anchor links (#1822)

* testing new anchoring link idea.

* testing without the bold asterik

* tinkering with the look

* another attempt

* another attempt #2

* another attempt #3

* another attempt #4

* making sure new anchor works as expected.

* seeing if link will save with spaces instead of dashes

* need underscores to link

* is it fixed?

* testing

* testing 2

* testing 4

* testing 5

* testing 6

* testing 7

* testing 8

* going back to test original problem

* able to link with spaces instead of underscores. Testing if a return is possible to keep under 79 character limit.

* double checking everything is still working.

* DO NOT break ref lines apart, it won't work.

* trying a shorter name.

* continuing to add anchors

* updating lines 1946 thru 2214 with anchors

* updating lines 2214 thru 3371 with anchors

* updating lines 3371 to the end with anchors

* testing anchor

* testing anchor

* testing anchor 3

* testing anchor 4

* testing anchor 45 percent

* testing anchor final half

* fixing typo

* numbering fcst, obs_1 and 2 to create different links.

* finding more anchors that need numbers to keep them separate.

* fixing warnings

* fixing warnings

* fixing typo

* Feature 1749 hss (#1825)

* Per #1749, updating the MET version number from 10.0 to 10.1 prior to adding new columns of output to existing line types.

* Per #1749, adding 10.1 columns to the Makefile.am

* Per #1749, changes for the mechanics of adding the HSS_EC statistic to the MCTS line type. Still need to acutally compute it and make the expected correct value configurable.

* Per #1749, add hss_ec_value as a configurable option for Point-Stat and Grid-Stat. Still need to actually compute it correctly, add it to other test config files, add support to series_analysis/stat_analysis, update the docs, and make writeup corresponding issues for other METplus components.

* Per #1749, fix the column offsets for the HSS_EC columns.

* Per #1749, add correct definition of HSS_EC.

* Per #1749, pass hss_ec_value from the config file into the computation of the MCTS statistics.

* Per #1749, add hss_ec_value entry to all the Grid-Stat config files.

* Per #1749, update the documentation about the HSS_EC statistic.

* Per #1749, add the -hss_ec_value job command option to Stat-Analysis.

* Per #1749, no real code changes here. Just changing to consistent ordering with hss_ec_value preceeding rank_corr_flag.

* Per #1749, update docs for stat_analysis supporting hss_ec_value.

* Per #1749, add HSS_EC to Series-Analysis, but only with a constant hss_ec_value for now.

* Per #1749, add EC_VALUE to the MCTC line type definition.

* Per #1749, move ECvalue from the MCTSInfo class into the ContingencyTable class so that it's available to be included in the MCTC output line type.

* Per #1749, update point_stat, grid_stat, and series_analysis to accomodate the move of ECvalue from the MCTSInfo class to the ContingencyTable class.

* Per #1749, update library code to write EC_VALUE to the MCTC line type and update the User's Guide docs.

* Per #1749, update stat_analysis code for the addition of EC_VALUE in the MCTC line type.

* Per #1749, write EC_VALUE to the MCTC output line type.

* Per #1749, store the ec_value that was actually used to compute the stats.

* Per #1749, parsing EC_VALUE from the MCTC line type.

* Per #1749, move the MCTC EC_VALUE column to the end of the line, as requested by METdatadb.

* Per #932, need to write MCTS HSS_EC value to temp file during the bootstrapping process.

* Added new reference for Ou 2016

* Layout correction

* Added generalized HSS, removed word from HSS_EC

* Per #1749, change the hss_ec_value config entry to match new conventions.

Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>

* Feature 1826 v10.1.0_beta1 (#1828)

* Per #1826, add update the version in the docs to 10.1.0-beta1 and add release notes for this development version.

* Per #1826, change the beta1 release date to 6/11 so that I can do it today.

* Revoming Randy and David from the email notification list for nightly run scripts.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
requestor: DTC/MRW DTC Medium Range Weather T&E requestor: University/UIUC University of Illinois, Urbana-Champaign type: enhancement Improve something that it is currently doing
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants