Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correction to crps_climo calculation, CRPSS #1451

Closed
21 of 22 tasks
j-opatz opened this issue Aug 10, 2020 · 6 comments · Fixed by #1676 or #1683
Closed
21 of 22 tasks

Correction to crps_climo calculation, CRPSS #1451

j-opatz opened this issue Aug 10, 2020 · 6 comments · Fixed by #1676 or #1683
Assignees
Labels
reporting: DTC NOAA R2O NOAA Research to Operations DTC Project requestor: NOAA/EMC NOAA Environmental Modeling Center type: enhancement Improve something that it is currently doing
Milestone

Comments

@j-opatz
Copy link
Contributor

j-opatz commented Aug 10, 2020

Describe the Enhancement

During the use of Ensemble Stat, a user observed that the CRPSS from MET did not match the CRPSS from VSDB. Further investigation showed that the CRPS matched between the two calculations. As the climatology CRPS is the added factor for calculating CRPSS, this is the most likely source of the error.

Investigation has revealed that MET's handling of climo data for ensemble verification differs from how it's done at EMC. Ensemble-Stat's use of climo data was made consistent with Point-Stat and Grid-Stat, where individual obs are subset into climo CDF bins, stats are computed within each bin, and the mean of those stats are reported. However, that is NOT how NOAA/EMC uses climo data for ensemble verification.

For met-10.0.0, we should remove the climo CDF subsetting logic. Instead, use the climo mean and standard deviation, assuming a normal distribution to sample from the climo distribution. Process those climo values as an ensemble. By default, NOAA/EMC uses a climo ensemble of size 10, but this should be configurable.

Since the climo CRPS and CRPS skill score are now dependent on both the climo mean and standard deviation, we need to add the climo standard deviation to the ORANK line type. That's necessary for Stat-Analysis to be aggregate ORANK lines together when computing the ECNT output line type.

Time Estimate

Not established; dependent on ID of problem

Sub-Issues

  • Discover where in the crps_climo calculation the error is made

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

2791541

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required: John HG
  • Select scientist(s) or no scientist required: John O

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones
  • Select milestone

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

  • Complete the issue definition above.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@j-opatz j-opatz added type: enhancement Improve something that it is currently doing component: application code requestor: NOAA/EMC NOAA Environmental Modeling Center priority: medium Medium Priority alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Aug 10, 2020
@j-opatz j-opatz added this to the MET 10.0 milestone Aug 10, 2020
@j-opatz j-opatz self-assigned this Aug 10, 2020
@j-opatz j-opatz added this to To do in MET-10.0.0-beta1 (10/22/20) via automation Aug 10, 2020
@JohnHalleyGotway
Copy link
Collaborator

Please find Binbin's sample test data attached.
crpss_met_vsdb.tar.gz

@JohnHalleyGotway
Copy link
Collaborator

Met with EMC folks on 9/8/2020 and Binbin requested that CRPS climo to be added to the ECNT line type.

@JohnHalleyGotway JohnHalleyGotway added alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle and removed alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle labels Sep 10, 2020
@TaraJensen TaraJensen added priority: high and removed alert: NEED ACCOUNT KEY Need to assign an account key to this issue priority: medium Medium Priority labels Sep 22, 2020
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-beta2 (12/7/20) via automation Oct 20, 2020
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-beta3 (1/27/21) via automation Dec 4, 2020
@JohnHalleyGotway JohnHalleyGotway added this to To do in MET-10.0.0-beta4 (3/2/21) via automation Jan 25, 2021
@JohnHalleyGotway JohnHalleyGotway self-assigned this Feb 1, 2021
JohnHalleyGotway added a commit that referenced this issue Feb 12, 2021
@JohnHalleyGotway
Copy link
Collaborator

Currently the CRPS climo is computed using only the climo mean. To make the logic consistent with NCEP (i.e. Binbin's example) we'll need both the climo mean and standard deviation. Use those to extract an ensemble of equally-likely climo values.

One implication is that we'll need to add CLIMO_STDEV to the ORANK line type. That's need to be able to aggregate ORANK line types into ECNT and recompute the Gaussian CRPSS.

JohnHalleyGotway added a commit that referenced this issue Feb 15, 2021
JohnHalleyGotway added a commit that referenced this issue Feb 17, 2021
* Per #1450, add new ECNT columns for Hersback CRPS. Still need to actually compute the stats though.

* Per #1450, update NumArray functions to only sort if the data is not yet sorted. And check for bad data when computing the standard deviation.

* Per #1450, add code to compute the empirical CRPS value.

* Per #1450, large change to the new output for the empirical CRPS. In order to aggregate decomposed empirical CRPS reliability and potential correctly, we'd need to write (n+1)*2 additional columns. While the empirical crps can be aggregated as a weighted mean, the decomposition cannot. It just isn't feasible to do this in the ECNT line type. If this reliability and potential really are required, recommend that we add an entirely new CRPS line type instead of tacking onto ECNT. These changes simply remove reliabilit and potential from the output.

* Per #1450 and #1451, replacing single CRPS_CLIMO column with CRPSCL and CRPSCL_EMP which will be needed for #1451.

* Per #1450, delete temp files I'd accidentally committed.

* Per #1450, update the user's guide with CRPS updates.

* Fix bug replacing crpss_emp with crpss_gaus.
JohnHalleyGotway added a commit that referenced this issue Feb 18, 2021
* Per 1646, one line fix for cut-and-paste error. (#1647)

* Per #1644, no actual code changes here. Just formatting and spacing. For example, replace double ;; with single ;'

* Per #1644, FOUND THE BUG! It's a copy/paste error. We had var_name_map.end() that should be def_var_name_map.end(). Fixing that gets rid of the runtime hang.'

* Per #1643, redefine the contents of the existing AREA_RATIO output column from MODE. Define it as FCST/OBS object area instead of min/max. Update the User's Guide to note the change and also clarify that the MTD VOLUME_RATIO output really is FCST/OBS. (#1650)

* Feature 1644 ps_log (#1651)

* Per #1644, write rejection reason codes at verbosity 2 when there are 0 matched pairs.

* Per #1644, add a few sentences to Point-Stat, Practical Information chapter about debugging 0 matched pairs.

* The mode_conv.pl logic was slightly broken. MET PR #1650 should have broken the NB but it did not. Turns out the diffing logic is NOT properly distinguishing between single and pair object lines. It does this by looking for an underscore in the OBJECT_ID column. When we added FCST_UNITS and OBS_UNITS, that shifted OBJECT_ID up 2 spots, but the code was still checking the (0-based) 20th column instead of the 22nd. Fixing this now and will rerun NB20210202 to confirm it works again.

* The diffing logic for MODE pair lines still was not correct. We'd added the ASPECT_DIFF and CURVATURE_RATIO columns a while ago, but they were missing from the diff logic. This logic really is not good. We need to make it more robust, reading the version-specific header columns from a table file instead of hard-coding them!

* Feature 1653 rscripts (#1654)

* Per #1653, update plot_cnt.R and plot_mpr.R to remove the version-specific header columns.

* Per #1653, nice enhancments to these Rscripts to make them more independent of the MET version number.

* Per #1653, more tweaks

* Per #1653, if no input files are provided, error out with a useful message.

* Per #1653, while the scripts ran fine using R 4.0.2 on my Mac, they fail on eyewall using R 3.4.0. Adding as.character() to get past that error.

* Feature 1655 nc_log (#1656)

* #1630 Display a warning instead of error message with invalid variable if the input data is empty

* Feature 1658 grib_tables (#1659)

* Per #1658, update MXUPHL entries.

* Per #1658, updating long name for MAXREF, MAXUVV, and MAXDVV.

* Modified format of release notes

* Feature 1450 hersbach (#1662)

* Per #1450, add new ECNT columns for Hersback CRPS. Still need to actually compute the stats though.

* Per #1450, update NumArray functions to only sort if the data is not yet sorted. And check for bad data when computing the standard deviation.

* Per #1450, add code to compute the empirical CRPS value.

* Per #1450, large change to the new output for the empirical CRPS. In order to aggregate decomposed empirical CRPS reliability and potential correctly, we'd need to write (n+1)*2 additional columns. While the empirical crps can be aggregated as a weighted mean, the decomposition cannot. It just isn't feasible to do this in the ECNT line type. If this reliability and potential really are required, recommend that we add an entirely new CRPS line type instead of tacking onto ECNT. These changes simply remove reliabilit and potential from the output.

* Per #1450 and #1451, replacing single CRPS_CLIMO column with CRPSCL and CRPSCL_EMP which will be needed for #1451.

* Per #1450, delete temp files I'd accidentally committed.

* Per #1450, update the user's guide with CRPS updates.

* Fix bug replacing crpss_emp with crpss_gaus.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
JohnHalleyGotway added a commit that referenced this issue Feb 18, 2021
…r each CDF bin. Removing the bin-related arguments from the write_ecnt functions.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
…nsemble-Stat will no longer compute stats separately for each climo bin, I'm removing the reference to write_bins from the Ensemble-Stat config files.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
…ak apart the normal climo computation into separate functions for crps, ign, and pit.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
…climo CDF bins. We had done this to be consistent with the use of climo data in point and grid-stat. But this change to the handling of climo data is consitent with the NOAA/EMC approach.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
…ate function so that it can also be called by stat-analysis.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
…opy to make the logic of doing this in Stat-Analysis a little easier.
JohnHalleyGotway added a commit that referenced this issue Feb 19, 2021
… type. Needed to call set_climo_cdf() there so that we know how many climo values to use when computing the empirical climo CRPS.
@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Feb 19, 2021

I realize that HiRA in Point-Stat also writes the ECNT line type. Here's a selection of columns taken from:

cat MET_test_output/climatology/point_stat_WMO_CLIMO_1.5DEG_120000L_20120409_120000V_ecnt.txt | awk '{print $10, $12, $18, $19, $24, $27, $28, $37, $39, $40}'

FCST_VAR FCST_LEV INTERP_MTHD INTERP_PNTS LINE_TYPE CRPS CRPSS CRPSCL CRPSCL_EMP CRPSS_EMP
TMP P850 NBRHD_SQUARE 4 ECNT 0.80461 0.61819 2.10735 2.05663 0.57563
TMP P850 NBRHD_SQUARE 9 ECNT 0.85146 0.59596 2.10735 2.05663 0.5682
TMP P850 NBRHD_SQUARE 16 ECNT 0.96009 0.54441 2.10735 2.05663 0.51426
TMP P850 NBRHD_SQUARE 25 ECNT 1.04051 0.50625 2.10735 2.05663 0.47649

Note that the CRPSCL and CRPSCL_EMP columns remain constant. That is because we're using the climo_mean/climo_stdev at the obs location to compute them. So while the neighborhood size of the forecast ensemble members change, the reference climatology remains fixed across all neighborhood sizes.

This is probably just fine, but is interesting to note.

JohnHalleyGotway added a commit that referenced this issue Feb 20, 2021
…empty OBS_UNIT string and write NA instead.
JohnHalleyGotway added a commit that referenced this issue Feb 20, 2021
…emble-Stat to also include point verification. Tweak the Ensemble-Stat cofiguration for that and also add a call to pb2nc to prepare the point observations for use.
@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Feb 20, 2021

I compared the result of the updated code to the sample data from Binbin as follows.

(1) Wrote to_orank.py to convert CRPSS.data to 14760 ORANK lines as if they were written by Ensemble-Stat.
to_orank.py.txt
That produces this result:
CRPSS_orank.txt

(2) Ran that through Stat-Analysis:

stat_analysis -lookin CRPSS_orank.txt -job aggregate_stat -line_type ORANK -out_line_type ECNT -v 4 -out_bin_size 0.1 -out orank_to_ecnt.txt
COL_NAME: TOTAL N_ENS    CRPS   CRPSS     IGN      ME   RMSE  SPREAD ME_OERR RMSE_OERR SPREAD_OERR SPREAD_PLUS_OERR  CRPSCL CRPS_EMP CRPSCL_EMP CRPSS_EMP
    ECNT: 14760    20 4.94721 0.49744 3.64504 1.19287 9.2985 10.0277      NA        NA          NA               NA 9.84398  5.10959   10.24482   0.50125

orank_to_ecnt.txt
(3) For easier comparison, I tweaked CRPSS.f to use a constant weight of 1.0. I also wrote CRPSS as the mean of the CPRSS for each point and the CRPSS of the avg CRPS scores:

 CRPSF =   5.10960197      CRPSC =   9.77567101    
 Avg of CRPSS =  0.234634370      CRPSS from Avg =  0.477314472  

Results:

  • EMC is using the empirical Hersbach CRPS method.
  • MET ensemble CRPS_EMP of 5.10959 is ALMOST IDENTICAL to the VSDB ensemble CRPSF of 5.10960197.
  • MET climo CRPSCL_EMP of 10.24482 is SIMILAR TO to the VSDB climo CRPSC of 9.77567101. MET computes this using the mean and stdev of the 9 climo vals while VSDB uses the 9 climo vals directly. And that will definitely cause a difference.
  • MET CRPSS_EMP of 0.50125 is SIMILAR TO VSDB CRPSS from the average of 0.477314472.

Notes:

  • I assume the CRPSS.f code has a bug. Instead of computing the CRPSS for each point and then averaging them, it should really be averaging the CRPSF and CRPSC across all those points and computing CRPSS once using the average scores. But I need Binbin to confirm.
  • For some reason this simple Stat-Analysis job takes almost 2.5 minutes to run. I should figure out what's slowing it down so much and fix it!

@JohnHalleyGotway JohnHalleyGotway linked a pull request Feb 22, 2021 that will close this issue
10 tasks
@JohnHalleyGotway JohnHalleyGotway moved this from To do to Pull request review in MET-10.0.0-beta4 (3/2/21) Feb 22, 2021
JohnHalleyGotway added a commit that referenced this issue Feb 24, 2021
* Per #1450, add new ECNT columns for Hersback CRPS. Still need to actually compute the stats though.

* Per #1450, update NumArray functions to only sort if the data is not yet sorted. And check for bad data when computing the standard deviation.

* Per #1450, add code to compute the empirical CRPS value.

* Per #1450, large change to the new output for the empirical CRPS. In order to aggregate decomposed empirical CRPS reliability and potential correctly, we'd need to write (n+1)*2 additional columns. While the empirical crps can be aggregated as a weighted mean, the decomposition cannot. It just isn't feasible to do this in the ECNT line type. If this reliability and potential really are required, recommend that we add an entirely new CRPS line type instead of tacking onto ECNT. These changes simply remove reliabilit and potential from the output.

* Per #1450 and #1451, replacing single CRPS_CLIMO column with CRPSCL and CRPSCL_EMP which will be needed for #1451.

* Per #1450, delete temp files I'd accidentally committed.

* Per #1450, update the user's guide with CRPS updates.

* Per #1451, instead of computing the climo crps on the fly, compute and store it separately for each point.

* Per #1451, the ECNT line type will no longer be written separately for each CDF bin. Removing the bin-related arguments from the write_ecnt functions.

* Per #1451, the climo_cdf.write_bins option no longer applies. Since Ensemble-Stat will no longer compute stats separately for each climo bin, I'm removing the reference to write_bins from the Ensemble-Stat config files.

* Per #1451, compute and store the climo CRPS for each point. Also, break apart the normal climo computation into separate functions for crps, ign, and pit.

* Per #1451, update Ensemble-Stat logic to no longer subset pairs into climo CDF bins. We had done this to be consistent with the use of climo data in point and grid-stat. But this change to the handling of climo data is consitent with the NOAA/EMC approach.

* Per #1451, split out the setting of climo CDF thresholds into a separate function so that it can also be called by stat-analysis.

* Per #1451, in the Ensemble-Stat ORANK line type, rename CLIMO to CLIMO_MEAN and add a CLIMO_STDEV column.

* Per #1451, also need to update gsidens2orank to write a climo_stdev column.

* Per #1451, switch from constant pointer to ClimoCDFInfo object to a copy to make the logic of doing this in Stat-Analysis a little easier.

* Per #1451, the HiRA method in Point-Stat computes an ECNT output line type. Needed to call set_climo_cdf() there so that we know how many climo values to use when computing the empirical climo CRPS.

* Per #1451, need to store climo_cdf for both grid and point verification.

* Per #1451, update to write the CLIMO_STDEV header column for the ORANK line type.

* Per #1451, in Ensemble-Stat when doing point verification, check for empty OBS_UNIT string and write NA instead.

* Per #1451, update unit tests by enhancing the climatology call to Ensemble-Stat to also include point verification. Tweak the Ensemble-Stat cofiguration for that and also add a call to pb2nc to prepare the point observations for use.

* Per #1450, added a new section to the Ensemble-Stat chapter describing how climo mean/stdev are used in the computation of the skill scores.

* Update ensemble-stat.rst

Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
MET-10.0.0-beta4 (3/2/21) automation moved this from Pull request review to Done Feb 24, 2021
@JohnHalleyGotway JohnHalleyGotway linked a pull request Feb 25, 2021 that will close this issue
10 tasks
JohnHalleyGotway added a commit that referenced this issue Feb 25, 2021
* Per 1646, one line fix for cut-and-paste error. (#1647)

* Per #1644, no actual code changes here. Just formatting and spacing. For example, replace double ;; with single ;'

* Per #1644, FOUND THE BUG! It's a copy/paste error. We had var_name_map.end() that should be def_var_name_map.end(). Fixing that gets rid of the runtime hang.'

* Per #1643, redefine the contents of the existing AREA_RATIO output column from MODE. Define it as FCST/OBS object area instead of min/max. Update the User's Guide to note the change and also clarify that the MTD VOLUME_RATIO output really is FCST/OBS. (#1650)

* Feature 1644 ps_log (#1651)

* Per #1644, write rejection reason codes at verbosity 2 when there are 0 matched pairs.

* Per #1644, add a few sentences to Point-Stat, Practical Information chapter about debugging 0 matched pairs.

* The mode_conv.pl logic was slightly broken. MET PR #1650 should have broken the NB but it did not. Turns out the diffing logic is NOT properly distinguishing between single and pair object lines. It does this by looking for an underscore in the OBJECT_ID column. When we added FCST_UNITS and OBS_UNITS, that shifted OBJECT_ID up 2 spots, but the code was still checking the (0-based) 20th column instead of the 22nd. Fixing this now and will rerun NB20210202 to confirm it works again.

* The diffing logic for MODE pair lines still was not correct. We'd added the ASPECT_DIFF and CURVATURE_RATIO columns a while ago, but they were missing from the diff logic. This logic really is not good. We need to make it more robust, reading the version-specific header columns from a table file instead of hard-coding them!

* Feature 1653 rscripts (#1654)

* Per #1653, update plot_cnt.R and plot_mpr.R to remove the version-specific header columns.

* Per #1653, nice enhancments to these Rscripts to make them more independent of the MET version number.

* Per #1653, more tweaks

* Per #1653, if no input files are provided, error out with a useful message.

* Per #1653, while the scripts ran fine using R 4.0.2 on my Mac, they fail on eyewall using R 3.4.0. Adding as.character() to get past that error.

* Feature 1655 nc_log (#1656)

* #1630 Display a warning instead of error message with invalid variable if the input data is empty

* Feature 1658 grib_tables (#1659)

* Per #1658, update MXUPHL entries.

* Per #1658, updating long name for MAXREF, MAXUVV, and MAXDVV.

* Modified format of release notes

* Feature 1450 hersbach (#1662)

* Per #1450, add new ECNT columns for Hersback CRPS. Still need to actually compute the stats though.

* Per #1450, update NumArray functions to only sort if the data is not yet sorted. And check for bad data when computing the standard deviation.

* Per #1450, add code to compute the empirical CRPS value.

* Per #1450, large change to the new output for the empirical CRPS. In order to aggregate decomposed empirical CRPS reliability and potential correctly, we'd need to write (n+1)*2 additional columns. While the empirical crps can be aggregated as a weighted mean, the decomposition cannot. It just isn't feasible to do this in the ECNT line type. If this reliability and potential really are required, recommend that we add an entirely new CRPS line type instead of tacking onto ECNT. These changes simply remove reliabilit and potential from the output.

* Per #1450 and #1451, replacing single CRPS_CLIMO column with CRPSCL and CRPSCL_EMP which will be needed for #1451.

* Per #1450, delete temp files I'd accidentally committed.

* Per #1450, update the user's guide with CRPS updates.

* Fix bug replacing crpss_emp with crpss_gaus.

* #1657 Added TIME_EPSILON

* #1657 Corrected 1 second offset by the precision error

* #1657 Added AccumTime

* #1657 Read the time from "bounds" attribute and set the max value from the bounds time variable

* #1657 Corrected 1 second offset by the precision error

* Per #1439, add check_mask_names() utility function which errors out if the list of masking region names is non-unique. Update Point-Stat and Grid-Stat to call it. (#1679)

* Feature 1451 crpss (#1676)

* Per #1450, add new ECNT columns for Hersback CRPS. Still need to actually compute the stats though.

* Per #1450, update NumArray functions to only sort if the data is not yet sorted. And check for bad data when computing the standard deviation.

* Per #1450, add code to compute the empirical CRPS value.

* Per #1450, large change to the new output for the empirical CRPS. In order to aggregate decomposed empirical CRPS reliability and potential correctly, we'd need to write (n+1)*2 additional columns. While the empirical crps can be aggregated as a weighted mean, the decomposition cannot. It just isn't feasible to do this in the ECNT line type. If this reliability and potential really are required, recommend that we add an entirely new CRPS line type instead of tacking onto ECNT. These changes simply remove reliabilit and potential from the output.

* Per #1450 and #1451, replacing single CRPS_CLIMO column with CRPSCL and CRPSCL_EMP which will be needed for #1451.

* Per #1450, delete temp files I'd accidentally committed.

* Per #1450, update the user's guide with CRPS updates.

* Per #1451, instead of computing the climo crps on the fly, compute and store it separately for each point.

* Per #1451, the ECNT line type will no longer be written separately for each CDF bin. Removing the bin-related arguments from the write_ecnt functions.

* Per #1451, the climo_cdf.write_bins option no longer applies. Since Ensemble-Stat will no longer compute stats separately for each climo bin, I'm removing the reference to write_bins from the Ensemble-Stat config files.

* Per #1451, compute and store the climo CRPS for each point. Also, break apart the normal climo computation into separate functions for crps, ign, and pit.

* Per #1451, update Ensemble-Stat logic to no longer subset pairs into climo CDF bins. We had done this to be consistent with the use of climo data in point and grid-stat. But this change to the handling of climo data is consitent with the NOAA/EMC approach.

* Per #1451, split out the setting of climo CDF thresholds into a separate function so that it can also be called by stat-analysis.

* Per #1451, in the Ensemble-Stat ORANK line type, rename CLIMO to CLIMO_MEAN and add a CLIMO_STDEV column.

* Per #1451, also need to update gsidens2orank to write a climo_stdev column.

* Per #1451, switch from constant pointer to ClimoCDFInfo object to a copy to make the logic of doing this in Stat-Analysis a little easier.

* Per #1451, the HiRA method in Point-Stat computes an ECNT output line type. Needed to call set_climo_cdf() there so that we know how many climo values to use when computing the empirical climo CRPS.

* Per #1451, need to store climo_cdf for both grid and point verification.

* Per #1451, update to write the CLIMO_STDEV header column for the ORANK line type.

* Per #1451, in Ensemble-Stat when doing point verification, check for empty OBS_UNIT string and write NA instead.

* Per #1451, update unit tests by enhancing the climatology call to Ensemble-Stat to also include point verification. Tweak the Ensemble-Stat cofiguration for that and also add a call to pb2nc to prepare the point observations for use.

* Per #1450, added a new section to the Ensemble-Stat chapter describing how climo mean/stdev are used in the computation of the skill scores.

* Update ensemble-stat.rst

Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>

* #1677 Update the refence time (from time_bnds variable) (#1680)

Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>

* Feature 1135 stat_analysis (#1681)

* Per #1135, add fcst/obs_init/valid_inc/exc options for STAT-Analysis jobs.

* Per #1135, update all the STATAnalysis config files to include entries for the new fcst/obs_init/valid_inc/exc options.

* Per #1135, add documentation for fcst/obs_init/valid_inc/exc options to the STAT-Analysis chapter. Also, clarify the description for the existing options.

* Per #1135, adding another call to stat_analysis to check the time filtering options.

* Per #1135, just renaming stat_analysis output file.

* Apply suggestions from code review

Co-authored-by: jprestop <jpresto@ucar.edu>

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: jprestop <jpresto@ucar.edu>

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
@JohnHalleyGotway
Copy link
Collaborator

From NOAA/EMC on 3/16/21:
We just completed the retrospective verification of GEFS with MET_10.0.0_beta4 over past
4 weeks, and obtained the averaged CRPSS for different forecast hours. The comparison
shows that CRPSS-MET and CRPSS-VSDB are very close. See attached plot for Z500
over N. Hemisphere. This is very nice. So the new CRPSS code in MET_10.0.beta4 is
successful.
CRPSS-GEFS_Z500-Compare

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reporting: DTC NOAA R2O NOAA Research to Operations DTC Project requestor: NOAA/EMC NOAA Environmental Modeling Center type: enhancement Improve something that it is currently doing
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants