New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics: make CRPS handle obs outside the fx support #781
Conversation
Revise the CRPS calculation to handle the case where the observation is outside the forecast support. For example, if Prob(Power <= 10 MW) = 0% but then the observation is 9 MW. Or if Prob(Power <= 30 MW) = 100% but ten the observation is 31 MW. This change required some subtle changes to how the vectorized calculation is performed. Also, this commit switches the integration from the rectangular rule to a quadrature rule, which seems result in more accurate CRPS calculations when the number of forecast CDF intervals is low (e.g., 10 or less). This commit also updates the docstring of the CRPS function and the tests, including comparisons against examples where the forecast is defined by a continuous parametric distribution that allows calculating the CRPS analytically. Note: this branch still needs to validate the CRPS skill score calculation and related tests. Also, it would be good to include some "simpler" CRPS calculation examples (e.g., with 3 or 5 CDF intervals, but that may not be practical.
CI failed because I forgot about the |
Simplify the integration using the numpy trapezoidal function. The result is identical to the prior code (since underneath it's the same operations), but using np.trapz() makes it clearer what method is being used and allows users to read up on the numpy documentation regarding the integration method.
Add additional tests for the CRPS and CRPSS (CRPS skill score) functions, including "simple" examples with 3 CDF intervals to help show the logical of the trapezoidal integration.
Modify how the forecast CDF is expanded and the observation indicator function is calculated to support numpy broadcasting when the number of samples is greater than or equal to two (i.e., n >= 2).
Revert to assuming observations are provided as numpy arrays, including in the case where the observation is for a single sample. This matches the previous logic and helps prevent issues in other parts of the code base.
Note that since the CRPS calculation now uses a more generalized numerical integration setup, some of the example CRPS values had to be adjusted as the "simple" CRPS examples in the calculator tests are not very realistic (e.g., only 3 forecast CDF intervals). Rather than complicate the examples in these tests, I instead corrected the CRPS values for the given examples. Also, this commit corrects a mispelling of the Brier score name in a code comment.
@lboeman this PR (for revising the CRPS metric calculation) is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# (n, d) ==> (n, d + 1) | ||
fx_min = np.minimum(obs, fx[:, 0]) | ||
fx = np.hstack([fx_min[:, np.newaxis], fx]) | ||
fx_prob = np.hstack([np.zeros([n, 1]), fx_prob]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.zeros
or np.full(fx[:, 0])
? Just double checking since I don't remember discussing this case in email.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.zeros
because we're assuming the left-hand side of the CDF goes to zero, i.e., that whether the fx_min
is obs
or fx[:, 0]
the corresponding probability is zero. Also, if fx_min = fx[:, 0]
but fx_prob[:, 0] != 0
, this operation means the integration is done with a zero-width area and therefore doesn't change the CRPS result (fx[:, 1] - fx[:, 0] = 0
since now fx[:, 1] = fx[:, 0]
).
4: ('season', 'crps', 'JJA', 21.41819), | ||
6: ('month', 'crps', 'Aug', 21.41819), | ||
8: ('hour', 'crps', '0', 28.103), | ||
9: ('hour', 'crps', '1', 26.634375), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are big changes! Perhaps that should not be surprising given that the percentiles are 10, 20, 30. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
Sounds good. Everything looks good to me. From what I can follow tests align well with the discussion of expected behavior. Thanks for taking care of this so quickly @dplarson. |
Thanks @wholmgren for your fast review. I've revised per your comments. |
@lboeman FYI I'm available the rest of the day if any other changes are needed before merging this PR. |
I'll go ahead and merge and poke it in dev a bit! Thanks David. |
docs/source/api.rst
for API changes.docs/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).This PR addresses the issue identified in #779 regarding how the current implementation of the CRPS metric handles cases where the observation is outside the forecast support. For example, if the probabilistic forecasts predicts power values from 1 to 8 MW but the actual power is 10 MW. The core idea is to extend the forecast CDF so covers the observation, whether the observation is less than the minimum forecast (
obs < min fx
) or greater than the maximum forecast (obs > max fx
).This PR also updates the CRPS function docstring with improved equation rendering and notes on practical considerations related to calculating CRPS from discrete forecast CDFs.
Lastly, the CRPS calculation has been updated to use trapezoidal numerical integration (instead of rectangular integration). This change does not increase the computational cost of the CRPS calculation, but has lower error (
O(\delta x)
for rectangular vsO(\delta x^2)
for trapezoidal, where\delta x
is the spacing between points). In practice, this means the CRPS calculation is now more accurate in cases with lower resolution forecast CDFs (e.g., 0%, 10%, ..., 100%).