Metrics: make CRPS handle obs outside the fx support #781

dplarson · 2022-02-02T01:03:25Z

Closes CRPS Bug #779 .
- I am familiar with the contributing guidelines.
- Tests added.
- Updates entries to docs/source/api.rst for API changes.
- Adds descriptions to appropriate "what's new" file in docs/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
- New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
- Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

This PR addresses the issue identified in #779 regarding how the current implementation of the CRPS metric handles cases where the observation is outside the forecast support. For example, if the probabilistic forecasts predicts power values from 1 to 8 MW but the actual power is 10 MW. The core idea is to extend the forecast CDF so covers the observation, whether the observation is less than the minimum forecast (obs < min fx) or greater than the maximum forecast (obs > max fx).

This PR also updates the CRPS function docstring with improved equation rendering and notes on practical considerations related to calculating CRPS from discrete forecast CDFs.

Lastly, the CRPS calculation has been updated to use trapezoidal numerical integration (instead of rectangular integration). This change does not increase the computational cost of the CRPS calculation, but has lower error (O(\delta x) for rectangular vs O(\delta x^2) for trapezoidal, where \delta x is the spacing between points). In practice, this means the CRPS calculation is now more accurate in cases with lower resolution forecast CDFs (e.g., 0%, 10%, ..., 100%).

Revise the CRPS calculation to handle the case where the observation is outside the forecast support. For example, if Prob(Power <= 10 MW) = 0% but then the observation is 9 MW. Or if Prob(Power <= 30 MW) = 100% but ten the observation is 31 MW. This change required some subtle changes to how the vectorized calculation is performed. Also, this commit switches the integration from the rectangular rule to a quadrature rule, which seems result in more accurate CRPS calculations when the number of forecast CDF intervals is low (e.g., 10 or less). This commit also updates the docstring of the CRPS function and the tests, including comparisons against examples where the forecast is defined by a continuous parametric distribution that allows calculating the CRPS analytically. Note: this branch still needs to validate the CRPS skill score calculation and related tests. Also, it would be good to include some "simpler" CRPS calculation examples (e.g., with 3 or 5 CDF intervals, but that may not be practical.

dplarson · 2022-02-11T02:59:56Z

CI failed because I forgot about the metrics.calculator tests... Working to fix those.

Simplify the integration using the numpy trapezoidal function. The result is identical to the prior code (since underneath it's the same operations), but using np.trapz() makes it clearer what method is being used and allows users to read up on the numpy documentation regarding the integration method.

Add additional tests for the CRPS and CRPSS (CRPS skill score) functions, including "simple" examples with 3 CDF intervals to help show the logical of the trapezoidal integration.

Modify how the forecast CDF is expanded and the observation indicator function is calculated to support numpy broadcasting when the number of samples is greater than or equal to two (i.e., n >= 2).

Revert to assuming observations are provided as numpy arrays, including in the case where the observation is for a single sample. This matches the previous logic and helps prevent issues in other parts of the code base.

Note that since the CRPS calculation now uses a more generalized numerical integration setup, some of the example CRPS values had to be adjusted as the "simple" CRPS examples in the calculator tests are not very realistic (e.g., only 3 forecast CDF intervals). Rather than complicate the examples in these tests, I instead corrected the CRPS values for the given examples. Also, this commit corrects a mispelling of the Brier score name in a code comment.

dplarson · 2022-02-11T09:01:36Z

@lboeman this PR (for revising the CRPS metric calculation) is ready for review.

wholmgren

Thanks @dplarson! Great tests. I didn't double check all of them but let me know if there's anything I should pay more attention to.

@lboeman let's give @dplarson a chance to address the comments and unless David says otherwise then go ahead and merge - I don't feel a need to look again.

docs/source/whatsnew/1.0.13.rst

solarforecastarbiter/metrics/probabilistic.py

wholmgren · 2022-02-11T15:34:32Z

solarforecastarbiter/metrics/probabilistic.py

+    # (n, d) ==> (n, d + 1)
+    fx_min = np.minimum(obs, fx[:, 0])
+    fx = np.hstack([fx_min[:, np.newaxis], fx])
+    fx_prob = np.hstack([np.zeros([n, 1]), fx_prob])


np.zeros or np.full(fx[:, 0])? Just double checking since I don't remember discussing this case in email.

np.zeros because we're assuming the left-hand side of the CDF goes to zero, i.e., that whether the fx_min is obs or fx[:, 0] the corresponding probability is zero. Also, if fx_min = fx[:, 0] but fx_prob[:, 0] != 0, this operation means the integration is done with a zero-width area and therefore doesn't change the CRPS result (fx[:, 1] - fx[:, 0] = 0 since now fx[:, 1] = fx[:, 0]).

solarforecastarbiter/metrics/probabilistic.py

wholmgren · 2022-02-11T15:43:20Z

solarforecastarbiter/metrics/tests/test_calculator.py

+        4: ('season', 'crps', 'JJA', 21.41819),
+        6: ('month', 'crps', 'Aug', 21.41819),
+        8: ('hour', 'crps', '0', 28.103),
+        9: ('hour', 'crps', '1', 26.634375),


These are big changes! Perhaps that should not be surprising given that the percentiles are 10, 20, 30. Right?

solarforecastarbiter/metrics/tests/test_probabilistic.py

lboeman · 2022-02-11T16:11:54Z

@lboeman let's give @dplarson a chance to address the comments and unless David says otherwise then go ahead and merge - I don't feel a need to look again.

Sounds good. Everything looks good to me. From what I can follow tests align well with the discussion of expected behavior. Thanks for taking care of this so quickly @dplarson.

dplarson · 2022-02-11T17:28:56Z

Thanks @wholmgren for your fast review. I've revised per your comments.

dplarson · 2022-02-11T19:03:17Z

@lboeman FYI I'm available the rest of the day if any other changes are needed before merging this PR.

lboeman · 2022-02-11T19:06:56Z

@lboeman FYI I'm available the rest of the day if any other changes are needed before merging this PR.

I'll go ahead and merge and poke it in dev a bit! Thanks David.

dplarson added 2 commits February 1, 2022 16:59

tests for CPRS with obs outside forecast support (WIP)

3ce1750

dplarson added 6 commits February 10, 2022 22:45

Add tests for CRPS and CRPSS (skill score)

6ade7e5

Add additional tests for the CRPS and CRPSS (CRPS skill score) functions, including "simple" examples with 3 CDF intervals to help show the logical of the trapezoidal integration.

CRPS: allow numpy broadcasting with n >= 2

b37791d

Modify how the forecast CDF is expanded and the observation indicator function is calculated to support numpy broadcasting when the number of samples is greater than or equal to two (i.e., n >= 2).

tests/probabilistic: use np.array for obs

b94cbb2

Revert to assuming observations are provided as numpy arrays, including in the case where the observation is for a single sample. This matches the previous logic and helps prevent issues in other parts of the code base.

Add whatsnew/1.0.13.rst with CRPS revision

ab03c70

dplarson changed the title ~~WIP: make CRPS handle obs outside the fx support~~ Metrics: make CRPS handle obs outside the fx support Feb 11, 2022

dplarson marked this pull request as ready for review February 11, 2022 09:00

wholmgren approved these changes Feb 11, 2022

View reviewed changes

dplarson added 2 commits February 11, 2022 09:14

whatsnew: correct name

fa850ae

Address reviewer comments and suggested revisions.

3c10fe6

lboeman merged commit d4d85a3 into SolarArbiter:master Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics: make CRPS handle obs outside the fx support #781

Metrics: make CRPS handle obs outside the fx support #781

dplarson commented Feb 2, 2022 •

edited

dplarson commented Feb 11, 2022

dplarson commented Feb 11, 2022

wholmgren left a comment

wholmgren Feb 11, 2022

dplarson Feb 11, 2022

wholmgren Feb 11, 2022

dplarson Feb 11, 2022

lboeman commented Feb 11, 2022

dplarson commented Feb 11, 2022

dplarson commented Feb 11, 2022

lboeman commented Feb 11, 2022

Metrics: make CRPS handle obs outside the fx support #781

Metrics: make CRPS handle obs outside the fx support #781

Conversation

dplarson commented Feb 2, 2022 • edited

dplarson commented Feb 11, 2022

dplarson commented Feb 11, 2022

wholmgren left a comment

Choose a reason for hiding this comment

wholmgren Feb 11, 2022

Choose a reason for hiding this comment

dplarson Feb 11, 2022

Choose a reason for hiding this comment

wholmgren Feb 11, 2022

Choose a reason for hiding this comment

dplarson Feb 11, 2022

Choose a reason for hiding this comment

lboeman commented Feb 11, 2022

dplarson commented Feb 11, 2022

dplarson commented Feb 11, 2022

lboeman commented Feb 11, 2022

dplarson commented Feb 2, 2022 •

edited