Implement additional methods for ensemble_percentile #1694

SarahG-579462 · 2024-03-27T21:41:09Z

Addressing a Problem?

At the moment only linear interpolation method 7 of Hyndman and Fan is implemented for ensemble_percentiles.

Supposedly, method 8 is better for distribution-free variables, and method 9 is better for normally distributed variables. I would be interested to seeing the effects of choosing other interpolation methods on low-n variables (such as freezing rain indices for portraits climatiques, which only includes four models at the moment.)

To do so, I would need other methods (at least methods 8 and 9) implemented for ensemble_percentiles.

Potential Solution

Implement methods 8 and 9 for ensemble_percentiles in xclim, and expose the api for this function to allow a method selection.

Additional context

Willing to contribute, but might take a while to get around to :)

Contribution

I would be willing/able to open a Pull Request to contribute this feature.

Code of Conduct

I agree to follow this project's Code of Conduct

aulemahal · 2024-03-27T21:46:29Z

AFAIU, those three methods differ on what they use as the $\alpha$ and $\beta$ coefficients. xclim's nan_quantile exposes those two, so the simplest PR would be to expose that through the layers up until ensemble_percentiles.

(We discussed this in slack, but putting it here for all) I believe however that xclim deserves a small performance check to see if our homemade nan_quantile really is still more performant than xarray's wrapping of numpy.

Considering recent additions to flox and xarray (pydata/xarray#8720), maybe ensemble_percentiles also deserves a look at its rechunking behaviour (i.e. maybe the usage of flox makes it unnecessary?).

SarahG-579462 · 2024-03-27T21:49:53Z

There are plans to eventually have a benchmarking set up for xclim, #1510, your second point could be included in that, or it could be a small testbed for the benchmarks.

bzah · 2024-03-28T12:46:52Z

In case you don't want to start from scratch, I had a script to compare xclim's nan_quantile and numpy's nan_quantile on random samples, also testing how the performance change with the number of NaNs in the sample: https://gist.github.com/bzah/2a84d050b8a1aed1b40a2ed1526e1f12
Also numbagg added a performant nan_quantile not too long ago (numbagg/numbagg#166), it might be interesting to add it to the comparison.

SarahG-579462 added the enhancement New feature or request label Mar 27, 2024

Zeitsperre mentioned this issue Apr 16, 2024

Investigate numbagg for performance with Quantiles #1707

Open

Zeitsperre assigned SarahG-579462 Apr 16, 2024

Zeitsperre added this to the v0.49.0 (PyCon LT) milestone Apr 16, 2024

Zeitsperre modified the milestones: v0.49.0 (PyCon LT), v0.50.0 Apr 25, 2024

huard mentioned this issue Apr 25, 2024

Add support for method argument in ensemble_percentile #1732

Merged

5 tasks

aulemahal closed this as completed in #1732 May 1, 2024

aulemahal closed this as completed in aed2058 May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement additional methods for ensemble_percentile #1694

Implement additional methods for ensemble_percentile #1694

SarahG-579462 commented Mar 27, 2024

aulemahal commented Mar 27, 2024

SarahG-579462 commented Mar 27, 2024

bzah commented Mar 28, 2024

Implement additional methods for ensemble_percentile #1694

Implement additional methods for ensemble_percentile #1694

Comments

SarahG-579462 commented Mar 27, 2024

Addressing a Problem?

Potential Solution

Additional context

Contribution

Code of Conduct

aulemahal commented Mar 27, 2024

SarahG-579462 commented Mar 27, 2024

bzah commented Mar 28, 2024