Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement additional methods for ensemble_percentile #1694

Closed
2 tasks done
SarahG-579462 opened this issue Mar 27, 2024 · 3 comments · Fixed by #1732
Closed
2 tasks done

Implement additional methods for ensemble_percentile #1694

SarahG-579462 opened this issue Mar 27, 2024 · 3 comments · Fixed by #1732
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@SarahG-579462
Copy link
Contributor

Addressing a Problem?

At the moment only linear interpolation method 7 of Hyndman and Fan is implemented for ensemble_percentiles.

Supposedly, method 8 is better for distribution-free variables, and method 9 is better for normally distributed variables. I would be interested to seeing the effects of choosing other interpolation methods on low-n variables (such as freezing rain indices for portraits climatiques, which only includes four models at the moment.)

To do so, I would need other methods (at least methods 8 and 9) implemented for ensemble_percentiles.

Potential Solution

Implement methods 8 and 9 for ensemble_percentiles in xclim, and expose the api for this function to allow a method selection.

Additional context

Willing to contribute, but might take a while to get around to :)

Contribution

  • I would be willing/able to open a Pull Request to contribute this feature.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@SarahG-579462 SarahG-579462 added the enhancement New feature or request label Mar 27, 2024
@aulemahal
Copy link
Collaborator

AFAIU, those three methods differ on what they use as the $\alpha$ and $\beta$ coefficients. xclim's nan_quantile exposes those two, so the simplest PR would be to expose that through the layers up until ensemble_percentiles.

(We discussed this in slack, but putting it here for all) I believe however that xclim deserves a small performance check to see if our homemade nan_quantile really is still more performant than xarray's wrapping of numpy.

Considering recent additions to flox and xarray (pydata/xarray#8720), maybe ensemble_percentiles also deserves a look at its rechunking behaviour (i.e. maybe the usage of flox makes it unnecessary?).

@SarahG-579462
Copy link
Contributor Author

There are plans to eventually have a benchmarking set up for xclim, #1510, your second point could be included in that, or it could be a small testbed for the benchmarks.

@bzah
Copy link
Contributor

bzah commented Mar 28, 2024

In case you don't want to start from scratch, I had a script to compare xclim's nan_quantile and numpy's nan_quantile on random samples, also testing how the performance change with the number of NaNs in the sample: https://gist.github.com/bzah/2a84d050b8a1aed1b40a2ed1526e1f12
Also numbagg added a performant nan_quantile not too long ago (numbagg/numbagg#166), it might be interesting to add it to the comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants