Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate the work needed to finally implement the xarray data model throughout pysteps. #373

Closed
ladc opened this issue Jul 12, 2024 · 11 comments
Assignees
Labels
help wanted Extra attention is needed v2 Breaking changes wrt v1

Comments

@ladc
Copy link
Contributor

ladc commented Jul 12, 2024

Already since 2018 (see #12, #216 and https://gist.github.com/kmuehlbauer/645e42a53b30752230c08c20a9c964f9 ) there has been talk of moving to xarray. At the time, it was not yet mature, but now it would be very beneficial for the code performance, readability, maintainability, and usability.
This requires adaptations throughout the code, rigorous testing, potentially requiring the creation of new unit tests to ensure that all functions still function as expected.
More importantly, this would introduce many breaking changes so it's not a minor update; it would instead constitute a significant part of the much-anticipated pysteps v2 (#216). Your insights welcome! @dnerini @RubenImhoff

@ladc
Copy link
Contributor Author

ladc commented Jul 12, 2024

When we tackle this topic, we can create new issues to divide the work in more digestible bits.

@RubenImhoff
Copy link
Contributor

RubenImhoff commented Jul 12, 2024

And see also PR #240 and #254 for some inspiration and examples to get started.

@ladc
Copy link
Contributor Author

ladc commented Jul 16, 2024

@mats-knmi and Anshul @gjm174 will tackle it - Mats will start by rewriting all the importers which will break all the unit tests 👏 . He will then fix all the tests. The rewritten functions will break the rest of the code like dominoes and this will be addressed bit by bit.
Anshul focuses on profiling.

The current metadata properties are described here: https://pysteps.readthedocs.io/en/stable/pysteps_reference/io.html#pysteps-io-importers

@ladc
Copy link
Contributor Author

ladc commented Jul 17, 2024

@dnerini We are discussing how we should approach this. These changes will break a lot. @mats-knmi proposes to create a v2 branch after all, so these changes can still be reviewed and included incrementally into v2 without totally breaking the master banch. What do you think?

@dnerini
Copy link
Member

dnerini commented Jul 17, 2024

I'm happy with it, but maybe it's also worth looking at what was already done in the older pysteps-v2 branch?
I created a draft PR to have a closer look: #383

@ladc
Copy link
Contributor Author

ladc commented Jul 17, 2024

@dnerini Mats had a look at it and it has diverged significantly from master. He has done some similar work already but the main difference is that there's no legacy support in his current revised version. This would result in much cleaner code but would break compatibility with v1.

The current methods are rewritten to support xarray but only by exporting the required metadata from the dataset, processing and inserting it back into the data array.

Some room for improvement probably remains (using the xarray functionality directly) but most of the time it's unavoidable to simply extract the np arrays to do the processing in the various functions.

@dnerini
Copy link
Member

dnerini commented Jul 18, 2024

also, some work done in #383 is probably still relevant.

@ladc
Copy link
Contributor Author

ladc commented Jul 18, 2024

The gallery / examples also needs to be adapted, they serve as integration tests.
We will maintain a v2 branch in parallel to the master, but add the other improvements to the master branch.
CI should be set up for the v2 branch.
Any help is welcome! E.g. getting the nowcasting working with xarray.
In a later step we can add fancy features such as the accessors that have been proposed earlier.

@dnerini dnerini added the help wanted Extra attention is needed label Jul 18, 2024
@mats-knmi
Copy link
Contributor

I finshed the conversion of the transformation.py and conversion.py to xarray in the original branch I created (https://github.com/pySTEPS/pysteps/tree/use-xarray-as-data-model).

From now on I will consider that branch to be the "main" branch of the xarray conversion and open separate pull requests to this branch for every next module that I convert, the first of these pull requests can be found here: #391.

I invite everone who wants to help to follow this same way of working.

@ladc
Copy link
Contributor Author

ladc commented Jul 19, 2024

The evaluation has been done and the work has started!

@ladc ladc closed this as completed Jul 19, 2024
@ladc ladc added the v2 Breaking changes wrt v1 label Jul 19, 2024
@ladc
Copy link
Contributor Author

ladc commented Jul 19, 2024

Outcome: a new project was created by @mats-knmi
https://github.com/orgs/pySTEPS/projects/4
with a list of issues to be picked up - welcome to create any new issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed v2 Breaking changes wrt v1
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants