Skip to content
This repository has been archived by the owner on Dec 31, 2022. It is now read-only.

Code lifecycle #6

Closed
grantmcdermott opened this issue Dec 23, 2022 · 4 comments
Closed

Code lifecycle #6

grantmcdermott opened this issue Dec 23, 2022 · 4 comments

Comments

@grantmcdermott
Copy link

grantmcdermott commented Dec 23, 2022

Hi @sorhawell. This looks awesome. I’m a big fan of the Python Polars implementation and I’m excited to see it ported to R. Thanks for all the work so far!

My question is about the code lifecycle of the project. If I look at your contribution example the core translation work seems to involve a simple case of copying and pasting the relevant source code at that point in time (with some documentation+scaffolding adaptations). This manual approach is great because it facilitates easy contributions. But I’m not sure how it reconciles or tracks changes on the source code side. E.g. Say the underlying (Rust) source code for the cosine expression function changes, or the Python front-end gains another argument. Is there an automated way to recognize that the rpolars equivalent needs to be updated?

@sorhawell
Copy link
Member

This is a very good question.

A too big maintenance burden and overestimation of user interest are definitely the biggest project risks, I think. My plan was to go Leroy Jenkins and hope the balance between polars maturity and maintainer recruitment will make rpolars viable.

The nodejs-polars was first developed within main polars repo, but the update frequency was too high for single maintainer to catch up with the isch ~20 active polars maintainers. Now nodejs-polars has its own update cycle and can allow to only release monthly or less. See this discussion:
pola-rs/nodejs-polars#6

Developing externally the repo rpolars, it depends on rust-polars releases and I try to copy nodejs relation to polars. Py-polars does sometime use parts of rust-polars core which is not exposed in the public API, and it works out of the box as it is the same repo. I do as nodejs and also import a few functions from core as they do. Long story short, I try to ride along on the example of nodejs-polars and see where it leads.

Rust-polars seems to get updated every 2nd month or so, and rpolars will in a subsequent release point to the newer version. I have updated twice and it is actually the easier part. It will break some tests and I fix that. I strive to independently reimplement any polars functionality in R to catch any bugs. In fact I have also found 2-3 bugs in rust/py-polars in this way.

Avoiding behaviour drift with py-polars, which is an aim of rpolars, is the part I don't have a full solution to yet. Unfortunately rust-polars do not contain all features and many features are implemented as a mix of python and rust. There will be breaking changes and new features I might miss out on, and rpolars would carry the old behaviors for many versions until someone raises issues.

In my head I have the following ideas. I'd love to get more suggestions to how this could be addressed.

Solution A: The maintainers will actively read the py-polars release notes and write an update plan for every time rpolars point to a newer version of rust-polars.

Solution B: Maintainers will write a py-polars rpolars behaviour driven test suite comparing output DataFrame, logical-plans and expression syntax trees. This will potentially catch breaking changes.

Solution C: Maintainers will use fuzz tests and search through py-polars for new functions and new function parameters to find new behaviour. This will potentially also catch new additional features.

@sorhawell
Copy link
Member

@grantmcdermott

I have checked out a bit your tidy-polars project. I think it makes a lot of sense to make some dtplyr kinda tidy-r-polar implementation and potentially could be the fastest way to reach R users. I'd happily collaborate on that in some form.

What are your thoughts on keeping up with the polars development?
Do you think the rpolars implementation, has some issues which could be addressed better?

@sorhawell
Copy link
Member

Your issue subsequently made ritchie46 invite rpolars to migrate pola-rs to get some more attention. Hopefully that will one day help one the maintenance burden. However the rpolars/rpolars is soon archived and the issue will be locked.

You're welcome to reopen the issue here if relevant here or join the pola-rs discord channel for casual discussions.

Many Thanks
Soren

@grantmcdermott
Copy link
Author

Excellent!

(I've just logged on again after the Christmas break, and think this seems to be the best outcome for all concerned. Thanks again for all the hard work to date. I'll try to contribute as time allows in the coming months.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants