Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy for handling backwards incompatible API changes in upstream libraries #9420

Open
pavithraes opened this issue Aug 23, 2022 · 2 comments
Labels
discussion Discussing a topic with no specific actions yet needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. upstream

Comments

@pavithraes
Copy link
Member

Originally noted by @ian-r-rose and @jsignell in #9366 (review):

I think we should reconsider some of the policies around all of these backwards incompatibility checks, and probably phase some of them out.

It would be good to create a policy for how Dask should handle this kind of API change.

@pavithraes pavithraes added discussion Discussing a topic with no specific actions yet upstream labels Aug 23, 2022
@TomAugspurger
Copy link
Member

I think the things to consider here are

  1. The maintenance burden and code complexity of trying to support multiple versions of pandas. It can be quite complicated to correctly forward the appropriate keywords depending on the pandas version available at runtime. Supporting more versions means more maintenance.
  2. The benefit to the user of supporting multiple versions of pandas (Dask might be one part of a larger system, which might not be able to easily support newer versions of pandas). Supporting more versions means the latest version of dask can be used by more users.

In general, my goal when I was doing pandas compatibility was to match latest version of pandas (matching the method signatures and behavior, warnings and all). If a user calls a deprecated method in pandas, they should see a warning (ideally at before graph computation). And user shouldn't see warnings from Dask's internal usage of pandas.

We didn't have an official policy on when to bump the minimum required version of pandas. Roughly, it happened when there was a particularly tricky bit of compatibility code that didn't seem worth doing.

I've been out of the loop for a bit, so I'd be curious to hear from @jsignell about how things are going.

@jsignell
Copy link
Member

Thanks for writing this up Tom! I agree with the two contradictory goals.

In general, my goal when I was doing pandas compatibility was to match latest version of pandas (matching the method signatures and behavior, warnings and all).

This is really helpful context. I think we have (kind of inadvertently) moved to matching the user's version of pandas particularly when it comes little differences like changes to kwarg defaults. At the time this felt like the best thing to do, the principle being: "dask matches pandas", but the code branching can get a little wacky. So I think you are right that matching latest pandas is the way forward.

If a user calls a deprecated method in pandas, they should see a warning (ideally at before graph computation). And user shouldn't see warnings from Dask's internal usage of pandas.

These both align with what I've been trying to do, with the slight note that the warning probably depends on the user's pandas version. Based on the comment above it seems like we should feel free to warn regardless of pandas version.

We didn't have an official policy on when to bump the minimum required version of pandas. Roughly, it happened when there was a particularly tricky bit of compatibility code that didn't seem worth doing.

I think we are doing a lazy NEP29 on this now.

@github-actions github-actions bot added the needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. label Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussing a topic with no specific actions yet needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. upstream
Projects
None yet
Development

No branches or pull requests

3 participants