Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas 1.3.0 (but not 1.3.1) support on purpose? #77

Closed
Anthchirp opened this issue Aug 18, 2021 · 7 comments
Closed

Pandas 1.3.0 (but not 1.3.1) support on purpose? #77

Anthchirp opened this issue Aug 18, 2021 · 7 comments
Assignees
Labels
MTZDtypes Issues related to custom dtypes question Further information is requested version Version issues

Comments

@Anthchirp
Copy link
Contributor

f803979 added explicit support for version 1.3.0, but excluded patch releases such as 1.3.1.

Was that on purpose or should the expression read <1.4?

@kmdalton
Copy link
Member

This is @JBGreisman 's department, but I recall that Pandas has been mucking around with custom dtypes lately and breaking everything. I would not be surprised if the very specific versioning requirement was just self defense.

@kmdalton kmdalton added MTZDtypes Issues related to custom dtypes question Further information is requested labels Aug 18, 2021
@JBGreisman JBGreisman added the version Version issues label Aug 18, 2021
@JBGreisman
Copy link
Member

This was mostly done in self defense as @kmdalton suggested. There are two places that rs has historically run into issues during pandas updates:

The first is with attributes for our subclassed DataFrame (rs.DataSet). In the olde days of this project, there were a few pandas operations that would lead to loss of the subclass , and therefore loss of cell and spacegroup attributes. That has largely stabilized in the last couple of years and hasn't caused trouble lately.

The second place is with our ExtensionDtypes that are used to implement all the MTZ datatypes. Pandas adjusts that API a bit more regularly, and also sometimes changes behavior in small ways during patch releases. I had set a maximal version to protect against short-term issues, but I think that was the wrong approach to this problem.

I think it makes the most sense to have a GitHub Action that tests the build on a set schedule to detect possible issues with any latest pandas version. This sort of "detect early" strategy is a bit more flexible, and should allow us to keep up to date with pandas in a more seamless manner.

@JBGreisman
Copy link
Member

JBGreisman commented Aug 18, 2021

To do list:

  • Extend supported version to latest pandas (1.3.2)
  • Add renovate to repo to automatically test against dependency updates

@Anthchirp
Copy link
Contributor Author

We use renovate to track and test against dependency updates, eg. DiamondLightSource/python-workflows#65

@JBGreisman
Copy link
Member

awesome -- good to hear. I had been looking into that as a possible solution for this

@JBGreisman
Copy link
Member

I tested renovate in my personal fork and it seems accomplish this task exactly as I hoped. I tested it by reverting the pandas version to "pandas >= 1.2.0, <= 1.3.0", and renovate filed a PR to change the line to read "pandas >= 1.2.0, <= 1.3.2".

@JBGreisman
Copy link
Member

I'm closing this issue because renovate has now been added to the repo, and appears to be working as intended

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MTZDtypes Issues related to custom dtypes question Further information is requested version Version issues
Projects
None yet
Development

No branches or pull requests

3 participants