Skip to content

BUG: Fix unpickling of string dtypes of legacy pandas versions #61770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 7, 2025

Conversation

Liam3851
Copy link
Contributor

@Liam3851 Liam3851 commented Jul 3, 2025

@Liam3851 Liam3851 marked this pull request as ready for review July 3, 2025 23:09
@Liam3851 Liam3851 changed the title Fix unpickling of string dtypes of legacy pandas versions BUG: Fix unpickling of string dtypes of legacy pandas versions Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added Bug Strings String extension data type and string data IO Pickle read_pickle, to_pickle labels Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added this to the 2.3.1 milestone Jul 4, 2025
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Liam3851 thanks a lot for the bug report and the fix!

Looks perfect, and thanks for adding legacy data (we should probably also add some data for 2.0-2.2 ..).

Can you add a note in the doc/source/whatsnew/v2.3.1.rst file? Because we will want to backport this fix

@Liam3851
Copy link
Contributor Author

Liam3851 commented Jul 5, 2025

Thanks very much for the review @jorisvandenbossche, I've added pickles for 2.0-2.2 as extra checks and a whatsnew entry.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jorisvandenbossche jorisvandenbossche merged commit e5a1c10 into pandas-dev:main Jul 7, 2025
43 of 44 checks passed
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jul 7, 2025
@@ -59,6 +59,7 @@ Bug fixes
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
- Fixed bug in unpickling objects pickled in pandas versions pre-2.3.0 that used :class:`StringDtype` (:issue:`61763`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Liam3851 for the PR.

For future reference, by convention the trailing period is normally excluded. But no need to do anything as a follow up as will probably be changed when the release notes are tidied just prior to release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Pickle read_pickle, to_pickle Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: StringDtype objects from pandas <2.3.0 cannot be reliably unpickled in 2.3.0.
3 participants