You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v2.3.0.rst
-35Lines changed: 0 additions & 35 deletions
Original file line number
Diff line number
Diff line change
@@ -31,39 +31,6 @@ Other enhancements
31
31
- The :meth:`~Series.cumsum`, :meth:`~Series.cummin`, and :meth:`~Series.cummax` reductions are now implemented for :class:`StringDtype` columns (:issue:`60633`)
32
32
- The :meth:`~Series.sum` reduction is now implemented for :class:`StringDtype` columns (:issue:`59853`)
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
in determining the result dtype when there are different string dtypes compared. Some examples:
52
-
53
-
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
54
-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
55
-
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
56
-
57
-
.. _whatsnew_230.api_changes:
58
-
59
-
API changes
60
-
~~~~~~~~~~~
61
-
62
-
- When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
63
-
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
64
-
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
89
-
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
90
55
- Bug in :meth:`Series.__pos__` and :meth:`DataFrame.__pos__` where an ``Exception`` was not raised for :class:`StringDtype` with ``storage="pyarrow"`` (:issue:`60710`)
91
56
- Bug in :meth:`Series.rank` for :class:`StringDtype` with ``storage="pyarrow"`` that incorrectly returned integer results with ``method="average"`` and raised an error if it would truncate results (:issue:`59768`)
92
57
- Bug in :meth:`Series.replace` with :class:`StringDtype` when replacing with a non-string value was not upcasting to ``object`` dtype (:issue:`60282`)
In previous versions, comparing :class:`Series` of different string dtypes (e.g. ``pd.StringDtype("pyarrow", na_value=pd.NA)`` against ``pd.StringDtype("python", na_value=np.nan)``) would result in inconsistent resulting dtype or incorrectly raise. pandas will now use the hierarchy
in determining the result dtype when there are different string dtypes compared. Some examples:
27
+
28
+
- When ``pd.StringDtype("pyarrow", na_value=pd.NA)`` is compared against any other string dtype, the result will always be ``boolean[pyarrow]``.
29
+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("pyarrow", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
30
+
- When ``pd.StringDtype("python", na_value=pd.NA)`` is compared against ``pd.StringDtype("python", na_value=np.nan)``, the result will be ``boolean``, the NumPy-backed nullable extension array.
31
+
32
+
.. _whatsnew_231.string_fixes.ignore_empty:
33
+
34
+
Index set operations ignore empty RangeIndex and object dtype Index
When enabling the ``future.infer_string`` option, :class:`Index` set operations (like
38
+
union or intersection) will now ignore the dtype of an empty :class:`RangeIndex` or
39
+
empty :class:`Index` with ``object`` dtype when determining the dtype of the resulting
40
+
Index (:issue:`60797`).
41
+
42
+
This ensures that combining such empty Index with strings will infer the string dtype
43
+
correctly, rather than defaulting to ``object`` dtype. For example:
44
+
45
+
.. code-block:: python
46
+
47
+
>>> pd.options.mode.infer_string =True
48
+
>>> df = pd.DataFrame()
49
+
>>> df.columns.dtype
50
+
dtype('int64') # default RangeIndex for empty columns
51
+
>>> df["a"] = [1, 2, 3]
52
+
>>> df.columns.dtype
53
+
<StringDtype(na_value=nan)># new columns use string dtype instead of object dtype
54
+
55
+
.. _whatsnew_231.string_fixes.bugs:
56
+
57
+
Bug fixes
58
+
^^^^^^^^^
59
+
- Bug in :meth:`.DataFrameGroupBy.min`, :meth:`.DataFrameGroupBy.max`, :meth:`.Resampler.min`, :meth:`.Resampler.max` where all NA values of string dtype would return float instead of string dtype (:issue:`60810`)
60
+
- Bug in :meth:`DataFrame.sum` with ``axis=1``, :meth:`.DataFrameGroupBy.sum` or :meth:`.SeriesGroupBy.sum` with ``skipna=True``, and :meth:`.Resampler.sum` with all NA values of :class:`StringDtype` resulted in ``0`` instead of the empty string ``""`` (:issue:`60229`)
61
+
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
13
62
14
-
Enhancements
15
-
~~~~~~~~~~~~
16
-
-
17
63
18
64
.. _whatsnew_231.regressions:
19
65
@@ -26,7 +72,7 @@ Fixed regressions
26
72
27
73
Bug fixes
28
74
~~~~~~~~~
29
-
- Fixed bug in :meth:`DataFrame.explode` and :meth:`Series.explode` where methods would fail with ``dtype="str"`` (:issue:`61623`)
0 commit comments