Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't expose EA.pad_or_backfill to users + switch to DeprecationWarning #54838

Merged
merged 6 commits into from
Aug 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/reference/extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ objects.
api.extensions.ExtensionArray._from_sequence
api.extensions.ExtensionArray._from_sequence_of_strings
api.extensions.ExtensionArray._hash_pandas_object
api.extensions.ExtensionArray._pad_or_backfill
api.extensions.ExtensionArray._reduce
api.extensions.ExtensionArray._values_for_argsort
api.extensions.ExtensionArray._values_for_factorize
Expand All @@ -54,7 +55,6 @@ objects.
api.extensions.ExtensionArray.interpolate
api.extensions.ExtensionArray.isin
api.extensions.ExtensionArray.isna
api.extensions.ExtensionArray.pad_or_backfill
api.extensions.ExtensionArray.ravel
api.extensions.ExtensionArray.repeat
api.extensions.ExtensionArray.searchsorted
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ Other Deprecations
- Deprecated positional indexing on :class:`Series` with :meth:`Series.__getitem__` and :meth:`Series.__setitem__`, in a future version ``ser[item]`` will *always* interpret ``item`` as a label, not a position (:issue:`50617`)
- Deprecated replacing builtin and NumPy functions in ``.agg``, ``.apply``, and ``.transform``; use the corresponding string alias (e.g. ``"sum"`` for ``sum`` or ``np.sum``) instead (:issue:`53425`)
- Deprecated strings ``T``, ``t``, ``L`` and ``l`` denoting units in :func:`to_timedelta` (:issue:`52536`)
- Deprecated the "method" and "limit" keywords in ``.ExtensionArray.fillna``, implement and use ``pad_or_backfill`` instead (:issue:`53621`)
- Deprecated the "method" and "limit" keywords in ``.ExtensionArray.fillna``, implement ``_pad_or_backfill`` instead (:issue:`53621`)
- Deprecated the ``method`` and ``limit`` keywords in :meth:`DataFrame.replace` and :meth:`Series.replace` (:issue:`33302`)
- Deprecated the ``method`` and ``limit`` keywords on :meth:`Series.fillna`, :meth:`DataFrame.fillna`, :meth:`.SeriesGroupBy.fillna`, :meth:`.DataFrameGroupBy.fillna`, and :meth:`.Resampler.fillna`, use ``obj.bfill()`` or ``obj.ffill()`` instead (:issue:`53394`)
- Deprecated the behavior of :meth:`Series.__getitem__`, :meth:`Series.__setitem__`, :meth:`DataFrame.__getitem__`, :meth:`DataFrame.__setitem__` with an integer slice on objects with a floating-dtype index, in a future version this will be treated as *positional* indexing (:issue:`49612`)
Expand Down
9 changes: 2 additions & 7 deletions pandas/core/arrays/_mixins.py
Original file line number Diff line number Diff line change
Expand Up @@ -296,13 +296,8 @@ def _fill_mask_inplace(
func = missing.get_fill_func(method, ndim=self.ndim)
func(self._ndarray.T, limit=limit, mask=mask.T)

def pad_or_backfill(
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill(
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
mask = self.isna()
if mask.any():
Expand Down
17 changes: 5 additions & 12 deletions pandas/core/arrays/arrow/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -924,19 +924,14 @@ def dropna(self) -> Self:
"""
return type(self)(pc.drop_null(self._pa_array))

def pad_or_backfill(
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill(
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
if not self._hasna:
# TODO(CoW): Not necessary anymore when CoW is the default
return self.copy()

if limit is not None and limit_area is not None:
if limit is None:
Comment on lines -939 to +934
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am missing something, I think this was just wrong before. This code path doesn't use limit, so it should only be used for the cases it was not specified (and this is None).

It worked before as well because the fall back option (the base class implementation) also worked.
And it only started failing now because I removed the limit_area is not None, so the remaining check started to pass when specifying just limit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This change looks correct to me

method = missing.clean_fill_method(method)
try:
if method == "pad":
Expand All @@ -952,9 +947,7 @@ def pad_or_backfill(

# TODO(3.0): after EA.fillna 'method' deprecation is enforced, we can remove
# this method entirely.
return super().pad_or_backfill(
method=method, limit=limit, limit_area=limit_area, copy=copy
)
return super()._pad_or_backfill(method=method, limit=limit, copy=copy)

@doc(ExtensionArray.fillna)
def fillna(
Expand All @@ -974,7 +967,7 @@ def fillna(
return super().fillna(value=value, method=method, limit=limit, copy=copy)

if method is not None:
return super().pad_or_backfill(method=method, limit=limit, copy=copy)
return super().fillna(method=method, limit=limit, copy=copy)

if isinstance(value, (np.ndarray, ExtensionArray)):
# Similar to check_value_size, but we do not mask here since we may
Expand Down
32 changes: 13 additions & 19 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,6 @@ class ExtensionArray:
interpolate
isin
isna
pad_or_backfill
ravel
repeat
searchsorted
Expand All @@ -148,6 +147,7 @@ class ExtensionArray:
_from_sequence
_from_sequence_of_strings
_hash_pandas_object
_pad_or_backfill
_reduce
_values_for_argsort
_values_for_factorize
Expand Down Expand Up @@ -183,7 +183,7 @@ class ExtensionArray:
methods:

* fillna
* pad_or_backfill
* _pad_or_backfill
* dropna
* unique
* factorize / _values_for_factorize
Expand Down Expand Up @@ -918,13 +918,8 @@ def interpolate(
f"{type(self).__name__} does not implement interpolate"
)

def pad_or_backfill(
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill(
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
"""
Pad or backfill values, used by Series/DataFrame ffill and bfill.
Expand Down Expand Up @@ -960,26 +955,26 @@ def pad_or_backfill(
Examples
--------
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan])
>>> arr.pad_or_backfill(method="backfill", limit=1)
>>> arr._pad_or_backfill(method="backfill", limit=1)
<IntegerArray>
[<NA>, 2, 2, 3, <NA>, <NA>]
Length: 6, dtype: Int64
"""

# If a 3rd-party EA has implemented this functionality in fillna,
# we warn that they need to implement pad_or_backfill instead.
# we warn that they need to implement _pad_or_backfill instead.
if (
type(self).fillna is not ExtensionArray.fillna
and type(self).pad_or_backfill is ExtensionArray.pad_or_backfill
and type(self)._pad_or_backfill is ExtensionArray._pad_or_backfill
):
# Check for pad_or_backfill here allows us to call
# super().pad_or_backfill without getting this warning
# Check for _pad_or_backfill here allows us to call
# super()._pad_or_backfill without getting this warning
warnings.warn(
"ExtensionArray.fillna 'method' keyword is deprecated. "
"In a future version. arr.pad_or_backfill will be called "
"In a future version. arr._pad_or_backfill will be called "
"instead. 3rd-party ExtensionArray authors need to implement "
"pad_or_backfill.",
FutureWarning,
"_pad_or_backfill.",
DeprecationWarning,
stacklevel=find_stack_level(),
)
return self.fillna(method=method, limit=limit)
Expand Down Expand Up @@ -1063,8 +1058,7 @@ def fillna(
if method is not None:
warnings.warn(
f"The 'method' keyword in {type(self).__name__}.fillna is "
"deprecated and will be removed in a future version. "
"Use pad_or_backfill instead.",
"deprecated and will be removed in a future version.",
FutureWarning,
stacklevel=find_stack_level(),
)
Expand Down
13 changes: 3 additions & 10 deletions pandas/core/arrays/interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -890,19 +890,12 @@ def max(self, *, axis: AxisInt | None = None, skipna: bool = True) -> IntervalOr
indexer = obj.argsort()[-1]
return obj[indexer]

def pad_or_backfill( # pylint: disable=useless-parent-delegation
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill( # pylint: disable=useless-parent-delegation
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
# TODO(3.0): after EA.fillna 'method' deprecation is enforced, we can remove
# this method entirely.
return super().pad_or_backfill(
method=method, limit=limit, limit_area=limit_area, copy=copy
)
return super()._pad_or_backfill(method=method, limit=limit, copy=copy)

def fillna(
self, value=None, method=None, limit: int | None = None, copy: bool = True
Expand Down
9 changes: 2 additions & 7 deletions pandas/core/arrays/masked.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,13 +190,8 @@ def __getitem__(self, item: PositionalIndexer) -> Self | Any:

return self._simple_new(self._data[item], newmask)

def pad_or_backfill(
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill(
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
mask = self._mask

Expand Down
5 changes: 4 additions & 1 deletion pandas/core/arrays/numpy_.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,10 @@ def _values_for_factorize(self) -> tuple[np.ndarray, float | None]:
fv = np.nan
return self._ndarray, fv

def pad_or_backfill(
# Base EA class (and all other EA classes) don't have limit_area keyword
# This can be removed here as well when the interpolate ffill/bfill method
# deprecation is enforced
def _pad_or_backfill(
self,
*,
method: FillnaOptions,
Expand Down
13 changes: 3 additions & 10 deletions pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -791,20 +791,13 @@ def searchsorted(
m8arr = self._ndarray.view("M8[ns]")
return m8arr.searchsorted(npvalue, side=side, sorter=sorter)

def pad_or_backfill(
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill(
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
# view as dt64 so we get treated as timelike in core.missing,
# similar to dtl._period_dispatch
dta = self.view("M8[ns]")
result = dta.pad_or_backfill(
method=method, limit=limit, limit_area=limit_area, copy=copy
)
result = dta._pad_or_backfill(method=method, limit=limit, copy=copy)
if copy:
return cast("Self", result.view(self.dtype))
else:
Expand Down
13 changes: 3 additions & 10 deletions pandas/core/arrays/sparse/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -712,19 +712,12 @@ def isna(self):
mask[self.sp_index.indices] = isna(self.sp_values)
return type(self)(mask, fill_value=False, dtype=dtype)

def pad_or_backfill( # pylint: disable=useless-parent-delegation
self,
*,
method: FillnaOptions,
limit: int | None = None,
limit_area: Literal["inside", "outside"] | None = None,
copy: bool = True,
def _pad_or_backfill( # pylint: disable=useless-parent-delegation
self, *, method: FillnaOptions, limit: int | None = None, copy: bool = True
) -> Self:
# TODO(3.0): We can remove this method once deprecation for fillna method
# keyword is enforced.
return super().pad_or_backfill(
method=method, limit=limit, limit_area=limit_area, copy=copy
)
return super()._pad_or_backfill(method=method, limit=limit, copy=copy)

def fillna(
self,
Expand Down
6 changes: 3 additions & 3 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -1451,7 +1451,7 @@ def pad_or_backfill(
vals = cast(NumpyExtensionArray, self.array_values)
if axis == 1:
vals = vals.T
new_values = vals.pad_or_backfill(
new_values = vals._pad_or_backfill(
method=method,
limit=limit,
limit_area=limit_area,
Expand Down Expand Up @@ -1948,9 +1948,9 @@ def pad_or_backfill(

if values.ndim == 2 and axis == 1:
# NDArrayBackedExtensionArray.fillna assumes axis=0
new_values = values.T.pad_or_backfill(method=method, limit=limit).T
new_values = values.T._pad_or_backfill(method=method, limit=limit).T
else:
new_values = values.pad_or_backfill(method=method, limit=limit)
new_values = values._pad_or_backfill(method=method, limit=limit)
return [self.make_block_same_class(new_values)]


Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/arrays/test_datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ def test_fillna_method_doesnt_change_orig(self, method):

fill_value = arr[3] if method == "pad" else arr[5]

result = arr.pad_or_backfill(method=method)
result = arr._pad_or_backfill(method=method)
assert result[4] == fill_value

# check that the original was not changed
Expand Down
10 changes: 5 additions & 5 deletions pandas/tests/arrays/test_datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -497,7 +497,7 @@ def test_fillna_preserves_tz(self, method):
dtype=DatetimeTZDtype(tz="US/Central"),
)

result = arr.pad_or_backfill(method=method)
result = arr._pad_or_backfill(method=method)
tm.assert_extension_array_equal(result, expected)

# assert that arr and dti were not modified in-place
Expand All @@ -510,12 +510,12 @@ def test_fillna_2d(self):
dta[0, 1] = pd.NaT
dta[1, 0] = pd.NaT

res1 = dta.pad_or_backfill(method="pad")
res1 = dta._pad_or_backfill(method="pad")
expected1 = dta.copy()
expected1[1, 0] = dta[0, 0]
tm.assert_extension_array_equal(res1, expected1)

res2 = dta.pad_or_backfill(method="backfill")
res2 = dta._pad_or_backfill(method="backfill")
expected2 = dta.copy()
expected2 = dta.copy()
expected2[1, 0] = dta[2, 0]
Expand All @@ -529,10 +529,10 @@ def test_fillna_2d(self):
assert not dta2._ndarray.flags["C_CONTIGUOUS"]
tm.assert_extension_array_equal(dta, dta2)

res3 = dta2.pad_or_backfill(method="pad")
res3 = dta2._pad_or_backfill(method="pad")
tm.assert_extension_array_equal(res3, expected1)

res4 = dta2.pad_or_backfill(method="backfill")
res4 = dta2._pad_or_backfill(method="backfill")
tm.assert_extension_array_equal(res4, expected2)

# test the DataFrame method while we're here
Expand Down
8 changes: 4 additions & 4 deletions pandas/tests/extension/base/dim2.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,20 +160,20 @@ def test_fillna_2d_method(self, data_missing, method):
assert arr[0].isna().all()
assert not arr[1].isna().any()

result = arr.pad_or_backfill(method=method, limit=None)
result = arr._pad_or_backfill(method=method, limit=None)

expected = data_missing.pad_or_backfill(method=method).repeat(2).reshape(2, 2)
expected = data_missing._pad_or_backfill(method=method).repeat(2).reshape(2, 2)
tm.assert_extension_array_equal(result, expected)

# Reverse so that backfill is not a no-op.
arr2 = arr[::-1]
assert not arr2[0].isna().any()
assert arr2[1].isna().all()

result2 = arr2.pad_or_backfill(method=method, limit=None)
result2 = arr2._pad_or_backfill(method=method, limit=None)

expected2 = (
data_missing[::-1].pad_or_backfill(method=method).repeat(2).reshape(2, 2)
data_missing[::-1]._pad_or_backfill(method=method).repeat(2).reshape(2, 2)
)
tm.assert_extension_array_equal(result2, expected2)

Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/extension/base/missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def test_fillna_no_op_returns_copy(self, data):
assert result is not data
tm.assert_extension_array_equal(result, data)

result = data.pad_or_backfill(method="backfill")
result = data._pad_or_backfill(method="backfill")
assert result is not data
tm.assert_extension_array_equal(result, data)

Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/extension/decimal/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ def value_counts(self, dropna: bool = True):
return value_counts(self.to_numpy(), dropna=dropna)

# We override fillna here to simulate a 3rd party EA that has done so. This
# lets us test the deprecation telling authors to implement pad_or_backfill
# lets us test the deprecation telling authors to implement _pad_or_backfill
# Simulate a 3rd-party EA that has not yet updated to include a "copy"
# keyword in its fillna method.
# error: Signature of "fillna" incompatible with supertype "ExtensionArray"
Expand Down