Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions doc/redirects.csv
Original file line number Diff line number Diff line change
Expand Up @@ -358,7 +358,6 @@ generated/pandas.DataFrame.ewm,../reference/api/pandas.DataFrame.ewm
generated/pandas.DataFrame.expanding,../reference/api/pandas.DataFrame.expanding
generated/pandas.DataFrame.ffill,../reference/api/pandas.DataFrame.ffill
generated/pandas.DataFrame.fillna,../reference/api/pandas.DataFrame.fillna
generated/pandas.DataFrame.filter,../reference/api/pandas.DataFrame.filter
generated/pandas.DataFrame.first,../reference/api/pandas.DataFrame.first
generated/pandas.DataFrame.first_valid_index,../reference/api/pandas.DataFrame.first_valid_index
generated/pandas.DataFrame.floordiv,../reference/api/pandas.DataFrame.floordiv
Expand Down Expand Up @@ -1023,7 +1022,6 @@ generated/pandas.Series.expanding,../reference/api/pandas.Series.expanding
generated/pandas.Series.factorize,../reference/api/pandas.Series.factorize
generated/pandas.Series.ffill,../reference/api/pandas.Series.ffill
generated/pandas.Series.fillna,../reference/api/pandas.Series.fillna
generated/pandas.Series.filter,../reference/api/pandas.Series.filter
generated/pandas.Series.first,../reference/api/pandas.Series.first
generated/pandas.Series.first_valid_index,../reference/api/pandas.Series.first_valid_index
generated/pandas.Series.floordiv,../reference/api/pandas.Series.floordiv
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/frame.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,6 @@ Reindexing / selection / label manipulation
DataFrame.drop_duplicates
DataFrame.duplicated
DataFrame.equals
DataFrame.filter
DataFrame.idxmax
DataFrame.idxmin
DataFrame.reindex
Expand All @@ -193,6 +192,7 @@ Reindexing / selection / label manipulation
DataFrame.rename_axis
DataFrame.reset_index
DataFrame.sample
DataFrame.select
DataFrame.set_axis
DataFrame.set_index
DataFrame.take
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ Reindexing / selection / label manipulation
Series.mask
Series.add_prefix
Series.add_suffix
Series.filter
Series.select

Missing data handling
---------------------
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/gotchas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ Mutating with User Defined Function (UDF) methods

This section applies to pandas methods that take a UDF. In particular, the methods
:meth:`DataFrame.apply`, :meth:`DataFrame.aggregate`, :meth:`DataFrame.transform`, and
:meth:`DataFrame.filter`.
:meth:`DataFrame.select`.

It is a general rule in programming that one should not mutate a container
while it is being iterated over. Mutation will invalidate the iterator,
Expand Down
12 changes: 0 additions & 12 deletions doc/source/user_guide/user_defined_functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,6 @@ User-Defined Functions can be applied across various pandas methods:
+-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| :ref:`udf.pipe` | Series or DataFrame | Series or DataFrame | Chain functions together to apply to Series or Dataframe |
+-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| :ref:`udf.filter` | Series or DataFrame | Boolean | Only accepts UDFs in group by. Function is called for each group, and the group is removed from the result if the function returns ``False`` |
+-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| :ref:`udf.agg` | Series or DataFrame | Scalar or Series | Aggregate and summarizes values, e.g., sum or custom reducer |
+-------------------------------+------------------------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| :ref:`udf.transform` (axis=0) | Column (Series) | Column (Series) | Same as :meth:`apply` with (axis=0), but it raises an exception if the function changes the shape of the data |
Expand Down Expand Up @@ -261,16 +259,6 @@ calling multiple functions.

When to use: Use :meth:`pipe` when you need to create a pipeline of operations and want to keep the code readable and maintainable.

.. _udf.filter:

:meth:`Series.filter` and :meth:`DataFrame.filter`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``filter`` method is used to select a subset of rows that match certain criteria.
:meth:`Series.filter` and :meth:`DataFrame.filter` do not support user defined functions,
but :meth:`SeriesGroupBy.filter` and :meth:`DataFrameGroupBy.filter` do. You can read more
about ``filter`` in groupby operations in :ref:`groupby.filter`.

.. _udf.agg:

:meth:`Series.agg` and :meth:`DataFrame.agg`
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -715,6 +715,7 @@ Other Deprecations
- Deprecated the ``arg`` parameter of ``Series.map``; pass the added ``func`` argument instead. (:issue:`61260`)
- Deprecated using ``epoch`` date format in :meth:`DataFrame.to_json` and :meth:`Series.to_json`, use ``iso`` instead. (:issue:`57063`)
- Deprecated allowing ``fill_value`` that cannot be held in the original dtype (excepting NA values for integer and bool dtypes) in :meth:`Series.unstack` and :meth:`DataFrame.unstack` (:issue:`12189`, :issue:`53868`)
- Deprecated :meth:`Series.filter` and :meth:`DataFrame.filter`, renaming these to ``select`` (:issue:`26642`)
- Deprecated allowing ``fill_value`` that cannot be held in the original dtype (excepting NA values for integer and bool dtypes) in :meth:`Series.shift` and :meth:`DataFrame.shift` (:issue:`53802`)
- Deprecated option "future.no_silent_downcasting", as it is no longer used. In a future version accessing this option will raise (:issue:`59502`)
- Deprecated slicing on a :class:`Series` or :class:`DataFrame` with a :class:`DatetimeIndex` using a ``datetime.date`` object, explicitly cast to :class:`Timestamp` instead (:issue:`35830`)
Expand Down
49 changes: 41 additions & 8 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5520,7 +5520,8 @@ def _reindex_with_indexers(
self
)

def filter(
@final
def select(
self,
items=None,
like: str | None = None,
Expand All @@ -5530,9 +5531,9 @@ def filter(
"""
Subset the DataFrame or Series according to the specified index labels.

For DataFrame, filter rows or columns depending on ``axis`` argument.
Note that this routine does not filter based on content.
The filter is applied to the labels of the index.
For DataFrame, select rows or columns depending on ``axis`` argument.
Note that this routine does not select based on content.
The select is applied to the labels of the index.

Parameters
----------
Expand All @@ -5551,7 +5552,7 @@ def filter(
Returns
-------
Same type as caller
The filtered subset of the DataFrame or Series.
The selected subset of the DataFrame or Series.

See Also
--------
Expand Down Expand Up @@ -5579,22 +5580,54 @@ def filter(
rabbit 4 5 6

>>> # select columns by name
>>> df.filter(items=["one", "three"])
>>> df.select(items=["one", "three"])
one three
mouse 1 3
rabbit 4 6

>>> # select columns by regular expression
>>> df.filter(regex="e$", axis=1)
>>> df.select(regex="e$", axis=1)
one three
mouse 1 3
rabbit 4 6

>>> # select rows containing 'bbi'
>>> df.filter(like="bbi", axis=0)
>>> df.select(like="bbi", axis=0)
one two three
rabbit 4 5 6
"""

return self._filter(items=items, like=like, regex=regex, axis=axis)

@final
def filter(
self,
items=None,
like: str | None = None,
regex: str | None = None,
axis: Axis | None = None,
) -> Self:
"""
Use obj.select instead.

.. deprecated:: 3.0.0
"""
warnings.warn(
f"{type(self).__name__}.filter is deprecated and will be removed "
"in a future version. Use obj.select instead.",
Pandas4Warning,
stacklevel=find_stack_level(),
)
return self._filter(items=items, like=like, regex=regex, axis=axis)

@final
def _filter(
self,
items=None,
like: str | None = None,
regex: str | None = None,
axis: Axis | None = None,
) -> Self:
nkw = common.count_not_none(items, like, regex)
if nkw > 1:
raise TypeError(
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -756,7 +756,7 @@ def filter(self, func, dropna: bool = True, *args, **kwargs):

See Also
--------
Series.filter: Filter elements of ungrouped Series.
Series.select : Select elements of ungrouped Series.
DataFrameGroupBy.filter : Filter elements from groups base on criterion.

Notes
Expand Down Expand Up @@ -2380,7 +2380,7 @@ def filter(self, func, dropna: bool = True, *args, **kwargs) -> DataFrame:

See Also
--------
DataFrame.filter: Filter elements of ungrouped DataFrame.
DataFrame.select: Select elements of ungrouped DataFrame.
SeriesGroupBy.filter : Filter elements from groups base on criterion.

Notes
Expand Down
8 changes: 4 additions & 4 deletions pandas/tests/copy_view/test_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -365,14 +365,14 @@ def test_select_dtypes():


@pytest.mark.parametrize(
"filter_kwargs", [{"items": ["a"]}, {"like": "a"}, {"regex": "a"}]
"select_kwargs", [{"items": ["a"]}, {"like": "a"}, {"regex": "a"}]
)
def test_filter(filter_kwargs):
# Case: selecting columns using `filter()` returns a new dataframe
def test_select(select_kwargs):
# Case: selecting columns using `select_kwargs()` returns a new dataframe
# + afterwards modifying the result
df = DataFrame({"a": [1, 2, 3], "b": [4, 5, 6], "c": [0.1, 0.2, 0.3]})
df_orig = df.copy()
df2 = df.filter(**filter_kwargs)
df2 = df.select(**select_kwargs)
assert np.shares_memory(get_array(df2, "a"), get_array(df, "a"))

# mutating df2 triggers a copy-on-write for that column/block
Expand Down
Loading
Loading