# Filtering rows in Pandas `DataFrame`s
Here's how to select rows that contain certain values for columns in Pandas. This is also called filtering by conditions.

In [1]:
import pandas as pd

In [2]:
baseball_df = pd.DataFrame({
    'City': ['Pittsburgh', 'Cincinnati', 'Chicago', 'St. Louis', 'Milwaukee'], 
    'Team': ['Pirates', 'Reds', 'Cubs', 'Cardinals', 'Brewers'], 
    'Division': 5 * ['Central'], 
    'League': 5 * ['NL'], 
    'Wins': [87, 79, 64, 59, 55]
})
baseball_df

Unnamed: 0,City,Team,Division,League,Wins
0,Pittsburgh,Pirates,Central,NL,87
1,Cincinnati,Reds,Central,NL,79
2,Chicago,Cubs,Central,NL,64
3,St. Louis,Cardinals,Central,NL,59
4,Milwaukee,Brewers,Central,NL,55


Filtering in Pandas works by using conditions. For each row, the condition is `True` or `False`.

In [4]:
wins = baseball_df.Wins > 65

In [5]:
wins

0     True
1     True
2    False
3    False
4    False
Name: Wins, dtype: bool

Unnamed: 0,City,Team,Division,League,Wins
0,Pittsburgh,Pirates,Central,NL,87
1,Cincinnati,Reds,Central,NL,79


For this condition, the first 2 rows have more than 65 wins and the rest do not.

We can now use `.loc` to filter the dataframe to select just the rows for which this condition is `True`.

In [None]:
baseball_df.loc[baseball_df["Wins"] > 65]

You can also select which columns you want in the resulting dataframe

In [7]:
baseball_df.loc[baseball_df["Wins"] > 65, ["City", "Team", "Wins"]]

Unnamed: 0,City,Team,Wins
0,Pittsburgh,Pirates,87
1,Cincinnati,Reds,79


You can also filter rows by matches with strings.

In [9]:
baseball_df.loc[baseball_df.Team == "Brewers"]

Unnamed: 0,City,Team,Division,League,Wins
4,Milwaukee,Brewers,Central,NL,55


To create a filter with multiple possible values (the OR operator), use `|` between conditions.

In [10]:
baseball_df.loc[(baseball_df.Team == "Brewers") | (baseball_df.Team == "Cardinals")]

Unnamed: 0,City,Team,Division,League,Wins
3,St. Louis,Cardinals,Central,NL,59
4,Milwaukee,Brewers,Central,NL,55


If you are checking for matches with many possible values, check out `.isin()`.

In [11]:
baseball_df.loc[baseball_df.Team.isin(["Brewers", "Cardinals"])]

Unnamed: 0,City,Team,Division,League,Wins
3,St. Louis,Cardinals,Central,NL,59
4,Milwaukee,Brewers,Central,NL,55


Sometimes you want rows to match multiple conditions (the AND operator). To do that, use `&` between conditions.

In [13]:
baseball_df.loc[(baseball_df.Team == "Cardinals") & (baseball_df.Wins > 55)]

Unnamed: 0,City,Team,Division,League,Wins
3,St. Louis,Cardinals,Central,NL,59


There are many useful functions for matching strings in Pandas. Check the [string function documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html#string-methods) for more information.

In [14]:
baseball_df.loc[baseball_df.City.str.contains("Pitt")]

Unnamed: 0,City,Team,Division,League,Wins
0,Pittsburgh,Pirates,Central,NL,87
