Link to Medium blog post: https://towardsdatascience.com/how-to-filter-a-pandas-dataframe-in-3-minutes-b8bc4fd3443e

# How To Filter A Pandas Dataframe in 3 Minutes

In [4]:
import pandas as pd
import  numpy  as np

## Boolean Indexing

Boolean Indexing requires finding the true values for each row. If you look for df['column'] == 'XY', a True/False Series is created.

Imagine you were just on the set with the Simpsons. They are stars and each of them is allowed to order a few things for the next season, which are paid by the producers. Your job is to take the protagonists’ orders and forward the relevant data:

In [5]:
df = pd.DataFrame({'Items': 'Car Saxophone Curler Car Slingshot Duff'.split(),
 'Customer': 'Homer Lisa Marge Lisa Bart Homer'.split(),
 'Amount': np.arange(6), 'Costs': np.arange(6) * 2})
print(df)

       Items Customer  Amount  Costs
0        Car    Homer       0      0
1  Saxophone     Lisa       1      2
2     Curler    Marge       2      4
3        Car     Lisa       3      6
4  Slingshot     Bart       4      8
5       Duff    Homer       5     10


### Example 1 — Select rows with a specific value

Let’s dig up all the entries for Bart so we can forward them to his manager:

In [6]:
df.loc[df['Customer'] == 'Bart']

Unnamed: 0,Items,Customer,Amount,Costs
4,Slingshot,Bart,4,8


### Example 2 — Select rows from a list

Word got around on set that we are the data experts and the advertising partners want to know what the children in the show order for marketing reasons:

In [7]:
kids = ['Lisa','Bart']
df.loc[df['Customer'].isin(kids)]

Unnamed: 0,Items,Customer,Amount,Costs
1,Saxophone,Lisa,1,2
3,Car,Lisa,3,6
4,Slingshot,Bart,4,8


We’ll see saxophone, cars and slingshots on the TV commercials…

### Example 3 — Combine multiple conditions


Also the Simpsons have to save money. The new rules are: 1) no cars and 2) no more than 3 items may be ordered per person:

In [8]:
df.loc[(df['Items'] != 'Car') & (df['Amount'] <= 3)]

Unnamed: 0,Items,Customer,Amount,Costs
1,Saxophone,Lisa,1,2
2,Curler,Marge,2,4


I hope Homer and Bart will stay with us and not leave the show in a rage…



### Example 4 — Select all rows which not appears in a list

Unfortunately, the stars of the series are not at all enthusiastic about these cuts, so the first sponsors are coming forward to help out:

In [9]:
happy_stars = ['Lisa','Marge']
df.loc[~df['Customer'].isin(happy_stars)]

Unnamed: 0,Items,Customer,Amount,Costs
0,Car,Homer,0,0
4,Slingshot,Bart,4,8
5,Duff,Homer,5,10


## Positional Indexing

Sometimes you do not want to filter according to a certain condition, but select certain rows of the DataFrame based on their position. In this case we use slicing to get the wanted rows.

### Example 1 — Select the first rows of a dataframe

The new trainee in your department should not work directly with the whole data set, he only needs the first three entries:

In [10]:
df.iloc[0:3]

Unnamed: 0,Items,Customer,Amount,Costs
0,Car,Homer,0,0
1,Saxophone,Lisa,1,2
2,Curler,Marge,2,4


### Example 2 — Select the last rows of a dataframe

You’ve got a lot of work to do, and you’ll get another trainee. So that both trainees work on their tasks independently, you now save the last three lines of the record:

In [11]:
df.iloc[-3:]

Unnamed: 0,Items,Customer,Amount,Costs
3,Car,Lisa,3,6
4,Slingshot,Bart,4,8
5,Duff,Homer,5,10
