In [1]:
import pandas as pd

# Conditional Filtering

In [2]:
# Sample DataFrame
df = pd.DataFrame({
    "Name": ["Onkar", "Amit", "Sara", "Rohit", "Neha"],
    "Age": [21, 25, 23, 29, 20],
    "City": ["Pune", "Mumbai", "Nashik", "Pune", "Nagpur"],
    "Salary": [50000, 65000, 55000, 70000, 48000]
})

df

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
1,Amit,25,Mumbai,65000
2,Sara,23,Nashik,55000
3,Rohit,29,Pune,70000
4,Neha,20,Nagpur,48000


## 1. Basic conditional filtering

Basic filtering with a condition.  
For more condition check next point.

In [3]:
df[df["Age"]>23]

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
3,Rohit,29,Pune,70000


In [4]:
df[df["Salary"] > 60000]

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
3,Rohit,29,Pune,70000


In [12]:
df[df["City"] == "Pune"]

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
3,Rohit,29,Pune,70000


## 2. Multiple conditions filtering

`df[(condition1) &|~ (condition2) .....]`  
This is general syntax for multiple conditions.  
* parentheses around each condition.

In [18]:
df[(df["Age"]>23) & (df["Salary"]>60000)] # With AND

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
3,Rohit,29,Pune,70000


In [19]:
df[(df["City"]=="Pune") | (df["City"]=="Mumbai")] # With OR

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
1,Amit,25,Mumbai,65000
3,Rohit,29,Pune,70000


## 3. Filtering with `!=`

`!=` - Not equal to

In [20]:
df[df["City"] != "Pune"]

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
2,Sara,23,Nashik,55000
4,Neha,20,Nagpur,48000


## 4. Filtering with `.isin()`

Used when we filtering with multiple conditions in a single columns  
`.isin(list)`  
the parameter is list.  

`df[df["City"].isin(["Pune", "Nashik"])]` for Include Pune & Nashik  
`~ df[df["City"].isin(["Pune", "Nashik"])]` to exclude Pune & Nashik  

In [23]:
df[df["City"].isin(["Pune", "Nashik"])]

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
2,Sara,23,Nashik,55000
3,Rohit,29,Pune,70000


In [24]:
df[~df["City"].isin(["Pune", "Nashik"])]

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
4,Neha,20,Nagpur,48000


## 5. Filtering with String Method (`.str`)

Can use `.str` conditions on column by using `.str`

In [36]:
# Name start with O case sensitive
df[df["Name"].str.startswith("O")]

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000


In [37]:
# City name starts with N case sensitive
df[df["City"].str.contains("N")]

Unnamed: 0,Name,Age,City,Salary
2,Sara,23,Nashik,55000
4,Neha,20,Nagpur,48000


In [38]:
# City with N in it but the not case sensitive
df[df["City"].str.contains("N", case=False)]

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
2,Sara,23,Nashik,55000
3,Rohit,29,Pune,70000
4,Neha,20,Nagpur,48000


## 6. Filtering using multiple columns

In [42]:
# People from pune with age < 25
df[(df["City"]=="Pune") & (df["Age"]<25)]

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000


In [43]:
# People whose salary is between 50k and 70k
df[(df["Salary"]>50000) & (df["Salary"]<70000)]

Unnamed: 0,Name,Age,City,Salary
1,Amit,25,Mumbai,65000
2,Sara,23,Nashik,55000


## 7. Using `.query()` - SQL like filtering

`df.query(sql-query)`  
Write query in "" and for strings inside the query use ''.  

In [44]:
df.query("Age>23 and Salary>65000")

Unnamed: 0,Name,Age,City,Salary
3,Rohit,29,Pune,70000


In [48]:
df.query("City == 'Pune'")

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
3,Rohit,29,Pune,70000


In [49]:
df.query("City in ['Pune', 'Mumbai']")

Unnamed: 0,Name,Age,City,Salary
0,Onkar,21,Pune,50000
1,Amit,25,Mumbai,65000
3,Rohit,29,Pune,70000


# Symmary
| Task | Method |
| - | - |
| Simple condition | `df[df["Age"]>23]` |
| AND | `df[(condition1) & (condition2)]` |
| OR | `df[(condition1) \| (condition2)]` |
| NOT | `df[~df[...]]` |
| Filter by List | `df[df["City"].isin(list)]` |
| Filter by string pattern | `df[df["Name"].str.method()]` |
| SQL-style filtering | `df.query("...")` |