# Pandas Boolean Indexing

Boolean indexing allows you to filter rows using **conditions** that return True or False.

In this notebook, youâ€™ll learn:
- How to filter rows using conditions  
- How to combine conditions with `&`, `|`, `~`  
- How to filter using string methods  
- How to write reusable boolean masks  


ðŸŸ¦ 1. Create Sample Data

In [1]:
import pandas as pd

data = {
    "student_id": [101, 102, 103, 104, 105, 106],
    "name": ["Alice", "Bob", "Charlie", "David", "Eva", "Alex"],
    "major": ["Math", "Physics", "Math", "Biology", "Physics", "Math"],
    "math_score": [85, 70, 90, 60, 95, 88],
    "attendance": [90, 80, 95, 70, 98, 85]
}

df = pd.DataFrame(data)
df.set_index("student_id", inplace=True)
df

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
101,Alice,Math,85,90
102,Bob,Physics,70,80
103,Charlie,Math,90,95
104,David,Biology,60,70
105,Eva,Physics,95,98
106,Alex,Math,88,85


ðŸŸ¦ 2. Filtering Rows Based on Conditions

In [2]:
# Students with math score above 85
df[df["math_score"] > 85]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95
105,Eva,Physics,95,98
106,Alex,Math,88,85


In [3]:
# Students with attendance below 85
df[df["attendance"] < 85]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
102,Bob,Physics,70,80
104,David,Biology,60,70


ðŸŸ¦ 3. Using Logical Operations (&, |, ~)

3.1 AND Condition

In [4]:
df[(df["math_score"] > 80) & (df["attendance"] > 90)]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95
105,Eva,Physics,95,98


3.2 OR Condition

In [5]:
df[(df["math_score"] > 90) | (df["attendance"] > 95)]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
105,Eva,Physics,95,98


3.3 NOT Condition

In [6]:
df[~(df["major"] == "Physics")]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
101,Alice,Math,85,90
103,Charlie,Math,90,95
104,David,Biology,60,70
106,Alex,Math,88,85


ðŸŸ¦ 4. Filtering with String Conditions

Pandas allows powerful string filtering using `.str` methods.

4.1 Names Starting with 'A'

In [7]:
df[df["name"].str.startswith("A")]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
101,Alice,Math,85,90
106,Alex,Math,88,85


4.2 ames Containing "ar"

In [8]:
df[df["name"].str.contains("ar")]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95


4.3 Majors Ending With "ics"

In [11]:
df[df["major"].str.endswith("ics")]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
102,Bob,Physics,70,80
105,Eva,Physics,95,98


ðŸŸ¦ 5. Reusable Boolean Masks

Boolean masks are reusable filters that make complex logic clean and readable.


In [12]:
high_math = df["math_score"] > 85
good_attendance = df["attendance"] > 90

# Reuse masks
df[high_math & good_attendance]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95
105,Eva,Physics,95,98


In [13]:
df.loc[high_math, ["name", "math_score", "attendance"]]

Unnamed: 0_level_0,name,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
103,Charlie,90,95
105,Eva,95,98
106,Alex,88,85


In [14]:
df[~high_math]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
101,Alice,Math,85,90
102,Bob,Physics,70,80
104,David,Biology,60,70


In [15]:
df[(df["major"] == "Math") & (df["math_score"] > 85) & (df["attendance"] > 88)]

Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95


## âš  Why `and` / `or` don't work in pandas filtering

You **cannot** use Python's `and` / `or` with pandas Series.

Because a Series contains many True/False values, not just one.

# and / or expect a single True or False


In [17]:
# This will FAIL:
try:
    df[df["math_score"] > 80 and df["attendance"] > 90]
except Exception as e:
    print("Error:", e)

# This will WORK:
df[(df["math_score"] > 80) & (df["attendance"] > 90)]

Error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


Unnamed: 0_level_0,name,major,math_score,attendance
student_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
103,Charlie,Math,90,95
105,Eva,Physics,95,98
