Data Selection & Filtering
Selecting the right rows and columns is the first step in analyzing any dataset. Pandas gives you several powerful ways to do this.

Selecting Rows & Columns
Selecting Columns
df["column_name"]        # Single column (as Series)
df[["col1", "col2"]]     # Multiple columns (as DataFrame)

Selecting Rows by Index
Use .loc[] (label-based) and .iloc[] (position-based):

df.loc[0]                # First row (by label)
df.iloc[0]               # First row (by position)

Select Specific Rows and Columns
df.loc[0, "Name"]        # Value at row 0, column 'Name'
df.iloc[0, 1]            # Value at row 0, column at index 1

You can also slice:

df.loc[0:2, ["Name", "Age"]]   # Rows 0 to 2, selected columns
df.iloc[0:2, 0:2]              # Rows and cols by index position

Fast Access: .at and .iat
These are optimized for single element access:

df.at[0, "Name"]       # Fast label-based access
df.iat[0, 1]           # Fast position-based access

Filtering with Conditions
Simple Condition
df[df["Age"] > 30]

Multiple Conditions (AND / OR)
df[(df["Age"] > 25) & (df["City"] == "Delhi")]
df[(df["Name"] == "Bob") | (df["Age"] < 30)]

Use parentheses around each condition!

Querying with .query()
The .query() method in pandas lets you filter DataFrame rows using a string expression — it's a more readable and often more concise alternative to using boolean indexing.

This is a cleaner, SQL-like way to filter:

df.query("Age > 25 and City == 'Delhi'")

Dynamic column names:

col = "Age"
df.query(f"{col} > 25")

Here are the main rules and tips for using .query() in pandas:

1. Column names become variables
You can reference column names directly in the query string:

df.query("age > 25 and city == 'Delhi'")

2. String values must be in quotes
Use single or double quotes around strings in the expression:

df.query("name == 'Harry'")

If you have quotes inside quotes, mix them:

df.query('city == "Mumbai"')

3. Use backticks for column names with spaces or special characters
If a column name has spaces, use backticks (`):

df.query("`first name` == 'Alice'")

4. You can use @ to reference Python variables
To pass external variables into .query():

age_limit = 30
df.query("age > @age_limit")

5. Logical operators
Use these:

and, or, not — instead of &, |, ~
==, !=, <, >, <=, >=
Bad:

df.query("age > 30 & city == 'Delhi'")  # ❌

Good:

df.query("age > 30 and city == 'Delhi'")  # ✅

6. Chained comparisons
Just like Python:

df.query("25 < age <= 40")

7. Avoid using reserved keywords as column names
If you have a column named class, lambda, etc., you’ll need to use backticks:

df.query("`class` == 'Physics'")

8. Case-sensitive
Column names and string values are case-sensitive:

df.query("City == 'delhi'")  # ❌ if actual value is 'Delhi'

9. .query() returns a copy, not a view
The result is a new DataFrame. Changes won't affect the original unless reassigned:

filtered = df.query("age < 50")

Summary
Use df[col], .loc[], .iloc[], .at[], .iat[] to access data
Filter with logical conditions or .query() for readable code
Mastering selection makes the rest of pandas feel easy
Download Jupyter Notebook
Download data.json

In [None]:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,10,(3,3)))
print(df)
df.index = ["R1","R2","R3"]
df.columns = ["C1","C2","C3"]
print("\n",df)
print("\nSelecting columns:-")
print(df["C2"])
print(df[["C1","C3"]])

print("\nSelecting rows by index:-")
print(df.loc["R2"])
print(df.loc["R1","C2"])
print(df.iloc[1])

print("\nSingle element access:-")
print(df.at["R2","C3"])
#df.iat[] -> used when giving index

print("\nFiltering with conditions:-")
Data = {
    "Name":["Sameer","Sophia","Aliya","Arhan","Rahul","Sam","Harry"]
}
df2 = pd.DataFrame("Data")

   0  1  2
0  4  7  7
1  8  6  6
2  8  7  3

     C1  C2  C3
R1   4   7   7
R2   8   6   6
R3   8   7   3

Selecting columns:-
R1    7
R2    6
R3    7
Name: C2, dtype: int64
    C1  C3
R1   4   7
R2   8   6
R3   8   3

Selecting rows by index:-
C1    8
C2    6
C3    6
Name: R2, dtype: int64
7
C1    8
C2    6
C3    6
Name: R2, dtype: int64

Single element access:-


np.int64(6)

In [None]:
# Filtering with condition
# Simple Condition
df