In [1]:
import pandas as pd

# Today we will learn dataframe slicing.
df = pd.read_csv("data.csv")
df

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9


### Selecting multiple columns

In [2]:
df[["Actor", "Film"]]

Unnamed: 0,Actor,Film
0,Shah Rukh Khan,Pathaan
1,Salman Khan,Tiger Zinda Hai
2,Aamir Khan,Dangal
3,Ranbir Kapoor,Brahmastra
4,Ranveer Singh,Padmaavat
5,Ayushmann Khurrana,Andhadhun
6,Rajkummar Rao,Stree
7,Hrithik Roshan,War
8,Akshay Kumar,Good Newwz
9,Kartik Aaryan,Bhool Bhulaiyaa 2


### Single element access

In [5]:
df.at[0, "Actor"]

'Shah Rukh Khan'

Index-based

In [7]:
df.iat[0,1]

'Pathaan'

### Filtering

Movies with IMDb rating greater than 5.0

In [11]:
df["IMDb"] > 5.0 # returns boolean values for IMDb columns

0     True
1     True
2     True
3     True
4     True
5     True
6     True
7     True
8     True
9     True
10    True
11    True
Name: IMDb, dtype: bool

In [13]:
df[df["IMDb"] > 5.0] # filters the data frame with the condition and prints it.

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
2,Aamir Khan,Dangal,2016,Biography,2024,8.4
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9


Shahrukh khan movies with IMDb rating greater than 6

In [20]:
df[(df["IMDb"] > 6) & (df["Actor"] == "Shah Rukh Khan")]
# You can do this too. It selects the column IMDb only. It return a series.
# df[(df["IMDb"] > 6) & (df["Actor"] == "Shah Rukh Khan")]["IMDb"]

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2


Movies after 2018 with more than 100 cr box office collection

In [17]:
df[(df["Year"] > 2018) & (df["BoxOffice(INR Crore)"] > 100)] # Always use parenthesis and binary operators.

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2


### .query()

It is used to filter df with sql like syntax. It is very easy!

Action movies after 2018

In [18]:
df.query("Genre == 'Action' and Year > 2018")

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
7,Hrithik Roshan,War,2019,Action,475,6.5
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2


In order to pass variables, you can use @.
You can include f-strings too!

In [22]:
year = 2018
df.query(f"Genre == 'Action' and Year > {year}") # Use `` for column names with spaces.

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
7,Hrithik Roshan,War,2019,Action,475,6.5
11,Vicky Kaushal,Uri: The Surgical Strike,2019,Action,342,8.2


Here are the main **rules and tips** for using `.query()` in pandas:

---

### **1. Column names become variables**
You can reference column names directly in the query string:

```python
df.query("age > 25 and city == 'Delhi'")
```

---

### **2. String values must be in quotes**
Use **single** or **double** quotes around strings in the expression:

```python
df.query("name == 'Harry'")
```

If you have quotes inside quotes, mix them:

```python
df.query('city == "Mumbai"')
```

---

### **3. Use backticks for column names with spaces or special characters**
If a column name has spaces, use backticks (`` ` ``):

```python
df.query("`first name` == 'Alice'")
```

---

### **4. You can use `@` to reference Python variables**
To pass external variables into `.query()`:

```python
age_limit = 30
df.query("age > @age_limit")
```

---

### **5. Logical operators**
Use these:
- `and`, `or`, `not` — instead of `&`, `|`, `~`
- `==`, `!=`, `<`, `>`, `<=`, `>=`

Bad:
```python
df.query("age > 30 & city == 'Delhi'")  # ❌
```

Good:
```python
df.query("age > 30 and city == 'Delhi'")  # ✅
```

---

### **6. Chained comparisons**
Just like Python:

```python
df.query("25 < age <= 40")
```

---

### **7. Avoid using reserved keywords as column names**
If you have a column named `class`, `lambda`, etc., you’ll need to use backticks:

```python
df.query("`class` == 'Physics'")
```

---

### **8. Case-sensitive**
Column names and string values are case-sensitive:

```python
df.query("City == 'delhi'")  # ❌ if actual value is 'Delhi'
```

---

### **9. `.query()` returns a **copy**, not a view**
The result is a new DataFrame. Changes won't affect the original unless reassigned:

```python
filtered = df.query("age < 50")
```

---

Note: If you want to make a new df out of a sliced df, always .copy(). It ensures that the copy of df is returned instead of a view. It ensures safety. If using .query(), then not required.

In [25]:
df2 = df[(df['Year'] > 2016) & (df['IMDb'] > 5)].copy()

In [26]:
df2

Unnamed: 0,Actor,Film,Year,Genre,BoxOffice(INR Crore),IMDb
0,Shah Rukh Khan,Pathaan,2023,Action,1050,7.2
1,Salman Khan,Tiger Zinda Hai,2017,Action,565,6.0
3,Ranbir Kapoor,Brahmastra,2022,Fantasy,431,5.6
4,Ranveer Singh,Padmaavat,2018,Historical,585,7.0
5,Ayushmann Khurrana,Andhadhun,2018,Thriller,111,8.3
6,Rajkummar Rao,Stree,2018,Horror Comedy,180,7.5
7,Hrithik Roshan,War,2019,Action,475,6.5
8,Akshay Kumar,Good Newwz,2019,Comedy,318,7.0
9,Kartik Aaryan,Bhool Bhulaiyaa 2,2022,Horror Comedy,266,5.9
10,Varun Dhawan,Badrinath Ki Dulhania,2017,Romantic Comedy,201,6.1
