# **🧼 Day 3 – Data Cleaning & Missing Data 🧹**

#### **Goal:** Learn how to identify and systematically address common data quality issues, focusing primarily on handling missing values (NaN).

#### **Topics To Cover:** Data Cleaning Workflow, Identifying Missing Data, Techniques for Imputation (fillna), and Removing Missing Data (dropna).

----

## **Introduction to Data Quality 💎**
In the real world, data is rarely perfect. Before any meaningful analysis or machine learning can take place, data must be cleaned. **Data Cleaning** is the process of detecting and correcting (or removing) corrupt or inaccurate records from a dataset.

#### **The "Garbage In, Garbage Out" Principle 🗑️:**
This principle is the core reason data cleaning is the most time-consuming (and arguably most important) step in a data science project.

* If your input data is flawed (garbage in), any conclusions or models derived from it will also be flawed (garbage out).

* Cleaning ensures your analysis is based on valid, accurate, and consistent information.

#### *Definition*
* **Data Cleaning**: Data cleaning is the process of detecting and correcting (or removing) curropt, incorrect or irrelavent records from dataset.
It involves identifying errors and inconsistencies and then making changes to ensure data is high-quality and reliable.
This can include fixing typos, standerdizing formats ('USA' vs 'U.S.A') and removing duplicates and more.
* **Handling missing data**: it is the important part of the data cleaning process. Missing data refers to the absence of variable for dataset.

### **The Importance of Data Cleaning in `Machine Learning`**
<p>For Machine Learning, data cleaning isn't just important: it's absolutely critical. Machine Leanring models performance is directly tied the the quality of the data its trained on. Dirty, Incorret or corrupt data can lead to serious problems:</p>

1. **Lower Model Performance:** many Algorithms can't directly handle the missing data, if they do the inconsistencies and incorrect data will confuse them during training. The model then learn from errors in data and this can lead to significantly drop in its predictive accuracy. <br>

2. **Biased Models:** if the portion of the data is missing for specific group then the model will be biased against that group, this can make its prediction unfair. <br>

3. **Algorithm failure:** Some Machine Learning algorithms can simply fail or crash if they encounter NaN values. Data cleaning is a prerequisite to even begin the training process.

***
## Let's Begin

In [1]:
# import neccesary libraries
import pandas as pd
import numpy as np

# read the data
data = pd.read_csv('../data/Netflix Dataset.csv')

# Create the dataframe
df = pd.DataFrame(data)
# df.isna().sum() # This tells us how many null values are in each column
df

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


## **3.1: Identifying Missing Values (NaN) ❓**
Missing data, typically represented by NaN (Not a Number) in Pandas, is a common issue. The first step is always to quantify how much data is missing and where it is located.


#### **Best Practice:** Calculating Percentages
To get a true sense of the impact, always check the percentage of missing data per column:

${Missing Percentage= 
Total Rows/
Missing Count
×100}$

A high percentage of missing data (e.g., > 50%) might suggest dropping the column entirely, while a low percentage (< 5%) often allows for imputation.

#### Key Methods and Attributes

<table border="1">
  <thead>
    <tr>
      <th>Method/Attribute</th>
      <th>Purpose</th>
      <th>Output</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>df.isnull()</code> or <code>df.isna()</code></td>
      <td>Checks for missing values element-wise.</td>
      <td>DataFrame of Booleans (True for missing).</td>
    </tr>
    <tr>
      <td><code>df.notnull()</code> or <code>df.notna()</code></td>
      <td>Checks for non-missing values.</td>
      <td>DataFrame of Booleans (True for present).</td>
    </tr>
    <tr>
      <td><code>.sum()</code></td>
      <td>When chained after <code>isnull()</code>, calculates the total count of True (missing) values for each column.</td>
      <td>Series of missing value counts per column.</td>
    </tr>
    <tr>
      <td><code>df.info()</code></td>
      <td>Summarizes the DataFrame including column dtypes, non-null counts, and memory usage.</td>
      <td>Printed summary (not returned).</td>
    </tr>
    <tr>
      <td><code>df.count()</code></td>
      <td>Counts the number of non-missing values per column or row.</td>
      <td>Series or DataFrame of counts.</td>
    </tr>
    <tr>
      <td><code>df.any()</code></td>
      <td>Checks if any element is True along an axis.</td>
      <td>Boolean Series or DataFrame.</td>
    </tr>
    <tr>
      <td><code>df.all()</code></td>
      <td>Checks if all elements are True along an axis.</td>
      <td>Boolean Series or DataFrame.</td>
    </tr>
    <tr>
      <td><code>df.empty</code></td>
      <td>Checks whether the DataFrame is empty (has no elements).</td>
      <td>Boolean (True if empty).</td>
    </tr>
  </tbody>
</table>


***

`df.isna()` and `df.isnull()` <br>

**df.isna()** and **df.isnull** both are used to find the null values in the DataFrame and returns a boolean dataframe of same shape

In [2]:
df.isna()
df.isna().sum() # Use .sum() with isna() to get the count of null values in each column

df.isnull() # isnull() function works the same as isna()
df.isnull().sum() # isnull() function works the same as isna()

Show_Id            0
Category           0
Title              0
Director        2388
Cast             718
Country          507
Release_Date      10
Rating             7
Duration           0
Type               0
Description        0
dtype: int64

`df.notna()` and `df.notnull()` <br>

**df.notna()** and **df.notnull()** both are used to find the not-null values in the DataFrame. It returns the boolean dataframe of same shape.

In [3]:
df.notna()
df.notna().sum() # Use .sum() with notna() to get the count of not-null values in each column

df.notnull() # notnull() function works the same as notna()
df.notnull().sum() # notnull() function works the same as notna()

Show_Id         7789
Category        7789
Title           7789
Director        5401
Cast            7071
Country         7282
Release_Date    7779
Rating          7782
Duration        7789
Type            7789
Description     7789
dtype: int64

`.info()`

In [4]:
# use .info() to get a summary of the dataframe
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7789 entries, 0 to 7788
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Show_Id       7789 non-null   object
 1   Category      7789 non-null   object
 2   Title         7789 non-null   object
 3   Director      5401 non-null   object
 4   Cast          7071 non-null   object
 5   Country       7282 non-null   object
 6   Release_Date  7779 non-null   object
 7   Rating        7782 non-null   object
 8   Duration      7789 non-null   object
 9   Type          7789 non-null   object
 10  Description   7789 non-null   object
dtypes: object(11)
memory usage: 669.5+ KB


`.count()`

In [5]:
# use .count() to get the count of non-null values in each column
df.count()

Show_Id         7789
Category        7789
Title           7789
Director        5401
Cast            7071
Country         7282
Release_Date    7779
Rating          7782
Duration        7789
Type            7789
Description     7789
dtype: int64

`.any()`

In [6]:
# use .any() to check if any value is null in each column
df.isna().any()

Show_Id         False
Category        False
Title           False
Director         True
Cast             True
Country          True
Release_Date     True
Rating           True
Duration        False
Type            False
Description     False
dtype: bool

`.all()`

In [7]:
# use .all() to check if all values are null in each column
df.isna().all()

Show_Id         False
Category        False
Title           False
Director        False
Cast            False
Country         False
Release_Date    False
Rating          False
Duration        False
Type            False
Description     False
dtype: bool

`.empty`

In [8]:
# use .empty to check if the dataframe is empty
df.empty
# it can also be used to check if dataframe is empty after some operation or dropping null values
df.dropna().empty

False

***

## **3.2: Strategies for Handling Missing Data 🛠️**

Once identified, missing data must be addressed using one of two primary strategies: Removal or Imputation.

### **3.2.1. Removal: Dropping Missing Values**
The simplest approach is to remove the records (rows or columns) containing NaN values using the `df.dropna()` method.

<table border="1">
  <thead>
    <tr>
      <th>Method</th>
      <th>Purpose</th>
      <th>Effect</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>axis=0</code> (default)</td>
      <td>Removes rows that contain at least one NaN.</td>
      <td>Cleans up the data but can lose valuable information.</td>
    </tr>
    <tr>
      <td><code>axis=1</code></td>
      <td>Removes columns that contain at least one NaN.</td>
      <td>Only advisable if a column is mostly empty.</td>
    </tr>
    <tr>
      <td><code>how='any'</code> (default)</td>
      <td>Drops the row/column if any value is NaN.</td>
      <td>The most aggressive removal method.</td>
    </tr>
    <tr>
      <td><code>how='all'</code></td>
      <td>Drops the row/column only if all values are NaN.</td>
      <td>The most conservative removal method.</td>
    </tr>
    <tr>
      <td><code>thresh=N</code></td>
      <td>Keeps rows that have at least N non-NaN values.</td>
      <td>Useful for setting a minimum completeness threshold.</td>
    </tr>
  </tbody>
</table>


**`df.dropna()`:** This method is used to drop rows or columns that contains null values

In [9]:
# drop rows with any null values
df_dropped_rows = df.dropna() # By default, it drops rows with any null and returns a new DataFrame
df_dropped_rows

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
5,s6,TV Show,46,Serdar Akar,"Erdal Beşikçioğlu, Yasemin Allen, Melis Birkan...",Turkey,"July 1, 2017",TV-MA,1 Season,"International TV Shows, TV Dramas, TV Mysteries",A genetics professor experiments with a treatm...
...,...,...,...,...,...,...,...,...,...,...,...
7780,s7779,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
7782,s7781,Movie,Zoo,Shlok Sharma,"Shashank Arora, Shweta Tripathi, Rahul Kumar, ...",India,"July 1, 2018",TV-MA,94 min,"Dramas, Independent Movies, International Movies",A drug dealer starts having doubts about his t...
7783,s7782,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...


In [10]:
# drop columns with any null values
df_dropped_columns = df.dropna(axis=1) # axis=1 specifies to drop columns instead of rows
df_dropped_columns

Unnamed: 0,Show_Id,Category,Title,Duration,Type,Description
0,s1,TV Show,3%,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [11]:
# how parameters defines the condition for dropping
df_drop_any = df.dropna(how='any') # Drops rows with any null values, it is the default behavior
df_drop_any

df_drop_all = df.dropna(how='all') # Drops rows where all values are null
df_drop_all

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [12]:
# thresh parameter specifies the minimum number of non-null values required to keep a row or column
df_thresh = df.dropna(thresh=5) # Keeps rows with at least 5 non-null values
df_thresh

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [13]:
# subset parameter specifies which columns to consider when dropping rows with null values
# important: this parameter is designed to work only when dropping rows (axis=0), if you try to use it when dropping columns (axis=1), it will raise a KeyError
df_subset = df.dropna(subset=['Director', 'Rating']) # Drops rows where 'Director' or 'Rating' is null
df_subset

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
5,s6,TV Show,46,Serdar Akar,"Erdal Beşikçioğlu, Yasemin Allen, Melis Birkan...",Turkey,"July 1, 2017",TV-MA,1 Season,"International TV Shows, TV Dramas, TV Mysteries",A genetics professor experiments with a treatm...
...,...,...,...,...,...,...,...,...,...,...,...
7782,s7781,Movie,Zoo,Shlok Sharma,"Shashank Arora, Shweta Tripathi, Rahul Kumar, ...",India,"July 1, 2018",TV-MA,94 min,"Dramas, Independent Movies, International Movies",A drug dealer starts having doubts about his t...
7783,s7782,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...


In [14]:
# If you want to modify the original DataFrame in place, you can use the inplace=True parameter
# df.dropna(inplace=True) # This will drop rows with any null values from the original DataFrame

---

### **3.2.2. Imputation: Filling Missing Values**

Imputation means replacing the NaN values with a substitute value using the .fillna() method. The choice of substitute value depends on the data type and distribution.

#### Key Methods:

<table border="1">
  <thead>
    <tr>
      <th>Imputation Method</th>
      <th>Technique</th>
      <th>When to Use</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>ffill</code> (forward fill)</td>
      <td>Replace NaN with the previous valid observation.</td>
      <td>For Time Series data where temporal order matters.</td>
    </tr>
    <tr>
      <td><code>bfill</code> (backward fill)</td>
      <td>Replace NaN with the next valid observation.</td>
      <td>For Time Series data where temporal order matters.</td>
    </tr>
    <tr>
      <td><code>df.fillna()</code></td>
      <td>Fills missing values with a specified value, method, or strategy.</td>
      <td>General-purpose method for flexible filling (constant, mean, ffill, bfill, etc.).</td>
    </tr>
    <tr>
      <td><code>df.interpolate()</code></td>
      <td>Fills NaN values using interpolation (linear, polynomial, time-based, etc.).</td>
      <td>Best for numerical or time series data with trends.</td>
    </tr>
    <tr>
      <td><code>df.mask()</code></td>
      <td>Replaces values where a condition is True.</td>
      <td>Useful for conditional replacement, often the inverse of <code>where()</code>.</td>
    </tr>
    <tr>
      <td><code>df.where()</code></td>
      <td>Keeps values where a condition is True, replaces others with NaN or specified value.</td>
      <td>For conditional filtering while keeping DataFrame shape intact.</td>
    </tr>
    <tr>
      <td><code>df.replace()</code></td>
      <td>Replaces specific values with others (e.g., NaN, strings, or numbers).</td>
      <td>When you need to substitute exact matches or patterns.</td>
    </tr>
  </tbody>
</table>


**1. `fillna()`:** Replaces null values with a specified value. Use dictionary for column-wise filling, or a method like forward/backward fill.

In [15]:
df_fill = df.fillna(0) # Fill all null values with 0
df_fill

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,0,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,0,Nasty C,0,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,0,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [16]:
# Sample DataFrame
df2 = pd.DataFrame({
    'A': [1, np.nan, 3],
    'B': [np.nan, 4, np.nan],
    'C': [5, 6, np.nan]
})
# axis=0: Operates column-wise, meaning fillna processes each column independently, filling NaN values with the previous non-null value in the same column, moving from top to bottom (down rows). it is the default behavior.
df_fillna_axis0 = df2.fillna(df2.median() , axis=0)
df_fillna_axis0

Unnamed: 0,A,B,C
0,1.0,4.0,5.0
1,2.0,4.0,6.0
2,3.0,4.0,5.5


In [17]:
# axis=1: Operates row-wise, meaning fillna processes each row independently, filling NaN values with the previous non-null value in the same row, moving from left to right (across columns A → B → C).
# the value parameter expects a scaler value or dict or a series that matches the index of the axis you're filling
# the axis=1 tells fillna to operate on rows, so it expects the series of row medians.
# df_fillna_axis1 = df2.fillna(df2.median(), axis=1) # this will produce error because df.median() returns series of column medians
# use .apply() for this purpose. this iterates over each row and produce each row's median and use it to fill NaN in the same row.
df_fillna_axis1 = df2.apply(lambda row: row.fillna(row.median()), axis=1)
df_fillna_axis1

Unnamed: 0,A,B,C
0,1.0,3.0,5.0
1,5.0,4.0,6.0
2,3.0,3.0,3.0


In [18]:
# convert inplace
df2.fillna(df2.mean(), inplace=True) # This will fill all null values with 0 in the original DataFrame
df2

Unnamed: 0,A,B,C
0,1.0,4.0,5.0
1,2.0,4.0,6.0
2,3.0,4.0,5.5


In [19]:
df_fill_mean = df.fillna(df.mean(numeric_only=True)) # Fill null values with the mean of each column (only for numeric columns)
df_fill_mean

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [20]:
df_fill_dict = df.fillna({'Director': 'Unknown', 'Rating': df['Rating'].first}) # Fill 'Director' nulls with 'Unknown' and 'Rating' nulls with the mean rating
df_fill_dict

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,Unknown,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Unknown,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Unknown,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


<div class="alert alert-block alert-info"><strong>Note:</strong> the 'method' parameter is deprecated in the fillna and will raise error in the future versions of Pandas. So, for instead of method use <code style="background-color: #9ccfe0ff; color: #fffa69ff;">.ffill()</code> for forward filling and <code style="background-color: #9ccfe0ff; color: #fffa69ff;">.bfill()</code> for backward filling,</div>

**2. `ffill()`:** The `pandas.DataFrame.ffill()` method in Pandas is used to fill missing values (NaN/NA) in a DataFrame or Series by propagating the last valid observation forward. This means it replaces a missing value with the value from the immediately preceding non-missing cell in the same column (by default) or row (if axis=1 is specified).<br>
It has three parameters:
- `axis`: Determines whether to fill along the index (columns, axis=0 or index, default) or along the columns (rows, axis=1 or columns).
- `inplace`: inplace: If True, the operation modifies the DataFrame or Series directly without returning a new object. If False (default), a new object with filled values is returned.
- `limit`: Specifies the maximum number of consecutive NaN values to fill. This prevents filling overly large gaps with a single preceding value.

In [21]:
df_ffill = df.ffill() # propogates the last valid observation
df_ffill

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Mozez Singh,Nasty C,India,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Mozez Singh,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [22]:
df_ffill_axis = df.ffill(axis=1) # fills along the column
df_ffill_axis

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,3%,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Zulu Man in Japan,Nasty C,Nasty C,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Zumbo's Just Desserts,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [23]:
df_ffill_limit = df.ffill(limit=1) # this fills consecutive 1 NaN values
df_ffill_limit

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Mozez Singh,Nasty C,India,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [24]:
# use inplace only if you planned to change your DataFrame inplace
# df_ffill_inplace = df.ffill(inplace=True)
# df_ffill_inplace

**3. `bfill()`:** The `pandas.DataFrame.bfill()` method is used to backward fill missing values (NaN/NA) in a DataFrame or Series by propagating the next valid observation backward. This means it replaces a missing value with the value from the first non-missing cell that appears immediately after it in the same column (by default) or row (if axis=1 is specified)<br>
**Parameters:**
* `axis`: Specifies the axis along which to fill the missing values.
    * 0 or 'index' (default): Fills missing values in each column by looking at the next value in the rows below it.
    * 1 or 'columns': Fills missing values in each row by looking at the next value in the columns to the right of it.
* `inplace`: A boolean value (False by default) that determines if the original DataFrame is modified (True) or if a new DataFrame with the filled values is returned (False).
* `limit`: An integer that defines the maximum number of consecutive NaN values to fill. If a gap of NaNs is larger than this number, it will only be partially filled.

In [25]:
df_bfill = df.bfill()
df_bfill

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,Jorge Michel Grau,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Sam Dunn,Nasty C,Australia,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Sam Dunn,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [26]:
df_bfill_axis = df.bfill(axis=1)
df_bfill_axis

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,"João Miguel, Bianca Comparato, Michel Gomes, R...","João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Nasty C,Nasty C,"September 25, 2020","September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,"Adriano Zumbo, Rachel Khoo","Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [27]:
df_bfill_limit = df.bfill(limit=1)
df_bfill_limit

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,Jorge Michel Grau,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,Australia,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Sam Dunn,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [28]:
# use inplace only if you planned to change you DataFrame inplace
# df_bfill_inplace = df.bfill(inplace=True)
# df_bfill_inplace

**4. `interpolate()`:** Fills null values by interpolating between non-null values using various methods (e.g., linear, polynomial).

**methods for interpolate:** ```linear```, `time`, `pad` or `ffill`, `bfill` or `backfill`, `slinear`, `quadratic`, `cubic`, `polynomial` and `spline`.

In [29]:
# The default method for interpolate() is linear, and it operates along axis=0 (columns) unless specified otherwise. It estimates NaN values based on a straight line between non-null values in each column.
# Note this method works only for numeric data, so non-numeric columns will remain unchanged. And at least one column should be numeric to avoid errors.

df3 = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Score': [85.0, np.nan, 92.0, 78.0],
    'Category': ['A', 'B', 'C', 'A']
})
# df3.interpolate() # This will trigger the FutureWarning or error if current pandas version is newer

# Use this instead (chain method)
df3.infer_objects(copy=False).interpolate()

  df3.infer_objects(copy=False).interpolate()


Unnamed: 0,Name,Score,Category
0,Alice,85.0,A
1,Bob,88.5,B
2,Charlie,92.0,C
3,David,78.0,A


Using `.replace()`, `.mask()` and `.where()` for filling nulls: Both the .replace() method and boolean masking (e.g. `.mask()` and `.where()`) can be used to fill null values, though .replace() is more explicit for this purpose, while masking is a more general-purpose data selection and assignment technique.

- The `.replace()` method is used to replace specific values in a DataFrame or Series, including NaN, with a specified value. It’s versatile for replacing any value, not just NaNs, but can be explicitly used for null handling.
- The `.mask()` method replaces values where a condition is True. To fill nulls, you can use df.isna() as the condition to identify NaN values and combine mask() with a replacement strategy. It’s more commonly used for conditional replacements but can be adapted for null handling.
- The `.where()` method is opposite of `mask()`. If the condition is True, the original element from the calling DataFrame is retained. If the condition is False, the corresponding element from an optional other DataFrame (or a scalar value) is used to replace the original element. If other is not specified, NaN (Not a Number) is used by default for replacement.

**5. `.replace()`**

In [30]:
# Using replace() for NaN directors
df_replaced = df.replace({'Director': {np.nan: 'Gojo Satoru'}}) # Just doing fun with NaN directors replacing with Gojo Satoru
df_replaced

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,Gojo Satoru,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Gojo Satoru,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Gojo Satoru,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [31]:
# Using replace() to replace specific values or all the occurances of specific value
df_replaced = df.replace(to_replace='Movie', value='Netfilx Movie') # replaces all the movie category type to Netflix Movie
df_replaced

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Netfilx Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Netfilx Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Netfilx Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Netfilx Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Netfilx Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Netfilx Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Netfilx Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


**6. `.mask()`**

In [32]:
df_mask = df.mask(df.isna(), 'Unknown') # replace all null/nan values with 'Unknown'
df_mask

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,Unknown,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,Unknown,Nasty C,Unknown,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,Unknown,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [33]:
df_mask = df.mask(df['Country'] == 'Singapore', 'Japan') # replaces all those rows where Country is Singapore
df_mask

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,Japan,Japan,Japan,Japan,Japan,Japan,Japan,Japan,Japan,Japan,Japan
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


**7. `.where()`**

In [34]:
df_where = df.where(df.isna(), 'Now NaN') # replace all the elements/values except NaN
df_where

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,Now NaN,Now NaN,Now NaN,,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
1,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
2,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
3,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
4,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
...,...,...,...,...,...,...,...,...,...,...,...
7784,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
7785,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
7786,Now NaN,Now NaN,Now NaN,,Now NaN,,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN
7787,Now NaN,Now NaN,Now NaN,,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN,Now NaN


In [35]:
df_where = df.where(df['Rating'] == 'R', 'Pg') # replaces all those rows with 'Pg' where the rating isn't 'R'
df_where

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
1,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
4,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
...,...,...,...,...,...,...,...,...,...,...,...
7784,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
7785,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
7786,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg
7787,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg,Pg


---

## **3.3: Index Handling in Pandas 🏷️**

**Definition:** Index Handling is the process of managing, modifying, and cleaning the row identifiers (the index) of a DataFrame or Series. The index is a core component that facilitates data access, alignment, and operations like merging or joining.

#### **Importance of Index Handling 🔑**
* **Data Alignment:** Ensures proper row-to-row alignment when performing binary operations (like addition) or when merging and joining DataFrames.

* **Uniqueness:** Handling duplicate or non-unique indices prevents subtle errors that can confuse algorithms or lead to unexpected results.

* **Performance:** A well-structured index (e.g., using a datetime index for time-series data) can significantly optimize data retrieval and query performance.

* **Clarity:** Setting meaningful indices (e.g., a unique ID column instead of arbitrary integers) makes the data more interpretable and easier to work with.

#### Key Methods:

<table border="1">
  <thead>
    <tr>
      <th>Method</th>
      <th>Purpose</th>
      <th>Key Parameters</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>.reset_index()</code></td>
      <td>Converts the current index into a regular column and assigns a new default integer index.</td>
      <td><code>drop=True</code> (discards old index), <code>inplace=True</code></td>
    </tr>
    <tr>
      <td><code>.set_index()</code></td>
      <td>Sets an existing column (or multiple columns) as the new index of the DataFrame.</td>
      <td><code>drop=True</code> (drops column after setting as index), <code>inplace=True</code></td>
    </tr>
    <tr>
      <td><code>.reindex(new_index)</code></td>
      <td>Conforms the DataFrame to a new index, adding NaN for missing labels or dropping labels not present in the new index.</td>
      <td><code>method</code> (ffill, bfill), <code>fill_value</code></td>
    </tr>
    <tr>
      <td><code>.reindex_like(other_df)</code></td>
      <td>Reindexes to match the index and column labels of another DataFrame.</td>
      <td>None</td>
    </tr>
    <tr>
      <td><code>.index.name</code></td>
      <td>Attribute used to rename the entire index (the column name of the index).</td>
      <td>Assignment (e.g., <code>df.index.name = 'New_Name'</code>)</td>
    </tr>
    <tr>
      <td><code>.rename_axis()</code></td>
      <td>Renames the axis labels. Essential for renaming levels in a MultiIndex.</td>
      <td><code>mapper</code> (dictionary), <code>axis</code></td>
    </tr>
  </tbody>
</table>


#### Resetting the Index
`.reset_index()`: Converts the current index to a column and assigns a default integer index (0, 1, 2, ...). Useful when the index is not meaningful or when you need to realign data for analysis.

In [36]:
# Reset the index to a default integer index
df_reset = df.reset_index()
df_reset

Unnamed: 0,index,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...,...
7784,7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [37]:
# If you want to drop the old index without keeping it as a column
df_reset_drop = df.reset_index(drop=True)
df_reset_drop

# Use inplace=True to modify the original DataFrame
# df.reset_index(inplace=True)  # Uncomment to modify df directly

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


#### Setting a New Index
`.set_index()`: Sets a column (or multiple columns) as the index of the DataFrame. This is useful for making data access more meaningful (e.g., using `Show_Id` as the index for unique identification).

In [38]:
# Set 'Show_Id' as the index
df_set_index = df.set_index('Show_Id')
df_set_index

Unnamed: 0_level_0,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
Show_Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...
s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [39]:
# Set multiple columns as a multi-level index
df_multi_index = df.set_index(['Type', 'Title'])  # Example with 'Type' and 'Title'
df_multi_index

# Use inplace=True to modify the original DataFrame
# df.set_index('Show_Id', inplace=True)  # Uncomment to modify df directly

Unnamed: 0_level_0,Unnamed: 1_level_0,Show_Id,Category,Director,Cast,Country,Release_Date,Rating,Duration,Description
Type,Title,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
"International TV Shows, TV Dramas, TV Sci-Fi & Fantasy",3%,s1,TV Show,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,In a future where the elite inhabit an island ...
"Dramas, International Movies",07:19,s2,Movie,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,After a devastating earthquake hits Mexico Cit...
"Horror Movies, International Movies",23:59,s3,Movie,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"When an army recruit is found dead, his fellow..."
"Action & Adventure, Independent Movies, Sci-Fi & Fantasy",9,s4,Movie,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"In a postapocalyptic world, rag-doll robots hi..."
Dramas,21,s5,Movie,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...
"Dramas, International Movies",Zozo,s7783,Movie,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,When Lebanon's Civil War deprives Zozo of his ...
"Dramas, International Movies, Music & Musicals",Zubaan,s7784,Movie,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,A scrappy but poor boy worms his way into a ty...
"Documentaries, International Movies, Music & Musicals",Zulu Man in Japan,s7785,Movie,,Nasty C,,"September 25, 2020",TV-MA,44 min,"In this documentary, South African rapper Nast..."
"International TV Shows, Reality TV",Zumbo's Just Desserts,s7786,TV Show,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,Dessert wizard Adriano Zumbo looks for the nex...


#### Checking and Handling Duplicate Indices
`.index.duplicated()`: Identifies duplicate index values. Use `.index.drop_duplicates()` or other methods to handle duplicates and ensure index uniqueness.

In [40]:
# Check for duplicate indices
duplicate_indices = df.index.duplicated().sum()
print(f"Number of duplicate indices: {duplicate_indices}")

Number of duplicate indices: 0


In [41]:
# If duplicates exist, keep only the first occurrence
df_no_duplicates = df.loc[~df.index.duplicated(keep='first')]
df_no_duplicates

Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


#### Renaming the Index
`.index.name` or `.rename_axis()`: Renames the index or index levels for clarity. Useful for multi-level indices or when preparing data for presentation.

In [42]:
# Rename the index name
df_set_index = df.set_index('Show_Id')
df_set_index.index.name = 'Netflix_ID'
df_set_index

Unnamed: 0_level_0,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
Netflix_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...
s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [43]:
# Rename axis for a multi-level index
df_multi_index = df.set_index(['Type', 'Title'])
df_multi_index = df_multi_index.rename_axis(['Content_Type', 'Show_Title'])
df_multi_index

Unnamed: 0_level_0,Unnamed: 1_level_0,Show_Id,Category,Director,Cast,Country,Release_Date,Rating,Duration,Description
Content_Type,Show_Title,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
"International TV Shows, TV Dramas, TV Sci-Fi & Fantasy",3%,s1,TV Show,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,In a future where the elite inhabit an island ...
"Dramas, International Movies",07:19,s2,Movie,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,After a devastating earthquake hits Mexico Cit...
"Horror Movies, International Movies",23:59,s3,Movie,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"When an army recruit is found dead, his fellow..."
"Action & Adventure, Independent Movies, Sci-Fi & Fantasy",9,s4,Movie,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"In a postapocalyptic world, rag-doll robots hi..."
Dramas,21,s5,Movie,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...
"Dramas, International Movies",Zozo,s7783,Movie,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,When Lebanon's Civil War deprives Zozo of his ...
"Dramas, International Movies, Music & Musicals",Zubaan,s7784,Movie,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,A scrappy but poor boy worms his way into a ty...
"Documentaries, International Movies, Music & Musicals",Zulu Man in Japan,s7785,Movie,,Nasty C,,"September 25, 2020",TV-MA,44 min,"In this documentary, South African rapper Nast..."
"International TV Shows, Reality TV",Zumbo's Just Desserts,s7786,TV Show,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,Dessert wizard Adriano Zumbo looks for the nex...


#### Reindexing
`.reindex()`: Aligns the DataFrame to a new index, adding missing rows with NaN or dropping unmatched rows. Useful for aligning data across multiple DataFrames.

In [44]:
# Create a new index (example: a subset or extension of Show_Id values)
# new_index = ['s1', 's2', 's3', 's4', 's5']  # Replace with actual Show_Id values from your dataset
# df_reindexed = df.set_index('Show_Id').reindex(new_index) # here this will produce ValueError error if new_index() size doesn't meet the actual df indices count
# df_reindexed

In [45]:
# Fill missing values after reindexing
# df_reindexed_filled = df_reindexed.fillna({'Director': 'Unknown', 'Rating': 'Not_Rated'})
# df_reindexed_filled

`.reindex_like()`: this method is used to reindex DataFrame to match another DataFrames indices (both rows and columns)

In [46]:
import pandas as pd

# Original DataFrame
df4 = pd.DataFrame(
    [[1, 2, 3], [4, 5, 6]],
    index=['A', 'B'],
    columns=['X', 'Y', 'Z']
)

# Another DataFrame with a different index and columns
df5 = pd.DataFrame(
    [[10, 11], [12, 13], [14, 15]],
    index=['B', 'C', 'D'],
    columns=['Y', 'W']
)

# Reindex df1 to match the index and columns of df2
df4_reindexed = df.reindex_like(df5)

print("Original df4:")
print(df4)
print("\nDataFrame df5 (like which df1 is reindexed):")
print(df5)
print("\nReindexed df4:")
print(df4_reindexed)

Original df4:
   X  Y  Z
A  1  2  3
B  4  5  6

DataFrame df5 (like which df1 is reindexed):
    Y   W
B  10  11
C  12  13
D  14  15

Reindexed df4:
     Y    W
B  NaN  NaN
C  NaN  NaN
D  NaN  NaN


#### Handling Missing Indices
Reindexing or resetting the index can address gaps or missing indices, ensuring a continuous and consistent index for analysis.

In [47]:
# Check for missing indices (e.g., non-sequential integer index)
print("Is the index sequential?", df.index.is_monotonic_increasing)

# Reset index to ensure continuity
df_reset_continuous = df.reset_index(drop=True)
df_reset_continuous

# Reindex to fill gaps (example with a range index)
df_reindex_range = df.reindex(range(len(df)), fill_value=np.nan)
df_reindex_range

Is the index sequential? True


Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...","October 19, 2020",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,"March 2, 2019",TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,"September 25, 2020",TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,"October 31, 2020",TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


#### Converting Index Types
`.index.astype()` or `pd.to_datetime()`: Converts the index to a specific type (e.g., datetime for time-series data). Ensures the index is compatible with analysis requirements.

In [48]:
# Convert a date column to datetime and set as index (assuming 'Release_Date' is a column)
df['Release_Date'] = pd.to_datetime(df['Release_Date'], errors='coerce')  # Convert to datetime
df_date_index = df.set_index('Release_Date')
df_date_index

Unnamed: 0_level_0,Show_Id,Category,Title,Director,Cast,Country,Rating,Duration,Type,Description
Release_Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2020-08-14,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
2016-12-23,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2018-12-20,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
2017-11-16,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
2020-01-01,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...
2020-10-19,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...",TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
2019-03-02,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
2020-09-25,s7785,Movie,Zulu Man in Japan,,Nasty C,,TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
2020-10-31,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [49]:
# Convert index to string (if needed)
df_string_index = df.set_index('Show_Id').index.astype(str)
df_string_index

Index(['s1', 's2', 's3', 's4', 's5', 's6', 's7', 's8', 's9', 's10',
       ...
       's7778', 's7779', 's7780', 's7781', 's7782', 's7783', 's7784', 's7785',
       's7786', 's7787'],
      dtype='object', name='Show_Id', length=7789)

----

## **3.4: Removing Duplicate Rows (drop_duplicates()) 🚫**
The `.drop_duplicates()` method is used to remove redundant rows from a DataFrame based on the values in all columns or a specified subset of columns. This is a critical step to ensure data uniqueness and prevent redundancy, which can skew analysis or machine learning models.

### **Why It Matters**
- Removes redundant data to improve data quality.
- Prevents biased or incorrect results in analysis or modeling due to duplicate entries.
- Does **not** directly handle duplicate indices (see below for index handling).


<table border="1">
  <thead>
    <tr>
      <th>Parameter</th>
      <th>Description</th>
      <th>Recommended Usage</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code>subset</code></td>
      <td>Specifies a list of column names to consider when identifying duplicates.</td>
      <td>Use to keep rows that are unique only by specific fields (e.g., checking for duplicate customers by ID and Email).</td>
    </tr>
    <tr>
      <td><code>keep='first'</code></td>
      <td>Retain the first occurrence of the duplicate set and discard the rest.</td>
      <td>Default and most common choice.</td>
    </tr>
    <tr>
      <td><code>keep='last'</code></td>
      <td>Retain the last occurrence of the duplicate set and discard the rest.</td>
      <td>Useful if the newest record is likely the most accurate.</td>
    </tr>
    <tr>
      <td><code>keep=False</code></td>
      <td>Drop all duplicates, leaving only the unique, non-repeated records.</td>
      <td>Use when you require absolutely no ambiguity in your data.</td>
    </tr>
    <tr>
      <td><code>ignore_index=True</code></td>
      <td>Resets the index to a default integer index (0, 1, 2, ...) after dropping duplicates.</td>
      <td>Use to ensure index continuity after removal.</td>
    </tr>
  </tbody>
</table>


In [50]:
# Remove duplicate rows (keep the first occurrence)
df_no_duplicates = df.drop_duplicates()
print("\nDataFrame after removing duplicates (all columns):")
df_no_duplicates


DataFrame after removing duplicates (all columns):


Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,2020-08-14,TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,2016-12-23,TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,2018-12-20,R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,2017-11-16,PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,2020-01-01,PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...",2020-10-19,TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,2019-03-02,TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,2020-09-25,TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,2020-10-31,TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [51]:
# Remove duplicates based on specific columns (e.g., 'Title' and 'Type')
df_subset_duplicates = df.drop_duplicates(subset=['Title', 'Type'], keep='first')
print("\nDataFrame after removing duplicates based on 'Title' and 'Type':")
df_subset_duplicates


DataFrame after removing duplicates based on 'Title' and 'Type':


Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,2020-08-14,TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,2016-12-23,TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,2018-12-20,R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,2017-11-16,PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,2020-01-01,PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7784,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...",2020-10-19,TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7785,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,2019-03-02,TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7786,s7785,Movie,Zulu Man in Japan,,Nasty C,,2020-09-25,TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7787,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,2020-10-31,TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


In [52]:
# Remove duplicates and reset the index
df_no_duplicates_reset = df.drop_duplicates(ignore_index=True)
print("\nDataFrame after removing duplicates with reset index:")
df_no_duplicates_reset

# Modify the original DataFrame in place
# df.drop_duplicates(inplace=True)  # Uncomment to modify df directly


DataFrame after removing duplicates with reset index:


Unnamed: 0,Show_Id,Category,Title,Director,Cast,Country,Release_Date,Rating,Duration,Type,Description
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,2020-08-14,TV-MA,4 Seasons,"International TV Shows, TV Dramas, TV Sci-Fi &...",In a future where the elite inhabit an island ...
1,s2,Movie,07:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,2016-12-23,TV-MA,93 min,"Dramas, International Movies",After a devastating earthquake hits Mexico Cit...
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,2018-12-20,R,78 min,"Horror Movies, International Movies","When an army recruit is found dead, his fellow..."
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,2017-11-16,PG-13,80 min,"Action & Adventure, Independent Movies, Sci-Fi...","In a postapocalyptic world, rag-doll robots hi..."
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,2020-01-01,PG-13,123 min,Dramas,A brilliant group of students become card-coun...
...,...,...,...,...,...,...,...,...,...,...,...
7782,s7783,Movie,Zozo,Josef Fares,"Imad Creidi, Antoinette Turk, Elias Gergi, Car...","Sweden, Czech Republic, United Kingdom, Denmar...",2020-10-19,TV-MA,99 min,"Dramas, International Movies",When Lebanon's Civil War deprives Zozo of his ...
7783,s7784,Movie,Zubaan,Mozez Singh,"Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan...",India,2019-03-02,TV-14,111 min,"Dramas, International Movies, Music & Musicals",A scrappy but poor boy worms his way into a ty...
7784,s7785,Movie,Zulu Man in Japan,,Nasty C,,2020-09-25,TV-MA,44 min,"Documentaries, International Movies, Music & M...","In this documentary, South African rapper Nast..."
7785,s7786,TV Show,Zumbo's Just Desserts,,"Adriano Zumbo, Rachel Khoo",Australia,2020-10-31,TV-PG,1 Season,"International TV Shows, Reality TV",Dessert wizard Adriano Zumbo looks for the nex...


---