In [1]:
import numpy as np          # type: ignore
import pandas as pd         # type: ignore

# 🛠️ **Handling Missing Values in Pandas**  

## 🔍 **Detecting Missing Data**
| **Method**  | **Description** |
|------------|----------------|
| `isnull()`  | ✅ Returns `True` for missing values (`NaN`) |
| `notnull()` | ✅ Returns `True` for non-missing values |

---

## ✂️ **Removing Missing Data**
| **Method**  | **Description** |
|------------|----------------|
| `dropna()`  | 🗑️ Drops rows with **any** missing values |
| `dropna(how="all")`  | 🗑️ Drops rows **only if all** values are missing |
| `dropna(axis=1, how="all")`  | 🗑️ Drops **columns** where all values are missing |
| `dropna(thresh=5)`  | 🗑️ Drops rows with **less than 5** non-missing values |

---

## 🎨 **Filling Missing Data**
| **Method**  | **Description** |
|------------|----------------|
| `fillna(0)`  | 🔄 Replaces missing values with `0` |
| `fillna(method="ffill")`  | 🔄 Forward fills missing values (propagates last valid value) |
| `fillna(method="bfill")`  | 🔄 Backward fills missing values (uses next valid value) |

---

💡 **Tip:** Choose the right method based on your data! **Dropping** may lead to data loss, while **filling** helps retain structure. 🚀  

<div style="width: 100%; height: 10px; background: linear-gradient(to right, orange, red, orange, red, orange); border-radius: 5px; margin: 20px 0;"></div>

# 🗑️ **dropna() - Removing Missing Values in Pandas**  

## 🔥 **Why use `dropna()`?**  
Sometimes, missing data (`NaN`) can affect analysis.  
`dropna()` helps by removing rows or columns with missing values.

---

## ✂️ **Usage and Variations**
| **Method**  | **Effect** | **Example** |
|------------|------------|------------|
| `df.dropna()`  | Removes rows with **any** missing values | `df.dropna()` |
| `df.dropna(how="all")`  | Removes rows **only if all** values are missing | `df.dropna(how="all")` |
| `df.dropna(axis=1, how="all")`  | Removes **columns** where all values are missing | `df.dropna(axis=1, how="all")` |
| `df.dropna(thresh=5)`  | Keeps rows with **at least 5** non-missing values | `df.dropna(thresh=5)` |

---

## 🚀 **Example**
```python
import pandas as pd
import numpy as np

data = {"A": [1, 2, np.nan, 4], "B": [np.nan, 2, 3, np.nan]}
df = pd.DataFrame(data)

# Remove rows with any missing values
df_cleaned = df.dropna()
print(df_cleaned)

In [2]:
data = {"A": [1, 2, np.nan, 4], "B": [np.nan, 2, 3, np.nan]}
data = pd.DataFrame(data)
data

Unnamed: 0,A,B
0,1.0,
1,2.0,2.0
2,,3.0
3,4.0,


In [3]:
df_cleaned1 = data.dropna()       # Removes rows with **any** missing values
print(df_cleaned1)

     A    B
1  2.0  2.0


In [4]:
df_cleaned2 = data.dropna(how="all")     # Removes rows **only if all** values are missing
print(df_cleaned2)

     A    B
0  1.0  NaN
1  2.0  2.0
2  NaN  3.0
3  4.0  NaN


In [5]:
df_cleaned3 = data.dropna(axis=1, how="all")      # Removes **columns** where all values are missing
print(df_cleaned3)

     A    B
0  1.0  NaN
1  2.0  2.0
2  NaN  3.0
3  4.0  NaN


-------

In [6]:
df = pd.read_csv('../datasets/datasetBad.csv')

print(df.shape)
df

(21, 8)


Unnamed: 0,Duration,Date,Pulse,Maxpulse,Calories,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,130,409.1,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,145,479.0,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,135,340.0,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,175,282.4,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,148,406.0,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,127,300.0,304-762-2467,Swanson,768 City Parkway
6,60,'2020/12/07',110,136,374.0,,Winger,1209 South Street
7,450,'2020/12/08',104,134,253.3,876|678|3469,Holmes,98 Clue Drive
8,30,'2020/12/09',109,133,195.1,N/a,,123 Middle Earth
9,60,'2020/12/10',98,124,269.0,123-545-5421,Parker,"25th Main Street, New York"


In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21 entries, 0 to 20
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Duration      21 non-null     int64  
 1   Date          20 non-null     object 
 2   Pulse         21 non-null     int64  
 3   Maxpulse      21 non-null     int64  
 4   Calories      20 non-null     float64
 5   Phone_Number  19 non-null     object 
 6   Last_Name     20 non-null     object 
 7   Address       21 non-null     object 
dtypes: float64(1), int64(3), object(4)
memory usage: 1.4+ KB


In [8]:
df.dropna(inplace = True)
print(df.shape)
df

(16, 8)


Unnamed: 0,Duration,Date,Pulse,Maxpulse,Calories,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,130,409.1,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,145,479.0,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,135,340.0,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,175,282.4,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,148,406.0,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,127,300.0,304-762-2467,Swanson,768 City Parkway
7,450,'2020/12/08',104,134,253.3,876|678|3469,Holmes,98 Clue Drive
9,60,'2020/12/10',98,124,269.0,123-545-5421,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,120,250.7,7066950392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,120,250.7,123-543-2345,Draper,2039 Main Street


<div style="width: 100%; height: 10px; background: linear-gradient(to right, orange, red, orange, red, orange); border-radius: 5px; margin: 20px 0;"></div>

# 🔄 **drop_duplicates() - Removing Duplicate Rows in Pandas**  

## ❓ **Why use `drop_duplicates()`?**  
Duplicate data can **skew analysis** and **increase redundancy**.  
`drop_duplicates()` helps in **removing duplicate rows** efficiently.

---

## ✂️ **Usage and Variations**
| **Method**  | **Effect** | **Example** |
|------------|------------|------------|
| `df.drop_duplicates()`  | Removes **duplicate rows**, keeping the first occurrence | `df.drop_duplicates()` |
| `df.drop_duplicates(keep="last")`  | Removes **duplicate rows**, keeping the last occurrence | `df.drop_duplicates(keep="last")` |
| `df.drop_duplicates(keep=False)`  | Removes **all occurrences** of duplicate rows | `df.drop_duplicates(keep=False)` |
| `df.drop_duplicates(subset=["col"])`  | Drops duplicates based on **specific column(s)** | `df.drop_duplicates(subset=["name"])` |

---

## 🚀 **Example**
```python
import pandas as pd

data = {"Name": ["Alice", "Bob", "Alice", "Charlie"],
        "Age": [25, 30, 25, 35]}

df = pd.DataFrame(data)

# Remove duplicate rows
df_unique = df.drop_duplicates()
print(df_unique)

In [9]:
data = {"Name": ["Alice", "Bob", "Alice", "Charlie"],
        "Age": [25, 30, 25, 35]}
data = pd.DataFrame(data)
data

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Alice,25
3,Charlie,35


In [10]:
df_unique1 = data.drop_duplicates()     # Removes **duplicate rows**, keeping the first occurrence
print(df_unique1)

      Name  Age
0    Alice   25
1      Bob   30
3  Charlie   35


In [11]:
df_unique2 = data.drop_duplicates(keep="last")      # Removes **duplicate rows**, keeping the last occurrence
print(df_unique2)

      Name  Age
1      Bob   30
2    Alice   25
3  Charlie   35


In [12]:
df_unique3 = data.drop_duplicates(keep=False)      # Removes **all occurrences** of duplicate rows
print(df_unique3)

      Name  Age
1      Bob   30
3  Charlie   35


In [13]:
df_unique4 = data.drop_duplicates(subset=["Name"])      # Drops duplicates based on **specific column(s)** only
print(df_unique4)

      Name  Age
0    Alice   25
1      Bob   30
3  Charlie   35


------------

In [14]:
df

Unnamed: 0,Duration,Date,Pulse,Maxpulse,Calories,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,130,409.1,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,145,479.0,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,135,340.0,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,175,282.4,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,148,406.0,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,127,300.0,304-762-2467,Swanson,768 City Parkway
7,450,'2020/12/08',104,134,253.3,876|678|3469,Holmes,98 Clue Drive
9,60,'2020/12/10',98,124,269.0,123-545-5421,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,120,250.7,7066950392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,120,250.7,123-543-2345,Draper,2039 Main Street


In [15]:
print(f"before: {df.shape}")
df = df.drop_duplicates()
print(f"AFTER : {df.shape}")
df

before: (16, 8)
AFTER : (16, 8)


Unnamed: 0,Duration,Date,Pulse,Maxpulse,Calories,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,130,409.1,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,145,479.0,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,135,340.0,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,175,282.4,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,148,406.0,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,127,300.0,304-762-2467,Swanson,768 City Parkway
7,450,'2020/12/08',104,134,253.3,876|678|3469,Holmes,98 Clue Drive
9,60,'2020/12/10',98,124,269.0,123-545-5421,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,120,250.7,7066950392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,120,250.7,123-543-2345,Draper,2039 Main Street


In [16]:
data = {
    'A': [1, 2, 2, 4],
    'B': [5, 6, 6, 8],
    'C': [9, 10, 10, 12],
    'D': [9, 10, 10, 12] }

df1 = pd.DataFrame(data)
df1

Unnamed: 0,A,B,C,D
0,1,5,9,9
1,2,6,10,10
2,2,6,10,10
3,4,8,12,12


> ## 👉 **`duplicated`** : to check duplicated rows

In [17]:
df1.duplicated()

0    False
1    False
2     True
3    False
dtype: bool

In [18]:
df_unique_rows = df1.drop_duplicates()

print("DataFrame after removing duplicate rows:")
df_unique_rows

DataFrame after removing duplicate rows:


Unnamed: 0,A,B,C,D
0,1,5,9,9
1,2,6,10,10
3,4,8,12,12


In [19]:
## you can drop specific col >>
def dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:
    return customers.drop_duplicates("email")

data = pd.DataFrame()
data["customer_id "] = pd.Series([1,2,3,4,5,6])
data["name"]  = pd.Series(["ali","omar","amr","samy","khaled","sayed"])
data["email"] = pd.Series([ "emily@example.com",
                            "michael@example.com",
                            "sarah@example.com",
                            "john@example.com",
                            "john@example.com",
                            "alice@example.com"])
data

Unnamed: 0,customer_id,name,email
0,1,ali,emily@example.com
1,2,omar,michael@example.com
2,3,amr,sarah@example.com
3,4,samy,john@example.com
4,5,khaled,john@example.com
5,6,sayed,alice@example.com


In [20]:
dropDuplicateEmails(data)

Unnamed: 0,customer_id,name,email
0,1,ali,emily@example.com
1,2,omar,michael@example.com
2,3,amr,sarah@example.com
3,4,samy,john@example.com
5,6,sayed,alice@example.com


<div style="width: 100%; height: 10px; background: linear-gradient(to right, orange, red, orange, red, orange); border-radius: 5px; margin: 20px 0;"></div>

In [21]:
df

Unnamed: 0,Duration,Date,Pulse,Maxpulse,Calories,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,130,409.1,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,145,479.0,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,135,340.0,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,175,282.4,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,148,406.0,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,127,300.0,304-762-2467,Swanson,768 City Parkway
7,450,'2020/12/08',104,134,253.3,876|678|3469,Holmes,98 Clue Drive
9,60,'2020/12/10',98,124,269.0,123-545-5421,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,120,250.7,7066950392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,120,250.7,123-543-2345,Draper,2039 Main Street


In [22]:
df = df.drop(columns = ["Calories","Maxpulse"])
df

Unnamed: 0,Duration,Date,Pulse,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,123-545-5421,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,123/643/9775,Nadir,93 West Main Street
2,60,'2020/12/03',103,7066950392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,123-543-2345,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,876|678|3469,Snow,123 Dragons Road
5,60,'2020/12/06',102,304-762-2467,Swanson,768 City Parkway
7,450,'2020/12/08',104,876|678|3469,Holmes,98 Clue Drive
9,60,'2020/12/10',98,123-545-5421,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,7066950392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,123-543-2345,Draper,2039 Main Street


<div style="width: 100%; height: 10px; background: linear-gradient(to right, orange, red, orange, red, orange); border-radius: 5px; margin: 20px 0;"></div>

> ### 👉 **`strip()`** , **`split()`**

In [23]:
print(df["Last_Name"])

0         Baggins
1           Nadir
2          /White
3         Schrute
4            Snow
5         Swanson
7          Holmes
9          Parker
11      ...Potter
12         Draper
13          Knope
14    Flenderson_
15        Weasley
17           Kent
19      Skywalker
20      Skywalker
Name: Last_Name, dtype: object


In [24]:
# print(df["Last_Name"].str.strip("_"))    remove _ only
df["Last_Name"].str.strip("_.../-")

0        Baggins
1          Nadir
2          White
3        Schrute
4           Snow
5        Swanson
7         Holmes
9         Parker
11        Potter
12        Draper
13         Knope
14    Flenderson
15       Weasley
17          Kent
19     Skywalker
20     Skywalker
Name: Last_Name, dtype: object

In [25]:
# #### ------------------ apply() ------------------
df["Phone_Number"]=df["Phone_Number"].apply(lambda x: x[0:3]+ "-" + x[3:6] + "-" + x[6:10])
df["Phone_Number"]

0     123--54-5-54
1     123-/64-3/97
2     706-695-0392
3     123--54-3-23
4     876-|67-8|34
5     304--76-2-24
7     876-|67-8|34
9     123--54-5-54
11    706-695-0392
12    123--54-3-23
13    876-|67-8|34
14    304--76-2-24
15    123--54-5-54
17    706-695-0392
19    876-|67-8|34
20    876-|67-8|34
Name: Phone_Number, dtype: object

> ### **⚠️ you can see more explain in  [4-Apply-map-applymap.ipynb](<../📂 03_Advanced_Methods/📂 03.1_Grouping_and_Aggregation/4-Apply-map-applymap.ipynb>)**

In [26]:
#### ------------------ replace() ------------------
df["Phone_Number"] = df["Phone_Number"].str.replace("--","")
df["Phone_Number"]

0       12354-5-54
1     123-/64-3/97
2     706-695-0392
3       12354-3-23
4     876-|67-8|34
5       30476-2-24
7     876-|67-8|34
9       12354-5-54
11    706-695-0392
12      12354-3-23
13    876-|67-8|34
14      30476-2-24
15      12354-5-54
17    706-695-0392
19    876-|67-8|34
20    876-|67-8|34
Name: Phone_Number, dtype: object

In [28]:
df

Unnamed: 0,Duration,Date,Pulse,Phone_Number,Last_Name,Address
0,60,'2020/12/01',110,12354-5-54,Baggins,"123 Shire Lane, Shire"
1,60,'2020/12/02',117,123-/64-3/97,Nadir,93 West Main Street
2,60,'2020/12/03',103,706-695-0392,/White,298 Drugs Driveway
3,45,'2020/12/04',109,12354-3-23,Schrute,"980 Paper Avenue, Pennsylvania, 18503"
4,45,'2020/12/05',117,876-|67-8|34,Snow,123 Dragons Road
5,60,'2020/12/06',102,30476-2-24,Swanson,768 City Parkway
7,450,'2020/12/08',104,876-|67-8|34,Holmes,98 Clue Drive
9,60,'2020/12/10',98,12354-5-54,Parker,"25th Main Street, New York"
11,60,'2020/12/12',100,706-695-0392,...Potter,2394 Hogwarts Avenue
12,60,'2020/12/12',100,12354-3-23,Draper,2039 Main Street


In [27]:
df["Address"].str.split(",",expand=True)

Unnamed: 0,0,1,2
0,123 Shire Lane,Shire,
1,93 West Main Street,,
2,298 Drugs Driveway,,
3,980 Paper Avenue,Pennsylvania,18503.0
4,123 Dragons Road,,
5,768 City Parkway,,
7,98 Clue Drive,,
9,25th Main Street,New York,
11,2394 Hogwarts Avenue,,
12,2039 Main Street,,


In [29]:
## appending new columns :
df[["Street_Address", "State", "Post_Code"]] = df["Address"].str.split(',', expand=True)
df[["Street_Address", "State", "Post_Code"]]

Unnamed: 0,Street_Address,State,Post_Code
0,123 Shire Lane,Shire,
1,93 West Main Street,,
2,298 Drugs Driveway,,
3,980 Paper Avenue,Pennsylvania,18503.0
4,123 Dragons Road,,
5,768 City Parkway,,
7,98 Clue Drive,,
9,25th Main Street,New York,
11,2394 Hogwarts Avenue,,
12,2039 Main Street,,


In [33]:
df

Unnamed: 0,Duration,Date,Pulse,Phone_Number,Last_Name,Address,Street_Address,State,Post_Code
0,60,'2020/12/01',110,12354-5-54,Baggins,"123 Shire Lane, Shire",123 Shire Lane,Shire,
1,60,'2020/12/02',117,123-/64-3/97,Nadir,93 West Main Street,93 West Main Street,,
2,60,'2020/12/03',103,706-695-0392,/White,298 Drugs Driveway,298 Drugs Driveway,,
3,45,'2020/12/04',109,12354-3-23,Schrute,"980 Paper Avenue, Pennsylvania, 18503",980 Paper Avenue,Pennsylvania,18503.0
4,45,'2020/12/05',117,876-|67-8|34,Snow,123 Dragons Road,123 Dragons Road,,
5,60,'2020/12/06',102,30476-2-24,Swanson,768 City Parkway,768 City Parkway,,
7,450,'2020/12/08',104,876-|67-8|34,Holmes,98 Clue Drive,98 Clue Drive,,
9,60,'2020/12/10',98,12354-5-54,Parker,"25th Main Street, New York",25th Main Street,New York,
11,60,'2020/12/12',100,706-695-0392,...Potter,2394 Hogwarts Avenue,2394 Hogwarts Avenue,,
12,60,'2020/12/12',100,12354-3-23,Draper,2039 Main Street,2039 Main Street,,


> ### 👉 **`reset_index`** : index to default order
> ### 👉 when drop= False  >> **it show default index**

In [35]:
print(df.shape)
print("-----------------------")
df.reset_index(drop=True)

(16, 9)
-----------------------


Unnamed: 0,Duration,Date,Pulse,Phone_Number,Last_Name,Address,Street_Address,State,Post_Code
0,60,'2020/12/01',110,12354-5-54,Baggins,"123 Shire Lane, Shire",123 Shire Lane,Shire,
1,60,'2020/12/02',117,123-/64-3/97,Nadir,93 West Main Street,93 West Main Street,,
2,60,'2020/12/03',103,706-695-0392,/White,298 Drugs Driveway,298 Drugs Driveway,,
3,45,'2020/12/04',109,12354-3-23,Schrute,"980 Paper Avenue, Pennsylvania, 18503",980 Paper Avenue,Pennsylvania,18503.0
4,45,'2020/12/05',117,876-|67-8|34,Snow,123 Dragons Road,123 Dragons Road,,
5,60,'2020/12/06',102,30476-2-24,Swanson,768 City Parkway,768 City Parkway,,
6,450,'2020/12/08',104,876-|67-8|34,Holmes,98 Clue Drive,98 Clue Drive,,
7,60,'2020/12/10',98,12354-5-54,Parker,"25th Main Street, New York",25th Main Street,New York,
8,60,'2020/12/12',100,706-695-0392,...Potter,2394 Hogwarts Avenue,2394 Hogwarts Avenue,,
9,60,'2020/12/12',100,12354-3-23,Draper,2039 Main Street,2039 Main Street,,


In [None]:
df.reset_index(drop=False)

Unnamed: 0,index,Duration,Date,Pulse,Phone_Number,Last_Name,Address,Street_Address,State,Post_Code
0,0,60,'2020/12/01',110,12354-5-54,Baggins,"123 Shire Lane, Shire",123 Shire Lane,Shire,
1,1,60,'2020/12/02',117,123-/64-3/97,Nadir,93 West Main Street,93 West Main Street,,
2,2,60,'2020/12/03',103,706-695-0392,/White,298 Drugs Driveway,298 Drugs Driveway,,
3,3,45,'2020/12/04',109,12354-3-23,Schrute,"980 Paper Avenue, Pennsylvania, 18503",980 Paper Avenue,Pennsylvania,18503.0
4,4,45,'2020/12/05',117,876-|67-8|34,Snow,123 Dragons Road,123 Dragons Road,,
5,5,60,'2020/12/06',102,30476-2-24,Swanson,768 City Parkway,768 City Parkway,,
6,7,450,'2020/12/08',104,876-|67-8|34,Holmes,98 Clue Drive,98 Clue Drive,,
7,9,60,'2020/12/10',98,12354-5-54,Parker,"25th Main Street, New York",25th Main Street,New York,
8,11,60,'2020/12/12',100,706-695-0392,...Potter,2394 Hogwarts Avenue,2394 Hogwarts Avenue,,
9,12,60,'2020/12/12',100,12354-3-23,Draper,2039 Main Street,2039 Main Street,,


<div style="width: 100%; height: 10px; background: linear-gradient(to right, orange, red, orange, red, orange); border-radius: 5px; margin: 20px 0;"></div>

> ### **⚠️ you can see more data cleaning explain in  [3-isnull,notnull & Delete,Drop](<3-isnull,notnull & Delete,Drop.ipynb>)**