### Garbage In, Garbage Out (GIGO): Cleaning Missing Data
**Description**: Load a dataset (e.g., Titanic dataset) and identify missing values. Use
appropriate techniques to handle these missing values.

In [1]:
# Write your code from here
import pandas as pd
import seaborn as sns

# Load Titanic dataset from seaborn
titanic = sns.load_dataset('titanic')

# 1. Identify missing values
print("Missing values per column:")
print(titanic.isnull().sum())

# 2. Handling missing data

# Example techniques:

# a) Drop rows where 'age' or 'embarked' is missing (less preferred if many rows)
titanic_dropped = titanic.dropna(subset=['age', 'embarked'])

# b) Fill missing 'age' values with median age
median_age = titanic['age'].median()
titanic['age_filled'] = titanic['age'].fillna(median_age)

# c) Fill missing 'embarked' with mode (most common port)
mode_embarked = titanic['embarked'].mode()[0]
titanic['embarked_filled'] = titanic['embarked'].fillna(mode_embarked)

# Display summary of changes
print("\nAfter handling missing data:")
print(titanic[['age', 'age_filled', 'embarked', 'embarked_filled']].head(10))


Missing values per column:
survived         0
pclass           0
sex              0
age            177
sibsp            0
parch            0
fare             0
embarked         2
class            0
who              0
adult_male       0
deck           688
embark_town      2
alive            0
alone            0
dtype: int64

After handling missing data:
    age  age_filled embarked embarked_filled
0  22.0        22.0        S               S
1  38.0        38.0        C               C
2  26.0        26.0        S               S
3  35.0        35.0        S               S
4  35.0        35.0        S               S
5   NaN        28.0        Q               Q
6  54.0        54.0        S               S
7   2.0         2.0        S               S
8  27.0        27.0        S               S
9  14.0        14.0        C               C
