# 🔍 Day 6: Exploratory Data Analysis (EDA)

## 🧠 Objective
Perform an in-depth exploratory analysis of the Titanic dataset using visual and statistical techniques.

## 🛠️ Setup and Dataset Load

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import skew

# Load Titanic dataset
df = sns.load_dataset("titanic")
df.head()

## 🔸 Data Overview

In [None]:
df.info()

In [None]:
df.describe(include='all')

## 🔸 Variable Types

In [None]:
df.dtypes.value_counts()

## 🔸 Missing Data

In [None]:
df.isnull().sum().sort_values(ascending=False)

In [None]:
sns.heatmap(df.isnull(), cbar=False, cmap="YlGnBu")
plt.title("Missing Value Heatmap")
plt.show()

## 🔸 Univariate Analysis

In [None]:
sns.countplot(x='sex', data=df)
plt.title("Gender Distribution")
plt.show()

In [None]:
sns.histplot(df['age'].dropna(), kde=True, bins=30)
plt.title("Age Distribution")
plt.show()

## 🔸 Bivariate Analysis

In [None]:
sns.countplot(x='sex', hue='survived', data=df)
plt.title("Survival Count by Gender")
plt.show()

In [None]:
sns.scatterplot(x='age', y='fare', hue='survived', data=df)
plt.title("Fare vs Age Colored by Survival")
plt.show()

## 🔸 Correlation Analysis

In [None]:
numeric = df.select_dtypes(include='number')
corr = numeric.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title("Correlation Heatmap")
plt.show()

## 🔸 Outlier Detection

In [None]:
sns.boxplot(y='fare', data=df)
plt.title("Boxplot of Fare")
plt.show()

## 🔸 Skewness Check

In [None]:
skew_vals = df.select_dtypes(include='number').apply(lambda x: skew(x.dropna()))
skew_vals

## 🎮 EDA Challenge


**Try answering these using visualizations:**
- Who had a higher survival rate: male or female?
- Did younger passengers survive more than older ones?
- Is there any class-based bias in survival?

Use countplots, histplots, or scatter plots to derive answers.


## ✅ Summary
- Understood different EDA techniques
- Learned how to inspect and visualize data
- Prepared to make insights before model building