# Titanic Dataset - Exploratory Data Analysis (EDA)
This notebook contains an exploratory data analysis (EDA) of the Titanic dataset.
We will analyze passenger demographics, socio-economic status, and their relationship with survival outcomes.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='whitegrid')

In [None]:
df = pd.read_csv('train.csv')
df.head()

In [None]:
df.info()
df.describe()

In [None]:
df.isnull().sum()

In [None]:
plt.figure(figsize=(6,4))
sns.countplot(x='Survived', data=df, palette='pastel')
plt.title('Survival Count')
plt.show()

In [None]:
plt.figure(figsize=(6,4))
sns.countplot(x='Sex', hue='Survived', data=df, palette='Set2')
plt.title('Survival by Sex')
plt.show()

In [None]:
plt.figure(figsize=(6,4))
sns.countplot(x='Pclass', hue='Survived', data=df, palette='muted')
plt.title('Survival by Passenger Class')
plt.show()

In [None]:
plt.figure(figsize=(6,4))
sns.histplot(df['Age'].dropna(), bins=30, kde=True, color='skyblue')
plt.title('Age Distribution')
plt.show()

In [None]:
plt.figure(figsize=(6,4))
sns.histplot(df['Fare'], bins=40, kde=True, color='orange')
plt.title('Fare Distribution')
plt.show()

In [None]:
plt.figure(figsize=(8,6))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Heatmap')
plt.show()

### Key Insights
- Women had much higher survival rates compared to men.
- First-class passengers had the highest chance of survival, while third-class had the lowest.
- Younger passengers (20–40) were the majority, but survival varied across ages.
- Higher fares were associated with greater survival probability.
- Correlation shows strong negative relation between `Pclass` and survival.