# 🚢 Titanic Survival Analysis

This project performs **Exploratory Data Analysis (EDA)** on the Titanic dataset to identify patterns influencing passenger survival.

## 📌 Objective
To explore the Titanic dataset and derive insights using visual and statistical tools. We'll focus on understanding how features like age, sex, class, and fare influenced survival.

## 📂 Dataset Description
The dataset contains the following columns:
- `PassengerId`: ID
- `Survived`: Survival (0 = No, 1 = Yes)
- `Pclass`: Class (1st, 2nd, 3rd)
- `Name`, `Sex`, `Age`, `SibSp`, `Parch`, `Ticket`, `Fare`, `Cabin`, `Embarked`

## 🧹 Data Preprocessing

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
df = pd.read_csv("titanic dataset.csv")
df.head()

In [None]:
# Check missing values
df.isnull().sum()

In [None]:
# Drop columns with too many missing values and fill others
df.drop(['Cabin', 'Ticket', 'Name'], axis=1, inplace=True)
df['Age'].fillna(df['Age'].median(), inplace=True)
df['Embarked'].fillna(df['Embarked'].mode()[0], inplace=True)

In [None]:
# Verify changes
df.isnull().sum()

## 📊 Exploratory Data Analysis (EDA)

In [None]:
# Survival counts
sns.countplot(data=df, x='Survived')
plt.title('Survival Count')
plt.show()

In [None]:
# Gender vs Survival
sns.countplot(data=df, x='Sex', hue='Survived')
plt.title('Survival by Gender')
plt.show()

In [None]:
# Class vs Survival
sns.countplot(data=df, x='Pclass', hue='Survived')
plt.title('Survival by Class')
plt.show()

In [None]:
# Age distribution by survival
sns.histplot(data=df, x='Age', hue='Survived', kde=True)
plt.title('Age Distribution by Survival')
plt.show()

In [None]:
# Correlation heatmap
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap='coolwarm')
plt.title('Feature Correlation Heatmap')
plt.show()

## 🔍 Conclusion
- **Women** had a higher survival rate.
- **1st class passengers** were more likely to survive.
- Younger passengers had slightly better chances of survival.
- Fare and class showed a strong correlation with survival.

This EDA helps highlight the socio-economic and demographic factors behind survival on the Titanic.

## 📚 References
- Dataset: Titanic (Kaggle)
- Tools: Pandas, Matplotlib, Seaborn
- Internship Project by Hex Softwares