# Exploratory Data Analysis on Titanic Dataset

In this notebook, we will perform Exploratory Data Analysis (EDA) on the Titanic dataset using Python libraries like Pandas, Matplotlib, and Seaborn.

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

sns.set(style="darkgrid")


In [None]:
# Load Titanic dataset from seaborn
df = sns.load_dataset('titanic')
df.head()


In [None]:
# Dataset information
df.info()


In [None]:
# Check for missing values
df.isnull().sum()


In [None]:
# Drop columns with too many missing values
df_cleaned = df.drop(columns=['deck'])
# Fill age missing values with median
df_cleaned['age'].fillna(df['age'].median(), inplace=True)
# Drop remaining rows with missing values
df_cleaned.dropna(inplace=True)
df_cleaned.info()


## Data Visualizations

In [None]:
# Survival count
sns.countplot(x='survived', data=df_cleaned)
plt.title('Survival Count')
plt.show()


In [None]:
# Survival by gender
sns.countplot(x='sex', hue='survived', data=df_cleaned)
plt.title('Survival by Gender')
plt.show()


In [None]:
# Age distribution
plt.hist(df_cleaned['age'], bins=20, edgecolor='black')
plt.title('Age Distribution of Passengers')
plt.xlabel('Age')
plt.ylabel('Count')
plt.show()


### Conclusion:
- Women had higher survival rates.
- Most passengers were in the 20–40 age range.
- First-class passengers had better chances of survival.