
# Titanic Dataset: Exploratory Data Analysis

In this notebook, we will explore the Titanic dataset using various visual and statistical techniques to extract insights.

---


In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load sample Titanic dataset
data = {
    'survived': [0, 1, 1, 1, 0, 0, 1, 0, 1, 0],
    'pclass': [3, 1, 3, 1, 3, 3, 1, 2, 2, 3],
    'sex': ['male', 'female', 'female', 'female', 'male', 'male', 'female', 'male', 'female', 'male'],
    'age': [22, 38, 26, 35, 35, 27, 54, 2, 27, 19],
    'fare': [7.25, 71.28, 7.92, 53.1, 8.05, 8.46, 51.86, 21.08, 11.13, 7.89]
}
df = pd.DataFrame(data)

# Basic info
print(df.info())
print(df.describe(include='all'))

# Survival Counts
sns.countplot(x='survived', data=df)
plt.title('Survival Counts')
plt.show()

# Passenger Class Counts
sns.countplot(x='pclass', data=df)
plt.title('Passenger Class Counts')
plt.show()

# Gender Counts
sns.countplot(x='sex', data=df)
plt.title('Gender Counts')
plt.show()

# Age Distribution
df['age'].hist(bins=10, edgecolor='k')
plt.title('Age Distribution')
plt.xlabel('Age')
plt.ylabel('Count')
plt.show()

# Boxplot for Age vs Pclass
sns.boxplot(x='pclass', y='age', data=df)
plt.title('Age Distribution across Passenger Class')
plt.show()

# Correlation Heatmap
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()



### Observations:
- **Survival Counts**: More passengers did not survive (6/10).
- **Passenger Class**: Most passengers were in 3rd class.
- **Gender**: There were more males than females.
- **Age**: Age distribution is between 2 and 54 years, with a higher frequency in the 20-30 range.
- **Age vs Pclass**: Younger passengers appear in lower classes more often.
- **Correlations**: Survived and pclass are negatively correlated; fare is positively correlated with pclass and survived.

---

### Summary of Findings
- **Women had higher survival rates**.
- **Higher class passengers had better survival chances**.
- **Fare is positively correlated with survival**.

This concludes our mini-EDA of the sample Titanic dataset.
