# 🛳️ Titanic Dataset Analysis with Pandas

This notebook includes analysis of the Titanic dataset using pandas.

We will answer the following questions using the dataset:

1. Average age of people who died
2. Average and median fare of people who died
3. Average age of males who died
4. Average age of females who died
5. Average age of survivors
6. Average fare of survivors
7. Total number of survivors
8. Median fare of children under 10
9. Comparison of average and median fare by passenger class
10. Comparison of death rates between males and females

In [1]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings("ignore")


In [2]:
import pandas as pd

# Load the dataset (adjust path if needed)
df = pd.read_csv("/Users/sumrubektas/Desktop/titanic.csv")

# Check the first few rows
df.head()


Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


## 1. Average age of people who died


In [3]:
# Create a mask to select passengers who died (Survived == 0)
died_mask = df["Survived"] == 0

# Select only passengers who died
df_died = df[died_mask]

# Calculate the average age of those passengers
average_age_died = df_died["Age"].mean()

print("Average age of passengers who died:", average_age_died)


Average age of passengers who died: 30.62617924528302


## 2. Average and median fare of people who died


In [4]:
# Calculate average ticket fare for passengers who died
average_fare_died = df_died["Fare"].mean()

# Calculate median ticket fare for passengers who died
median_fare_died = df_died["Fare"].median()

print("Average ticket fare of passengers who died:", average_fare_died)
print("Median ticket fare of passengers who died:", median_fare_died)


Average ticket fare of passengers who died: 22.117886885245902
Median ticket fare of passengers who died: 10.5


## 3. Average age of males who died


In [6]:
# Mask for males who died
died_male_mask = (df["Survived"] == 0) & (df["Sex"] == "male")

# Select male passengers who died
df_died_male = df[died_male_mask]

# Calculate average age
average_age_died_male = df_died_male["Age"].mean()

print("Average age of males who died:", average_age_died_male)


Average age of males who died: 31.618055555555557


## 4. Average age of females who died

In [7]:
# Mask for females who died
died_female_mask = (df["Survived"] == 0) & (df["Sex"] == "female")

# Select female passengers who died
df_died_female = df[died_female_mask]

# Calculate average age
average_age_died_female = df_died_female["Age"].mean()

print("Average age of females who died:", average_age_died_female)


Average age of females who died: 25.046875


## 5. Average age of survivors


In [8]:
# Mask for survivors
survived_mask = df["Survived"] == 1

# Select passengers who survived
df_survived = df[survived_mask]

# Calculate average age
average_age_survived = df_survived["Age"].mean()

print("Average age of passengers who survived:", average_age_survived)


Average age of passengers who survived: 28.343689655172415


## 6. Average fare of survivors


In [9]:
# Calculate average fare of survivors
average_fare_survived = df_survived["Fare"].mean()

print("Average ticket fare of passengers who survived:", average_fare_survived)


Average ticket fare of passengers who survived: 48.39540760233918


## 7. Total number of survivors


In [10]:
# Use len() with condition directly to count survivors
total_survivors = len(df[df["Survived"] == 1])

print("Total number of survivors:", total_survivors)


Total number of survivors: 342


## 8. Median fare of children under 10 years old


In [11]:
# Mask for passengers younger than 10
young_mask = df["Age"] < 10

# Select those passengers
df_young = df[young_mask]

# Calculate median fare
median_fare_young = df_young["Fare"].median()

print("Median ticket fare of passengers younger than 10:", median_fare_young)


Median ticket fare of passengers younger than 10: 27.0


## 9. Compare average and median fare by passenger class (Pclass)


In [12]:
for pclass in [1, 2, 3]:
    df_class = df[df["Pclass"] == pclass]
    avg_fare = df_class["Fare"].mean()
    med_fare = df_class["Fare"].median()
    print(f"Pclass {pclass}: Average fare = {avg_fare:.2f}, Median fare = {med_fare:.2f}")


Pclass 1: Average fare = 84.15, Median fare = 60.29
Pclass 2: Average fare = 20.66, Median fare = 14.25
Pclass 3: Average fare = 13.68, Median fare = 8.05


## 10. Compare death ratio of males and females


In [14]:
# Total males and females
total_males = len(df[df["Sex"] == "male"])
total_females = len(df[df["Sex"] == "female"])

# Dead males and females
dead_males = len(df[(df["Sex"] == "male") & (df["Survived"] == 0)])
dead_females = len(df[(df["Sex"] == "female") & (df["Survived"] == 0)])

# Calculate death rates
death_rate_males = dead_males / total_males
death_rate_females = dead_females / total_females

print(f"Death rate of males: {death_rate_males:.2%}")
print(f"Death rate of females: {death_rate_females:.2%}")


Death rate of males: 81.11%
Death rate of females: 25.80%


Summary
This project demonstrates how to use pandas to:

Filter data using conditions (Boolean masks)

Calculate averages and medians

Compare groups using filters

Count values with len()

