# Titanic Survival Analysis

## Introduction

This project explores the Titanic dataset to uncover patterns and factors that influenced passenger survival during the tragic 1912 sinking.  
By analyzing demographic, ticket, and survival data, we aim to draw actionable insights using Python, Pandas, and Tableau.  
Our final goal is to communicate findings through visualizations and a structured presentation.


## 1. Business Understanding

The sinking of the RMS Titanic in 1912 is one of the most infamous maritime disasters in history. This dataset provides detailed information on the passengers aboard the Titanic, such as age, gender, ticket class, and whether they survived.

The goal of this analysis is to apply data science techniques to gain insights into the factors that influenced survival rates. Specifically, we aim to answer questions like:

- Did gender or age affect a passenger's chance of survival?
- Were passengers in certain classes more likely to survive?
- What role did family size or port of embarkation play?

Understanding these patterns is important not only for historical interest but also for developing predictive models that can simulate or explain survival in similar scenarios.
ondieki frank
22:17
This project will go through the stages of:
- Cleaning and preparing the data,
- Performing exploratory data analysis (EDA),
- Generating key insights,
- And finally, visualizing and communicating findings through Tableau dashboards and a summary presentation.

By the end of this analysis, we aim to have a clear, data-driven perspective on the demographics and conditions that most significantly impacted survival on the Titanic.


# 2. Data Understanding

The dataset used for this analysis is the Titanic Passenger Data, famously used in data science and machine learning education. It contains detailed information about passengers aboard the Titanic, including demographics, ticket details, and survival status.

### Dataset Overview

- **Total Rows:** 891
- **Total Columns:** 12

### 🧾 Column Descriptions
| Column        | Description |
|---------------|-------------|
| `PassengerId` | Unique ID for each passenger |
| `Survived`    | Survival status (0 = No, 1 = Yes) |
| `Pclass`      | Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd) |
| `Name`        | Full name of the passenger |
| `Sex`         | Gender |
| `Age`         | Age in years |
| `SibSp`       | # of siblings/spouses aboard the Titanic |
| `Parch`       | # of parents/children aboard the Titanic |
| `Ticket`      | Ticket number |
| `Fare`        | Passenger fare |
| `Cabin`       | Cabin number |
| `Embarked`    | Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton) |

### Initial Observations

- Some features contain missing values, notably `Age`, `Cabin`, and `Embarked`.
- The `Survived` column will serve as our target variable for analysis.
- A mix of categorical and numerical features will require appropriate handling.

Before proceeding to data cleaning, we will perform basic exploration to better understand the data structure and potential challenges.

# 3. Data Cleaning

In this section, we will clean the Titanic dataset by:

- Checking for missing values
- Handling missing data appropriately
- Converting categorical variables into usable formats
- Ensuring data types are appropriate

These steps are crucial for accurate analysis and modeling later on.

### Load Dataset and Inspect


In [3]:
# Import essential libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Configure plots to show inline in Jupyter
%matplotlib inline

# Set style for plots
sns.set(style="whitegrid")

# Load the dataset from the 'Datasets' folder inside your project
df = pd.read_csv("/home/frank/Flatiron/Assignments/Phase_2/Group_1/Project_Titanic/Project_Titanic/Datasets/Titanic-Dataset.csv")

# Preview the first few rows of the dataset
df.head()



Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S
