### Introduction

The Titanic disaster remains one of the most infamous maritime tragedies in history, capturing public interest for over a century. In this exploratory data analysis (EDA), we delve into the Titanic dataset to gain insights into the passengers who traveled aboard the ill-fated ship. 

Through this analysis, we'll use different data visualization and statistical methods to find patterns and insights about the passengers. The goal is to better understand what happened during this tragic event and to see how data can help us learn about real-life situations.


### Titanic Dataset Description

The Titanic dataset is a well-known collection of data often used in data analysis and machine learning. It provides information about passengers aboard the RMS Titanic, which tragically sank on its first voyage in April 1912. The dataset includes details like age, gender, class, ticket fare, and survival staus. It’s a useful resource for exploring patterns and factors that might have affected survival during this historic event.

#### Key Features of the Dataset:

- **PassengerId**: A unique identifier for each passenger.
- **Survived**: A binary indicator (0 = No, 1 = Yes) representing whether the passenger survived the disaster.
- **Pclass**: The class of the ticket purchased by the passenger (1 = First, 2 = Second, 3 = Third).
- **Name**: The full name of the passenger.
- **Sex**: The gender of the passenger (male or female).
- **Age**: The age of the passenger in years (some entries may be missing).
- **SibSp**: The number of siblings or spouses aboard the Titanic.
- **Parch**: The number of parents or children aboard the Titanic.
- **Ticket**: The ticket number of the passenger.
- **Fare**: The fare paid for the ticket.
- **Cabin**: The cabin number where the passenger stayed (some entries may be missing).
- **Embarked**: The port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).



In [42]:
import requests
import pathlib as path
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np



In [43]:
df = pd.read_csv("titanic.csv")
print(df.head)

<bound method NDFrame.head of      survived  pclass     sex   age  sibsp  parch     fare embarked   class  \
0           0       3    male  22.0      1      0   7.2500        S   Third   
1           1       1  female  38.0      1      0  71.2833        C   First   
2           1       3  female  26.0      0      0   7.9250        S   Third   
3           1       1  female  35.0      1      0  53.1000        S   First   
4           0       3    male  35.0      0      0   8.0500        S   Third   
..        ...     ...     ...   ...    ...    ...      ...      ...     ...   
886         0       2    male  27.0      0      0  13.0000        S  Second   
887         1       1  female  19.0      0      0  30.0000        S   First   
888         0       3  female   NaN      1      2  23.4500        S   Third   
889         1       1    male  26.0      0      0  30.0000        C   First   
890         0       3    male  32.0      0      0   7.7500        Q   Third   

       who  adult_mal


#### Things to get back to:

Purpose
The Titanic dataset serves as an excellent introduction to data science and machine learning concepts, allowing users to explore various data analysis techniques, visualization methods, and predictive modeling algorithms. By analyzing the dataset, one can uncover insights regarding the factors that influenced survival rates among different passenger groups.

Questions
- What factors seemed to affect whether someone survived?
- How did things like ticket class and fare relate to survival?
- Were there differences in survival rates based on gender and age?

Description
The dataset contains information about various aspects of the passengers, such as age, gender, and class, as well as ticket information and survival status.