# Titanic Dataset Practice Exercises
This notebook includes practice questions covering **NumPy**, **Pandas**, **Matplotlib**, **Seaborn**, **Encoding**, and **Missing Value Handling** using the Titanic dataset.
---

In [1]:

# Load the Titanic Dataset
import pandas as pd

# Load the dataset
df = pd.read_csv('Titanic-Dataset.csv')

# Display the first 5 rows
df.head()


Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


## **A. NumPy (5 Questions)**
1. Create a NumPy array of the `Age` column and calculate its **mean**, **median**, and **standard deviation**.
2. Generate a NumPy array of **random ages** between 1 and 80 for 10 passengers.
3. Find the **index positions** where passenger ages are greater than 60.
4. Replace all missing values in the `Age` column with the **mean age** using NumPy.
5. Create a 2D NumPy array with `Fare` and `Age` and find the **correlation** between them.


## **B. Pandas Basics (5 Questions)**
6. Display the **first 10 rows** and **last 10 rows** of the dataset.
7. Check the **shape** and **summary statistics** (`.shape`, `.describe()`).
8. Count how many passengers **survived** and **did not survive**.
9. Find the **average age** of passengers for each passenger class (`Pclass`).
10. Identify the **top 5 highest fares** paid and the corresponding passenger names.


## **C. Data Cleaning & Missing Values (5 Questions)**
11. Check for **missing values** in each column.
12. Fill missing values in the `Embarked` column with the **most frequent value**.
13. Drop the `Cabin` column entirely as it has many missing values.
14. Replace missing `Age` values with the **median age**.
15. Identify rows where `Fare` is **zero or negative** and replace them with the **mean fare**.


## **D. Encoding & Transformation (4 Questions)**
16. Encode the `Sex` column into **binary form** (0 = male, 1 = female) using Pandas.
17. Perform **one-hot encoding** on the `Embarked` column.
18. Normalize the `Fare` column using **Min-Max scaling**.
19. Create a new column `FamilySize` = `SibSp` + `Parch` + 1.


## **E. Visualization with Matplotlib (4 Questions)**
20. Plot a **histogram** of the `Age` column.
21. Create a **bar chart** showing the number of passengers in each `Pclass`.
22. Plot a **line chart** showing the trend of `Fare` by `PassengerId`.
23. Create a **scatter plot** of `Fare` vs `Age` to analyze relationship.


## **F. Visualization with Seaborn (4 Questions)**
24. Create a **count plot** of passengers by `Sex`.
25. Make a **box plot** to show the distribution of `Age` for each `Pclass`.
26. Draw a **heatmap** to visualize correlations among numeric columns.
27. Plot a **violin plot** of `Fare` vs `Survived`.


## **G. Advanced Analysis (3 Questions)**
28. Find the **survival rate by gender**.

