# Titanic dataset

This notebook will guide you through a series of tasks to help you become familiar with pandas, numpy, and matplotlib using the Titanic dataset.

Fields in the Titanic dataset:
- PassengerId: An unique index for passenger rows. It starts from 1 for first row and increments by 1 for every new rows.
- Survived: Shows if the passenger survived or not. 1 stands for survived and 0 stands for not survived.
- Pclass: Ticket class. 1 stands for First class ticket. 2 stands for Second class ticket. 3 stands for Third class ticket.
- Name: Passenger's name.
- Sex: Passenger's gender.
- Age: Passenger's age.
- SibSp: Number of siblings or spouses travelling with each passenger.
- Parch: Number of parents of children travelling with each passenger.
- Ticket: Ticket number.
- Fare: How much money the passenger has paid for the travel journey.
- Cabin: Cabin number of the passenger.
- Embarked: Port from where the particular passenger was embarked/boarded.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Task 1: Load the data
Read the CSV file into a pandas DataFrame.

## Task 2: Display basic information
Use `.info()` to display concise summary information about the DataFrame.

## Task 3: Descriptive statistics
Generate descriptive statistics for the dataset using `.describe()`.

## Task 4: Missing values
Identify columns with missing values and count the number of missing values in each column.

## Task 5: Fill missing values
Fill missing values in all columns with the mean of the respective column.

## Task 6: Drop columns
Remove the `Name` from the DataFrame because it is not relevant for our analysis.

## Task 7: Filter rows
Filter and display rows where the passengers are female and survived.

## Task 8: New column
Create a new column `FamilySize` that adds `SibSp` and `Parch` columns.

## Task 9: Sorting
Sort the dataset by `Fare` in descending order.

## Task 10: Histogram
Plot a histogram of the `Age` column using matplotlib.

## Task 11: Bar chart
Plot a bar chart showing the number of survivors and non-survivors.

## Task 12: Scatter plot
Create a scatter plot of `Age` vs `Fare` colored by `Survived`.

## Task 13: Mapping
Map all the alphabetical values in respective columns to numeric values.

## Task 14: Correlation
Calculate and display the correlation matrix for the numerical features. (You can use `sns.heatmap()` to plot the correlation matrix.)

## Task 15: Normalize data
Normalize all features in the dataset. You can use either Min-Max scaling or mean-std normalization.

## Task 16: Survival rate by age group
Create age groups (bins) and calculate the survival rate for each group.