# *Titanic* - Machine Learning from Disaster


> This Notebook will be my prediction submition to the **Kaggle** *Titanic - Machine Learning from Disaster* **competition**:

The **sinking** of the ***Titanic*** is one of the most infamous shipwrecks in history.

On **April 15, 1912**, during her maiden voyage, the widely considered “unsinkable” *RMS Titanic* sank after colliding with an **iceberg**. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the **death** of **1502 out of 2224** passengers and crew.

While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.

In this challenge, we ask you to build a predictive model that answers the question: “**what sorts of people were more likely to survive?**” using passenger data (ie name, age, gender, socio-economic class, etc).

---

## Let's get started!
First of all, we need to **download the datasets** provided by *Kaggle* for the competition. [Datasets can be found here.](https://www.kaggle.com/competitions/titanic/data) (**You will need an account and join** the competition in order to view the data tab).

Once you got the datasets, import them along with ***Tensorflow***. Now we can start making our model.

In [1]:
import tensorflow as tf
import pandas as pd
print('Tensorflow version: ' + tf.__version__ + '. Pandas version: ' + pd.__version__)

Tensorflow version: 2.8.0. Pandas version: 1.3.5


I'll be using ***Pandas*** to read the **.csv** files

In [2]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

test_data = pd.read_csv('/content/drive/MyDrive/DATASETS_4_ML/Titanic/test.csv')
train_data = pd.read_csv('/content/drive/MyDrive/DATASETS_4_ML/Titanic/train.csv')
gender_submission_data = pd.read_csv('/content/drive/MyDrive/DATASETS_4_ML/Titanic/gender_submission.csv')

Mounted at /content/drive


Show the **first 5 elements** of the .csv:

In [3]:
train_data.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


### Let's make our model!

> We'll build what's known as a random forest model. This model is constructed of several "trees" (there are three trees in the picture below, but we'll construct 100!) that will individually consider each passenger's data and vote on whether the individual survived. Then, the random forest model makes a democratic decision: the outcome with the most votes wins! (Extract from Kaggle, by ***Alexis Cook*** on his tutorial for the *Titanic* competition)

I'll just copy Alexis code, times fly (thanks 😁):

In [4]:
from sklearn.ensemble import RandomForestClassifier

y = train_data["Survived"]

features = ["Pclass", "Sex", "SibSp", "Parch"]
X = pd.get_dummies(train_data[features])
X_test = pd.get_dummies(test_data[features])

model = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=1)
model.fit(X, y)
predictions = model.predict(X_test)

output = pd.DataFrame({'PassengerId': test_data.PassengerId, 'Survived': predictions})
output.to_csv('submission.csv', index=False)
print("Your submission was successfully saved!")

Your submission was successfully saved!


### If you are a really beginner, I really recommend you to [go and read]('https://www.kaggle.com/code/alexisbcook/titanic-tutorial/notebook') *Alexis Guide* and find other recources.

---


My Notebook collection (GIT): [@jsanchezpc](https://github.com/jsanchezpc/DeepLearning-Notebook)
