# Assignment 1:
 The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew. While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

Dataset Link: https://www.kaggle.com/c/titanic/*code*

### Import Libraries:

In [32]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Load the Dataset:

In [33]:
train_data = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/titanic/train.csv")
train_data.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


### Explore the Dataset:

In [34]:
train_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB


### One-Hot Encode Embarked Column:

In [35]:
ports = pd.get_dummies(train_data.Embarked, prefix = 'Embarked')
ports.head()

Unnamed: 0,Embarked_C,Embarked_Q,Embarked_S
0,0,0,1
1,1,0,0
2,0,0,1
3,0,0,1
4,0,0,1


In [36]:
train_data = train_data.join(ports)
train_data.drop(['Embarked'],axis=1,inplace=True)

In [37]:
train_data.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked_C,Embarked_Q,Embarked_S
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,0,0,1
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,1,0,0
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,0,0,1
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,0,0,1
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,0,0,1


### Map Gender to Numeric Values:

In [38]:
train_data.Sex = train_data.Sex.map({'male':0,'female':1})

In [39]:
y = train_data.Survived.copy()
x = train_data.drop(['Survived'], axis=1)

### Drop Unnecessary Columns:

In [40]:
x.drop(['Cabin','Ticket','Name','PassengerId'], axis=1,inplace=True)

In [41]:
x.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 9 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Pclass      891 non-null    int64  
 1   Sex         891 non-null    int64  
 2   Age         714 non-null    float64
 3   SibSp       891 non-null    int64  
 4   Parch       891 non-null    int64  
 5   Fare        891 non-null    float64
 6   Embarked_C  891 non-null    uint8  
 7   Embarked_Q  891 non-null    uint8  
 8   Embarked_S  891 non-null    uint8  
dtypes: float64(2), int64(4), uint8(3)
memory usage: 44.5 KB


In [42]:
x.isnull().values.any()

True

In [43]:
x[pd.isnull(x).any(axis=1)]

Unnamed: 0,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked_C,Embarked_Q,Embarked_S
5,3,0,,0,0,8.4583,0,1,0
17,2,0,,0,0,13.0000,0,0,1
19,3,1,,0,0,7.2250,1,0,0
26,3,0,,0,0,7.2250,1,0,0
28,3,1,,0,0,7.8792,0,1,0
...,...,...,...,...,...,...,...,...,...
859,3,0,,0,0,7.2292,1,0,0
863,3,1,,8,2,69.5500,0,0,1
868,3,0,,0,0,9.5000,0,0,1
878,3,0,,0,0,7.8958,0,0,1


### Handle Missing Values in Age:

In [44]:
x.Age.fillna(x.Age.mean(), inplace=True)

In [45]:
x.isnull().values.any()

False

### Train-Test Split:

In [46]:
from sklearn.model_selection import train_test_split

In [47]:
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)

In [48]:
x.shape

(891, 9)

## Neural Network Model:
### Import TensorFlow and Keras:

In [49]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential() #brain without neurons

Input Layer

In [50]:
model.add(Dense(120, activation="relu", input_shape=(9,)))

Hidden Layer

In [51]:
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))
model.add(Dense(120, activation="relu"))

Output Layer

In [52]:
model.add(Dense(1, activation = "sigmoid"))

### Compile the Model:

In [53]:
model.compile(optimizer="adam", loss = "binary_crossentropy", metrics = ["accuracy"])

### Train the Model:

In [54]:
model.fit(xtrain,ytrain, batch_size=50, epochs = 60)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


<keras.src.callbacks.History at 0x79d65cf32aa0>

In [55]:
from sklearn.metrics import accuracy_score
ypred = model.predict(xtest)



In [56]:
ypred

array([[0.17004819],
       [0.9146969 ],
       [0.12081064],
       [0.12331524],
       [0.11116133],
       [0.9567862 ],
       [0.7208514 ],
       [0.71639484],
       [0.07398109],
       [0.98982316],
       [0.12532522],
       [0.1592703 ],
       [0.55840844],
       [0.9771424 ],
       [0.1323705 ],
       [0.10524708],
       [0.4154689 ],
       [0.28941706],
       [0.4117146 ],
       [0.1220594 ],
       [0.139152  ],
       [0.15073144],
       [0.15073144],
       [0.12112818],
       [0.6865638 ],
       [0.17951673],
       [0.13429736],
       [0.07827498],
       [0.12112818],
       [0.09596555],
       [0.50393295],
       [0.13836616],
       [0.69058913],
       [0.10323772],
       [0.99961144],
       [0.21540013],
       [0.72686446],
       [0.08707963],
       [0.9338829 ],
       [0.15459599],
       [0.15073144],
       [0.14189671],
       [0.11037733],
       [0.95464087],
       [0.49799284],
       [0.03917548],
       [0.11853444],
       [0.150

In [57]:
int(True)

1

In [58]:
(ypred >= 0.5).astype("int")

array([[0],
       [1],
       [0],
       [0],
       [0],
       [1],
       [1],
       [1],
       [0],
       [1],
       [0],
       [0],
       [1],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [1],
       [0],
       [1],
       [0],
       [1],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
       [1],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [1],
       [0],
       [1],
       [1],
       [1],
       [0],
       [1],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
    

### Make Predictions:

In [59]:
ypred = (ypred >= 0.5).astype("int")

In [60]:
accuracy_score(ypred,ytest)

0.8268156424581006