# Titanic - Machine Learning from Disaster

This notebook demonstrates the usage of a neural network built entirely from scratch using only NumPy. The main project, located in the `examples` folder, showcases how to evaluate the classic Titanic dataset to classify survivors using our custom neural network implementation.

We will walk through the process of loading the dataset, preprocessing the data, and using the neural network to make predictions. This example highlights the flexibility and functionality of our self-made neural network.

Kaggle link: https://www.kaggle.com/competitions/titanic/overview

In [41]:
# Retrieve the modules from the main folder
import sys
sys.path.insert(0, '..')

# Import pandas for loading the dataset
import pandas as pd
import numpy as np

# Import standardscaler
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# Import the modules
from neural_network import NeuralNetwork

In [42]:
# Load the data from Kaggle
df_titanic = pd.read_csv("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")

df_titanic.head(10)

Unnamed: 0,Survived,Pclass,Name,Sex,Age,Siblings/Spouses Aboard,Parents/Children Aboard,Fare
0,0,3,Mr. Owen Harris Braund,male,22.0,1,0,7.25
1,1,1,Mrs. John Bradley (Florence Briggs Thayer) Cum...,female,38.0,1,0,71.2833
2,1,3,Miss. Laina Heikkinen,female,26.0,0,0,7.925
3,1,1,Mrs. Jacques Heath (Lily May Peel) Futrelle,female,35.0,1,0,53.1
4,0,3,Mr. William Henry Allen,male,35.0,0,0,8.05
5,0,3,Mr. James Moran,male,27.0,0,0,8.4583
6,0,1,Mr. Timothy J McCarthy,male,54.0,0,0,51.8625
7,0,3,Master. Gosta Leonard Palsson,male,2.0,3,1,21.075
8,1,3,Mrs. Oscar W (Elisabeth Vilhelmina Berg) Johnson,female,27.0,0,2,11.1333
9,1,2,Mrs. Nicholas (Adele Achem) Nasser,female,14.0,1,0,30.0708


In [43]:
# Get dummie data for Pclass & Sex
df_titanic = pd.get_dummies(df_titanic.drop('Name', axis=1), dtype = int)

In [44]:
# Define the features
features = ['Pclass', 'Age', 'Siblings/Spouses Aboard', 'Parents/Children Aboard', 'Fare', 'Sex_female', 'Sex_male']

In [45]:
# Split the data for testing
X_train, X_test, y_train, y_test = train_test_split(df_titanic[features], df_titanic['Survived'], test_size=0.33, random_state=42)

# Reshape y_train
y_train = np.array(y_train).reshape(-1,1)

In [46]:
# Scale the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [63]:
# Define the neural network
nn = NeuralNetwork(layers=[7,50,25,10,1], activation='relu', loss='cross_entropy', random_seed=42)

# Train the model
nn.train(np.array(X_train), y_train, epochs=5000, learning_rate=0.01)

# Store the results from the prediction
results = nn.predict(X_test)

# Print the accuracy score
print(f'\nThe accuracy score of the model is: {accuracy_score(y_test, results):.4f}')

Epoch: 0, loss: 1.6846050211243135
Epoch: 100, loss: 0.7526639586502518
Epoch: 200, loss: 0.4700807862930637
Epoch: 300, loss: 0.4345743034448419
Epoch: 400, loss: 0.3600407413366649
Epoch: 500, loss: 0.3922902760905672
Epoch: 600, loss: 0.38445653091731885
Epoch: 700, loss: 0.37970644983387114
Epoch: 800, loss: 0.3273750967389601
Epoch: 900, loss: 0.32371629453677414
Epoch: 1000, loss: 0.3208592766569313
Epoch: 1100, loss: 0.31869473593915676
Epoch: 1200, loss: 0.3169070216983008
Epoch: 1300, loss: 0.2651726728436184
Epoch: 1400, loss: 0.2618072124004991
Epoch: 1500, loss: 0.2595085566014691
Epoch: 1600, loss: 0.21744343972578276
Epoch: 1700, loss: 0.21162014215120273
Epoch: 1800, loss: 0.20726852891211164
Epoch: 1900, loss: 0.2050769585057394
Epoch: 2000, loss: 0.20358289216525063
Epoch: 2100, loss: 0.20175106995626385
Epoch: 2200, loss: 0.2003442754291006
Epoch: 2300, loss: 0.1989995020799597
Epoch: 2400, loss: 0.19795145517561485
Epoch: 2500, loss: 0.19685365544783276
Epoch: 2600, 