# Loading Data

In [1]:
import numpy as np 
import tflearn

# downloading the titanic dataset
from tflearn.datasets import titanic
titanic.download_dataset('titanic_dataset.csv')

# load CSV file, indicating 1st_column[labels]
from tflearn.data_utils import load_csv
data, labels = load_csv('titanic_dataset.csv', target_column=0, categorical_labels=True, n_classes=2)

  from ._conv import register_converters as _register_converters


curses is not supported on this machine (please install/reinstall curses for an optimal experience)


# Data Preprocessing

We need to convert all our data to numerical values, because a neural network model can only perform operations over numbers. However, our dataset contains some non numerical values, such as 'name' or 'sex'. Column 'name' is discarded, we just need to handle 'sex' field. In this simple case, we will just assign '0' to males and '1' to females.

In [2]:
# Preprocessing function

def preprocess(passengers, columns_to_delete):
    # Sort by descending id and delete columns
    for column_to_delete in sorted(columns_to_delete, reverse=True):
        [passenger.pop(column_to_delete) for passenger in passengers]
        
    for i in range(len(passengers)):
        # Converting 'sex' field to float (id is 1 after removing labels column)
        passengers[i][1] = 1. if passengers[i][1] == 'female' else 0.
    return np.array(passengers, dtype=np.float32)

# Ignore 'name' and 'ticket' columns (id 1 & 6 of data array)
to_ignore=[1, 6]

# Preprocess data
data = preprocess(data, to_ignore)
    

# Build a Deep Neural Network

We are building a 3-layers neural network using TFLearn. We need to specify the shape of our input data. In our case, each sample has a total of 6 features and we will process samples per batch to save memory, so our data input shape is [None, 6] ('None' stands for an unknown dimension, so we can change the total number of samples that are processed in a batch).

In [3]:
# Build neural network
nnet = tflearn.input_data(shape= [None, 6])
nnet = tflearn.fully_connected(nnet, 32)
nnet = tflearn.fully_connected(nnet, 32)
nnet = tflearn.fully_connected(nnet, 2, activation='softmax')
nnet = tflearn.regression(nnet)

# Training

TFLearn provides a model wrapper 'DNN' that can automatically performs a neural network classifier tasks, such as training, prediction, save/restore, etc... We will run it for 40 epochs (the network will see all data 40 times) with a batch size of 32.

In [4]:
# define model
model = tflearn.DNN(nnet)

# start training apply gradient descent algorithm
model.fit(data, labels, n_epoch=40, batch_size=32, show_metric=True)

Training Step: 1639  | total loss: [1m[32m0.45191[0m[0m | time: 0.510s
| Adam | epoch: 040 | loss: 0.45191 - acc: 0.8050 -- iter: 1280/1309
Training Step: 1640  | total loss: [1m[32m0.45112[0m[0m | time: 0.520s
| Adam | epoch: 040 | loss: 0.45112 - acc: 0.8027 -- iter: 1309/1309
--


Neural network model finish to train with an overall accuracy around 80%, which means that it can predict the correct outcome (survived or not) for 80% of the total passengers.

# Testing the Model

It's time to try out the model
Lets take Titanic movie protagonists (Leonardo DiCaprio and Kate Winslet) and calculate their chance of surviving (class 1).

In [5]:
# lets create some data Leonardo DiCaprio and Kate Winslet
l_dicaprio = [3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000]
k_winslet = [1, 'Rose DeWitt Bukater', 'female', 17, 1, 2, 'N/A', 100.0000]

# preprocess data
l_dicaprio, k_winslet = preprocess([l_dicaprio, k_winslet], to_ignore)

# lets predict surviving chances (class 1 results)
pred = model.predict([l_dicaprio, k_winslet])
print("Leonardo DiCaprio Surviving Rate:", pred[0][1])
print("Kate Winslet Surviving Rate:", pred[1][1])

Leonardo DiCaprio Surviving Rate: 0.119364284
Kate Winslet Surviving Rate: 0.93090546


Awesome! Model accurately predicted the outcome of the movie. Odds were against Leonardo DiCaprio, but Kate Winslet had a high chance of surviving.

More generally, it can be seen through this study that women and children passengers from first class have the highest chance of surviving, while third class male passengers have the lowest.

Thank You!