This is a simple example and starting point for neural networks with TensorFlow.
We create a feed-forward neural network with two hidden layers (128 and 256 nodes)
and ReLU units.
The test accuracy is around 78.5 % - which is not too bad for such a simple model.

In [None]:
import numpy as np
import pandas as pd        # For loading and processing the dataset
import tensorflow as tf    # Of course, we need TensorFlow.
from sklearn.model_selection import train_test_split

## Reading and cleaning the input data

We first read the CSV input file using Pandas.
Next, we remove irrelevant entries, and prepare the data for our neural network.

In [None]:
# Read the CSV input file and show first 5 rows
df_train = pd.read_csv('/content/input/train.csv')
df_train.head(5)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [None]:
# We can't do anything with the Name, Ticket number, and Cabin, so we drop them.
df_train = df_train.drop(['PassengerId','Name','Ticket', 'Cabin'], axis=1)

In [None]:
# To make 'Sex' numeric, we replace 'female' by 0 and 'male' by 1
df_train['Sex'] = df_train['Sex'].map({'female':0, 'male':1}).astype(int)

In [None]:
# We replace 'Embarked' by three dummy variables 'Embarked_S', 'Embarked_C', and 'Embarked Q',
# which are 1 if the person embarked there, and 0 otherwise.
df_train = pd.concat([df_train, pd.get_dummies(df_train['Embarked'], prefix='Embarked')], axis=1)
df_train = df_train.drop('Embarked', axis=1)

In [None]:
# We normalize the age and the fare by subtracting their mean and dividing by the standard deviation
age_mean = df_train['Age'].mean()
age_std = df_train['Age'].std()
df_train['Age'] = (df_train['Age'] - age_mean) / age_std

fare_mean = df_train['Fare'].mean()
fare_std = df_train['Fare'].std()
df_train['Fare'] = (df_train['Fare'] - fare_mean) / fare_std

In [None]:
# In many cases, the 'Age' is missing - which can cause problems. Let's look how bad it is:
print("Number of missing 'Age' values: {:d}".format(df_train['Age'].isnull().sum()))

# A simple method to handle these missing values is to replace them by the mean age.
df_train['Age'] = df_train['Age'].fillna(df_train['Age'].mean())

Number of missing 'Age' values: 177


In [None]:
# With that, we're almost ready for training
df_train.head()

Unnamed: 0,Survived,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked_C,Embarked_Q,Embarked_S
0,0,3,1,-0.530005,1,0,-0.502163,False,False,True
1,1,1,0,0.57143,1,0,0.786404,True,False,False
2,1,3,0,-0.254646,0,0,-0.48858,False,False,True
3,1,1,0,0.364911,1,0,0.420494,False,False,True
4,0,3,1,0.364911,0,0,-0.486064,False,False,True


In [None]:
# Finally, we convert the Pandas dataframe to a NumPy array, and split it into a training and test set
X_train = df_train.drop('Survived', axis=1).to_numpy() # Use to_numpy() instead of as_matrix()
y_train = df_train['Survived'].to_numpy() # Use to_numpy() instead of as_matrix()

from sklearn.model_selection import train_test_split # Don't forget to import this module!
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.2)

In [None]:
# We'll build a classifier with two classes: "survived" and "didn't survive",
# so we create the according labels
# This is taken from https://www.kaggle.com/klepacz/titanic/tensor-flow
labels_train = (np.arange(2) == y_train[:,None]).astype(np.float32)
labels_test = (np.arange(2) == y_test[:,None]).astype(np.float32)

## Define TensorFlow model
In a first step, we define how our neural network will look.
We create a network with 2 hidden layers with ReLU activations, and an output layer with softmax.
We use dropout for regularization.

In [None]:
import tensorflow as tf

# Disable eager execution to use placeholders
tf.compat.v1.disable_eager_execution()

# Use tf.compat.v1.placeholder for older TensorFlow versions
inputs = tf.compat.v1.placeholder(tf.float32, shape=(None, X_train.shape[1]), name='inputs')
label = tf.compat.v1.placeholder(tf.float32, shape=(None, 2), name='labels')


# First layer
hid1_size = 128
# Use tf.compat.v1.random_normal for older TensorFlow versions
w1 = tf.Variable(tf.compat.v1.random_normal([hid1_size, X_train.shape[1]], stddev=0.01), name='w1')
b1 = tf.Variable(tf.constant(0.1, shape=(hid1_size, 1)), name='b1')
# Use rate instead of keep_prob for newer TensorFlow versions
y1 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(w1, tf.transpose(inputs)), b1)), rate=0.5)

# Second layer
hid2_size = 256
# Use tf.compat.v1.random_normal for older TensorFlow versions
w2 = tf.Variable(tf.compat.v1.random_normal([hid2_size, hid1_size], stddev=0.01), name='w2')
b2 = tf.Variable(tf.constant(0.1, shape=(hid2_size, 1)), name='b2')
# Use rate instead of keep_prob for newer TensorFlow versions
y2 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(w2, y1), b2)), rate=0.5)

# Output layer
# Use tf.compat.v1.random_normal for older TensorFlow versions
wo = tf.Variable(tf.compat.v1.random_normal([2, hid2_size], stddev=0.01), name='wo')
# Use tf.random.normal from tf.random module
bo = tf.Variable(tf.random.normal([2, 1]), name='bo') # Changed to use tf.random.normal
yo = tf.transpose(tf.add(tf.matmul(wo, y2), bo))


The output is a softmax output, and we train it with the cross entropy loss.
We further define functions which calculate the predicted label, and the accuracy of the network.

In [None]:
# Loss function and optimizer
lr = tf.compat.v1.placeholder(tf.float32, shape=(), name='learning_rate')
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=yo, labels=label))
optimizer = tf.compat.v1.train.GradientDescentOptimizer(lr).minimize(loss)

# Prediction
pred = tf.nn.softmax(yo)
pred_label = tf.argmax(pred, 1)
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(label, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

## Train the network!

Finally, we are ready to train our network. Let's initialize TensorFlow and start training.

In [None]:
# Create operation which will initialize all variables
init = tf.compat.v1.global_variables_initializer()

# Configure GPU not to use all memory
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True

# Start a new tensorflow session and initialize variables
sess = tf.compat.v1.InteractiveSession(config=config)
sess.run(init)

In [None]:
# This is the main training loop: we train for 50 epochs with a learning rate of 0.05 and another
# 50 epochs with a smaller learning rate of 0.01
for learning_rate in [0.05, 0.01]:
    for epoch in range(50):
        avg_cost = 0.0

        # For each epoch, we go through all the samples we have.
        for i in range(X_train.shape[0]):
            # Finally, this is where the magic happens: run our optimizer, feed the current example into X and the current target into Y
            _, c = sess.run([optimizer, loss], feed_dict={lr:learning_rate,
                                                          inputs: X_train[i, None],
                                                          label: labels_train[i, None]})
            avg_cost += c
        avg_cost /= X_train.shape[0]

        # Print the cost in this epcho to the console.
        if epoch % 10 == 0:
            print("Epoch: {:3d}    Train Cost: {:.4f}".format(epoch, avg_cost))

Epoch:   0    Train Cost: 0.6486
Epoch:  10    Train Cost: 0.5212
Epoch:  20    Train Cost: 0.5192
Epoch:  30    Train Cost: 0.5067
Epoch:  40    Train Cost: 0.5239
Epoch:   0    Train Cost: 0.4679
Epoch:  10    Train Cost: 0.4254
Epoch:  20    Train Cost: 0.4221
Epoch:  30    Train Cost: 0.4182
Epoch:  40    Train Cost: 0.4037


We calculate the accuracy on our training set, and (more importantly) our test set.

In [None]:
acc_train = accuracy.eval(feed_dict={inputs: X_train, label: labels_train})
print("Train accuracy: {:3.2f}%".format(acc_train*100.0))

acc_test = accuracy.eval(feed_dict={inputs: X_test, label: labels_test})
print("Test accuracy:  {:3.2f}%".format(acc_test*100.0))

Train accuracy: 84.13%
Test accuracy:  81.56%


## Predict new passengers

If we're happy with these results, we load the test dataset, and do all pre-processing steps we also did for the training set.

In [None]:
df_test = pd.read_csv('/content/input/test.csv')
df_test.head()

Unnamed: 0,PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,892,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q
1,893,3,"Wilkes, Mrs. James (Ellen Needs)",female,47.0,1,0,363272,7.0,,S
2,894,2,"Myles, Mr. Thomas Francis",male,62.0,0,0,240276,9.6875,,Q
3,895,3,"Wirz, Mr. Albert",male,27.0,0,0,315154,8.6625,,S
4,896,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22.0,1,1,3101298,12.2875,,S


In [None]:
# Check if the columns exist before dropping
if 'Name' in df_test.columns and 'Ticket' in df_test.columns and 'Cabin' in df_test.columns:
    df_test = df_test.drop(['Name', 'Ticket', 'Cabin'], axis=1)
    print("Columns 'Name', 'Ticket', and 'Cabin' dropped successfully.")
else:
    print("One or more of the columns 'Name', 'Ticket', 'Cabin' are not present in the DataFrame.")
    # Investigate why the columns are missing. Were they dropped earlier? Is this the correct DataFrame?

# Replace any non-standard missing values with np.nan
df_test['Sex'] = df_test['Sex'].replace(['?', '', None], np.nan)  # Replace any non-standard missing values with np.nan

# Now fill NaN values with 'unknown'
df_test['Sex'] = df_test['Sex'].fillna('unknown')

# Map 'female', 'male', and 'unknown' to integers, handling potential non-finite values
df_test['Sex'] = df_test['Sex'].map({'female': 0, 'male': 1, 'unknown': -1})

# Convert to integer, explicitly handling non-finite values
df_test['Sex'] = df_test['Sex'].fillna(-1).astype(int) # Fill remaining non-finite with -1 then convert to int

# Check if 'Embarked' column exists before applying get_dummies
if 'Embarked' in df_test.columns:
    df_test = pd.concat([df_test, pd.get_dummies(df_test['Embarked'], prefix='Embarked')], axis=1)
    df_test = df_test.drop('Embarked', axis=1)
else:
    print("Column 'Embarked' is not present in the DataFrame.")
    # Investigate why the column is missing. Was it dropped earlier? Is this the correct DataFrame?

df_test['Age'] = (df_test['Age'] - age_mean) / age_std
df_test['Fare'] = (df_test['Fare'] - fare_mean) / fare_std
df_test.head()
X_test = df_test.drop('PassengerId', axis=1).values # Use .values to get the NumPy array

One or more of the columns 'Name', 'Ticket', 'Cabin' are not present in the DataFrame.
Column 'Embarked' is not present in the DataFrame.


Then we predict the label of all our test data

In [None]:
# Predict
for i in range(X_test.shape[0]):
    df_test.loc[i, 'Survived'] = sess.run(pred_label, feed_dict={inputs: X_test[i, None]}).squeeze()

In [None]:
# Important: close the TensorFlow session, now that we're finished.
sess.close()

Finally, we can create an output to upload to Kaggle.

In [None]:
output = pd.DataFrame()
output['PassengerId'] = df_test['PassengerId']
output['Survived'] = df_test['Survived'].astype(int)
output.to_csv('./prediction.csv', index=False)
output.head()

Unnamed: 0,PassengerId,Survived
0,892,1
1,893,1
2,894,1
3,895,1
4,896,1
