# My First Neural Network with Keras

Keras is a powerful easy-to-use Python library for developing and evaluating deep learning models.

It wraps the efficient numerical computation libraries Theano and TensorFlow and allows you to define and train neural network models in a few short lines of code.

Here we will create our first neural network model in Python using Keras. Let’s get started.

Steps involved are:

- Load Data
- Define Model
- Compile Model
- Fit Model
- Make Predictions
- Evaluate Model

## Load Data

Let us use the Pima Indians onset of diabetes [dataset](http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes). This is a standard machine learning dataset from the UCI Machine Learning repository. It describes patient medical record data for Pima Indians and whether they had an onset of diabetes within five years.

As such, it is a binary classification problem (onset of diabetes as 1 or not as 0). All of the input variables that describe each patient are numerical. This makes it easy to use directly with neural networks that expect numerical input and output values, and ideal for our first neural network in Keras.

In [81]:
from keras.models import Sequential
from keras.layers import Dense
import numpy

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

# load pima indians dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
print "Dataset Shape: ", dataset.shape

# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

print "Number of Samples: ", len(X)
print "Number of Features: ", len(X[0])

Dataset Shape:  (768, 9)
Number of Samples:  768
Number of Features:  8


## Training and Testing Data Split

In [82]:
from sklearn.cross_validation import train_test_split

# Shuffle and split the dataset into the number of training and testing points above
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.1, random_state = 42, stratify=Y)

print "Training Samples: ", len(X_train)
print "Testing Samples: ", len(X_test)

Training Samples:  691
Testing Samples:  77


## Define Model

Models in Keras are defined as a sequence of layers.

We create a Sequential model and add layers one at a time until we are happy with our network topology.

The first thing to get right is to ensure the input layer has the right number of inputs. This can be specified when creating the first layer with the **input_dim** argument and setting it to 8 for the 8 input variables.

How do we know the number of layers and their types?

This is a very hard question. There are heuristics that we can use and often the best network structure is found through a process of trial and error experimentation. Generally, you need a network large enough to capture the structure of the problem if that helps at all.

In this example, we will use a fully-connected network structure with three layers.

Fully connected layers are defined using the Dense class. We can specify the number of neurons in the layer as the first argument, the initialization method as the second argument as **init** and specify the activation function using the **activation** argument.

In this case, we initialize the network weights to a small random number generated from a uniform distribution ('**uniform**'), in this case between 0 and 0.05 because that is the default uniform weight initialization in Keras. Another traditional alternative would be '**normal**' for small random numbers generated from a Gaussian distribution.

We will use the [rectifier](https://en.wikipedia.org/wiki/Rectifier_(neural_networks) ('**relu**') activation function on the first two layers and the sigmoid function in the output layer. It used to be the case that sigmoid and tanh activation functions were preferred for all layers. These days, better performance is achieved using the rectifier activation function. We use a sigmoid on the output layer to ensure our network output is between 0 and 1 and easy to map to either a probability of class 1 or snap to a hard classification of either class with a default threshold of 0.5.

We can piece it all together by adding each layer. The first layer has 12 neurons and expects 8 input variables. The second hidden layer has 8 neurons and finally the output layer has 1 neuron to predict the class (onset of diabetes or not).

In [83]:
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))

## Compile Model

Now that the model is defined, we can compile it.

Compiling the model uses the efficient numerical libraries under the covers (the so-called backend) such as Theano or TensorFlow. The backend automatically chooses the best way to represent the network for training and making predictions to run on your hardware, such as CPU or GPU or even distributed.

When compiling, we must specify some additional properties required when training the network. Remember training a network means finding the best set of weights to make predictions for this problem.

We must specify the loss function to use to evaluate a set of weights, the optimizer used to search through different weights for the network and any optional metrics we would like to collect and report during training.

In this case, we will use logarithmic loss, which for a binary classification problem is defined in Keras as "**binary_crossentropy**". We will also use the efficient gradient descent algorithm "**adam**" for no other reason that it is an efficient default. 

Finally, because it is a classification problem, we will collect and report the classification accuracy as the metric.

In [84]:
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

## Fit Model

We have defined our model and compiled it ready for efficient computation.

Now it is time to execute the model on some data.

We can train or fit our model on our loaded data by calling the **fit()** function on the model.

The training process will run for a fixed number of iterations through the dataset called epochs, that we must specify using the **nb_epoch** argument. We can also set the number of instances that are evaluated before a weight update in the network is performed, called the **batch size** and set using the batch_size argument.

For this problem, we will run for a small number of iterations (150) and use a relatively small batch size of 10. Again, these can be chosen experimentally by trial and error.

In [85]:
from time import time

# Fit the model
# This is where the work happens on your CPU or GPU.

start = time()
model.fit(X_train, Y_train, nb_epoch=150, batch_size=10, verbose=0)
end = time()

print "Took: {} seconds".format(float("{:.4f}".format(end - start)))


Took: 28.6099 seconds


## Make Predictions

Making predictions is as easy as calling **model.predict()**. We are using a sigmoid activation function on the output layer, so the predictions will be in the range between 0 and 1. We can easily convert them into a crisp binary prediction for this classification task by rounding them.

In [86]:
# calculate predictions
predictions = model.predict(X_test)
# round predictions
rounded_predictions = [round(x) for x in predictions]
print(rounded_predictions)

[0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]


## Evaluate Model

We have trained our neural network on the train dataset and we can evaluate the performance of the network on the same dataset. This will only give us an idea of how well we have modeled the dataset (e.g. train accuracy), but no idea of how well the algorithm might perform on new data. 

In [87]:
# evaluate the model
scores = model.evaluate(X_train, Y_train)
print ("\n Training Accuracy - %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))


 Training Accuracy - acc: 79.74%


You can evaluate your model on your test dataset using the evaluate() as below:

This will generate a prediction for each input and output pair and collect scores, including the average loss and any metrics you have configured, such as accuracy.

In [88]:
# evaluate the model
scores = model.evaluate(X_test, Y_test)
print("\n Test Accuracy - %s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

 Test Accuracy - acc: 71.43%


## Final Results

Training Accuracy - 79.74%

Test Accuracy - 71.43%

Not bad for a quickly put together simple Neural Network. I'm sure with some trial and error with hyperparameters we can improve our test accuracy some more.