# Getting Started: Keras with a Tensorflow Backend (In Anaconda)

Install Miniconda from https://conda.io/miniconda.html (Python 3.6) and launch the Anaconda prompt. Create a new environment and call it whatever you want (1), then activate this environment (2). Install spyder (3), then install Tensorflow using conda install, or failing that, pip (4). From the taskbar, click on Anaconda folder and then open spyder(your_env_name).

In [None]:
'''(1) conda create -n your_env_name python=3.6'''

'''(2) activate your_env_name'''

'''(3) conda install spyder'''

'''(4) conda install -c conda-forge tensorflow /
       pip install tensorlow '''
    

Import relevant modules such as tensorflow, NumPy, SciPy and matplotlib. For image processing you may also need to install and import PIL.

In [12]:
import scipy.io as sio

import tensorflow as tf
import tensorflow.keras as keras
import numpy as np

import matplotlib.pyplot as plt


# MNIST

The MNIST data set is a great place to start. It contains labelled 28 X 28 pixel images of handwritten digits from 0 to 9, and can be accessed through tensorflow as it's in their examples package. On this data even a very simple model can acheive accuracy (measured by the proportion of digits from the test set which are correctly identified) upwards of 98%, which varies depending on the network configuration.


First, import the necessary packages


In [None]:
import scipy.io as sio

import tensorflow as tf

import numpy as np

import matplotlib.pyplot as plt

from tensorflow.examples.tutorials.mnist import input_data

# Getting the Data

In [None]:
#Load MNIST Data--------------------------------------------------------------------------------

mnist = input_data.read_data_sets("MNIST_data/", one_hot = True)


#Identify images and labels---------------------------------------------------------------------

features = mnist.train.images      #Features are all the images of handwritten digits

labels = mnist.train.labels     #Labels describe above feaures

# Preparing the Data

In [10]:
#shuffle data---------------------------------------------------------------

ran = np.arange(features.shape[0])

np.random.shuffle(ran)

features = features[ran]

labels = labels[ran]


#Split data into training and test subsets------------------------------------

training_features = mnist.train.images.reshape(mnist.train.images.shape[0],28,28,1)

training_labels = mnist.train.labels

test_features = mnist.test.images.reshape(mnist.test.images.shape[0],28,28,1)

test_labels = mnist.test.labels

# Building a Sequential Model

Keras has two main methods for building models. The high level model building API accessed via keras.sequential() function allows you to quickly build a network model layer by layer using the keras.layers class of commonly used layer types. There is the option to construct layer types from scratch if it is required for more complex network designs.
We are going to build a convolutional neural network as this is most effective for image processing.

In [3]:
#Build model------------------------------------------------------------------

model = keras.Sequential([

        keras.layers.Conv2D(32,(5, 5),input_shape=(28,28,1), activation='relu'),
    
        keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
        keras.layers.Conv2D(64, (5,5), activation='relu'),
    
        keras.layers.MaxPooling2D(pool_size=(2, 2)),
    
        keras.layers.Flatten(),
    
        keras.layers.Dense(10, activation=tf.nn.softmax)
])

Above, we have two 2D convoltion layers followed by densely connected layer of width 10 with a softmax activation function. The maxpooling layers scale down the image by sliding a kernel window over the input array and picking the maximum value from that window. The softmax layer categorizes the digit into one of the  10 possible output categories that the images can be (categories are digits 0-9 and the correct digit will have output 1, all others output 0).

In [4]:
#Compile model----------------------------------------------------------------

model.compile(optimizer=tf.train.AdamOptimizer(), loss='categorical_crossentropy', metrics=['accuracy'])

The Adam optimizer(alpha, beta1, beta2, epsilon) is a variation of the stochastic gradient descent algorithm where the learning rate is adjusted throughout. Adam adapts the parameter learning rates based on the average first moment and also the average of the second moments of the gradients (the uncentered variance). It calculates an exponential moving average of the gradient and the squared gradient, and the parameters beta1 and beta2 control the decay rates of these moving averages. It can be used for non-stationary objectives and problems with very noisy and/or sparse gradients.

The loss function is equivalent to the cost function (J). When solving an optimization problem we are trying to minimise the loss function as this value represents the error in the output of the network. This error can be propagated back through the network, and the network weights can be adjusted accordingly. If your outputs are ones and zeros you can use categorical_crossentropy, but if they are integers 1,2,3,4... is is better to use sparse_categorical_crossentropy.

# Training the Model

In [11]:
#Train the model--------------------------------------------------------------------

model.fit(training_features , training_labels, epochs = 5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x11c808455c0>

The .fit() function trains the model on the training features and compares the outputs to the training labels. An epoch is a single pass through the whole dataset, where each training sample in the set is presented to the model once. Therefore the number of epochs is the number of complete cyles through the training data. The loss should decrease with every epoch until it levels out. The algorithm continues to run for the number of epochs you have specified, but can be stopped when the loss stops decreasing (you can set a condition for this).

# Cross-Validation

In [9]:
#Predict the test set digits--------------------------------------------------

scores = model.evaluate(test_features, test_labels)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))



acc: 98.85%


The .predict() function predicts the test labels. We can then use the model.evaluate metrics function to display the accuracy of the model.
Accuracy is determined by comparing the predicted labels to the groundtruths.It is possible to calculate the accuracy manually using a simple for loop as shown below, where we count the number of correct predictions by the model.

In [7]:
#Evaulate the accuracy of the model--------------------------------------------

count = 0

for i in range(0, len(test_labels)):
    
    pred = (np.argmax(predictions[i]))
    
    if test_labels[i][0] == pred:
        
        count +=1

print("Correct predictions: ", count)

Correct predictions:  14
