# Keras Tutorial
This will just be a quick intro to the functionality of Keras. The network itself will be complete functionable and able to train as is but I encourage you to play around with it! Here is a list of possible things you could change to observe different results.
- Number of layers
- Size of layers
- Types of layers
- Optimization function
- Error function

In [1]:
#Import all the functionality we will need. If you are going to use
#other layers or error functions then make sure you are importing
#the correct modules!
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils

Using Theano backend.


Using gpu device 0: GeForce GTX 980 (CNMeM is disabled)


## Setting Hyper-Parameters
Recall that hyperparameters are not parameters that we can train like the weights of the connections but we must define. Some examples are the number of output classes, batch sizes, epochs, layer sizes, etc. 

In [2]:
#Set parameters
nb_classes = 10
batch_size = 256
nb_epoch = 10

## Import the Dataset and Pre-process It
Keras comes with the MNIST dataset built in so you can quickly test out your networks. It is also important to make sure your data is preprocessed before throwing it at a network. Some good things to do is subtracting the mean, scaling the variance, etc. Then we must make sure our data is the righ shape to give the network

In [3]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

#Preprocess data
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

('X_train shape:', (60000, 1, 28, 28))
(60000, 'train samples')
(10000, 'test samples')


## Seeding your Random Number Generator
It's good practice to always seed your random number generator. This way we know that when we make a change to our network and get better results we can guaruntee that the better performance is due to the changes we made and not the inherit randomness of the network.

In [4]:
np.random.seed(1234)

## Constructing the Network
I will be making 2 different networks. 1 will be a simple MLP and the other will be a convolutional neural network. Keras does allow you to use many other types of layers so I encourage you to look into more of them and try them out!

In [9]:
#Create network
model = Sequential()

#32 conv filters using 5x5 kernels and relu activation
model.add(Convolution2D(32,5,5,border_mode='same', input_shape=(1,28,28)))
model.add(Activation('relu'))

#32 conv filters using 3x3 kernels and relu activation
model.add(Convolution2D(32,3,3,border_mode='same'))
model.add(Activation('relu'))

#Max pooling and 50% Dropout
#I will explain these later
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(.5))

#16 conv filters using 3x3 kernels and relu activation
model.add(Convolution2D(16,3,3,border_mode='same'))
model.add(Activation('relu'))

#50% Dropout
model.add(Dropout(.5))

#Flatten and add dense network
#Dense here just means normal nueron connections. 
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

## Compilation
The process of compiling the network is USUALLY a 1-liner. Simply tell Keras what type of loss(error) function and optimizer you want to use and that's it! If you want to use a custom loss function or optimizer then the process is a bit more complicated but is completely documented in the Keras documentation.

In [10]:
#NOTE: Compilation will take a while! Don't fret.
model.compile(loss='categorical_crossentropy', optimizer='adadelta')

## Train and Test the Netwok

In [11]:
#If you are going to try using this network then either leave it running over night or reduce the number of
#convolutional layers. 
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)

print('Test score:', score[0])
print('Test accuracy:', score[1])

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
 2304/60000 [>.............................] - ETA: 1710s - loss: 2.0295 - acc: 0.2791

KeyboardInterrupt: 

99% Accuracy! You can see that convolutional networks are VERY good at image recognition tasks. 

## Simple MLP Network
I won't outline any of the specifics here as it follows the same general structure as the convolutional network. This is the best one I could come up with without convolutional layers and in the short time I had to make this tutorial.

In [8]:
np.random.seed(1234)

# the data, shuffled and split between tran and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

nb_epoch = 20
nb_classes = 10
batch_size = 256

#Create network
model2 = Sequential()

#Note that you can put the activation in the layer decleration
#You can do a lot more also so check out the documentation!
model2.add(Dense(28*28,250,activation='relu'))

model2.add(Dense(250,400,activation='relu'))
model2.add(Dropout(.5))

model2.add(Dense(400,128,activation='relu'))

model2.add(Dense(128, nb_classes))
model2.add(Activation('softmax'))

(60000, 'train samples')
(10000, 'test samples')


In [9]:
model2.compile(loss='categorical_crossentropy', optimizer='adadelta')

In [10]:
#This network should not take much time at all to train.
model2.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
score = model2.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)

print('Test score:', score[0])
print('Test accuracy:', score[1])

Train on 60000 samples, validate on 10000 samples
Epoch 0
Epoch 1
Epoch 2
Epoch 3
Epoch 4
Epoch 5
Epoch 6
Epoch 7
Epoch 8
Epoch 9
Epoch 10
Epoch 11
Epoch 12
Epoch 13
Epoch 14
Epoch 15
Epoch 16
Epoch 17
Epoch 18
Epoch 19
('Test score:', 0.091494554800701733)
('Test accuracy:', 0.98140000000000005)
