## Deep learning introduction

Example:  Classify handwritten digits

* grayscale images, 28x28 pixels into 10 categories
* [MNISt](https://en.wikipedia.org/wiki/MNIST_database) dataset: 60k training images, 10k test images


Terminology:

* **Class**: a category in a classification problem 
* **Sample**: datapoints
* **Label** class associatied with a specific sample

MNIST database is preloaded in KERASin set of four Numpy arrays

In [2]:
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz


* `train_images` and `train_labels` frm the training set
* `test_images` and `test_labels` is test set on which model will be tested

In [7]:
print(f"Shape train_image = {train_images.shape}")
print(f"Length train_labels = {len(train_labels)}")
print(f"Shape test_images= {test_images.shape}")
print(f"Length test_labels = {len(test_labels)}")

Shape train_image = (60000, 28, 28)
Length train_labels = 60000
Shape test_images= (10000, 28, 28)
Length test_labels = 10000


***Workflow:*** 

1. feed neuronal network training data
2. associate images with labels
3. produce predictions for test images
4. verify whether predictions match test labels

In [9]:
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(512, activation = 'relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation = 'softmax'))

* layer: is a filter
* layer extract representations of data fed into them.
* layers are chained

Network above consists of a sequence of two dense layers that are *densely* connected (or *fully* connected) neural layers. The second layer is a 10 way *softmax* layer which rturns 10 probability scores (summing to 1). Each score will give probability that current digit belongs to one of 10 digit classes.

Make network ready for training need to pick three more things as part of the *compilation* step: 

* optimizer (method to update network)
* Loss function (how network measures performance on training data)
* Metrics to monitor during training and testing (here: accuracy of classification)

In [10]:
network.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

* reshape images into a `float32` array of shape (60000, 28*28)
* rescale 8 bit images into values in interval `[0, 1]`

In [13]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32')/255

test_images = test_images.reshape((10000, 28 *28))
test_images = test_images.astype('float32')/255

* categorically encode labels

In [15]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

* train network.  Call network's fit method

In [18]:
network.fit(train_images, train_labels, epochs =5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x647a08b90>

In [19]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
print('test_acc:', test_acc)


test_acc: 0.9205999970436096


This is lower than the training set accuracy.  This is an example of *overfitting*. 