# Recognizing hand-written digits
This notebook demonstrates how to use a convolutional neural network for image recognition.
We're using the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). 

You need a specially prepared dataset for this notebook. You can prepare the dataset for this notebook using the notebook "Prepare the dataset.ipynb". 

## The network structure
The neural network has two convolution layers and two max-pooling layers mixed together.
The output of the neural network is a softmax activated dense layer with 10 neurons. Each for one of the different labels that we can predict with this model.

In [5]:
from cntk.layers import Convolution2D, Sequential, Dense, MaxPooling
from cntk.ops import log_softmax, relu
from cntk.initializer import glorot_uniform
from cntk import input_variable, default_options

features = input_variable((3,28,28))
labels = input_variable(10)

with default_options(initialization=glorot_uniform, activation=relu):
    model = Sequential([
        Convolution2D(filter_shape=(5,5), strides=(1,1), num_filters=8, pad=True),
        MaxPooling(filter_shape=(2,2), strides=(2,2)),
        Convolution2D(filter_shape=(5,5), strides=(1,1), num_filters=16, pad=True),
        MaxPooling(filter_shape=(3,3), strides=(3,3)),
        Dense(10, activation=log_softmax)
    ])

z = model(features)

The loss for the model is a categorical cross entropy. We're using a `Function` object to combine the loss with a metric to measure the performance of the model. This `criterion_factory` is used to create the objective for the training logic. We're using a SGD learner for this model.

In [6]:
from cntk import Function
from cntk.losses import cross_entropy_with_softmax
from cntk.metrics import classification_error
from cntk.learners import sgd

@Function
def criterion_factory(output, targets):
    loss = cross_entropy_with_softmax(output, targets)
    metric = classification_error(output, targets)
    
    return loss, metric

loss = criterion_factory(z, labels)
learner = sgd(z.parameters, lr=0.2)

## The data source
The data is stored as images on disk with a mapping file that combines the filename of the images with the label for each of the images. We're using random transforms during training to augment the training data in an attempt to improve performance.

Note that these transforms are not applied during testing. 

In [7]:
import os
from cntk.io import MinibatchSource, StreamDef, StreamDefs, ImageDeserializer, INFINITELY_REPEAT
import cntk.io.transforms as xforms

def create_datasource(folder, train=True, max_sweeps=INFINITELY_REPEAT):
    mapping_file = os.path.join(folder, 'mapping.bin')
    
    stream_definitions = StreamDefs(
        features=StreamDef(field='image', transforms=[]),
        labels=StreamDef(field='label', shape=10)
    )
    
    deserializer = ImageDeserializer(mapping_file, stream_definitions)
    
    return MinibatchSource(deserializer, max_sweeps=max_sweeps)

Training data and testing data is stored in separate folders.
You need a separate data source for both.

In [8]:
train_datasource = create_datasource('mnist_train')
test_datasource = create_datasource('mnist_test', max_sweeps=1, train=False)

## Training the model
The model is trained for one epoch with a batchsize of 64. We've added the progress printer to visualize the output of the training session. We've also included the test set here to validate the performance of the model.

In [9]:
from cntk.logging import ProgressPrinter
from cntk.train import TestConfig


progress_writer = ProgressPrinter(0)

test_config = TestConfig(test_datasource)

input_map = {
    features: train_datasource.streams.features,
    labels: train_datasource.streams.labels
}

loss.train(train_datasource, 
           max_epochs=10,
           minibatch_size=64,
           epoch_size=60000, 
           parameter_learners=[learner], 
           model_inputs_to_streams=input_map,  
           callbacks=[progress_writer, test_config])

 average      since    average      since      examples
    loss       last     metric       last              
 ------------------------------------------------------
Learning rate per minibatch: 0.2
      170        170      0.938      0.938            64
 2.04e+07   3.06e+07      0.901      0.883           192
 8.74e+06          4      0.897      0.895           448
 4.08e+06          4      0.905      0.912           960
 1.97e+06          4      0.896      0.887          1984
 9.71e+05          4      0.893      0.891          4032
 4.82e+05          4      0.891      0.889          8128
  2.4e+05          4       0.89      0.889         16320
  1.2e+05          4       0.89       0.89         32704
      2.3        2.3      0.891      0.891            64
      2.3        2.3      0.885      0.883           192
      2.3       2.31        0.9       0.91           448
      2.3        2.3      0.893      0.887           960
      2.3        2.3      0.886       0.88          1984
 

KeyboardInterrupt: 

RuntimeError: SWIG director method error.