# Lab 2 - Fully Connected Feedforward Network with MNIST
# Model Overview

In this lab, we will train a fully connected feedforward network on MNIST data. 

The lab comprises two parts. During the first part, the instructor will walk you through the code to define, train, and evaluate the initial version of FCNN model. In the second part you will compete with other students to improve the performance of the model.


Our fully connected feedforward network - a.k.a multi-layer perceptron - will be relatively simple with 2 hidden layers (`num_hidden_layers`). The number of nodes in the hidden layer being a parameter specified by `hidden_layers_dim`. The figure below illustrates the entire model we will use in this tutorial in the context of MNIST data.

![model-mlp](http://cntk.ai/jup/cntk103c_MNIST_MLP.png)

In this and the following labs we will demonstrate the use of the Functional API. 

# Code Walkthrough
## Initialize environment

In [None]:
import sys
import os
import time
import numpy as np
import cntk as C
from cntk.logging.progress_print import ProgressPrinter
from cntk.layers import Dense, Sequential, For


## Data reading

In this lab we are using the MNIST data pre-processed to follow CNTK CTF format. 


    |labels 0 0 0 0 0 0 0 1 0 0 |features 0 0 0 0 ... 
                                                  (784 integers each representing a pixel)
                                                 

Each line in the file contains two key-value pairs, also refered as streams. The `labels` stream is the one-hot encoded representation of a digit 0-9. The `features` stream is a 784 vector of 0-255 integers representing 28 x 28 pixel grayscale image.

Our dataset includes three files: the training file with 50,000 images, the validation file with 10,000 images, and the testing file with 10,000 images.

To read/sample the files, we define a `create_reader` function that configures and returns the CNTK MinibatchSource object.
    

In [None]:
# Ensure we always get the same amount of randomness
np.random.seed(0)

# Read a CTF formatted text (as mentioned above) using the CTF deserializer from a file
def create_reader(path, is_training, input_dim, num_label_classes):
    return C.io.MinibatchSource(C.io.CTFDeserializer(path, C.io.StreamDefs(
        labels = C.io.StreamDef(field='labels', shape=num_label_classes),
        features   = C.io.StreamDef(field='features', shape=input_dim)
    )), randomize = is_training, max_sweeps = C.io.INFINITELY_REPEAT if is_training else 1)

## Network definition and training

### Define the network


In [None]:
# Define a fully connected feedforward classification network factory with sigmoid neurons in the hidden layers
def create_fcnn_model_factory(num_hidden_layers, hidden_layers_dim, num_output_classes):
    with C.layers.default_options(init = C.layers.glorot_uniform(), activation = C.ops.sigmoid):
        return Sequential([
            For(range(num_hidden_layers), lambda i: Dense(hidden_layers_dim, name = 'hidden' + str(i))),
            Dense(num_output_classes, activation = None, name='classify')])


# Create model factory
num_hidden_layers = 2
hidden_layers_dim = 400
num_output_classes = 10
mn_factory = create_fcnn_model_factory(num_hidden_layers, hidden_layers_dim, num_output_classes)  



### Define the criterion function

In [None]:
@C.Function
def criterion_mn_factory(data, label_one_hot):
    z = mn_factory(data)
    loss = C.cross_entropy_with_softmax(z, label_one_hot)
    metric = C.classification_error(z, label_one_hot)
    return loss, metric

input_dim = 784
features = C.input_variable(input_dim)/255
labels = C.input_variable(num_output_classes, is_sparse=True)

criterion_mn = criterion_mn_factory(features, labels)

### Train the model using the SGD learner

In [None]:
# Define an SGD learner
learner = C.sgd(mn_factory.parameters, C.learning_rate_schedule(0.2, C.UnitType.minibatch))

# Define a helper function to report on training progress
progress_writer = ProgressPrinter()

# Create the reader to the training data set
train_file = "../Data/MNIST_train.txt"
reader_train = create_reader(train_file, True, input_dim, num_output_classes)

# Initiate training
progress = criterion_mn.train(minibatch_source = reader_train,
                    streams = (reader_train.streams.features, reader_train.streams.labels),
                    minibatch_size = 64,
                    epoch_size = 12800,
                    max_epochs = 40,
                    parameter_learners=[learner],
                    callbacks = [progress_writer])


### Evaluate the model



In [None]:
# Create the reader on the validation data set
validation_file = "../Data/MNIST_validate.txt"
reader_validate = create_reader(validation_file, False, input_dim, num_output_classes)

# Score the validation set and calculate the classification error metric
validation_metric = criterion_mn.test(minibatch_source = reader_validate,
                                  minibatch_size = 64,
                                  streams = (reader_validate.streams.features, reader_validate.streams.labels),
                                  callbacks = [progress_writer])

# Hackathon

Try to improve the performance of the model. 

Hints:
- Try different activation functions in hidden layers
- Play with the learning rate, minibatch size and the number of sweeps
- You can look at regularization - check `l1_regularization` and `l2_regularization` hyper parameters of the `sgd` learner
- Try different optimization algorithms

## Final testing


DON'T CHEAT. DON'T USE MNIST_test.txt FOR MODEL TRAINING AND SELECTION. DON'T EXECUTE THE BELOW CELL TILL YOU ARE READY FOR THE FINAL TEST



In [None]:
# Create the reader on the testing data set
test_file = '../Data/MNIST_test.txt'
reader_test = create_reader(test_file, False, input_dim, num_output_classes)

# Score the testing data set and calculate the classification error metric
test_metric = criterion_mn.test(minibatch_source = reader_test,
                           minibatch_size = 64,
                           streams = (reader_test.streams.features, reader_test.streams.labels),
                           callbacks = [progress_writer])