In [None]:
import graphlab
from IPython.display import Image

Deep Learning Basics
=============

Today we'll walk through building a deep learning network for hand-written digit recognition using the <a href='http://yann.lecun.com/exdb/mnist/'>MNIST dataset</a>. The MNIST data-set represents real-world data that is already formatted and labeled, so we can focus on building our network today instead of cleaning the data.

<img src="images/mnist.png"></img>

We're going to walk through 4 steps today:
- Loading the Data
- Training our Model
- Evaluating the results of the model
- Look at ways to improveing the performance of our model
-----

<img src="images/load.png"></img>

We've download the MNIST data set for you. We load the data into an SFame, which is a powerful and scalable data structure that is used by many of the models in GraphLab Create.

In [None]:
data = graphlab.SFrame('mnist_train.gl/')
data

Let's visualize the data using canvas:

In [None]:
data.show()   # Uncomment to use, it will open a separate window

--------
<img src="images/train.png"></img>

We now use the ```neuralnet_classifier``` provided by GraphLab Create to create a neural network for our data set. The ```create``` method picks a default network architecture for you based ont eh dataset. We also specify the number of iterations we want to train on (the more the better, but also takes more time). You should adjust the max_iterations and validate the performance of the model improves.

In [None]:
net = graphlab.deeplearning.get_builtin_neuralnet('mnist')
neuralnet = graphlab.neuralnet_classifier.create(data,
                                                 network = net,
                                                 target ='label',
                                                 max_iterations = 6,
                                                 metric=['accuracy', 'recall@2'],
                                                 validation_set=None)

A typical error curve for neuralnet training. Here, epochs and iterations are interchangeable, and represent the number of passes through the data. 

In [None]:
Image('images/error_curve.png')

-----
<img src="images/evaluate.png"></img>
In order to ensure that the deep learning model is acutally learning how to recognize the data, instead of memorizing features, we want to validate it with a dataset it hasn't seen before.

In [None]:
validation_data = graphlab.SFrame('mnist_test.gl/')
validation_data

In [None]:
neuralnet.evaluate(validation_data)

Lets explore the examples that are misclassified by the model:

In [None]:
neuralnet.show()

In [None]:
def find_misclassifications(validation_data, nn_model):
    classifications = nn_model.classify(validation_data)
    joined_classifications = validation_data.join(classifications, on={'id':'row_id'})
    misclassifications = joined_classifications[joined_classifications['label'] != joined_classifications['class']]
    return misclassifications

In [None]:
misclassifications = find_misclassifications(validation_data, neuralnet)

In [None]:
misclassifications

Let's sort by score, in a descending fashion. The score represents the confidence of the model in the prediction. The data instances, which are missclassified with high confidence by the model, are particlularly interesting because that may give us some insight into where the model is *very* wrong. So let's sort the misclassifications:

In [None]:
sorted_misclassifications = misclassifications.sort('score', ascending=False)

And  visualize it:

In [None]:
sorted_misclassifications.show()

-----
<img src="images/improve.png"></img>
Our network does well on the data, but we'd like to do better. How can we improve it? Here's what Andrew Ng has to say:
<img src="images/machine_learning_recipe.png"></img>

Our training data accuracy is ~80% with room to improve. So let's the make the network larger! As reminder here is what a neural network layer is. If you'd like to learn more about the different layers, please refer to our [API docs](https://dato.com/products/create/docs/generated/graphlab.deeplearning.layers.html).

<img src="images/neural_net.png"></img>

Let's look at our network generated. Layer 3 is a good place to increase network size. Let's increase the number of hidden units in the third layer from 100 to 500 and re-train the model. 

In [None]:
neuralnet['network']

In [None]:
new_network = neuralnet['network']
new_network.layers[4] = graphlab.deeplearning.layers.FullConnectionLayer(500)# Layer 3 previously had 100 hidden units. 

In [None]:
improved_neuralnet = graphlab.neuralnet_classifier.create(data, target='label', 
                                                          max_iterations=3, 
                                                          validation_set=None, 
                                                          network=new_network)

Looks like our validation accuracy jumped to about 93%!

In [None]:
improved_neuralnet.evaluate(validation_data)

Beware of Overfitting
===========

If you make the network too big relative to the dataset, overfitting can occur. This is when a  model describes random  noise instead of visual structure. For instance, maybe all the 7's have strikes through the middle in the training set. The model may learn this fact, and get confused when it gets a new 7 without a strike. This can be solved by either more data (as in the flowchart above) or by making the model smaller/less complex so that it does not actually have the expressesiveness to memorize details not critical to the crucial visual structure. 

In [None]:
Image('images/overfitting.png')