Deep Learning
=============

Assignment 2
------------

Previously in `1_notmnist.ipynb`, we created a pickle with formatted datasets for training, development and testing on the [notMNIST dataset](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html).

The goal of this assignment is to progressively train deeper and more accurate models using TensorFlow.

In [2]:
from __future__ import print_function
import numpy as np
import tensorflow as tf
import pickle

First reload the data we generated in 1_notmnist.ipynb.

In [7]:
pickle_file = '../../data/notMNIST_sanitized.pickle'

with open(pickle_file, 'rb') as f:
    save = pickle.load(f)
    train_dataset = save['train_dataset_sanitized']
    train_labels = save['train_labels_sanitized']
    valid_dataset = save['valid_dataset']
    valid_labels = save['valid_labels']
    test_dataset = save['test_dataset']
    test_labels = save['test_labels']
    del save # hint to help gc free up memory
    print('Training_set',train_dataset.shape, train_labels.shape)
    print('Validation_set',valid_dataset.shape, valid_labels.shape)
    print('Test_set',test_dataset.shape, test_labels.shape)

Training_set (192407, 28, 28) (192407,)
Validation_set (10000, 28, 28) (10000,)
Test_set (10000, 28, 28) (10000,)


Reformat into a shape that's more adapted to the models we're going to train:
- data as a flat matrix,
- labels as float 1-hot encodings.

In [9]:
image_size = 28
num_labels = 10

In [14]:
def reformat(dataset, labels):
    """重构数据集样式"""
    dataset = dataset.reshape(dataset.shape[0], -1).astype(np.float32)
    # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]
    labels = (np.arange(num_labels) == train_labels[:,None]).astype(np.float32) 
    return dataset, labels

train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training_set',train_dataset.shape, train_labels.shape)
print('Validation_set',valid_dataset.shape, valid_labels.shape)
print('Test_set',test_dataset.shape, test_labels.shape)

Training_set (192407, 784) (192407, 10)
Validation_set (10000, 784) (192407, 1, 10)
Test_set (10000, 784) (192407, 1, 10)


We're first going to train a multinomial logistic regression using simple gradient descent.

TensorFlow works like this:
* First you describe the computation that you want to see performed: what the inputs, the variables, and the operations look like. These get created as nodes over a computation graph. This description is all contained within the block below:

      with graph.as_default():
          ...

* Then you can run the operations on this graph as many times as you want by calling `session.run()`, providing it outputs to fetch from the graph that get returned. This runtime operation is all contained in the block below:

      with tf.Session(graph=graph) as session:
          ...

Let's load all the data into TensorFlow and build the computation graph corresponding to our training: