# Sample Model Training and Saving
Here is a sample model to classify notMNIST data and save the trained weights.  **Spoiler alert:** the following reveals simple solution for one of the Udacity assignments.

The sections pertinent to saving your model are highlighted in the final two sections with Python comments.  You should label the input and output tensors of your model with the optional name parameter.  An instance of [tf.train.Saver](https://www.tensorflow.org/api_docs/python/tf/train/Saver) can write out the model and TensorFlow state to a file.

In this example, the files will be named `model.data-00000-of-00001`, `model.index`, and `checkpoint`.  You must include both your model and checkpoint files in your submission to recover your model state.

In [1]:
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import cPickle as pickle
from six.moves import range
from pathlib import Path

Data input from [Udacity coursework](https://www.udacity.com/course/deep-learning--ud730) reading the pickled training data.

*We do not include the data (`notMINST.pickle` file) in this repository.*

In [2]:
pickle_file = 'notMNIST.pickle'

with open(pickle_file, 'rb') as f:
    save = pickle.load(f)
    train_dataset = save['train_dataset']
    train_labels = save['train_labels']
    valid_dataset = save['valid_dataset']
    valid_labels = save['valid_labels']
    test_dataset = save['test_dataset']
    test_labels = save['test_labels']
    del save  # hint to help gc free up memory
    
image_size = 28
num_labels = 10

def reformat(dataset, labels):
    dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)
    labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
    return dataset, labels

def accuracy(predictions, labels):
    return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
          / predictions.shape[0])

train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 784) (200000, 10)
Validation set (10000, 784) (10000, 10)
Test set (10000, 784) (10000, 10)


## Define the Model
Here is a sample model prepared for saving to disk, so that we can use the training results later.  The pertinent aspect here is to make sure to label any tensor that we will read or write to during evaluation using the optional `name` arguments.

In [3]:
batch_size = 128
num_hidden = 1024

sgd_hidden_graph = tf.Graph()
with sgd_hidden_graph.as_default():
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size * image_size))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    # ---------------------------------------> LABELED FOR LATER USE <----------------
    tf_test_dataset = tf.constant(test_dataset, name='test_data')
    
    w0 = tf.Variable(tf.truncated_normal([image_size * image_size, num_hidden]), name='W0')
    w1 = tf.Variable(tf.truncated_normal([num_hidden, num_labels]), name='W1')

    b0 = tf.Variable(tf.zeros([num_hidden]), name='b0')
    b1 = tf.Variable(tf.zeros([num_labels]), name='b1')

    def reluLayer(dataset):
        return tf.nn.relu(tf.matmul(dataset, w0) + b0)
    def logitLayer(dataset):
        return tf.matmul(reluLayer(dataset), w1) + b1
    
    sgd_hidden_loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(
            labels=tf_train_labels, 
            logits=logitLayer(tf_train_dataset)))
    sgd_hidden_optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(sgd_hidden_loss)
  
    # ---------------------------------------------------------------------> LABELED FOR LATER USE <--
    sgd_hidden_train_prediction = tf.nn.softmax(logitLayer(tf_train_dataset), name='train_predictor')
    sgd_hidden_valid_prediction = tf.nn.softmax(logitLayer(tf_valid_dataset), name='validate_predictor')
    sgd_hidden_test_prediction = tf.nn.softmax(logitLayer(tf_test_dataset), name='test_predictor')

## Train the Model
We demonstrate how to write the model state to disk.  You will need to save the `model*` and `checkpoint` files for later use.

In [4]:
num_steps = 3001

with tf.Session(graph=sgd_hidden_graph) as sgd_hidden_session:
    # ------> CREATE A SAVER INSTANCE TO PICKLE YOUR MODEL <---
    saver = tf.train.Saver()
    tf.global_variables_initializer().run()
    print("Initialized")
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :].reshape(batch_size, num_labels)
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, sgd_hidden_predictions = sgd_hidden_session.run(
            [sgd_hidden_optimizer, sgd_hidden_loss, sgd_hidden_train_prediction], feed_dict=feed_dict)
        if (step % 500 == 0):
            print("Minibatch loss at step %d: %f" % (step, l))
            print("Minibatch accuracy: %.1f%%" % accuracy(sgd_hidden_predictions, batch_labels))
            print("Validation accuracy: %.1f%%" % accuracy(sgd_hidden_valid_prediction.eval(), valid_labels))
    print("Test accuracy: %.1f%%" % accuracy(sgd_hidden_test_prediction.eval(), test_labels))
    # -----------------> MODEL IS SAVED HERE <-------------------
    saver.save(sgd_hidden_session, '{}/model'.format(Path.cwd()))

Initialized
Minibatch loss at step 0: 302.069550
Minibatch accuracy: 13.3%
Validation accuracy: 29.4%
Minibatch loss at step 500: 15.300731
Minibatch accuracy: 80.5%
Validation accuracy: 79.5%
Minibatch loss at step 1000: 12.396424
Minibatch accuracy: 77.3%
Validation accuracy: 79.9%
Minibatch loss at step 1500: 5.337847
Minibatch accuracy: 87.5%
Validation accuracy: 81.3%
Minibatch loss at step 2000: 3.442477
Minibatch accuracy: 85.2%
Validation accuracy: 81.7%
Minibatch loss at step 2500: 3.387420
Minibatch accuracy: 85.9%
Validation accuracy: 82.0%
Minibatch loss at step 3000: 1.928396
Minibatch accuracy: 82.8%
Validation accuracy: 82.2%
Test accuracy: 89.2%
