# TensorFlow Mechanics 101

Code: [tensorflow/g3doc/tutorials/mnist/](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/)

The goal of this tutorial is to show how to use TensorFlow to train and evaluate a simple feed-forward neural network for handwritten digit classification using the (classic) MNIST data set. The intended audience for this tutorial is experienced machine learning users interested in using TensorFlow.

These tutorials are not intended for teaching Machine Learning in general.

Please ensure you have followed the instructions to [install TensorFlow](http://www.tensorflow.org/get_started/os_setup.html).


## Tutorial Files

This tutorial references the following files:

| File | Purpose |
|:--|:--|
| [mnist.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/mnist.py) | The code to build a fully-connected MNIST model. |
| [fully_connected_feed.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py) | The main code to train the built MNIST model against the downloaded dataset using a feed dictionary. |

Simply run the ```fully_connected_feed.py``` file directly to start training:

In [1]:
%run fully_connected_feed.py

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
Step 0: loss = 2.32 (0.111 sec)
Step 100: loss = 2.15 (0.010 sec)
Step 200: loss = 1.92 (0.009 sec)
Step 300: loss = 1.58 (0.009 sec)
Step 400: loss = 1.32 (0.009 sec)
Step 500: loss = 0.88 (0.009 sec)
Step 600: loss = 0.85 (0.009 sec)
Step 700: loss = 0.67 (0.009 sec)
Step 800: loss = 0.61 (0.017 sec)
Step 900: loss = 0.62 (0.013 sec)
Training Data Eval:
  Num examples: 55000  Num correct: 47841  Precision @ 1: 0.8698
Validation Data Eval:
  Num examples: 5000  Num correct: 4386  Precision @ 1: 0.8772
Test Data Eval:
  Num examples: 10000  Num correct: 8731  Precision @ 1: 0.8731
Step 1000: loss = 0.54 (0.044 sec)
Step 1100: loss = 0.50 (0.266 sec)
Step 1200: loss = 0.48 (0.013 sec)
Step 1300: loss = 0.38 (0.009 sec)
Step 1400: loss = 0.50 (0.011 sec)
Step 1500: loss = 0.44 (0.010 sec)
Step 1600: loss = 0.29 (0.012 se

In [2]:
print batch_size

NameError: name 'batch_size' is not defined

In [1]:
"""Trains and Evaluates the MNIST network using a feed dictionary.
TensorFlow install instructions:
https://tensorflow.org/get_started/os_setup.html
MNIST tutorial:
https://tensorflow.org/tutorials/mnist/tf/index.html
"""
# pylint: disable=missing-docstring
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os.path
import time
import tensorflow.python.platform
import numpy
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf
import input_data
import mnist
# Basic model parameters as external flags.
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('max_steps', 2000, 'Number of steps to run trainer.')
flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.')
flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.')
flags.DEFINE_integer('batch_size', 100, 'Batch size.  '
                     'Must divide evenly into the dataset sizes.')
flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.')
flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data '
                     'for unit testing.')
def placeholder_inputs(batch_size):
  """Generate placeholder variables to represent the the input tensors.
  These placeholders are used as inputs by the rest of the model building
  code and will be fed from the downloaded data in the .run() loop, below.
  Args:
    batch_size: The batch size will be baked into both placeholders.
  Returns:
    images_placeholder: Images placeholder.
    labels_placeholder: Labels placeholder.
  """
  # Note that the shapes of the placeholders match the shapes of the full
  # image and label tensors, except the first dimension is now batch_size
  # rather than the full size of the train or test data sets.
  images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
                                                         mnist.IMAGE_PIXELS))
  labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))
  return images_placeholder, labels_placeholder
def fill_feed_dict(data_set, images_pl, labels_pl):
  """Fills the feed_dict for training the given step.
  A feed_dict takes the form of:
  feed_dict = {
      <placeholder>: <tensor of values to be passed for placeholder>,
      ....
  }
  Args:
    data_set: The set of images and labels, from input_data.read_data_sets()
    images_pl: The images placeholder, from placeholder_inputs().
    labels_pl: The labels placeholder, from placeholder_inputs().
  Returns:
    feed_dict: The feed dictionary mapping from placeholders to values.
  """
  # Create the feed_dict for the placeholders filled with the next
  # `batch size ` examples.
  images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size,
                                                 FLAGS.fake_data)
  feed_dict = {
      images_pl: images_feed,
      labels_pl: labels_feed,
  }
  return feed_dict
def do_eval(sess,
            eval_correct,
            images_placeholder,
            labels_placeholder,
            data_set):
  """Runs one evaluation against the full epoch of data.
  Args:
    sess: The session in which the model has been trained.
    eval_correct: The Tensor that returns the number of correct predictions.
    images_placeholder: The images placeholder.
    labels_placeholder: The labels placeholder.
    data_set: The set of images and labels to evaluate, from
      input_data.read_data_sets().
  """
  # And run one epoch of eval.
  true_count = 0  # Counts the number of correct predictions.
  steps_per_epoch = data_set.num_examples // FLAGS.batch_size
  num_examples = steps_per_epoch * FLAGS.batch_size
  for step in xrange(steps_per_epoch):
    feed_dict = fill_feed_dict(data_set,
                               images_placeholder,
                               labels_placeholder)
    true_count += sess.run(eval_correct, feed_dict=feed_dict)
  precision = true_count / num_examples
  print('  Num examples: %d  Num correct: %d  Precision @ 1: %0.04f' %
        (num_examples, true_count, precision))
def run_training():
  """Train MNIST for a number of steps."""
  # Get the sets of images and labels for training, validation, and
  # test on MNIST.
  data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)
  # Tell TensorFlow that the model will be built into the default Graph.
  with tf.Graph().as_default():
    # Generate placeholders for the images and labels.
    images_placeholder, labels_placeholder = placeholder_inputs(
        FLAGS.batch_size)
    # Build a Graph that computes predictions from the inference model.
    logits = mnist.inference(images_placeholder,
                             FLAGS.hidden1,
                             FLAGS.hidden2)
    # Add to the Graph the Ops for loss calculation.
    loss = mnist.loss(logits, labels_placeholder)
    # Add to the Graph the Ops that calculate and apply gradients.
    train_op = mnist.training(loss, FLAGS.learning_rate)
    # Add the Op to compare the logits to the labels during evaluation.
    eval_correct = mnist.evaluation(logits, labels_placeholder)
    # Build the summary operation based on the TF collection of Summaries.
    summary_op = tf.merge_all_summaries()
    # Create a saver for writing training checkpoints.
    saver = tf.train.Saver()
    # Create a session for running Ops on the Graph.
    sess = tf.Session()
    # Run the Op to initialize the variables.
    init = tf.initialize_all_variables()
    sess.run(init)
    # Instantiate a SummaryWriter to output summaries and the Graph.
    summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
                                            graph_def=sess.graph_def)
    # And then after everything is built, start the training loop.
    for step in xrange(FLAGS.max_steps):
      start_time = time.time()
      # Fill a feed dictionary with the actual set of images and labels
      # for this particular training step.
      feed_dict = fill_feed_dict(data_sets.train,
                                 images_placeholder,
                                 labels_placeholder)
      # Run one step of the model.  The return values are the activations
      # from the `train_op` (which is discarded) and the `loss` Op.  To
      # inspect the values of your Ops or variables, you may include them
      # in the list passed to sess.run() and the value tensors will be
      # returned in the tuple from the call.
      _, loss_value = sess.run([train_op, loss],
                               feed_dict=feed_dict)
      duration = time.time() - start_time
      # Write the summaries and print an overview fairly often.
      if step % 100 == 0:
        # Print status to stdout.
        print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration))
        # Update the events file.
        summary_str = sess.run(summary_op, feed_dict=feed_dict)
        summary_writer.add_summary(summary_str, step)
      # Save a checkpoint and evaluate the model periodically.
      if (step + 1) % 1000 == 0 or (step + 1) == FLAGS.max_steps:
        saver.save(sess, FLAGS.train_dir, global_step=step)
        # Evaluate against the training set.
        print('Training Data Eval:')
        do_eval(sess,
                eval_correct,
                images_placeholder,
                labels_placeholder,
                data_sets.train)
        # Evaluate against the validation set.
        print('Validation Data Eval:')
        do_eval(sess,
                eval_correct,
                images_placeholder,
                labels_placeholder,
                data_sets.validation)
        # Evaluate against the test set.
        print('Test Data Eval:')
        do_eval(sess,
                eval_correct,
                images_placeholder,
                labels_placeholder,
                data_sets.test)

run_training()

usage: __main__.py [-h] [--learning_rate LEARNING_RATE]
                   [--max_steps MAX_STEPS] [--hidden1 HIDDEN1]
                   [--hidden2 HIDDEN2] [--batch_size BATCH_SIZE]
                   [--train_dir TRAIN_DIR] [--fake_data FAKE_DATA]
                   [--nofake_data]
__main__.py: error: unrecognized arguments: -f /Users/asseldonk/Library/Jupyter/runtime/kernel-e0252561-a1e8-4d47-85ff-dee651f78660.json


SystemExit: 2

To exit: use 'exit', 'quit', or Ctrl-D.


## Prepare the Data

MNIST is a classic problem in machine learning. The problem is to look at greyscale 28x28 pixel images of handwritten digits and determine which digit the image represents, for all the digits from zero to nine.

<p align="center">
    <img src="images/mnist_digits.png" style="width: 300px;" />
</p>

For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) or [Chris Olah's visualizations of MNIST](http://yann.lecun.com/exdb/mnist/).

## Download

At the top of the ```run_training()``` method, the ```input_data.read_data_sets()``` function will ensure that the correct data has been downloaded to your local training folder and then unpack that data to return a dictionary of ```DataSet``` instances.

In [None]:
data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)

NOTE: The ```fake_data``` flag is used for unit-testing purposes and may be safely ignored by the reader.

| Dataset | Purpose |
| -- | -- |
| ```data_sets.train```      | 55000 images and labels, for primary training. |
| ```data_sets.validation``` | 5000 images and labels, for iterative validation of training accuracy. |
| ```data_sets.test```       | 10000 images and labels, for final testing of trained accuracy. |

For more information about the data, please read the [Download](http://www.tensorflow.org/tutorials/mnist/download/index.html) tutorial.

## Inputs and Placeholders

The ```placeholder_inputs()``` function creates two ```tf.placeholder``` ops that define the shape of the inputs, including the batch_size, to the rest of the graph and into which the actual training examples will be fed.

In [None]:
# print batch_size
images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))

Further down, in the training loop, the full image and label datasets are sliced to fit the ```batch_size``` for each step, matched with these placeholder ops, and then passed into the ```sess.run()``` function using the ```feed_dict``` parameter.

## Build the Graph

After creating placeholders for the data, the graph is built from the mnist.py file according to a 3-stage pattern: ```inference()```, ```loss()```, and ```training()```.

1. ```inference()``` - Builds the graph as far as is required for running the network forward to make predictions.
2. ```loss()``` - Adds to the inference graph the ops required to generate loss.
3. ```training()``` - Adds to the loss graph the ops required to compute and apply gradients.