# Tensorflow MNIST Exercise
I modify the MNIST tutorial on the tensorflow website to understand how graphs and sessions work. 

## MNIST Tutorial
From: https://www.tensorflow.org/get_started/mnist/beginners  
MNIST data is basically a data set containing flattened [28, 28] images (so [None, 784]) of handwritten numbers from 0 to 9. There are 55,000 TRAINING images, 10,000 TEST images, and 5,000 VALIDATION images - very important to separate data into those three categories. 

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [2]:
import tensorflow as tf

I copy the graph from the tutorial and write another simply graph. Whenever there's a placeholder, that's a place to use feed_dict to feed in external data into the graph to run the session later. I define all these operations inside a graph and call it g_1. I define another graph g_2 to practice working with multiple graphs at once.  
There are basically a few basic elements defined inside a graph: **placeholders** (for feeding data into the graph), **variables** (the weights that you train), and **operations** (anything as simple as addition to the optimizer). In general, every session.run() statement consists of calling an **operation** and feeding data into the **placeholders**. Then the weights or **variables** are trained automatically. I'm sure it gets more complicated than this, but this is a start.

In [17]:
# graph for training with MNIST data 
g_1 = tf.Graph()
with g_1.as_default():   
    # placeholder: x represents the flattened 28 x 28 image input  
    x = tf.placeholder(tf.float32, [None, 784])

    # variable: W represents a 784 x 10 weight tensor that is variable during training 
    W = tf.Variable(tf.zeros([784, 10]))

    # variable: b represents a 1 x 10 bias tensor that is variable during training 
    b = tf.Variable(tf.zeros([10]))

    # operation: y is the output of this network
    y = tf.nn.softmax(tf.matmul(x, W) + b)

    # placeholder: y_ represents the input for the correct data
    y_ = tf.placeholder(tf.float32, [None, 10])

    # operation: cross_entropy calculates the total loss 
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

    # operation: gradient descent - all those operations happen in here
    train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
    
    # operation: determine correct predictions - supervised learning 
    correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
    
    # operation: determine accuracy - supervised learning
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    
# a random graph
g_2 = tf.Graph()
with g_2.as_default():
    #placeholder 
    randomPlaceholder = tf.placeholder(tf.float32, [None, 2])
    
    # constant: fixed tensor 
    matrix1 = tf.constant([[1., 2.]])
    
    # constant: fixed tensor
    matrix2 = tf.constant([[3.], [4.]])

    # product1
    product1 = tf.matmul(matrix1, matrix2)
    
    # product2
    product2 = tf.matmul(randomPlaceholder, matrix2)  

I can then define a tensorflow session within the "with-as" statement. All sess.run() calls must be run underneath inside the scope of the session. I use a .run() call to execute the training step on small batches of data. This automatically updates all the weights via SGD. If the network trains well, then the loss should decrease over the long run as shown below. The accuracy of about 0.90 means that the network successfully predicts the number from the image 90% of the time, assuming all the data is labeled correctly to begin with and that there are no false-positives or false-negatives.  

In [39]:
# Do session with the MNIST graph 
with tf.Session(graph = g_1) as sess1: 
    tf.global_variables_initializer().run()
    for _ in range(1000):
        # Train in batches of 32
        batch_xs, batch_ys = mnist.train.next_batch(32)
        # Do the train step operation on the batch we collect
        # x and y are the placeholders 
        # we feed in train.images and train.labels 
        # train_step is the SGD operation from the graph
        sess1.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
        # Use sess.run to call the accuracy operation by feeding in test images and test labels this time 
        if _ % 100 == 0:
            print('CE loss at epoch', _, ':' , sess1.run(cross_entropy, feed_dict = {x: batch_xs, y_: batch_ys}))
    print('MNIST g_1 graph accuracy:', sess1.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
tf.reset_default_graph()

CE loss at epoch 0 : 1.29822
CE loss at epoch 100 : 0.249976
CE loss at epoch 200 : 0.156431
CE loss at epoch 300 : 0.332993
CE loss at epoch 400 : 0.17977
CE loss at epoch 500 : 0.240799
CE loss at epoch 600 : 0.122391
CE loss at epoch 700 : 0.284209
CE loss at epoch 800 : 0.076709
CE loss at epoch 900 : 0.123379
MNIST g_1 graph accuracy: 0.912


In [40]:
# Do a separate session on the random graph 
with tf.Session(graph = g_2) as sess2:
    tf.global_variables_initializer().run()
    # Running operation product1
    print(sess2.run(product1))
    # Running operation product2, which requires an input
    print(sess2.run(product2, feed_dict = {randomPlaceholder: [[6., 7.]]}))
tf.reset_default_graph()

[[ 11.]]
[[ 46.]]
