### Previous:   <a href="../Keras/keras_01.ipynb">Check out Keras 1.1 MNIST</a>

# <center> TensorFlow </center>
## <center>2.1 Structure</center>

# Explanation

# Example

Importing the MNIST dataset

In [None]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
import matplotlib.pyplot as plt
import numpy as np
import random as ran

Functions are defined, which are assigning the amount of training and test data, we will load from the data set.<br>
We also define functions for resizing and displaying the data.<br>
It's not important to look deeply into those functions.

In [None]:
def TRAIN_SIZE(num):
    print ('Total Training Images in Dataset = ' + str(mnist.train.images.shape))
    print ('--------------------------------------------------')
    x_train = mnist.train.images[:num,:]
    print ('x_train Examples Loaded = ' + str(x_train.shape))
    y_train = mnist.train.labels[:num,:]
    print ('y_train Examples Loaded = ' + str(y_train.shape))
    print('')
    return x_train, y_train

def TEST_SIZE(num):
    print ('Total Test Examples in Dataset = ' + str(mnist.test.images.shape))
    print ('--------------------------------------------------')
    x_test = mnist.test.images[:num,:]
    print ('x_test Examples Loaded = ' + str(x_test.shape))
    y_test = mnist.test.labels[:num,:]
    print ('y_test Examples Loaded = ' + str(y_test.shape))
    return x_test, y_test

def display_digit(num):
    print(y_train[num])
    label = y_train[num].argmax(axis=0)
    image = x_train[num].reshape([28,28])
    plt.title('Example: %d  Label: %d' % (num, label))
    plt.imshow(image, cmap=plt.get_cmap('gray_r'))
    plt.show()

def display_mult_flat(start, stop):
    images = x_train[start].reshape([1,784])
    for i in range(start+1,stop):
        images = np.concatenate((images, x_train[i].reshape([1,784])))
    plt.imshow(images, cmap=plt.get_cmap('gray_r'))
    plt.show()

First, we define variables with how many training and test examples we would like to load. 

In [None]:
x_train, y_train = TRAIN_SIZE(55000)

So, what does this mean? In our data set, there are 55,000 examples of handwritten digits from zero to nine.
Each example is a 28x28 pixel image flattened in an array with 784 values representing each pixel’s intensity. 
The examples need to be flattened for TensorFlow to make sense of the digits linearly. 
This shows that in x_train we have loaded 55,000 examples each with 784 pixels. 
Our x_train variable is a 55,000 row and 784 column matrix.

The y_train data is the associated labels for all the x_train examples. 
Rather than storing the label as an integer, it is stored as a 1x10 binary array with the one representing the digit. 
This is also known as one-hot encoding. 

In [None]:
display_digit(ran.randint(0, x_train.shape[0]))

Here is what multiple training examples look like to the classifier in their flattened form. 
Of course, instead of pixels, our classifier sees values from zero to one representing pixel intensity.

In [None]:
display_mult_flat(0,500)

Until this point, we actually have not been using TensorFlow at all. <br>
The next step is importing TensorFlow and defining our session. 
TensorFlow, in a sense, creates a graph which you later feed with data and run in a session.

In [None]:
import tensorflow as tf
sess = tf.Session()

Next, we define a placeholder. <br>
A placeholder is a variable used to feed data into. 
The only requirement is that in order to feed data into this variable, we need to match its shape and type exactly. <br><br>Here, we define our x placeholder as the variable to feed our x_train data into.

In [None]:
x = tf.placeholder(tf.float32, shape=[None, 784])

When assigning __None__ to our placeholder, it means the placeholder can be fed as many examples as you want to give it.<br><br>
We then define y_, which will be used to feed y_train into. This will be used later so we can compare the ground truths to our predictions.

In [None]:
y_ = tf.placeholder(tf.float32, shape=[None, 10])

We will have a closer look into this part in the <a href="url">following notebook</a>

In [None]:
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W) + b)

You cannot just print a TensorFlow graph object to get its values, you must run an appropriate session in which you feed it data. 
In order to run a function in our session, we first must initialize the variables in our session.<br>
Let’s feed our classifier three examples and see what it predicts. 

In [None]:
x_train, y_train = TRAIN_SIZE(3)
sess.run(tf.global_variables_initializer())
#If using TensorFlow prior to 0.12 use:
#sess.run(tf.initialize_all_variables())
print(sess.run(y, feed_dict={x: x_train}))

So, here we can see our prediction for our first three training examples. Of course, our classifier knows nothing at this point, so it outputs an equal 10% probability of our training examples for each possible class.

How did TensorFlow know the probabilities?<br>
It learned the probabilities by calculating the softmax of our results. The Softmax function takes a set of values and forces their sum to equal one, which will give probabilities for each value. Any softmax value will always be greater than zero and less than one. 

Next, we create our __cross_entropy function__, also known as loss or cost function. It measures how good (or bad) we are classifying. <br>The higher the cost, the higher the level of inaccuracy. It calculates accuracy by comparing the true values from y_train to the results of our prediction y for each example. Our goal is to minimize the loss.

In [None]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

Below is where we can now assign custom variables for training. Any value that is in all caps below is designed to be changed and messed with.<br>In fact, just try it out! <br>First, use these values, then later notice what happens when you use too few training examples or too high or low of a learning rate.
<br>
If you set TRAIN_SIZE to a large number, be prepared to wait for a while. At any point, you can re run all the code starting from here and try different values:

In [None]:
x_train, y_train = TRAIN_SIZE(5500)
x_test, y_test = TEST_SIZE(10000)
LEARNING_RATE = 0.1
TRAIN_STEPS = 2500

We can now initialize all variables so that they can be used by our TensorFlow graph.

In [None]:
init = tf.global_variables_initializer()
sess.run(init)

Now, we need to train our classifier using gradient descent. We first define our training method and some variables for measuring our accuracy. The variable training will perform the gradient descent optimizer with a chosen LEARNING_RATE in order to try to minimize our loss.

In [None]:
training = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Now, we’ll define a loop that repeats __TRAIN_STEPS__ times.<br>
For each loop, it runs training, feeding in values from x_train and y_train using feed_dict.<br>

In order to calculate accuracy, it will run accuracy to classify the __unseen__ data in x_test by comparing its y and y_test. 
It is vitally important that our test data was unseen and not used for training data. If a teacher were to give students a practice exam and use that same exam for the final exam, you would have a very biased measure of students’ knowledge.

In [None]:
for i in range(TRAIN_STEPS+1):
    sess.run(training, feed_dict={x: x_train, y_: y_train})
    if i%100 == 0:
        print('Training Step:' + str(i) + '  Accuracy =  ' + str(sess.run(accuracy, feed_dict={x: x_test, y_: y_test})) + '  Loss = ' + str(sess.run(cross_entropy, {x: x_train, y_: y_train})))

Reference: <br>
https://www.oreilly.com/learning/not-another-mnist-tutorial-with-tensorflow <br>
https://www.tensorflow.org/versions/master/images/softmax-regression-scalargraph.png <br>
https://www.tensorflow.org/versions/master/images/softmax-regression-vectorequation.png

# Feedback

### Next: <a href = "tensorflow_02.ipynb">2.2 Weights</a>