## The same old traditional import statements. 

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import math

## Defining the hyperparameters and other important constants

In [2]:
# The MNIST dataset has 10 classes, representing the digits 0 through 9.
NUM_CLASSES = 10

# The MNIST images are always 28x28 pixels.
IMAGE_SIZE = 28

# Each image is 28 X 28 in dimension. We flatten the image to get 784 features where each feature would correspond to one pixel.
IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE

# Batch size 
BATCH_SIZE = 100

# Defining the number of units in hidden layers
HIDDEN_LAYER_1 = 50
HIDDEN_LAYER_2 = 20

## Bringing the MNIST dataset into our program

TensorFlow has support for easily accessing the classic datasets that are used to benchmark different domains within AI. We'll use TF to access the MNIST dataset.

In [3]:
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


###### For understanding the finer controls in TF, we need to understand the mechanism that runs under the hood. Well, what happened in the last demo, TF constructed a highly computationally efficient computation graph on top of which it performs all its calculations. So, essentially TF mechanism follows two steps:

###### 1. Building a computational graph
###### 2. Running the computational graph

######  Let’s take the two different but intertwined processes of computational graph generation and computations separately, 

## Building the computational graph
Remember those three steps that we talked about a while ago, we'll create the computational graph to accomplish these three tasks to make any ML model (as shown in this image below).
![The three abstract steps to create any ML model](./media/Steps.jpg)

Before that, let's first create means to insert inputs into our computational graph. Remember, 'PLACEHOLDERS'.

In [4]:
# Allocating nodes in the computational graph to accept inputs. These are means to insert inputs into our computational graph.

images = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS))
true_labels = tf.placeholder(tf.int32, shape=(None, NUM_CLASSES))

##### Now we define the portion of the graph to do the INFERENCE

![Portion of the graph to do the INFERENCE](./media/Infer.jpeg)

In [5]:
# Defining the first hidden layer
weights_layer_1 = tf.Variable(tf.truncated_normal([IMAGE_PIXELS, HIDDEN_LAYER_1],
                      stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))))

biases_layer_1 = tf.Variable(tf.zeros([HIDDEN_LAYER_1]))

hidden_output_1 = tf.nn.relu(tf.matmul(images, weights_layer_1) + biases_layer_1)

# Defining the second hidden layer
weights_layer_2 = tf.Variable(tf.truncated_normal([HIDDEN_LAYER_1, HIDDEN_LAYER_2],
                      stddev=1.0 / math.sqrt(float(HIDDEN_LAYER_1))))

biases_layer_2 = tf.Variable(tf.zeros([HIDDEN_LAYER_2]))

hidden_output_2 = tf.nn.relu(tf.matmul(hidden_output_1, weights_layer_2) + biases_layer_2)

# Defining the outputs
weights_output = tf.Variable(tf.truncated_normal([HIDDEN_LAYER_2, NUM_CLASSES],
                      stddev=1.0 / math.sqrt(float(HIDDEN_LAYER_2))))

biases_output = tf.Variable(tf.zeros([NUM_CLASSES]))

prediction = tf.matmul(hidden_output_2, weights_output) + biases_output

##### Now we define the portion of the graph to determine how good our current model is or what the LOSS is

![Portion of the graph to determine how good our current model is or what the LOSS is](./media/Loss.jpeg)

In [6]:
# Evaluating the loss function
loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=true_labels, logits=prediction))

##### Now we define the portion of the graph to UPDATE the parameters so that the future predictions are better than the current one

![Portion of the graph to UPDATE the parameters so that the future predictions are better than the current one](./media/Optimize.jpeg)

In [7]:
# Updating the parameters
optimizer = tf.train.GradientDescentOptimizer(0.05)
training = optimizer.minimize(loss)

## Doing the computations on the computational graph
Once we have created our efficient computational graph, its time for us to pass in the inputs and compute those three abstract steps mentioned above.  
But before that we'll create a session variable which serves as a handle to access the computational graph. We'll also initialize the variables that we have declared in our computation graph

In [8]:
# Session/Interactive Session: Captures the control and state of the computational graph.
# It is the connection to the backend computational graph.
sess = tf.Session()
init = tf.global_variables_initializer()

##### The following lines of code computes those three abstract steps. Look that the value of loss is decreasing with increasing training steps.

In [9]:
sess.run(init)

for train_step in range(2000):
    batch = mnist.train.next_batch(BATCH_SIZE)
    if train_step % 100 == 0:
        loss_value, _ = sess.run([loss, training], {images: batch[0], true_labels: batch[1]})
        print('Loss at ', str(train_step), ' training step is ', str(loss_value))

Loss at  0  training step is  2.31667
Loss at  100  training step is  2.29753
Loss at  200  training step is  2.29272
Loss at  300  training step is  2.3062
Loss at  400  training step is  2.29006
Loss at  500  training step is  2.27786
Loss at  600  training step is  2.28041
Loss at  700  training step is  2.27704
Loss at  800  training step is  2.25153
Loss at  900  training step is  2.27913
Loss at  1000  training step is  2.24235
Loss at  1100  training step is  2.24778
Loss at  1200  training step is  2.26063
Loss at  1300  training step is  2.25246
Loss at  1400  training step is  2.23242
Loss at  1500  training step is  2.24243
Loss at  1600  training step is  2.22911
Loss at  1700  training step is  2.22286
Loss at  1800  training step is  2.19076
Loss at  1900  training step is  2.20274


## Finally we test how good our final model is by checking the loss incurred on the previously unseen data.

In [10]:
print(sess.run([loss], {images:mnist.test.images, true_labels: mnist.test.labels}))

[2.201046]


# Voila!!
## We did it. We got through this. We made it. Yay!! Woo Hoo!!