# This is a quick tutorial on Tensorflow based on Stanford CS 20SI

Now it's time for us to train a machine learning model

<font color = green> **This section will be a basebone for orther project (seriously, deep learning models are only different on model, cost function, optimizations, oops... (too much))**  <font>

In [2]:
'''The first model will be simple Softmax regression model'''
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data #These one, well, just remember them... or copy from tensorflow website
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


<font color = purple>**MINI TUTORIAL: VARIABLE SCOPE, NAME SCOPE IN TENSORFLOW** </font>

Assume that we will be able to implement a model y = f(x) where y is our predicted output and x is our features. I take code sample from tensorflow so that we can have a better intuition. 

In [3]:
''' This is a sample code y = f(x) where x is an input images. 
---------------------------------------------------------------------------------

def my_image_filter(input_images):
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),      <<<<<<<<<< VARIABLES!
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")    <<<<<<<<<< VARIABLES!
    conv1 = tf.nn.conv2d(input_images, conv1_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + conv1_biases)

    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),       <<<<<<<<<< VARIABLES!
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")     <<<<<<<<<< VARIABLES!
    conv2 = tf.nn.conv2d(relu1, conv2_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv2 + conv2_biases)
-----------------------------------------------------------------------------------
'''
print 'This is for demo only'

This is for demo only


Assume that we have 2 images, and by sometask, for example, we might want to compare 2 images after being filtered. Thus we use: <br>

 - result1 = my_image_filter(x1) , result2 = my_image_filter(x2) <br>
 
<font color = green>**This will cause 1 problem, tensorflow will create 2 graph, with total 8 variables. Imagine in our deep model, each variables is a vector with a pretty large size, creating 2 graph will make us suffer from memory problem**. </font> One solution we might easily come up with is creating variables separately from the function, but this will cause encapsulation problem (just google the term if you are not familiar with it). In this tutorial, we will use **'scope'**, which is defined in Tensorflow for dealing with this problem. For more information, check this link: https://www.tensorflow.org/versions/r0.12/how_tos/variable_scope/#variable_scope_example   <br>

*** For whatever reasons, the first step is to define the dimensions of input and output***

In [4]:
print 'Number of training examples: ', mnist.train.num_examples
print 'Dimension of each image (should be 784 1d array)',len(mnist.train.images[0])
print 'Number of classes: ',len(mnist.train.labels[0])

Number of training examples:  55000
Dimension of each image (should be 784 1d array) 784
Number of classes:  10


In [5]:
#Remember, for feeding inputs, we use placeholders, in this example x: input, y:output
with tf.name_scope('input_data'):
    x = tf.placeholder(tf.float32, shape = [None , 784]) #None means that it can be any arbitary
    y = tf.placeholder(tf.float32, shape = [None, 10])   #Why? We will feed batch into our model

In [6]:
#For softmax, define a weight matrix (and bias) first
with tf.name_scope('Weights'):                          
    W = tf.get_variable("weights", [784,10],initializer=tf.random_normal_initializer())
    b = tf.get_variable("bias", [1,10],initializer=tf.random_normal_initializer())
#Define our softmax output
with tf.name_scope('model'):    
    y_pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

**Basically** <br>

x ----> W,b ----> y_pred ---> Compute cost function ---> backprob (kind of...) --> Update W,b then feed again      

In [7]:
with tf.name_scope('Cost_function'):
    cost = tf.losses.softmax_cross_entropy(y, y_pred)

In [8]:
with tf.name_scope('Optimizer'):
    optimizer = tf.train.AdamOptimizer(learning_rate=0.1).minimize(cost)

** Time for training our model **
First, let see what will happen when we train our model
- We feed batch of inputs into our model, use optimizer to minimiz the cose
- We need to see the cost function (loss) reduce in order to make sure everything is good.
- <font color = red> It is always necessary to save our model! </font>
- Since mnist dataset we used has a function that generate batch for us, in other kind of data, we need to write our own batch generator function (it is easy by the way). <br>

Now its time for us to train the model and see each corresponding steps

In [10]:
BATCH_SIZE = 128 #Define a batch size, it should be at the head of the program, not this cell,anyway
NUM_EPO = 10
SKIP_STEP = 10 # How many times we want to see our update, this means we want to see update every 10 batchs
total_cost = 0
init = tf.global_variables_initializer() 
saver = tf.train.Saver()
with tf.Session() as sess:
    sess.run(init)     #Init variables
    
    '''Start to init model'''
    for index in range(BATCH_SIZE*NUM_EPO):
        train_x, train_y = mnist.train.next_batch(BATCH_SIZE)
        _, loss_batch = sess.run([optimizer, cost], feed_dict={x: train_x, y:train_y}) #Doing optimization part
        total_cost += loss_batch   # See the comment in the next 3 lines
        if (index + 1) % SKIP_STEP == 0:
            print('Average loss at step {}: {:5.1f}'.format(index + 1, total_cost / SKIP_STEP))
            total_cost = 0.0       # Reset for next average
    '''Finish training'''
    
    
    # test the model
    n_batches = int(mnist.test.num_examples/BATCH_SIZE)
    total_correct_preds = 0
    for i in range(n_batches):
        X_batch, Y_batch = mnist.test.next_batch(BATCH_SIZE)
        loss_batch, logits_batch = sess.run([cost, y_pred], 
                                        feed_dict={x: X_batch, y:Y_batch}) 
        preds = tf.nn.softmax(logits_batch)
        
        correct_preds = tf.equal(tf.argmax(preds, 1), tf.argmax(Y_batch, 1))  # Count correct classification 
        
        accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))
        total_correct_preds += sess.run(accuracy)   
    
    print("Accuracy {0}".format(total_correct_preds/mnist.test.num_examples))
    

Average loss at step 10:   2.2
Average loss at step 20:   2.0
Average loss at step 30:   2.0
Average loss at step 40:   1.9
Average loss at step 50:   1.9
Average loss at step 60:   1.9
Average loss at step 70:   1.9
Average loss at step 80:   1.9
Average loss at step 90:   1.9
Average loss at step 100:   1.9
Average loss at step 110:   1.9
Average loss at step 120:   1.9
Average loss at step 130:   1.9
Average loss at step 140:   1.9
Average loss at step 150:   1.9
Average loss at step 160:   1.9
Average loss at step 170:   1.9
Average loss at step 180:   1.9
Average loss at step 190:   1.9
Average loss at step 200:   1.9
Average loss at step 210:   1.9
Average loss at step 220:   1.9
Average loss at step 230:   1.9
Average loss at step 240:   1.9
Average loss at step 250:   1.9
Average loss at step 260:   1.9
Average loss at step 270:   1.9
Average loss at step 280:   1.9
Average loss at step 290:   1.9
Average loss at step 300:   1.9
Average loss at step 310:   1.9
Average loss at s