Purpose: Using Logistic Regression in TF to recognize the following handwritten number from MINST database

<img src='./pic/MnistExamples.png'>

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data # import MNIST database
import time

MNIST 手写数字集合有55000 training images and 10000 testing images, 5000 validation images, 均是28*28 pixel

In [2]:
learning_rate=0.01 
batch_size=128 #一次load 128 pictures
n_epochs=30 # run 30 times

Step 1: Load data

In [4]:
mnist=input_data.read_data_sets('../MNIST-data', one_hot=True)

Extracting ../MNIST-data\train-images-idx3-ubyte.gz
Extracting ../MNIST-data\train-labels-idx1-ubyte.gz
Extracting ../MNIST-data\t10k-images-idx3-ubyte.gz
Extracting ../MNIST-data\t10k-labels-idx1-ubyte.gz


Step 2: setup two place holder x and y
- shape for x: [batch_size, 784] shape for y: [batch_size, 10]
- 784=28x28 (each image size is 28x28, 我们把每张图都拉平从2维到一维
- 10 means there are 10 labels (handwrite letters: 0~9) in total

In [5]:
X=tf.placeholder(tf.float32, [batch_size, 784], name="X_placeholder")
Y=tf.placeholder(tf.int32, [batch_size, 10], name="Y_placeholder")

Step 3: Use Variable to Express w and b (Y=Xw+b regression)
- W has shape of [784, 10] -> Xw will be [batch_size, 784]x[784, 10] = [batch_size, 10] shape
- b has shape of [1,10] -> I don't understand why b shape is not [batch_size,10]??? It was not explained well in tutorial
    - For now, I used [batch_size,10] as shape of b, looks like it performed the same

In [6]:
w=tf.Variable(tf.random_normal(shape=[784,10], stddev=0.01), name='weights') #mean is 0.0 by default
b=tf.Variable(tf.zeros([batch_size,10]), name='bias')

Step 4: Define 线性层 (i.e. relationship between predicted Y and X,w,b-> Ypred=Xw+b)

In [7]:
logit=tf.matmul(X,w)+b
logit #please note the shape of the logit

<tf.Tensor 'add:0' shape=(128, 10) dtype=float32>

Step 5: Define Cross-entropy Loss function
- 1. using logit -> normalize logit with log function (i.e. Y between 0 and 1), then calculate cross_entropy, then normalize cross_entropy (loss) using softmax (normalize loss so that Sum(loss)=1)
- 2. Calculate mean of loss across all examples in the batch

In [9]:
cross_entropy=tf.nn.softmax_cross_entropy_with_logits(logits=logit, labels=Y, name='loss') #step1 above
loss = tf.reduce_mean(cross_entropy) #step2 above

Step 6: Define the training method to train Loss Function to minimize loss
- here we are using GradientDescentOptimizer

In [10]:
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

Step 7. Start Session

In [11]:
with tf.Session() as sess:
    #visualize using tensorboard
    writer = tf.summary.FileWriter('./TFExample2_Logistic_reg', sess.graph)
    start_time=time.time()
    sess.run(tf.global_variables_initializer())
    
    
    ######Training######
    n_batches=int(mnist.train.num_examples/batch_size) #calculate how many batches in total in MNIST training data set
    
    #train model n_epochs times
    for i in range(n_epochs): 
        total_loss=0 #initialize from 0
        
        for _ in range(n_batches):
            
            X_batch, Y_batch = mnist.train.next_batch(batch_size) #fetch data for X and Y for each training batch
            
            _, loss_batch = sess.run([optimizer, loss], feed_dict={X: X_batch, Y: Y_batch}) #loss_batch is the loss calculated from each training batch
            
            total_loss+=loss_batch
            
        avg_loss_per_epoch=total_loss/n_batches
        
        print ('Average loss for epoch {0} is {1}'.format(i, avg_loss_per_epoch))
        
    print ('Total running time: {0} seconds'.format(time.time()-start_time))
    
    
    ######Testing Model Accuracy######
    
    n_batches=int(mnist.test.num_examples/batch_size) #calculate how many batches in total in MNIST testing data set
    
    #define some variables (tensors) for testing
    Y_predicted = logit
    
    correct_preds_boolean = tf.equal(tf.argmax(Y_predicted,axis=1), tf.argmax(Y, axis=1)) #tf.equal(A,B) -> if(A=B) then 1 else 0
        #reason to use argmax:
            #output label 0~9 is using onehot encoding, i.e. [0,0,0,0,0,0,0,0,0,1] -> means output label is "9" (index=9)
                                                            #[1,0,0,0,0,0,0,0,0,0] -> means output label is "0" (index=0)
            #since argmax can return the index/position of the max value along the axis, it can help us find the corresponding output label(0~9)
            #note: usually only 1 prediction is made for each label, so there should only be one "1" in each row of onehot encoding.
                
                
    count_of_correct_predictions=tf.reduce_sum(tf.cast(correct_preds_boolean, tf.float32))
    
    total_correct_preds=0 #initialize from 0
    
    for i in range(n_batches):
        X_batch, Y_batch = mnist.test.next_batch(batch_size) #fetch data for X and Y for each testing batch
        count_of_correct_predictions_per_batch = sess.run(count_of_correct_predictions, feed_dict={X: X_batch, Y: Y_batch})
        total_correct_preds+=count_of_correct_predictions_per_batch
        
    print ('Accuracy on testing sets is {0}'.format(total_correct_preds/mnist.test.num_examples))
    
    writer.close()
    sess.close()
        
            
        

Average loss for epoch 0 is 1.290301343638858
Average loss for epoch 1 is 0.7351923608557606
Average loss for epoch 2 is 0.6028370971863086
Average loss for epoch 3 is 0.5398999777012494
Average loss for epoch 4 is 0.5009538739016561
Average loss for epoch 5 is 0.4734705094690923
Average loss for epoch 6 is 0.45539031717882844
Average loss for epoch 7 is 0.4390068252881368
Average loss for epoch 8 is 0.4262276328665949
Average loss for epoch 9 is 0.4171778878548762
Average loss for epoch 10 is 0.40778209147475536
Average loss for epoch 11 is 0.401035820826506
Average loss for epoch 12 is 0.39446703617250445
Average loss for epoch 13 is 0.38763466435712535
Average loss for epoch 14 is 0.3841316726032671
Average loss for epoch 15 is 0.3795027020441624
Average loss for epoch 16 is 0.37313466503486764
Average loss for epoch 17 is 0.3716307927470107
Average loss for epoch 18 is 0.3671290382638678
Average loss for epoch 19 is 0.36432908173207634
Average loss for epoch 20 is 0.360690665377047

To see Graph and Summary on Tensor Board:
- anaconda prompt: navigate to Example2 folder, run
> tensorboard --logdir=./