# Simple Linear Model using Tensorflow
by Teppei Suzuki

## Contents
* Introduction
* Setup
* Examining the Data
* Building the Graph
* Cost Function
* Optimization
* Performance Measure
* Training
* Summary



## Introduction

In this demonstration, we will build a simple linear model using tensorflow and train our model on various image dataset to see how useful SLM can be.

## Setup

Let's first import some useful libraries along with tensorflow. Note that the general convention for importing tensorflow is to import as tf.

In [45]:
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

Let's now load the sample MNIST dataset.

In [8]:
from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets("data/MNIST/", one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting data/MNIST/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting data/MNIST/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting data/MNIST/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/MNIST/t10k-labels-idx1-ubyte.gz


## Examining the Data

Now that we have loaded the data, let's examine the contents to get a better understanding of what we are dealing with. As you can see, the data consists of training set, test set, and validation set.

The training set is used solely for training the model. After the training is over, the test set is used to measure the accuracy of the trained model. A validation set can be used during training to determine whether your model is overfitting the training dataset. Note that it is important to not use the test set for training.

In [47]:
print("Size of:")
print("- Training-set:\t\t{}".format(len(data.train.labels)))
print("- Test-set:\t\t{}".format(len(data.test.labels)))
print("- Validation-set:\t{}".format(len(data.validation.labels)))

Size of:
- Training-set:		55000
- Test-set:		10000
- Validation-set:	5000


Here we look at the first element in the training labels and training images. Notice that the size of the label is 10 (for 10 digit classification). The index where 1 is at is the correct value of the image. So for the printed example below, it is classifed as an 0.


In [68]:
print(data.train.labels[0])
print(data.train.images[0])

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0

In [56]:
print(len(data.train.labels[0]))
print(len(data.train.images[0]))

10
784


## Building the Graph

Lets first set up the constant values

Since the hand written digit pictures are 28pixels by 28pixels, we know that the flattened version will be of the form 1 by 784. Also, the number of classes is 10 (0-9).

In [78]:
img_row, img_col = 28, 28
img_flat = img_row * img_col
num_classes = 10

We now create a placeholder so that the graph can have inputs. 

In [79]:
x = tf.placeholder(tf.float32, [None, img_flat])
y_one_hot = tf.placeholder(tf.float32, [None, num_classes])
y_true = tf.placeholder(tf.int64, [None])

We can not create variables that can be tuned during our training phase.

In [80]:
W = tf.Variable(tf.zeros([img_flat, num_classes]))
b = tf.Variable(tf.zeros([num_classes]))

Lets put the placeholders and variables to create the linear model.

In [81]:
logits = tf.matmul(x, W) + b

Since we want our values to act as probability, we apply softmax function to logits

In [82]:
y_pred_one_hot = tf.nn.softmax(logits)

We then search through the columns of each row to find the index of max value.
The index is the predicted value

In [83]:
y_pred = tf.argmax(y_pred_one_hot, axis=1)

### Cost Function

We now define the cost function.

In [90]:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_one_hot)

cost = tf.reduce_mean(cross_entropy)

### Optimization

In [91]:
# We use gradient decent for our optimization method
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

### Performance measure

In [92]:
correct_prediction = tf.equal(y_pred, y_true)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

### Training

We are now ready to start training our model!

In [93]:
session = tf.Session()
init = tf.global_variables_initializer()
session.run(init)

In [122]:
num_iterations = 100000
batch_size = 100

In [123]:
for i in range(num_iterations):
        x_batch, y_true_batch = data.train.next_batch(batch_size)
        
        feed_dict_train = {x: x_batch,
                           y_one_hot: y_true_batch}

        session.run(optimizer, feed_dict=feed_dict_train)

In [120]:
data.test.cls = np.array([label.argmax() for label in data.test.labels])

feed_dict_test = {x: data.test.images,
                  y_one_hot: data.test.labels,
                  y_true: data.test.cls}

In [124]:
# Use TensorFlow to compute the accuracy.
acc = session.run(accuracy, feed_dict=feed_dict_test)
    
print("Accuracy on test-set: {0:.1%}".format(acc))

Accuracy on test-set: 92.4%
