# Training a model in TF

This notebook contains an example of how to speficy and train a model in TensorFlow.

1. Building the model
2. Specifying a cost/loss function and an optimizer (in this case SGD)
3. Launch a session to do the actual training

[Bonus] Define variables you want to track during learning. And see how you can use TensorBoard to monitor your training.

This is a starter code/example. So feel free to play around with these functionalities.

In [1]:
# for compatibility 
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [2]:
import numpy as np
import tensorflow as tf

In [3]:
from tensorflow.contrib.learn.python.learn import datasets
dataset = datasets.load_iris()

# split into input data (x) and GT labels (y)
data_x = dataset.data
data_y = dataset.target
print(data_x.shape)
print(data_y.shape)

(150, 4)
(150,)


Splitting the data into testing and training data. We shuffle the data first, as the dataset is ordered in terms of labels.

In [4]:
new_indices = np.random.permutation(data_y.shape[0])

train_x = data_x[new_indices[:125],:]
train_y = data_y[new_indices[:125]]

test_x = data_x[new_indices[125:],:]
test_y = data_y[new_indices[125:]]

## Step 1: Building the computation graph

In [5]:
# placeholder for data x (4 attributes), one prediction label
x    = tf.placeholder("float", shape=[None, 4])
y_GT = tf.placeholder("int64", shape=[None, ])

# model parameters
n_hidden = 100
W_h = tf.Variable(0.1*tf.random_normal([4, n_hidden]), name="W_h")
b_h = tf.Variable(tf.random_normal([n_hidden]), name="b_h")
hidden_layer = tf.matmul(x, W_h) + b_h

# model parameters
W = tf.Variable(0.1*tf.random_normal([n_hidden, 3]), name="W")
b = tf.Variable(tf.zeros([3]), name="b")

# putting the model together
z = tf.matmul(hidden_layer, W) + b
y = tf.nn.softmax(z)

In [6]:
print(y)

Tensor("Softmax:0", shape=(?, 3), dtype=float32)


## Step 2: Specify loss and training/optimization

In [7]:
# one-hot encoding of the class labels
y_GT_one_hot  = tf.one_hot(y_GT, depth=3)

# cross-entropy loss
cross_entropy = -tf.reduce_sum(y_GT_one_hot * tf.log(y+1e-10))

# alternative implementation of cross-entropy in tf. (a bit more stable numerical)
#cross_entropy = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=Y_GT_one_hot, logits=z)

# define optimizer
opt = tf.train.GradientDescentOptimizer(0.001)
train_op = opt.minimize(cross_entropy)

In [8]:
cross_entropy

<tf.Tensor 'Neg:0' shape=() dtype=float32>

## Step 3: Launch a session

In [9]:
# create operation to inialize all variables
init_op = tf.initialize_all_variables()

In [10]:
# launch session
with tf.Session() as sess:
    sess.run(init_op)
    
    print(W)
    #print sess.run(W) #print W.eval()

<tensorflow.python.ops.variables.Variable object at 0x7fbc2ca1dd90>


In [11]:
# one step of training
with tf.Session() as sess:
    
    sess.run(init_op)
    
    # one step training -- repeat this to fully optimize
    sess.run(train_op, feed_dict={x: data_x, y_GT: data_y})
    
    xen = sess.run(cross_entropy, feed_dict={x: data_x, y_GT: data_y})
    print(xen)
    

1342.77


### Adding summary variables that want to track during training

We will track the cross-entropy, and accuracy. The log files will be saves in a separate folder **\tf_logs**.

To visualize the plots/learning curves, go to your terminal and execute:

```
>> tensorboard --logdir="\tf_logs"
```

And then open your browser(preferably Chrome) and navigate to: http://127.0.1.1:6006.


In [12]:
# define the accuracy 
accuracy = tf.reduce_mean(tf.cast(tf.equal(y_GT, tf.argmax(y, 1)), tf.float32))

In [13]:
# summary 
tf.scalar_summary("cross_entropy", cross_entropy)
tf.scalar_summary("accuracy", accuracy)
summary_op = tf.merge_all_summaries()

In [14]:
# multiple steps of training

with tf.Session() as sess:
    sess.run(init_op)
    
    summary_writer  = tf.train.SummaryWriter('tf_logs', sess.graph)
    for iterIndex in range(200):
        sess.run(train_op, feed_dict={x: train_x, y_GT: train_y})
        
        summary = sess.run(summary_op, feed_dict={x: train_x, y_GT: train_y})
        summary_writer.add_summary(summary, iterIndex)
        
        if iterIndex%10==0:
            
            xen = sess.run(cross_entropy, feed_dict={x: test_x, y_GT: test_y})
            print("Iter %3d -- Cross-entropy: %f"%(iterIndex,xen))
            
            acc = sess.run(accuracy, feed_dict={x: test_x, y_GT: test_y})
            print("Iter %3d -- Accuracy: %f"%(iterIndex,acc))

Iter   0 -- Cross-entropy: 179.825623
Iter   0 -- Accuracy: 0.360000
Iter  10 -- Cross-entropy: 9.767296
Iter  10 -- Accuracy: 0.960000
Iter  20 -- Cross-entropy: 7.014232
Iter  20 -- Accuracy: 0.960000
Iter  30 -- Cross-entropy: 5.565076
Iter  30 -- Accuracy: 0.960000
Iter  40 -- Cross-entropy: 4.591428
Iter  40 -- Accuracy: 0.960000
Iter  50 -- Cross-entropy: 3.880028
Iter  50 -- Accuracy: 0.960000
Iter  60 -- Cross-entropy: 3.360386
Iter  60 -- Accuracy: 0.960000
Iter  70 -- Cross-entropy: 13.685889
Iter  70 -- Accuracy: 0.720000
Iter  80 -- Cross-entropy: 5.321474
Iter  80 -- Accuracy: 0.880000
Iter  90 -- Cross-entropy: 5.777423
Iter  90 -- Accuracy: 0.880000
Iter 100 -- Cross-entropy: 5.502826
Iter 100 -- Accuracy: 0.880000
Iter 110 -- Cross-entropy: 5.147684
Iter 110 -- Accuracy: 0.880000
Iter 120 -- Cross-entropy: 5.065189
Iter 120 -- Accuracy: 0.880000
Iter 130 -- Cross-entropy: 4.779284
Iter 130 -- Accuracy: 0.920000
Iter 140 -- Cross-entropy: 4.528471
Iter 140 -- Accuracy: 0

### Interpreting results

Here are some suggestions of experiments to test your understanding and intuition:
    
* Look at the progression of the loss/accuracy within training. What do you observe? Is that what you expected?

* Re-run the training with minibatches - aka each update in the training step will look only at a small sample of the data. 
Run this for minibatches of size $1, 5, 10$.

* Re-run the training above with a smaller sample size. Say $10$ training examples.

* Try deeper/shallower models. More neurons.

This is a very small dataset, not ideal for deep learning, but you will get results instantenously and will hopefully help you build some intuition around the training process + building and optimizing models in TF.
  