# Higher Level Interface

Using the functions provided by tf.layers, we can easily build many standard network models. In this notebook it will be a Convolutional Network. 

The notebook also demonstrates different model modes: For training we built a slightly different version of the model than for evaluation.

In [None]:
import tensorflow as tf
import tensorflow.contrib.learn as tflearn
import numpy as np
from collections import namedtuple

NUM_EPISODES = 10
RESTORE = False

Model = namedtuple("Model", ["logits", "probabilities", "loss", "train_step", "accuracy"])

## Model Function

### accuracy

In [None]:
def calc_accuracy(logits, labels, name="accuracy"):
    with tf.name_scope(name, [logits, labels]):
        predicted = tf.argmax(logits, axis=1, name="predicted")
        correct = tf.equal(tf.cast(predicted, tf.int32), labels, name="is_correct")
        return tf.reduce_mean(tf.cast(correct, tf.float32), name="accuracy")

### cnn forward pass
A function that takes in some input features, and applies a cascade of cnns, followed by a dense multiplication. Depending on the application it might make sense to expose more configuration parameters to the outside (e.g. the kernel size, or the used nonlinearity).

An additional parameter `is_training` is passed to indicate whether the forward model should be build in training or in evaluation mode. This influences the behaviour of the `dropout` layer.

In [None]:
def cnn_fn(x, channels=(32, 64), outputs=10, is_training=True):
    hidden = x
    for c in channels:
        hidden = tf.layers.conv2d(hidden, c, kernel_size=3, strides=2,
                                    activation=tf.nn.relu)
    hidden = tf.layers.flatten(hidden)
    hidden = tf.layers.dropout(hidden, 0.5, training=is_training)
    return tf.layers.dense(hidden, outputs)

### model_fn
Since we are using the higher level layers interface there is no more explicit access to any variable. Since a convolutional layer expects the input data to be shaped like an image we first perform a reshape on the data. If we built the network in evaluation mode we simple set the `train_op` to `None`. 
For evaluation we have also added the accuracy tensor as part of our model.

In [None]:
def model_fn(x, y, is_training):
    image_shaped = tf.reshape(x, (-1, 28, 28, 1))
    tf.summary.image("image", image_shaped)
    l = cnn_fn(image_shaped, is_training=is_training)
    tf.summary.histogram("logits", l)
    p = tf.nn.softmax(l)
    tf.summary.histogram("probabilities", p)
    
    with tf.name_scope("loss_calculation"):
        loss = tf.nn.softmax_cross_entropy_with_logits(labels=tf.one_hot(y, depth=10), logits=l)
        loss = tf.reduce_mean(loss)
    
    tf.summary.scalar("loss", loss)
    accuracy = calc_accuracy(l, y)
    tf.summary.scalar("accuracy", accuracy)

    global_step = tf.train.create_global_step()
    if is_training:
        optimizer = tf.train.GradientDescentOptimizer(0.1)
        train_op = optimizer.minimize(loss, global_step=global_step)
    else:
        train_op = None
    return Model(logits=l, probabilities=p, loss=loss, train_step=train_op, accuracy=accuracy)

## Training
We build the graph for training mode. Since everything model specific happens in `model_fn` this code remains unchanged.

In [None]:
graph = tf.Graph()
with graph.as_default():
    x = tf.placeholder(tf.float32, (None, 784), name="x")
    y = tf.placeholder(tf.int32, (None), name="y")
    _, _, loss, train_op, _ = model_fn(x, y, is_training=True)
    summaries = tf.summary.merge_all()
    init = tf.global_variables_initializer()
    saver = tf.train.Saver()

# load the dataset
mnist = tflearn.datasets.load_dataset("mnist")
images = mnist.train.images
labels = mnist.train.labels

writer = tf.summary.FileWriter("high_level_demo", graph)

### Training Loop
We have put the training loop inside a `tf.Session` with-block. This ensures that the session will be closed after this cell is executed. Since we no longer build the model in the global default graph, we need to explicitly pass the graph object to the newly created session. 

Since the model is much bigger than before it needs much more memory, and as such cannot process the complete dataset in one batch on older GPUs. Therefore a little rudimentary minibatching was added.  

In [None]:
with tf.Session(graph=graph) as session:
    if RESTORE:
        saver.restore(session, tf.train.latest_checkpoint("high_level_demo"))
    else:
        init.run()
    
    for i in range(NUM_EPISODES):
        for j in range(int(len(images) / 100)):
            imgs = images[100*j:100*j+100]
            lbls = labels[100*j:100*j+100]
            summary, _, step = session.run([summaries, train_op, tf.train.get_global_step()], {x: imgs, y: lbls})
        writer.add_summary(summary, step)

    saver.save(session, "high_level_demo/model", tf.train.global_step(session, tf.train.get_global_step()))
    writer.close()

## Evaluation
For evaluation we built the complete graph with training disabled. Since we really want to process everything in one big batch for evaluation, we put the complete model on the CPU. (See how using a model_fn makes this a trivial task!). 

Then we construct a new session, restore the model we trained above, and run the loss and accuracy tensors. We calculate them for both the training set and the test set. In this way we can identify whether we overfit. 

In [None]:
graph = tf.Graph()
with graph.as_default(), tf.device("/cpu:0"):
    x = tf.placeholder(tf.float32, (None, 784), name="x")
    y = tf.placeholder(tf.int32, (None), name="y")
    _, _, loss, _, accuracy = model_fn(x, y, is_training=False)
    saver = tf.train.Saver()

with tf.Session(graph=graph) as session:
    saver.restore(session, tf.train.latest_checkpoint("high_level_demo"))
    
    loss_v, accuracy_v = session.run([loss, accuracy], {x: images, y: labels})
    print("trainsing set: total loss: %s, accuracy: %s" % (loss_v, accuracy_v))
    
    loss_v, accuracy_v = session.run([loss, accuracy], {x: mnist.test.images, y: mnist.test.labels})
    print("test set: total loss: %s, accuracy: %s" % (loss_v, accuracy_v))

## Closing Remarks
To see the difference between training and evaluation mode graph you can change the evaluation code to also use training mode. You should see a slight drop in accuracy (~1%).