# Hi!

Today we'll get back to Neural Networks and take a look at how image processing is done. We'll talk about current state-of the art approach - Convolutional Neural Networks. We'll also talk about a brand new approach - Capsule Networks - that may or may not dethrone CNNs in the future ;) 

In [None]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

from ipywidgets import interact, fixed
import ipywidgets as widgets

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

tf.logging.set_verbosity(tf.logging.INFO)

### Data loading
As a very basic example of images, the MNIST dataset will accompaby us through the rest of the lab.

In [None]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Let's explore it!

In [None]:
def present_mnist(i):
    pixels = mnist.train.images[i].reshape(28,28)
    plt.imshow(pixels, cmap='gray')
    labels = mnist.train.labels[i]
    print(labels)
    print(np.argmax(labels))
    
    
print(mnist.train.images.shape)
print(mnist.train.labels.shape)
print(mnist.test.images.shape)
print(mnist.test.labels.shape)

# notice that the data is already normalized. Yay!
print(mnist.train.images.max()) 
print(mnist.train.images.min())

interact(present_mnist,
        i=widgets.IntSlider(min=0, max=100)
        )

## Let's classify it!

We've already discussed various vays to approach classification of this dataset. 
We'll rephrase some of them below. You can find further explanations in **lab4**.

### It's always good to start with linear classifier

Which, as you may remember is essentially a matrix multiplication!
But, for the educational value of this example not to be lost on us, let's now implement it in Tensorflow!

In [None]:
# if you want to deal with low-level variables in tf, everything happens inside a Session
outside_W = None
with tf.Session() as sess:
    # in tensorflow you build computational graphs from variables and operations on them
    # and then you can evaluate their outputs based on what you input
    # you can think of placeholders as such inputs
    
    # X = features vector (28 x 28 = 784) - we'll deal with bias differently this time
    X = tf.placeholder(tf.float32, shape=(None, 784)) 
    # L = labels vector - shape 10 because we've got 10 possible classes
    L = tf.placeholder(tf.float32, shape=(None, 10))
    # W - weights matrix
    W = tf.Variable(tf.random_uniform((784,10)))
    # b - bias weights - we can simply add them to the output of X @ W multiplication
    b = tf.Variable(tf.random_uniform((1,10)))
    sess.run(tf.global_variables_initializer())
    # Y - the outputs of our model - defined as a computation!
    Y = X @ W + b
    # this is a built-in method to calculate loss 
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=L, logits=Y))
    
    # this is built-in Gradient Descent
    train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
    
    correct_prediction = tf.equal(tf.argmax(Y,axis=1), tf.argmax(L,axis=1))
    
    # this dictionary will tell tensorflow what values to pass to which placeholder (input)
    # when we evaluate a computational graph
    # here, we pass test images and their respective labels to X and L respectively
    eval_dict = {X: mnist.test.images, L: mnist.test.labels}
    loss_history = []
    
    num_iterations = 1000
    batch_size = 1000
    
    for i in range(num_iterations):
        # for the training to go quicker, we'll use stochastic gradient descent
        # in each iteration we train the model on a random batch of data
        batch = mnist.train.next_batch(batch_size)
        # analogous to eval_dict
        train_dict = {X: batch[0], L: batch[1]}
        # one full training step
        train_step.run(feed_dict=train_dict)
        # evaluating the loss after an iteration
        loss_tmp = loss.eval(feed_dict=eval_dict)
        loss_history.append(loss_tmp)
        
    # evaluating the accuracy after training
    accuracy_test = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print('final accuracy on test data:', accuracy_test.eval(feed_dict=eval_dict))
    print('loss graph:')
    plt.plot(list(range(num_iterations)), loss_history)
    outside_W = sess.run(W) # a litle something that will let us keep the final value of W 
    
            

We should achieve about 88% accuracy. Not bad! 

(actually, for MNIST it *is* bad)

#### An example of a (very simple) computational graph
<img src="img/comp_graph.png" style="width: 500px;"/>

### Digression: Let's visualize what actually excites the classifier!

In [None]:
def display_W(i, outside_W):
    plt.imshow(outside_W[:, i].reshape(28,28), cmap='gray')

interact(display_W,
         i=widgets.IntSlider(min=0, max=9),
         outside_W=widgets.fixed(outside_W)
        )

## Maybe a Neural Network will do better?

### First, we will try out a fully connected (dense) network

As you may remember, fully connected network is essentially a bunch of linear classifiers stacked between each other with non-linearities between them:

<img src="img/neural_dense.jpg" style="width: 500px;"/>

#### Enter Tensorflow's layers and estimators!

FIrst, let's write down a function which defines a model to Tensorflow's estimator API. In the function we define model's behaviour in various cases:

In [None]:
def dense_model_fn(features, labels, mode):
    # labels are numbers of classes - let's not mess with API standards :D
    onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)

    input_layer = features["x"]
    # hidden layers
    dense_1 = tf.layers.dense(inputs=input_layer, units=200, activation=tf.nn.relu)
    dense_2 = tf.layers.dense(inputs=dense_1, units=200, activation=tf.nn.relu)
    # output layer
    logits = tf.layers.dense(inputs=dense_2, units=10, activation=tf.nn.relu)

    predictions = {
      # Generate predictions (for PREDICT and EVAL mode)
      "classes": tf.argmax(input=logits, axis=1),
      # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
      # `logging_hook`.
      "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
    }

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    # Calculate Loss (for both TRAIN and EVAL modes)
    loss = tf.losses.softmax_cross_entropy(
      onehot_labels=onehot_labels, logits=logits)

    # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        # We can use different optimizers!
#         optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
        optimizer = tf.train.AdamOptimizer()
        
        train_op = optimizer.minimize(
            loss=loss,
            global_step=tf.train.get_global_step() #keeps track of steps taken
        )
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

    # Add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
      "accuracy": tf.metrics.accuracy(
          labels=labels, predictions=predictions["classes"])}
    return tf.estimator.EstimatorSpec(
      mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

When we have the function, we can create an instance of the model:

In [None]:
mnist_dense_classifier = tf.estimator.Estimator(model_fn=dense_model_fn)

In order to train the model, we need to specify the training input. It's also a function:

In [None]:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": mnist.train.images},
    y=mnist.train.labels.argmax(axis=1),
    batch_size=100,
    num_epochs=None,
    shuffle=True)

Finally, we can perform the training!

In [None]:
mnist_dense_classifier.train(
    input_fn=train_input_fn,
    steps=1000)

In [None]:
# def display_dense(layers, id):
#     plt.imshow(lyers)

interact(display_W,
         i=widgets.IntSlider(min=0, max=199),
         outside_W=widgets.fixed(mnist_dense_classifier.get_variable_value('dense/kernel'))
        )


If we want to evaluate the performance of the model, we first define the testing input function (just like above). Then, on evaluation we get the metrics we specified in the model_fn:

In [None]:
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": mnist.test.images},
    y=mnist.test.labels.argmax(axis=1),
    num_epochs=1,
    shuffle=False)

eval_results = mnist_dense_classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

Woooow! 97% with just two hidden layers!

Feel free to experiment with different numbers of layers and their sizes ;)

### Enter Convolutional Neural Networks!

<img src=img/conv_net.png/>


In [None]:
def cnn_model_fn(features, labels, mode):
    # Input Layer
    input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])

    # Convolutional Layer #1
    conv1 = tf.layers.conv2d(
      inputs=input_layer,
        filter=None,
      filters=32,
      strides=(1,1), # how many pixels it skips before making the next record
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)

    # Pooling Layer #1
    pool1 = tf.layers.max_pooling2d(
        inputs=conv1, 
        pool_size=[2, 2], 
        strides=2 #means the same thing as (2,2) -> how many neighboring outputs of conv layer are taken into account
    )

    # Convolutional Layer #2 and Pooling Layer #2
    conv2 = tf.layers.conv2d(
      inputs=pool1,
      filters=64,
      kernel_size=[5, 5],
      padding="same",
      activation=tf.nn.relu)
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

    # Dense Layer
    pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
    dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
    dropout = tf.layers.dropout(
      inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

    # Logits Layer
    logits = tf.layers.dense(inputs=dropout, units=10)

    predictions = {
      # Generate predictions (for PREDICT and EVAL mode)
      "classes": tf.argmax(input=logits, axis=1),
      # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
      # `logging_hook`.
      "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
    }

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    # Calculate Loss (for both TRAIN and EVAL modes)
    onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
    #   onehot_labels = labels
    loss = tf.losses.softmax_cross_entropy(
      onehot_labels=onehot_labels, logits=logits)

    # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.AdamOptimizer()
        train_op = optimizer.minimize(
            loss=loss,
            global_step=tf.train.get_global_step())
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

    # Add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
      "accuracy": tf.metrics.accuracy(
          labels=labels, predictions=predictions["classes"])}
    return tf.estimator.EstimatorSpec(
      mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

In [None]:
mnist_cnn_classifier = tf.estimator.Estimator(model_fn=cnn_model_fn)

In [None]:
mnist_cnn_classifier.train(
    input_fn=train_input_fn,
    steps=1000)

In [None]:
eval_results = mnist_cnn_classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

### Enter capsule nets!

There is already an excellent notebook by Aurelien Geron explaining the whole idea here - we'll proceed with it now! ;)

https://github.com/ageron/handson-ml/blob/master/extra_capsnets.ipynb

I tweaked it a bit, so now please swich to the one in this directory!