# Intro to TensorFlow

TensorFlow is an opensource machine learning framework developed by Google. At its heart is a library for performing multi-dimensional matrix algebra, which is the basis of most machine learning techniques. It works by breaking up the algebraic expressions into a series of simple mathematical operations, which are linked together in a computational graph (or 'dataflow graph'). This explicitly defines the dependancies between the operations and allows for easier optimisation and parallelisation.  

This tutorial will show you several ways to build a Convolution Neural Network (CNN) in TensorFlow, and explain the pros and cons of each. The tutorial will not, however, go into the theory of how a CNN works. As is tradition with CNN tutorials, the MNIST dataset will be used as the example task.  

In [None]:
import tensorflow as tf

# Get MNIST dataset and normalise pixel values.
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

## 1. Keras Interface
Keras was one of the original high level APIs to machine learning libraries, and has since been subsumed into TensorFlow. It provides a very easy to use high-level API that is still relatively powerful. To achieve this, it provided two methods to define a neural network. The first of these is the Sequential API:

In [None]:
# Define the sequential model.
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# Compile model, specifying the training configuration.
model.compile(optimizer=tf.train.AdamOptimizer(), 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train for 5 epochs and then test the model.
model.fit(x_train, y_train, batch_size=32, epochs=5)
model.evaluate(x_test, y_test)

This is all the code you need to create and train a neural network in TensorFlow. Admittedly, it is a very simple network, but one none the less. The network is stored in the variable `model` and is made up of two fully connected layers (`tf.keras.layers.Dense`), separated by a dropout layer with probability $\frac{1}{5}$. 

As with all high-level APIs, with simplicity comes restrictions. If your network is not comprised of a simple series of layers, then you will find it very difficult to define it this way. Because of this, TensorFlow also has a Functional API. This allows you to create more complex models with attributes such as multiple inputs/outputs, shared layers and non-sequential data flows (residual connections). A network can be defined as follows:

In [None]:
# First define a placeholder for the inputs.
inputs = tf.keras.Input(shape=(28, 28))

# Define the model. Note: Each layer instance is callable on a tensor, and returns a tensor.
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(512, activation='relu')(x)
x = tf.keras.layers.Dropout(0.2)(x)
predictions = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=predictions)

# Compile model, specifying the training configuration.
model.compile(optimizer=tf.train.AdamOptimizer(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train for 5 epochs and then test the model.
model.fit(x_train, y_train, batch_size=32, epochs=5)
model.evaluate(x_test, y_test)

This network is exactly the same as the one in the Sequential example above. You can therefore see that it is not quite as clean a definition. In this API each layer instance is callable and returns a tensor. Furthermore, the input tensors and output tensors are used to define a `tf.keras.Model` instance, however, this is trained just like the Sequential model. Despite the increased comlexity, the Funtional API allows you to do the following, which is the way the majority of all your future networks should be laid out:

In [None]:
# Define the network as a subclass of tf.keras.Model
class MyModel(tf.keras.Model):
    def __init__(self, num_classes=10):
        super(MyModel, self).__init__(name='my_model')
        self.num_classes = num_classes
        # Define your layers here.
        self.flatten = tf.keras.layers.Flatten()
        self.fc1 = tf.keras.layers.Dense(512, activation='relu')
        self.dropout = tf.keras.layers.Dropout(0.2)
        self.fc2 = tf.keras.layers.Dense(num_classes, activation='softmax')

    def call(self, inputs):
        # Define your forward pass here.
        x = self.flatten(inputs)
        x = self.fc1(x)
        x = self.dropout(x)
        return self.fc2(x)

    
# Initial an instance of your model
model = MyModel(num_classes=10)

# Compile model, specifying the training configuration.
model.compile(optimizer=tf.train.AdamOptimizer(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train for 5 epochs and then test the model.
model.fit(x_train, y_train, batch_size=32, epochs=5)
model.evaluate(x_test, y_test)

The Functional API lets us build fully-customizable models by subclassing `tf.keras.Model` and defining our own forward pass. We create the model's layers in the `__init__` method and set them as attributes of the class instance. Then define the dependancies between them, the forward pass, in the `call` method.

Model subclassing is particularly useful when eager execution is enabled since the forward pass can be written imperatively, however, while it offers flexibility, the additional complexity provides more opportunities for user errors.

For more infomation on TensorFlow and Keras, please see: https://www.tensorflow.org/guide/keras

## 2. Advanced Capabilities
In this next section, we will look at several additional aspects of the Keras interface that provide some more advance capabilities. 

### Callbacks
Callbacks are objects that extend the training procedure. This includes capabilities such as checkpointing, LR scheduling, early stopping and metric logging. It is also possible to create your own custom callbacks. Callbacks are passed as a list to the a model's `fit` method:

In [None]:
# Initial another instance of the model
model2 = MyModel(num_classes=10)

# Compile model, specifying the training configuration.
model.compile(optimizer=tf.train.AdamOptimizer(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Define list of callback objects
callbacks = [
    tf.keras.callbacks.ModelCheckpoint(filepath='./checkpoints', save_weights_only=True),
    tf.keras.callbacks.TensorBoard(log_dir='./logs'),
]

# Train for 5 epochs (passing in callbacks as a parameter) and then test the model.
model.fit(x_train, y_train, batch_size=32, epochs=5, callbacks=callbacks)
model.evaluate(x_test, y_test)

### Saving and Loading Models
Saving and loading models is obviously important otherwise you would never be able to use the models you trained. It is also import for when you need to use more advanced techniques, such as fine-tuning.

Below displays how to save and load the model weights. It should be noted that the model architecture will have to be defined before the weights can be loaded into it. 

In [None]:
# Save weights to a TensorFlow Checkpoint file
model.save_weights('./weights/my_model')

# Restore the model's state,
# this requires a model with the same architecture.
model.load_weights('./weights/my_model')