# Tutorial

This notebook will walk though several exercises using tensorflow to implement a range of machine learning algorithms. In particular this will cover supervised learning and reinforcement learning algorithms which use neural networks.

To start, make sure you have all dependencies install by running `pip install -r requirements.txt` from the directory containing this notebook.

In [1]:
# Import modules
import numpy as np
import tensorflow as tf

## Loading MNIST

We'll using a simple classification problem as an example to introduce how to create, train, and evaluate a model. The next cell will load the MNIST dataset which contains images of handwritten numbers.

In [2]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train/255, x_test/255

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


## Creating a Model

We'll walkthrough two methods of creating a model. The first is to use Keras' Sequential model. This is convenient for quickly creating and training a model for basic supervised learning and some reinforcement learning.

In [5]:
# Create sequential model
model = tf.keras.models.Sequential()

# Add a flatten layer to turn MNIST images into a vector
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))

# Add some dense layers with ReLU activations
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dense(64, activation='relu'))

# Add a layer for the scores of each class. Note that we leave the activation linear.
model.add(tf.keras.layers.Dense(10))

# Add an output layer for the softmax predictions
model.add(tf.keras.layers.Softmax())

# Test this on a sample from the dataset
prediction = model(x_train[:1]).numpy()
print(prediction)


[[0.08526918 0.08805387 0.08779302 0.06259194 0.13711196 0.11185778
  0.13748164 0.13741958 0.06497267 0.08744836]]
(60000, 28, 28)


Keras' Model class (from which Sequential inherits), provides a convenient function for fitting data. First we must compile the model with an optimizer and loss function. We'll use the Adam optimizer and sparse categorical crossentropy.

In [13]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [14]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x1bbe313aa90>

In [15]:
model.evaluate(x_test, y_test, verbose=2)

313/313 - 1s - loss: 0.1123 - accuracy: 0.9723


[0.11232912540435791, 0.9722999930381775]

Next let's look at the alternative way of creating and training a model. We can create a model by creating a custom class which inherits from Keras' Model class. This method is more involved than the previous method; however, it allows for more customization. The ability to customize is often critical to ML research. Simply running a model on a new problem is usually not enough to constitute a novel contribution. You're contributions will likely come in the form of new model architectures, loss functions, optimization strategies, etc. All of these will be easier to implement if you understand how to build the model from scratch.

In [35]:
class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()

        # Add a flatten layer to turn MNIST images into a vector
        self.flatten = tf.keras.layers.Flatten(input_shape=(28,28))

        # Add some dense layers with ReLU activations
        self.hidden_layer1 = tf.keras.layers.Dense(128, activation='relu')
        self.hidden_layer2 = tf.keras.layers.Dense(64, activation='relu')

        # Add a layer for the scores of each class. Note that we leave the activation linear.
        self.score = tf.keras.layers.Dense(10)

        # Add an output layer for the softmax predictions
        self.probability = tf.keras.layers.Softmax()

        # Create an optimizer and loss function for training
        self.optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
        self.loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()

        # Parameters for minibatching
        self.batch_size = 64

        # Metrics
        self.loss_metric = tf.keras.metrics.Mean()
        self.accuracy_metric = tf.keras.metrics.Accuracy()
    
    def call(self, x):
        # Create a call function to pass input through each layer
        # IMPORTANT NOTE: Always implement this method as 'call', don't override the '__call__' method
        x = self.flatten(x)
        x = self.hidden_layer1(x)
        x = self.hidden_layer2(x)
        x = self.score(x)
        return self.probability(x)
    
    def train(self, x, y):
        N = y.shape[0]
        i = 0

        # Break up the data into minibatches
        while i < N:
            x_batch = x[i:min(i+self.batch_size, N)]
            y_batch = y[i:min(i+self.batch_size, N)]

            # This line lets us record gradients
            with tf.GradientTape() as tape:
                y_pred = self(x_batch)
                loss = self.loss_fn(y_batch, y_pred)
            
            grads = tape.gradient(loss, self.trainable_weights)
            self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
        
            self.loss_metric(loss)
            self.accuracy_metric(y_batch, np.argmax(y_pred.numpy(), axis=1))

            i += self.batch_size

            if i % (self.batch_size * 100) == 0:
                print("Step {}: Average loss = {}, Accuracy = {}".format(i/self.batch_size, self.loss_metric.result(), self.accuracy_metric.result()))
                self.loss_metric.reset_state()
                self.accuracy_metric.reset_state()

In [36]:
my_model = MyModel()

In [37]:
for epoch in range(5):
    print("Epoch: ", epoch)
    my_model.train(x_train, y_train)

Epoch:  0
Step 100.0: Average loss = 0.7794746160507202, Accuracy = 0.7806249856948853
Step 200.0: Average loss = 0.3627699911594391, Accuracy = 0.8951562643051147
Step 300.0: Average loss = 0.2991768419742584, Accuracy = 0.9098437428474426
Step 400.0: Average loss = 0.23929233849048615, Accuracy = 0.9298437237739563
Step 500.0: Average loss = 0.2374669462442398, Accuracy = 0.9312499761581421
Step 600.0: Average loss = 0.20516353845596313, Accuracy = 0.9417187571525574
Step 700.0: Average loss = 0.20398493111133575, Accuracy = 0.9417187571525574
Step 800.0: Average loss = 0.20411214232444763, Accuracy = 0.9393749833106995
Step 900.0: Average loss = 0.16094501316547394, Accuracy = 0.9515625238418579
Epoch:  1
Step 100.0: Average loss = 0.12226498872041702, Accuracy = 0.965113639831543
Step 200.0: Average loss = 0.15222826600074768, Accuracy = 0.9532812237739563
Step 300.0: Average loss = 0.12604905664920807, Accuracy = 0.9623437523841858
Step 400.0: Average loss = 0.11800150573253632, A