# Advanced CNNs with tf.keras and tf.data.


### Functional API

We saw in the last notebook how to use `tf.keras.Sequential` to stack layers together for the classification task. One issue with the stacking API is that we cannot create arbitrary models topologies, which are the bread and butter of Deep Learning research.  

Keras provides a [functional](https://keras.io/getting-started/functional-api-guide/) style of API to build complex model topologies such as:
* multi-input models (think images and their descriptions)
* multi-output models (think classification and a summary of an image)
* models with non-sequential data flows (e.g. skip connections or by-passing parts of the network)

So lets rewrite the previous classifier in this API.

In [1]:
import tensorflow as tf

import numpy as np

In [2]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.fashion_mnist.load_data()

In [3]:
TRAINING_SIZE = len(train_images)
TEST_SIZE = len(test_images)

train_images = np.asarray(train_images, dtype=np.float32) / 255

# Convert the train images and add channels
train_images = train_images.reshape((TRAINING_SIZE, 28, 28, 1))

test_images = np.asarray(test_images, dtype=np.float32) / 255
# Convert the train images and add channels
test_images = test_images.reshape((TEST_SIZE, 28, 28, 1))

In [4]:
# How many categories we are predicting from (0-9)
LABEL_DIMENSIONS = 10

train_labels  = tf.keras.utils.to_categorical(train_labels, LABEL_DIMENSIONS)
test_labels = tf.keras.utils.to_categorical(test_labels, LABEL_DIMENSIONS)

# Cast the labels to floats, needed later
train_labels = train_labels.astype(np.float32)
test_labels = test_labels.astype(np.float32)

We start with the input tensor and use the simple rule that **any layer instance is callable on a tensor and will return a tensor**:

In [5]:
inputs = tf.keras.Input(shape=(28,28,1))  # Returns a placeholder tensor

In [6]:
x = tf.keras.layers.Conv2D(filters=32, kernel_size=(3, 3), activation=tf.nn.relu)(inputs)
x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2), strides=2)(x)
x = tf.keras.layers.Conv2D(filters=64, kernel_size=(3, 3), activation=tf.nn.relu)(x)
x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2), strides=2)(x)
x = tf.keras.layers.Conv2D(filters=64, kernel_size=(3, 3), activation=tf.nn.relu)(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(64, activation=tf.nn.relu)(x)
predictions = tf.keras.layers.Dense(LABEL_DIMENSIONS, activation=tf.nn.softmax)(x)

In [7]:
# Instantiate the model given inputs and outputs.
model = tf.keras.Model(inputs=inputs, outputs=predictions)

model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0     

Training the model is the same as before:

In [8]:
optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.001)

model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

In [9]:
BATCH_SIZE=128

# Because tf.data may work with potentially **large** collections of data
# we do not shuffle the entire dataset by default
# Instead, we maintain a buffer of SHUFFLE_SIZE elements
# and sample from there.
SHUFFLE_SIZE = 10000 

# Create the dataset
dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
dataset = dataset.shuffle(SHUFFLE_SIZE)
dataset = dataset.batch(BATCH_SIZE)

In [10]:
EPOCHS=5 # or the number of times we go through our entire training dataset

for epoch in range(EPOCHS):
    for (batch, (images, labels)) in enumerate(dataset):
        train_loss, train_accuracy = model.train_on_batch(images, labels)
    
        if batch % 10 == 0: print(batch, train_accuracy)
  
    # Here you can gather any metrics or adjust your training parameters
    print('Epoch #%d\t Loss: %.6f\tAccuracy: %.6f' % (epoch + 1, train_loss, train_accuracy))

0 0.140625
10 0.515625
20 0.6484375
30 0.6484375
40 0.703125
50 0.6171875
60 0.6640625
70 0.625
80 0.796875
90 0.671875
100 0.75
110 0.7734375
120 0.796875
130 0.7890625
140 0.7578125
150 0.7109375
160 0.8203125
170 0.7890625
180 0.734375
190 0.828125
200 0.796875
210 0.7578125
220 0.7421875
230 0.796875
240 0.8125
250 0.7421875
260 0.8515625
270 0.8046875
280 0.859375
290 0.90625
300 0.8046875
310 0.796875
320 0.8515625
330 0.8359375
340 0.84375
350 0.8125
360 0.765625
370 0.890625
380 0.8046875
390 0.78125
400 0.8671875
410 0.859375
420 0.8359375
430 0.8203125
440 0.875
450 0.7890625
460 0.8984375
Epoch #1	 Loss: 0.461277	Accuracy: 0.833333
0 0.796875
10 0.8984375
20 0.8125
30 0.8515625
40 0.859375
50 0.8046875
60 0.8359375
70 0.8125
80 0.921875
90 0.796875
100 0.8828125
110 0.8359375
120 0.890625
130 0.890625
140 0.9140625
150 0.890625
160 0.90625
170 0.859375
180 0.7890625
190 0.890625
200 0.8671875
210 0.875
220 0.8671875
230 0.890625
240 0.8359375
250 0.875
260 0.8984375
270 0.89

Again to evaluate the model we need to check the accuracy on unseen or test data:

In [11]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest Model \t\t Loss: %.6f\tAccuracy: %.6f' % (test_loss, test_acc))


Test Model 		 Loss: 0.294421	Accuracy: 0.892000


### Model Subclassing

It is also possible to build a  fully-customizable model by subclassing `tf.keras.Model` and defining your own forward pass (just don't tell that to the PyTorch people 🙈). Just like in PyTorch you create layers in the `__init__` method and set them as attributes of the class instance. Define the forward pass in the `call` method and boom! you are ready to go!

This is particularly useful when eager execution is enabled since the forward pass can be written imperatively.

```python
class MyModel(tf.keras.Model):

  def __init__(self, num_classes=10):
    super(MyModel, self).__init__(name='my_model')
    self.num_classes = num_classes
    # Define your layers here.
    self.dense_1 = tf.keras.layers.Dense(32, activation='relu')
    self.dense_2 = tf.keras.layers.Dense(num_classes, activation='sigmoid')

  def call(self, inputs):
    # Define your forward pass here,
    # using layers you previously defined (in `__init__`).
    x = self.dense_1(inputs)
    return self.dense_2(x)

  def compute_output_shape(self, input_shape):
    # You need to override this function if you want to use the subclassed model
    # as part of a functional-style model.
    # Otherwise, this method is optional.
    shape = tf.TensorShape(input_shape).as_list()
    shape[-1] = self.num_classes
    return tf.TensorShape(shape)


# Instantiates the subclassed model.
model = MyModel(num_classes=10)
```

As an exercise write the above model for Fashion-MNIST by subclassing `tf.keras.Model` and train it!

In [12]:
### YOUR CODE HERE

### Custom layers

A lot of times researchers will write their own custom layer which is possible now by subclassing `tf.keras.layers.Layer` and implementing the following methods:

* `build`: Create the weights of the layer. Add weights with the `add_weight` method.
* `call`: Define the forward pass.
* `compute_output_shape`: Specify how to compute the output shape of the layer given the input shape.
* Optionally, a layer can be serialized by implementing the `get_config` method and the `from_config` class method.

As an example:

```python
class MyLayer(tf.keras.layers.Layer):

  def __init__(self, output_dim, **kwargs):
    self.output_dim = output_dim
    super(MyLayer, self).__init__(**kwargs)

  def build(self, input_shape):
    shape = tf.TensorShape((input_shape[1], self.output_dim))
    # Create a trainable weight variable for this layer.
    self.kernel = self.add_weight(name='kernel',
                                  shape=shape,
                                  initializer='uniform',
                                  trainable=True)
    # Be sure to call this at the end
    super(MyLayer, self).build(input_shape)

  def call(self, inputs):
    return tf.matmul(inputs, self.kernel)

  def compute_output_shape(self, input_shape):
    shape = tf.TensorShape(input_shape).as_list()
    shape[-1] = self.output_dim
    return tf.TensorShape(shape)

  def get_config(self):
    base_config = super(MyLayer, self).get_config()
    base_config['output_dim'] = self.output_dim

  @classmethod
  def from_config(cls, config):
    return cls(**config)


# Create a model using the custom layer
model = tf.keras.Sequential([MyLayer(10),
                             tf.keras.layers.Activation('softmax')])
```