<a href="https://colab.research.google.com/github/zerotodeeplearning/ztdl-masterclasses/blob/master/notebooks/Convolutional_Neural_Networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Learn with us: www.zerotodeeplearning.com

Copyright © 2021: Zero to Deep Learning ® Catalit LLC.

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Convolutional Neural Networks

This notebook is best run using a GPU backend

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.layers import Conv2D, MaxPooling2D, AveragePooling2D, Flatten

In [None]:
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

X_train = X_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
X_test = X_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0

In [None]:
X_train.shape

In [None]:
model = Sequential([
    Conv2D(32, (3, 3), input_shape=(28, 28, 1), activation='relu'),
    Flatten(),
    Dense(10, activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

h = model.fit(X_train, y_train, batch_size=128, epochs=3, validation_split=0.1)

In [None]:
pd.DataFrame(h.history).plot()

### Exercise 1: A better CNN

The CNN we just implemented is very basic. Improve its design by adding layers before and after the `Flatten` layer. It is up to you what the model will be. Here are some things you may consider:

- how many convolutional layers to use?
- what filter size?
- how many filters in each layer?
- what activation function?
- pooling? what type?
- how many fully-connected layers after Flatten?
- dropout or batch normalization?
- what batch size for training?

Once you've defined your new model, compile it and train it on the straining data for 5 epochs. Can you get the accuracy above 90% ?

Your code should look like:

```python
model = Sequential([

  # YOUR CODE HERE
  # ...
])

model.compile(# YOUR CODE HERE)


model.fit(# YOUR CODE HERE)
```

### Exercise 2: Modularize the network

You may have noticed that, as a network becomes deeper, some parts can be naturally grouped to be considered small sub-networks or modules within the larger architecture. Can we leverage this fact and build a network made of modules?

Define 2 functions to achieve that.

#### Function 1
The first function should return a block with the following components:
- Conv2D with variable number of filters and `'same'` padding.
- Conv2D with variable number of filters
- Optional BatchNormalization
- MaxPooling2D
- Dropout

Use the functional API to do this, and write a function with the following signature:
```python
def cnn_block(inputs, n_filters, activation, dropout, bn):
  x = ....(inputs)
  ....
  return x
```

#### Function 2
The second function should return a compiled model, it should use the first function to include CNN blocks and it should have the following signature:

```python
def convolutional_model(n_blocks=2):
  inputs = Input(shape=(28, 28, 1))
  x = inputs

  for i in range(n_blocks):
    x = cnn_block(x, 32, 'relu', 0.4, True)
  
  # YOUR CODE HERE
  # ..

  outputs = Dense(10)(x)

  model = Model(# YOUR CODE HERE)

  model.compile(
     # YOUR CODE HERE
  )
  return model
```

Test your functions by creating a model and training it for 1 epoch

In [None]:
from tensorflow.keras.layers import Input, BatchNormalization
from tensorflow.keras.models import Model
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.optimizers import SGD, Adam, RMSprop, Adamax

### Exercise 3: Hyperparameter search with Tensorboard

Extend your search for the best model using what you've learned in the Hyperparameter tuning class.

Take the `convolutional_model` function from the previous exercise and modify it to accept hyperparameters.

You could define a dictionary to hold your ranges:

```python
hp_ranges = {
    'n_blocks': # YOUR CODE HERE
    'n_conv_filters_1': # YOUR CODE HERE
    # YOUR CODE HERE
    # ...
}
```

and then change the signature of your function to look like this:

```python
def convolutional_model(hparams):
  inputs = Input(shape=(28, 28, 1))
  x = inputs
  
  for i in range(hparams['n_blocks']):
    x = cnn_block(x,
                  hparams[f'n_conv_filters_{i+1}'],
                  ...
  # YOUR CODE HERE
  # ...
```

Finally, sample the parameter space and see if you can identify important and unimportant parameters for this particular dataset. The `train_test_model` and search loop are provided for you.

In [None]:
from tensorflow.keras.callbacks import TensorBoard
from tensorboard.plugins.hparams import api as hp
from sklearn.model_selection import ParameterGrid, ParameterSampler
from scipy.stats.distributions import expon, uniform, randint

In [None]:
logdir = 'logs/hparam_tuning/'

In [None]:
# YOUR CODE HERE

In [None]:
def train_test_model(hparams, run):

  model = convolutional_model(hparams)

  model.fit(X_train, y_train, epochs=5,
            verbose=2,
            batch_size=hparams['batch_size'],
            validation_split=0.1,
            callbacks=[TensorBoard(logdir + run_name)]
            )
  
  model.fit(X_train, y_train, epochs=1,
            verbose=2,
            batch_size=hparams['batch_size'],
            validation_split=0.1,
            callbacks=[hp.KerasCallback(logdir+run, hparams)]
            )

In [None]:
session_num = 0

for hparams in ParameterSampler(hp_ranges, n_iter=50, random_state=0):
  print(hparams)
  run_name = "run-%d" % session_num
  print('--- Starting trial: %s' % run_name)
  train_test_model(hparams, run_name)
  session_num += 1

In [None]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir logs