# Deep Dives & Best Practices

---

1. Consider the following neural net:

    ```python
    import tensorflow as tf
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Conv2D(32, kernel_size=3, activation="relu", input_shape=(28, 28, 3)))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
    model.add(tf.keras.layers.Conv2D(64, kernel_size=3, activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
    model.add(tf.keras.layers.Conv2D(128, kernel_size=3, activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(512, activation="relu"))
    model.add(tf.keras.layers.Dense(10, activation="softmax"))
    model.compile()
    model.summary()
    ```
     
2. Rewrite it using functional code.
4. Add an extra convolutional layer of 128 filters with `kernel_size` 3 before the `Flatten` layer.
5. Add a residual connection between the two 128 filter layers (DLWP section 9.3.2). Hint: specify `padding='same'` to guarantee shape compatibility.
6. Compile and train on MNIST using one or more callbacks, such as `tf.keras.callbacks.ModelCheckpoint`, `tf.keras.callbacks.EarlyStopping` or `tf.keras.callbacks.EarlyStopping` (DLWP 7.3.2).
7. Try adding batch normalisation (DLWP section 9.3.3).

Optional: check out the [TPU template notebook](https://drive.google.com/file/d/1sQXVc4WmWZWRxWGy6yAzz2vrA-qIh0fP/view?usp=share_link). Encapsulate your net so as to run it using TPU acceleartion (note: these days TPU availability in Google Colab is quite irregular.)

---

# 2. Advanced practice (optional)

### Pick **one** of these two topics. 

These can be useful experiments if you feel like dedicating your coursework 2 to either computer vision or text-related tasks. Note that next week's labs focus on generative systems for artistic experiments, also a possible topic for the coursework.

## 2.1 Image segmentation with a ConvNet

Set up and run [this notebook](https://drive.google.com/file/d/1EPPh0vjqwLgzWBlf9TVGZ9dVbo3g4vgW/view?usp=share_link).

- Experiment with various hyperparameters.
- Can you encapsulate your ConvNet model so that it is easier to change its architecture (number of layers, number of heads, etc.)?

## 2.2 Sequence-to-sequence learning with RNNs/Transformers

Set up and run [this notebook](https://drive.google.com/file/d/1EPPh0vjqwLgzWBlf9TVGZ9dVbo3g4vgW/view?usp=share_link).

- Compare the RNN and Transformer architectures.
- Experiment with various hyperparameters.
- Can you encapsulate your Transformer model so that it is easier to change its architecture (number of layers, number of heads, etc.)?


Note: What would it take for the final Transformer model to be trained on more than one language?

---

#### Remember to run your best performing models on the test set.