# < Deep Learning - PART2 TF2 CNNs >

# Ch 5. CNNs Workshop 1- MNIST : Digit Recognition with Batch Normalization
2021/10/01

**[ Reference ]**
1. TensorFlow Core - Tutorials : **`TensorFlow 2 quickstart for experts`**, 2019. https://www.tensorflow.org/tutorials/quickstart/advanced
2. TensorFlow Core - Tutorials : **`Get started with TensorBoard`**, 2019. https://www.tensorflow.org/tensorboard/get_started
3. IT邦幫忙, **`Day 17: Tensorflow 2.0 再造訪 tf.GradientTape`**, 2019/10/02. 
https://ithelp.ithome.com.tw/articles/10223779
4. TensorFlow Core - Guides : **`Better performance with tf.function and AutoGraph`** 
https://www.tensorflow.org/guide/function
5. TensorFlow Core - API : **`tf.keras.layers.BatchNormalization`**
https://www.tensorflow.org/api_docs/python/tf/keras/layers/BatchNormalization
6. François Chollet, **Deep Learning with Python**, Section 7.3.1 (**BATCH NORMALIZATION**), pp. 260~261, Manning, 2018.
http://www.deeplearningitalia.com/wp-content/uploads/2017/12/Dropbox_Chollet.pdf#

In [24]:
from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf
import datetime

from tensorflow.keras.layers import Dense, Flatten, Conv2D, BatchNormalization
from tensorflow.keras import Model

In [25]:
tf.__version__

'2.6.0'

## 1. Load and prepare the `MNIST` dataset
+ **`MNIST`** dataset - http://yann.lecun.com/exdb/mnist/

In [26]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

### Checking the data info...

In [27]:
x_train.shape, y_train.shape

((60000, 28, 28), (60000,))

In [28]:
x_test.shape, y_test.shape

((10000, 28, 28), (10000,))

In [29]:
y_train    #  y_test

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [30]:
x_train.min(), x_train.max()    # x_test.min(), x_test.max()

(0, 255)

###  Normalization

In [31]:
x_train, x_test = x_train / 255.0, x_test / 255.0

In [32]:
x_train.min(), x_train.max()

(0.0, 1.0)

### Changing the data format into 4D Tensors for CNN's Inputs
+ **4D Tensor : (samples, height, width, channels)**

In [33]:
# Add a "channels" dimension
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

In [34]:
x_train.shape  # x_test.shape

(60000, 28, 28, 1)

### Using `tf.data` to batch and shuffle the dataset:

+ **`tf.data.Dataset`** : https://www.tensorflow.org/api_docs/python/tf/data/Dataset
> `Class` **`Dataset`** 
        - Represents a potentially large set of elements.

+ **Use `tf.data` to batch and shuffle the dataset**:
>
> For example :
>
> `train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))`
     `.shuffle(10000).batch(32)`

> [ NOTE ]:
+ `tf.data.Dataset.from_tensor_slices` - https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensor_slices
+ `shuffle()` - https://www.tensorflow.org/api_docs/python/tf/data/Dataset?version=stable#shuffle
+ `batch()` - https://www.tensorflow.org/api_docs/python/tf/data/Dataset?version=stable#batch

In [35]:
# Nodes in the Computation Graph...
train_ds = tf.data.Dataset.from_tensor_slices(
    (x_train, y_train)).shuffle(10000).batch(32)

test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

In [36]:
train_ds

<BatchDataset shapes: ((None, 28, 28, 1), (None,)), types: (tf.float64, tf.uint8)>

## 2. Forward Propagation
+ Build the `tf.keras` model using the Keras [model subclassing API](https://www.tensorflow.org/guide/keras#model_subclassing):

In [37]:
class MyModel(Model):   # tf.keras.Model class
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = Conv2D(32, 3, activation='relu')  # tf.keras.layers.Conv2D()
        self.batchnorm1 = BatchNormalization()   # tf.keras.layers.BatchNormalization()
        self.flatten = Flatten()                 # tf.keras.layers.Flatten()
        self.d1 = Dense(128, activation='relu')  # tf.keras.layers.Dense()
        self.batchnorm2 = BatchNormalization()   # tf.keras.layers.BatchNormalization()
        self.d2 = Dense(10)

    def call(self, x):
        x = self.conv1(x)
        x = self.batchnorm1(x)
        x = self.flatten(x)
        x = self.d1(x)
        x = self.batchnorm2(x)
        return self.d2(x)

# Create an instance of the model
model = MyModel()

In [38]:
model

<__main__.MyModel at 0x197192a5220>

## 3. Backpropagation
### Choosing an optimizer and loss function for training
> **[Loss Function] :  `tf.keras.losses.SparseCategoricalCrossentropy()`** : https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy
> + **Class `SparseCategoricalCrossentropy`** - Computes the crossentropy loss between the labels and predictions.
    + **Use this crossentropy loss function when there are two or more label classes.**
    + **We expect labels to be provided as integers.** 
    
> [NOTE] : + **If you want to provide labels using `one-hot` representation, please use `CategoricalCrossentropy` loss.**
    > + There should be `# classes` floating point values per feature for `y_pred` and a single floating point value per feature for `y_true`.

In [39]:
#  < tf.keras.losses.SparseCategoricalCrossentropy >
#    from_logits: Whether y_pred is expected to be a logits tensor. 
#                 By default, we assume that y_pred encodes 
#                 a probability distribution. 
#   [Note]: Using from_logits=True may be more numerically stable.

loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

optimizer = tf.keras.optimizers.Adam()

### Selecting metrics to measure the loss and the accuracy of the model 
+ These metrics accumulate the values over epochs and then print the overall result.
+ Module: **`tf.keras.metrics`**
https://www.tensorflow.org/api_docs/python/tf/keras/metrics
    > **Built-in metrics classes** : (*to name a few*)
    > + **`class SparseCategoricalAccuracy`** : Calculates how often predictions matches integer labels. https://www.tensorflow.org/api_docs/python/tf/keras/metrics/SparseCategoricalAccuracy
    > + **`class CategoricalAccuracy`** : Calculates how often predictions matches labels.
    > + **`class Mean`** : Computes the (weighted) mean of the given values. 

In [40]:
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

------------
### Building A Training Model with `tf.function` & `tf.GradientTape`  
+ Using `tf.function` decorator & `tf.GradientTape` method to build a custom-made training model instead of `tf.keras`'s`model.fit()`.
+ **`tf.GradientTape()`** (https://www.tensorflow.org/api_docs/python/tf/GradientTape)
    + When training with methods such as **`tf.GradientTape()`**, we can use **`tf.summary`** to log the required information.
> + `tf.GradientTape()`可以用在 training loop 裡，記錄並建構正向傳播的計算圖。
> + 在完成“記錄”後，tf.GradientTape() 的 tape 物件則呼叫 gradient()方法，並傳入損失值 (loss score) 和模型可訓練的參數。 [from **`Ref 3.`**]
> + 一旦計算出了梯度後，立即呼叫 optimizer.apply_gradients() 方法，傳入一個 list of tuple，每一個 tuple 的第二個則是參數變數，而第一個變數為針對該參數所計算出的梯度。 [from **`Ref 3.`**]

In [42]:
@tf.function
def train_step(images, labels):    # images : x_train , labels : y_train
    ## ----------------------------------------------------------------------
    ##  Forward propagation - 
    ##    tf.GradientTape()可以用在 training loop 裡，記錄並建構正向傳播的計算圖
    ## ----------------------------------------------------------------------
    with tf.GradientTape() as tape:
        # training=True is only needed if there are layers with different
        # behavior during training versus inference (e.g. Dropout).
        predictions = model(images, training=True)
        loss = loss_object(labels, predictions)

    ## ----------------------------------------------------------------------
    ##  Backpropagation - 
    ##    在完成“記錄”後，tf.GradientTape() 的 tape 物件則呼叫 gradient()方法，
    ##    並傳入損失值 (loss score) 和模型可訓練的參數。 [from Ref 3.]
    ## ----------------------------------------------------------------------
    gradients = tape.gradient(loss, model.trainable_variables)
    
    ## ----------------------------------------------------------------------
    ##  Parameters' update - 
    ##    一旦計算出了梯度後，立即呼叫 optimizer.apply_gradients() 方法，
    ##    傳入一個 list of tuple，每一個 tuple 的第二個則是參數變數，
    ##    而第一個變數為針對該參數所計算出的梯度。 [from Ref 3.]
    ## ----------------------------------------------------------------------    
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss(loss)
    train_accuracy(labels, predictions)

### Testing the model :

In [43]:
@tf.function
def test_step(images, labels):
    # training=False is only needed if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    predictions = model(images, training=False)
    t_loss = loss_object(labels, predictions)

    test_loss(t_loss)
    test_accuracy(labels, predictions)

## 4. TensorBoard with `tf.summary`
+ Set up summary writers to write the summaries to disk in a different logs directory:

In [44]:
current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")

log_directory = 'logs/CNN_MNIST_BN/'
train_log_dir = log_directory + current_time + '/train'
test_log_dir = log_directory + current_time + '/test'

train_summary_writer = tf.summary.create_file_writer(train_log_dir)
test_summary_writer = tf.summary.create_file_writer(test_log_dir)

> **[NOTE] :** 
+ **Use `tf.summary.scalar()` to log metrics (loss and accuracy) during training/testing within the scope of the summary writers to write the summaries to disk.** 
+ You have control over which metrics to log and how often to do it. 
+ **Other `tf.summary` functions enable logging other types of data.**

## 5. Training & Testing Processes

In [45]:
EPOCHS = 10

for epoch in range(EPOCHS):
    # Reset the metrics at the start of the next epoch
    train_loss.reset_states()
    train_accuracy.reset_states()
    test_loss.reset_states()
    test_accuracy.reset_states()

    for images, labels in train_ds:
        train_step(images, labels)
        
    with train_summary_writer.as_default():
        tf.summary.scalar('loss', train_loss.result(), step=epoch)
        tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)

    for test_images, test_labels in test_ds:
        test_step(test_images, test_labels)
        
    with test_summary_writer.as_default():
        tf.summary.scalar('loss', test_loss.result(), step=epoch)
        tf.summary.scalar('accuracy', test_accuracy.result(), step=epoch)

    template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
    print(template.format(epoch+1,
                          train_loss.result(),
                          train_accuracy.result()*100,
                          test_loss.result(),
                          test_accuracy.result()*100))

Epoch 1, Loss: 0.13404148817062378, Accuracy: 96.05500030517578, Test Loss: 0.06725851446390152, Test Accuracy: 97.83999633789062
Epoch 2, Loss: 0.05010765418410301, Accuracy: 98.44667053222656, Test Loss: 0.08503703773021698, Test Accuracy: 97.47000122070312
Epoch 3, Loss: 0.02724248543381691, Accuracy: 99.12999725341797, Test Loss: 0.05962126702070236, Test Accuracy: 98.18999481201172
Epoch 4, Loss: 0.019286485388875008, Accuracy: 99.35333251953125, Test Loss: 0.05750630050897598, Test Accuracy: 98.3699951171875
Epoch 5, Loss: 0.013315939344465733, Accuracy: 99.58499908447266, Test Loss: 0.0671699196100235, Test Accuracy: 98.22999572753906
Epoch 6, Loss: 0.010507801547646523, Accuracy: 99.63666534423828, Test Loss: 0.07832274585962296, Test Accuracy: 97.98999786376953
Epoch 7, Loss: 0.007947785779833794, Accuracy: 99.73333740234375, Test Loss: 0.05841346085071564, Test Accuracy: 98.37999725341797
Epoch 8, Loss: 0.006727313157171011, Accuracy: 99.78166961669922, Test Loss: 0.080632574

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials).

> ###  To run TensorBoard, run the following command on `Anaconda (Powershell) Prompt` :
`tensorboard --logdir=`_path/to/log-directory_

> + For instance, **`tensorboard --logdir logs/CNN_MNIST_BN`**

> Connecting to **`http://localhost:6006`**