<a href="https://colab.research.google.com/github/AmirJlr/Deep-Learning-FUM/blob/master/00-tensorflow-first-steps/TensorFlowHandsOn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Writing a training loop from scratch in TensorFlow

Writing low-level training & evaluation loops in TensorFlow.

In [27]:
import time
import os

# This guide can only be run with the TensorFlow backend.
os.environ["KERAS_BACKEND"] = "tensorflow"

import tensorflow as tf
import keras
import numpy as np

### MNIST model

In [28]:
def get_model():
    inputs = keras.Input(shape=(784,), name='digits')
    x1 = keras.layers.Dense(64, activation = 'relu')(inputs)
    x2 = keras.layers.Dense(64, activation = 'relu')(x1)
    outputs = keras.layers.Dense(10, name='prediction')(x2)

    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

model = get_model()

 Let's train it using `mini-batch gradient` with a custom training loop.


 we're going to need an `optimizer`, a `loss function`, and a `dataset`:

In [29]:
# Optimizer
optimizer = keras.optimizers.Adam(learning_rate=1e-3)

# Loss Function
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Prepare Data
batch_size = 32
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


In [30]:
x_train = np.reshape(x_train, (-1, 784))
x_test = np.reshape(x_test, (-1, 784))
print(x_train.shape)
print(x_test.shape)

(60000, 784)
(10000, 784)


In [31]:
# Reserve 10,000 samples for validation.
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

In [32]:
print(x_train.shape)
print(y_train.shape)

(50000, 784)
(50000,)


In [33]:
y_train[0], x_train[0]

(5,
 array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   3,  18,  18,  18,
        126, 136, 175,  26, 166, 255, 247, 127,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,  30,

In [34]:
### Prepare Training Data
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset

<_TensorSliceDataset element_spec=(TensorSpec(shape=(784,), dtype=tf.uint8, name=None), TensorSpec(shape=(), dtype=tf.uint8, name=None))>

In [35]:
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)
train_dataset

<_BatchDataset element_spec=(TensorSpec(shape=(None, 784), dtype=tf.uint8, name=None), TensorSpec(shape=(None,), dtype=tf.uint8, name=None))>

In [36]:
### Prepare the validation dataset.
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.batch(batch_size)

Calling a model inside a **GradientTape scope** enables you to **retrieve the gradients of the trainable weights of the layer with respect to a loss value**.

 Using an **optimizer** instance, you can use these gradients to **update these variables** (which you can retrieve using model.trainable_weights).


#### Here's our training loop, step by step:

- We open a for loop that iterates over epochs

- For each epoch, we open a for loop that iterates over the dataset, in batches

- For each batch, we open a GradientTape() scope

- Inside this scope, we call the model (forward pass) and compute the loss
Outside the scope, we retrieve the gradients of the weights of the model with regard to the loss

- Finally, we use the optimizer to update the weights of the model based on the gradients

In [37]:
EPOCHES = 3

for epoch in range(1, EPOCHES+1):
    print(f'\nStart of epoch {epoch}')

    # iterate over datadet in batchs
    for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):

        # open a GradientTape() scope
        with tf.GradientTape() as tape:
            # call the model (forward pass) and compute the loss
            logits= model(x_batch_train, training = True)

            loss_value = loss_fn(y_batch_train, logits)

        # retrieve the gradients of the weights of the model with regard to the loss
        grads = tape.gradient(loss_value, model.trainable_weights)

        # use the optimizer to update the weights of the model based on the gradients
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

        # Log every 100 batches.
        if step % 100 == 0:
            print(
                f"Training loss (for 1 batch) at step {step}: {float(loss_value):.4f}"
            )
            print(f"Seen so far: {(step + 1) * batch_size} samples")






Star of epoch 0




Training loss (for 1 batch) at step 0: 120.1026
Seen so far: 32 samples
Training loss (for 1 batch) at step 100: 2.4283
Seen so far: 3232 samples
Training loss (for 1 batch) at step 200: 2.0205
Seen so far: 6432 samples
Training loss (for 1 batch) at step 300: 1.7353
Seen so far: 9632 samples
Training loss (for 1 batch) at step 400: 0.6094
Seen so far: 12832 samples
Training loss (for 1 batch) at step 500: 1.8939
Seen so far: 16032 samples
Training loss (for 1 batch) at step 600: 1.6675
Seen so far: 19232 samples
Training loss (for 1 batch) at step 700: 0.4265
Seen so far: 22432 samples
Training loss (for 1 batch) at step 800: 1.8246
Seen so far: 25632 samples
Training loss (for 1 batch) at step 900: 0.7020
Seen so far: 28832 samples
Training loss (for 1 batch) at step 1000: 1.0976
Seen so far: 32032 samples
Training loss (for 1 batch) at step 1100: 1.0294
Seen so far: 35232 samples
Training loss (for 1 batch) at step 1200: 0.6754
Seen so far: 38432 samples
Training loss (for 1 batch) 