In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

import numpy as np
import tensorflow as tf
import time

Up until now, you've been running your code in *eager execution* mode, which is enabled by default. In this mode, the flow of code execution happens in the order you're accustomed to, and you can add breakpoints and inspect the values of our tensors and variables as usual.

In contrast, when in [graph execution](https://www.tensorflow.org/guide/intro_to_graphs) mode, the code execution flows a bit differently. During the first pass through the code, a computation graph is created containing information about the operations and tensors in that code. Then in subsequent passes, the graph is used instead of the Python code. One consequence of this flow is that our code isn't debuggable in the usual manner. You gain two major advantages though:
- The graph can be deployed to environments that don't have Python, such as embedded devices. 
- The graph can take advantage of several performance optimizations, such as running parts of the code in parallel.

In order to get the best of both worlds, you use eager execution mode during the development phase, and then switch to graph execution mode once you're done debugging the model. To switch from eager to graph execution, you can add the `@tf.function` decorator to the function containing your model operations.

Let's look at the training code again, but this time with the `@tf.function` decorator applied to the `fit_one_batch` function, which is where you have all the model operations.

In [2]:
!wget -Nq https://raw.githubusercontent.com/MicrosoftDocs/tensorflow-learning-path/main/intro-tf/tintro.py
from tintro import *

In [None]:
@tf.function
def fit_one_batch(X: tf.Tensor, y: tf.Tensor, model: tf.keras.Model, loss_fn: tf.keras.losses.Loss, 
optimizer: tf.keras.optimizers.Optimizer) -> Tuple[tf.Tensor, tf.Tensor]:
  with tf.GradientTape() as tape:
    y_prime = model(X, training=True)
    loss = loss_fn(y, y_prime)

  grads = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(grads, model.trainable_variables))

  return (y_prime, loss)


def fit(dataset: tf.data.Dataset, model: tf.keras.Model, loss_fn: tf.keras.losses.Loss, 
optimizer: tf.optimizers.Optimizer) -> None:
  loss_sum = 0
  correct_item_count = 0
  current_item_count = 0
  print_every = 100
  batch_loss = 0
  batch_index = 0

  for (X, y) in dataset:
    (y_prime, loss) = fit_one_batch(X, y, model, loss_fn, optimizer)

    y = tf.cast(y, tf.int64)
    correct_item_count += (tf.math.argmax(y_prime, axis=1) == y).numpy().sum()

    batch_loss = loss.numpy()
    loss_sum += batch_loss
    current_item_count += len(X)
    batch_index += 1

    if ((batch_index) % print_every == 0):
      batch_accuracy = correct_item_count / current_item_count * print_every
      print(f'[Batch {batch_index:>3d} - {current_item_count:>5d} items] accuracy: {batch_accuracy:>0.1f}%, loss: {batch_loss:>7f}')

  batch_accuracy = correct_item_count / current_item_count * print_every
  print(f'[Batch {batch_index:>3d} - {current_item_count:>5d} items] accuracy: {batch_accuracy:>0.1f}%, loss: {batch_loss:>7f}')


learning_rate = 0.1
batch_size = 64
epochs = 5

(train_dataset, test_dataset) = get_data(batch_size)

model = NeuralNetwork()

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.optimizers.SGD(learning_rate)

print('\nFitting:')
t_begin = time.time()
for epoch in range(epochs):
  print(f'\nEpoch {epoch + 1}\n-------------------------------')
  fit(train_dataset, model, loss_fn, optimizer)
t_elapsed = time.time() - t_begin
print(f'\nTime per epoch: {t_elapsed / epochs :>.3f} sec' )

Notice that you also add a timer, and print the time it takes to train. You can comment and uncomment the `@tf.function` decorator, and notice the difference between the elapsed times. Eager execution can take more than twice the amount of time to train, compared to graph execution.

Now that you've trained your model, you're ready to test it, which you can do by running a single pass forward through the network. The function `evaluate_one_batch` contains the code that does this: you simply need to call the `model` to get a prediction, followed by the loss function `loss_fn` to get a score for how the predicted labels `y_prime` compare to the actual labels `y`. Notice that you don't add a `tf.GradientTape()` this time. That's because, since you don't do a backward pass during testing, you don't need to calculate derivatives for gradient descent. Notice also that we add a `@tf.function` decorator once you're done with development and debugging, to get a performance boost.  

In [4]:
@tf.function
def evaluate_one_batch(X: tf.Tensor, y: tf.Tensor, model: tf.keras.Model, 
loss_fn: tf.keras.losses.Loss) -> Tuple[tf.Tensor, tf.Tensor]:
  y_prime = model(X, training=False)
  loss = loss_fn(y, y_prime)

  return (y_prime, loss)

The `evaluate` function calls the `evaluate_one_batch` function for the entire dataset, once per mini-batch. The important code in the function below is just the `for` loop and the call to `evaluate_one_batch` within it. The rest is just boilerplate code to print progress during execution.

In [5]:
def evaluate(dataset: tf.data.Dataset, model: tf.keras.Model, 
loss_fn: tf.keras.losses.Loss) -> Tuple[float, float]:
  batch_count = 0
  loss_sum = 0
  correct_item_count = 0
  current_item_count = 0

  for (X, y) in dataset:
    (y_prime, loss) = evaluate_one_batch(X, y, model, loss_fn)

    correct_item_count += (tf.math.argmax(y_prime, axis=1).numpy() == y.numpy()).sum()
    loss_sum += loss.numpy()
    current_item_count += len(X)
    batch_count += 1

  average_loss = loss_sum / batch_count
  accuracy = correct_item_count / current_item_count
  return (average_loss, accuracy)

And finally, you print the test loss and accuracy, and save the learned model parameters.

In [None]:
print('\nEvaluating:')
(test_loss, test_accuracy) = evaluate(test_dataset, model, loss_fn)
print(f'Test accuracy: {test_accuracy * 100:>0.1f}%, test loss: {test_loss:>8f}')

model.save_weights('outputs/weights')

The training loss and accuracy should be similar to the values you obtained with the Keras code. 