# Built-in Training Loop: Refresher

This is notebook is aimed at refreshing the training with built-in methods. 

Based on [Training & evaluation with the built-in methods](https://www.tensorflow.org/guide/keras/training_with_built_in_methods) tutorial.


## Setup 

Prepare env:

In [1]:
import os 

# Suppress unwnted TF logs
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# Load compressed models from tensorflow_hub
os.environ['TFHUB_MODEL_LOAD_FORMAT'] = 'COMPRESSED'

# Fix duplicated CUDA paths (only on my current env):
from socket import gethostname
if gethostname() == 'stepan-pc':
    OTHER_PATHS = os.environ['PATH']
    CUDA_12_5_PATH = '/usr/local/cuda-12.5/bin'
    os.environ['PATH']=f'{CUDA_12_5_PATH}:{OTHER_PATHS}'

Then import libs:

In [2]:
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import matplotlib as mpl
import matplotlib.pyplot as plt

## Intro

This tutorial covers the following `Model` API: [Model.compile()](https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile), [Model.fit()](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit), [Model.evaluate()](https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate) and [Model.predict()](https://www.tensorflow.org/api_docs/python/tf/keras/Model#predict)

## Overview: e2e Example

In [3]:
inputs = keras.Input(shape=(784,), name="digits")
dense1 = layers.Dense(units=64, activation=tf.nn.relu, name="dense_1")(inputs)
dense2 = layers.Dense(units=64, activation=tf.nn.relu, name="dense_2")(dense1)
outputs = layers.Dense(10, activation=tf.nn.softmax, name="predictions")(dense2)

model = keras.Model(inputs=inputs, outputs=outputs)

In [4]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

In [7]:
x_train.shape

(60000, 28, 28)

In [9]:
x_train = x_train.reshape(60000, 784).astype("float32") / 255
y_train = y_train.astype("float32")

x_test = x_test.reshape(10000, 784).astype("float32") / 255
y_test = y_test.astype("float32")

val_count = 10000
x_val = x_train[-val_count:]
y_val = y_train[-val_count:]
x_train = x_train[:-val_count]
y_train = y_train[:-val_count]

In [12]:
model.compile(
    optimizer=keras.optimizers.RMSprop(),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

In [14]:
model.fit(
    x_train,
    y_train,
    batch_size=64,
    epochs=10,
    validation_data=(x_val, y_val)
)

Epoch 1/10


I0000 00:00:1719901405.734584   19523 service.cc:145] XLA service 0x75ee540057b0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1719901405.734615   19523 service.cc:153]   StreamExecutor device (0): NVIDIA GeForce RTX 3070, Compute Capability 8.6


[1m351/782[0m [32m━━━━━━━━[0m[37m━━━━━━━━━━━━[0m [1m0s[0m 430us/step - loss: 0.8041 - sparse_categorical_accuracy: 0.7745

I0000 00:00:1719901406.563226   19523 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 1ms/step - loss: 0.5800 - sparse_categorical_accuracy: 0.8363 - val_loss: 0.1825 - val_sparse_categorical_accuracy: 0.9472
Epoch 2/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 611us/step - loss: 0.1830 - sparse_categorical_accuracy: 0.9462 - val_loss: 0.1525 - val_sparse_categorical_accuracy: 0.9533
Epoch 3/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 645us/step - loss: 0.1263 - sparse_categorical_accuracy: 0.9616 - val_loss: 0.1218 - val_sparse_categorical_accuracy: 0.9625
Epoch 4/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 775us/step - loss: 0.0988 - sparse_categorical_accuracy: 0.9702 - val_loss: 0.1019 - val_sparse_categorical_accuracy: 0.9690
Epoch 5/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 743us/step - loss: 0.0759 - sparse_categorical_accuracy: 0.9763 - val_loss: 0.0958 - val_sparse_categorical_accuracy: 0.9709
Epo

<keras.src.callbacks.history.History at 0x75ef7af35cc0>

In [15]:
model.evaluate(x_test, y_test, batch_size=128)

[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 412us/step - loss: 0.1276 - sparse_categorical_accuracy: 0.9667 


[0.11052051931619644, 0.9714000225067139]

In [16]:
model.evaluate(x_val, y_val, batch_size=128)

[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 440us/step - loss: 0.1044 - sparse_categorical_accuracy: 0.9694


[0.09574047476053238, 0.9742000102996826]

In [19]:
model.predict(x_val[:128])

[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 939us/step


array([[1.4352001e-16, 3.5445311e-10, 1.2919582e-10, ..., 4.5783321e-12,
        1.4284174e-07, 7.0319544e-17],
       [1.2047295e-10, 1.0237178e-08, 8.3884754e-07, ..., 5.2512844e-10,
        9.9998844e-01, 2.4112142e-08],
       [3.3136199e-05, 4.6018240e-08, 1.5922182e-05, ..., 2.1999713e-06,
        7.5325293e-09, 9.2212105e-07],
       ...,
       [2.1975108e-08, 1.1580967e-09, 9.8336084e-10, ..., 4.3023838e-09,
        3.9801327e-09, 1.5999518e-07],
       [1.3570403e-06, 5.4941860e-09, 1.4140634e-08, ..., 9.9998832e-01,
        3.8362882e-11, 8.0998780e-06],
       [1.0000000e+00, 1.2819496e-16, 4.0759250e-08, ..., 4.7118742e-10,
        6.4482707e-14, 3.0323299e-11]], dtype=float32)