# Using TensorBoard in Workbenches

Run and open TensorBoard to monitor the training process.

### 1. Import the required libraries and packages.

You can safely ignore the TensorFlow import warnings.
TensorFlow typically produces these warnings on CPU-only environments where the use of accelerators, such as GPUs or certain CPU instruction sets, is limited or not available.

In [None]:
import os
import datetime
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split

### 2. Load and Preprocess the Data

Split the data into train, test, and validation subsets.

In [None]:
data = pd.read_csv('./data/diabetes.csv')

X = data.drop('Outcome', axis=1)
y = data['Outcome']

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.3,
    random_state=0
)

X_test, X_validation, y_test, y_validation = train_test_split(
    X_test,
    y_test,
    test_size=0.3,
    random_state=0
)

print(f"Number of samples in training set: {X_train.shape[0]}")
print(f"Number of samples in test set: {X_test.shape[0]}")

### 4. Create and train the model.

Define the model.
Use `accuracy` as the metric to evaluate the model.

In [None]:
# Seed for reproducible results
tf.random.set_seed(10)
tf.keras.utils.set_random_seed(10)

model = tf.keras.Sequential([
    tf.keras.layers.Input((8,)),
    tf.keras.layers.Dense(20, activation='relu'),
    tf.keras.layers.Dense(2, activation='softmax')
])

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

Define the logging callback.

In [None]:
log_dir = "logs/training/" + datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
logging_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)

Train the model.
Use `validation_data` to validate the model after each epoch.

TensorFlow reports the value of the loss function and the `accuracy` metric for both the training and the validation subsets.

In [None]:
model.fit(
    X_train,
    y_train,
    epochs=100,
    validation_data=(X_validation, y_validation),
    callbacks=[logging_callback]
)

### 5. Run TensorBoard

Load the `tensorboard` notebook extension.

In [None]:
%load_ext tensorboard

Set the `TENSORBOARD_PROXY_URL` environment variable to access TensorBoard from this notebook.

In [None]:
os.environ["TENSORBOARD_PROXY_URL"] = os.getenv("NB_PREFIX") + "/proxy/6006/"

print("TensorBoard URL:", os.environ["TENSORBOARD_PROXY_URL"])

Run the TensorBoard server.

TensorBoard plots accuracy and loss for each epoch, for both the training and validation subsets.

Note that, as the training progresses, the loss value decreases and the accuracy of the model increases.

The fact that the validation subset accuracy remains close to the training accuracy means that the model is generalizing well on samples not seen during training.


In [None]:
%tensorboard --logdir logs/training