# Using Neptune with TensorFlow

In this example, we will use Neptune to log metadata generated from training using TensorFlow.

By the end of this guide, you will be able to
* Track and version the data.
* Log losses and other metrics generated from training.
* Log prediction over multiple epochs.
* Save the generated model with model registry.

[See this example in Neptune](https://app.neptune.ai/common/tensorflow-support/e/TFSUP-101)

## Before you start

This notebook example lets you try out Neptune anonymously, with zero setup.

* If you're running the notebook on your local machine, you need to have [Python](https://www.python.org/downloads/) and [pip](https://pypi.org/project/pip/) installed.
* If you want to see the example logged to your own workspace instead:
    * Create a Neptune account → [Take me to registration](https://neptune.ai/register)
    * Create a Neptune project that you will use for tracking metadata → [Tell me more about projects](https://docs.neptune.ai/administration/projects)

## Install Neptune and dependencies

In [None]:
! pip install -U neptune-client tensorflow numpy requests

## Start a run

To connect your script to Neptune and create a new run, we tell Neptune:
* **Who you are** - with a Neptune API token
* **Where to send your data** - to a Neptune project

The cell below lets you record data to the public project [common/tensorflow-support](https://app.neptune.ai/common/tensorflow-support) as an anonymous user.

In [None]:
import neptune.new as neptune

run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,
    project="common/tensorflow-support",
)

Alternatively, you can log the example to your own workspace.

To do that, replace the code above with the following:

```python
from getpass import getpass

run = neptune.init_run(
    api_token=getpass("Enter your Neptune API token: "),
    project="workspace-name/project-name",  # replace with your own
)
```

For example, if your workspace name is `ml-team` and the project name is `classification`, the project argument is: `project="ml-team/classification"`.

To find your API token and project name, [log in to Neptune](https://app.neptune.ai/).
- In the top-right corner, click your avatar and select **Get your API token**.
- To find and copy your project name, navigate to the project, then click **Settings** → **Properties**.

---

You now have new run in Neptune! From here on, we'll use the `run` object to log metadata.

**To open the run in Neptune, follow the link that appeared in the cell output.**

There's not much to display yet, but keep the tab with the run open to see what happens next.

## Log metadata to Neptune



Import libraries

In [None]:
import io

import requests
import tensorflow as tf
import numpy as np

Download the MNIST Data

In [None]:
response = requests.get(
    "https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz"
)
with open("mnist.npz", "wb") as f:
    f.write(response.content)

In [None]:
# (Neptune) Track and version data files used for training
run["datasets/version"].track_files("mnist.npz")

In [None]:
with np.load("mnist.npz") as data:
    train_examples = data["x_train"]
    train_labels = data["y_train"]
    test_examples = data["x_test"]
    test_labels = data["y_test"]

Parameters for training

In [None]:
params = {
    "batch_size": 1024,
    "shuffle_buffer_size": 100,
    "lr": 0.001,
    "num_epochs": 10,
    "num_visualization_examples": 10,
}

In [None]:
# (Neptune) Log training parameters
run["training/model/params"] = params

Normalize data for training

In [None]:
def normalize_img(image):
    """Normalizes images: `uint8` -> `float32`."""
    return tf.cast(image, tf.float32) / 255.0


train_examples = normalize_img(train_examples)
test_examples = normalize_img(test_examples)

Prepare data for training

In [None]:
train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_examples, test_labels))

train_dataset = train_dataset.shuffle(params["shuffle_buffer_size"]).batch(
    params["batch_size"]
)
test_dataset = test_dataset.batch(params["batch_size"])

Prepare model

In [None]:
# Model
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dense(10),
    ]
)

# Loss
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Optimizer
optimizer = tf.keras.optimizers.Adam(params["lr"])

In [None]:
with io.StringIO() as s:
    model.summary(print_fn=lambda x: s.write(x + "\n"))
    model_summary = s.getvalue()

# (Neptune) Log model summary
run["training/model/summary"] = model_summary

Helper functions for training loop

In [None]:
def loss_and_preds(model, x, y, training):
    # training=training is needed only if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    y_ = model(x, training=training)

    return loss_object(y_true=y, y_pred=y_), y_


def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        loss_value, _ = loss_and_preds(model, inputs, targets, training=True)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)

Training Loop

In [None]:
for epoch in range(params["num_epochs"]):
    epoch_loss_avg = tf.keras.metrics.Mean()
    epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()

    for x, y in train_dataset:
        loss_value, grads = grad(model, x, y)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        epoch_loss_avg.update_state(loss_value)
        epoch_accuracy.update_state(y, model(x, training=True))

    # (Neptune) Log metrics for the epoch
    # Train metrics
    run["training/train/loss"].log(epoch_loss_avg.result())
    run["training/train/accuracy"].log(epoch_accuracy.result())

    # (Neptune) Log test metrics
    test_loss, test_preds = loss_and_preds(model, test_examples, test_labels, False)
    run["training/test/loss"].log(test_loss)
    acc = epoch_accuracy(test_labels, test_preds)
    run["training/test/accuracy"].log(acc)

    # (Neptune) Log test prediction
    for idx in range(params["num_visualization_examples"]):
        np_image = test_examples[idx].numpy().reshape(28, 28)
        image = neptune.types.File.as_image(np_image)
        pred_label = test_preds[idx].numpy().argmax()
        true_label = test_labels[idx]
        run[f"training/visualization/epoch_{epoch}"].log(
            image, description=f"pred={pred_label} | actual={true_label}"
        )

    if epoch % 5 == 0 or epoch == (params["num_epochs"] - 1):
        print(
            "Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(
                epoch, epoch_loss_avg.result(), epoch_accuracy.result()
            )
        )

Tracking model with Neptune model registry

Refer to the [documentation](https://neptune.ai/product/model-registry) for more information.

In [None]:
# (Neptune) Create a model_version object
model_version = neptune.init_model_version(
    model="TFSUP-TFMOD",
    project="common/tensorflow-support",
    api_token=neptune.ANONYMOUS_API_TOKEN,
)

In [None]:
# (Neptune) Log metadata to model version
model_version["run_id"] = run["sys/id"]
model_version["metrics/test_loss"] = test_loss
model_version["metrics/test_accuracy"] = acc
model_version["datasets/version"].track_files("mnist.npz")

In [None]:
# Saves model artifacts to "weights" folder
model.save("weights")

# (Neptune) Log model artifacts
model_version["model/weights"].upload_files("weights/*")

## Stop logging

Once you are done logging, stop tracking the run.

In [None]:
run.stop()
model_version.stop()

## Explore the results in Neptune

You can also check out an [example run](https://app.neptune.ai/common/tensorflow-support/e/TFSUP-101) and the corresponding [model version](https://app.neptune.ai/common/tensorflow-support/m/TFSUP-TFMOD/v/TFSUP-TFMOD-112/metadata).