# Neptune + PyTorch

Introduction

This guide will show you how to:

* Create a NeptuneLogger()
* Log training metrics to Neptune using NeptuneLogger()
* Upload model checkpoints to Neptune using NeptuneLogger()
* Log model predictions to Neptune using NeptuneLogger()

## Before you start

This notebook example lets you try out Neptune anonymously, with zero setup.

* If you're running the notebook on your local machine, you need to have [Python](https://www.python.org/downloads/) and [pip](https://pypi.org/project/pip/) installed.
* If you want to see the example logged to your own workspace instead:
    * Create a Neptune account → [Take me to registration](https://neptune.ai/register)
    * Create a Neptune project that you will use for tracking metadata → [Tell me more about projects](https://docs.neptune.ai/administration/projects)

## Install Neptune and dependencies

In [None]:
%pip install -U neptune[pytorch] numpy torch torchvision torchviz

## Start a run

To create a new run for tracking the metadata, we tell Neptune:
* **Who you are** - with your Neptune API token
* **Where to send the metadata** - to your Neptune project

For example, if your workspace name is `ml-team` and the project name is `classification`, the project argument is: `project="ml-team/classification"`.

To find your API token and project name, [log in to Neptune](https://app.neptune.ai/).
- In the bottom-left corner, expand your user menu and select **Get your API token**.
- To copy the project path, in the top-right corner, open the settings menu and select **Properties**.


In [None]:
import neptune

run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,  # replace with your own
    project="common/pytorch-integration",  # replace with your own
)

You now have new run in Neptune! From here on, we'll use the `run` object to log metadata.

**To open the run in Neptune, click on the link that appeared in the cell output.**

There's not much to display yet, but keep the tab with the run open to see what happens next.

### Imports

In [None]:
import torch
from torch import nn
from torch import optim
from torchvision import transforms, datasets
import numpy as np

### Hyperparameters for training

In [None]:
parameters = {
    "lr": 1e-2,
    "bs": 128,
    "input_sz": 32 * 32 * 3,
    "n_classes": 10,
    "model_filename": "basemodel",
    "device": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
    "epochs": 2,
}

### Model

In [None]:
class Model(nn.Module):
    def __init__(self, input_sz, hidden_dim, n_classes):
        super(Model, self).__init__()
        self.seq_model = nn.Sequential(
            nn.Linear(input_sz, hidden_dim * 2),
            nn.ReLU(),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Linear(hidden_dim // 2, n_classes),
        )

    def forward(self, input):
        x = input.view(-1, 32 * 32 * 3)
        return self.seq_model(x)


model = Model(parameters["input_sz"], parameters["input_sz"], parameters["n_classes"]).to(
    parameters["device"]
)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=parameters["lr"])

### Download and transform the data for training

In [None]:
data_dir = "data/CIFAR10"
compressed_ds = "./data/CIFAR10/cifar-10-python.tar.gz"
data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    ),
    "val": transforms.Compose(
        [
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    ),
}

trainset = datasets.CIFAR10(data_dir, transform=data_tfms["train"], download=True)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=parameters["bs"], shuffle=True, num_workers=0
)
validset = datasets.CIFAR10(data_dir, train=False, transform=data_tfms["train"], download=True)
validloader = torch.utils.data.DataLoader(validset, batch_size=parameters["bs"], num_workers=0)

classes = [
    "airplane",
    "automobile",
    "bird",
    "cat",
    "deer",
    "dog",
    "frog",
    "horse",
    "ship",
    "truck",
]

### (Neptune) Create NeptuneLogger

In [None]:
from neptune_pytorch import NeptuneLogger

npt_logger = NeptuneLogger(
    run, model=model, log_model_diagram=True, log_gradients=True, log_parameters=True, log_freq=30
)

### (Neptune) Log hyperparams

In [None]:
from neptune.utils import stringify_unsupported

# (Neptune) The base_namespace attribute of the logger can be used to log metadata consistently
# under the 'base_namespace' namespace.
run[npt_logger.base_namespace]["hyperparams"] = stringify_unsupported(parameters)

### (Neptune) Log metrics while training

In [None]:
for epoch in range(parameters["epochs"]):
    for i, (x, y) in enumerate(trainloader, 0):
        x, y = x.to(parameters["device"]), y.to(parameters["device"])
        optimizer.zero_grad()
        outputs = model(x)
        _, preds = torch.max(outputs, 1)
        loss = criterion(outputs, y)
        acc = (torch.sum(preds == y.data)) / len(x)

        # Log after every 30 steps
        if i % 30 == 0:
            run[npt_logger.base_namespace]["batch/loss"].append(loss.item())
            run[npt_logger.base_namespace]["batch/acc"].append(acc.item())

        loss.backward()
        optimizer.step()

    # Checkpoint number is automatically incremented on subsequent call.
    # Call 1 -> ckpt_1.pt
    # Call 2 -> ckpt_2.pt
    # npt_logger.save_checkpoint()  # uncomment to save checkpoint

### (Neptune) Log prediction from model

In [None]:
from neptune.types import File

dataiter = iter(validloader)
images, labels = next(dataiter)

# Predict batch of n_samples
n_samples = 10
imgs = images[:n_samples].to(parameters["device"])
probs = torch.nn.functional.softmax(model(imgs), dim=1)

# Decode probs and Log tensors as image
for i, ps in enumerate(probs):
    pred = classes[torch.argmax(ps)]
    ground_truth = classes[labels[i]]
    description = f"pred: {pred} | ground truth: {ground_truth}"

    # Log Series of Tensors as Image and Predictions.
    run[npt_logger.base_namespace]["predictions"].append(
        File.as_image(imgs[i].cpu().squeeze().permute(2, 1, 0).clip(0, 1)),
        name=f"{i}_{pred}_{ground_truth}",
        description=description,
    )

### (Neptune) Save final model

In [None]:
# Save final model as "model.pt"
# npt_logger.save_model("model")  # uncomment to save final model

## Stop logging

Once you are done logging, stop tracking the run.

In [None]:
run.stop()

## Explore the results in Neptune

Go to the run link and explore metadata (metrics, params, predictions) that were logged to the run in Neptune.

You can also check out an [example run](https://app.neptune.ai/o/common/org/pytorch-integration/runs/details?viewId=standard-view&detailsTab=dashboard&dashboardId=Training-Overview-9920962e-ff6a-4dea-b551-88006799b116&shortId=PYTOR1-7046).