# How to use Neptune in HPO training job

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/how-to-guides/neptune-hpo/notebooks/Neptune_hpo.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>
<a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/how-to-guides/neptune-hpo/notebooks/Neptune_hpo.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a>
<a target="_blank" href="https://app.neptune.ai/o/common/org/showroom/runs/details?viewId=standard-view&detailsTab=charts&shortId=SHOW-27624"> 
  <img alt="Explore in Neptune" src="https://neptune.ai/wp-content/uploads/2024/01/neptune-badge.svg">
</a>
<a target="_blank" href="https://docs.neptune.ai/tutorials/hpo/">
  <img alt="View tutorial in docs" src="https://neptune.ai/wp-content/uploads/2024/01/docs-badge-2.svg">
</a>

## Introduction

When running a hyperparameter optimization job, you can use Neptune to track all the metadata from the study and each trial.

In this guide, you'll learn how to configure Neptune to track the metadata of your hyperparameter optimization job.

## Before you start

This notebook example lets you try out Neptune anonymously, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [None]:
! pip install -U neptune numpy torch torchvision tqdm

## Import libraries

In [None]:
import neptune
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from tqdm.auto import trange
from functools import reduce
from neptune.utils import stringify_unsupported

## Log metadata across HPO trials into a single run

Create a global Neptune run to log metadata (i.e. metrics) across different trials.

To create a new run for tracking the metadata, you tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in a public project. **Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### Log to your own project instead

Replace the code below with the following:

```python
import neptune
from getpass import getpass

run = neptune.init_run(
    project="workspace-name/project-name",  # replace with your own (see instructions below)
    api_token=getpass("Enter your Neptune API token: "),
    tags=["sweep-level"],  # to identify a run that contains multiple trials
)
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. The workspace name is displayed in the top-left corner of the app. 

    To copy the project path, in the top-right corner, open the settings menu and select **Properties**.

For more help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [None]:
run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,
    project="common/pytorch-integration",
    tags=["sweep-level"],  # to identify a run that contains multiple trials
)

**To open the run in the Neptune web app, click the link that appeared in the cell output.**

We'll use the `run` object we just created to log metadata. You'll see the metadata appear in the app.

### Hyperparameters

In [None]:
parameters = {
    "batch_size": 128,
    "epochs": 1,
    "input_size": (3, 32, 32),
    "n_classes": 10,
    "dataset_size": 1000,
    "model_filename": "basemodel",
    "device": torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
}

input_size = reduce(lambda x, y: x * y, parameters["input_size"])

### Hyperparameter search space

In [None]:
learning_rates = [1e-4, 1e-3, 1e-2]  # learning rate choices

## Model

In [None]:
class BaseModel(nn.Module):
    def __init__(self, input_size, hidden_dim, n_classes):
        super(BaseModel, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_size, hidden_dim * 2),
            nn.ReLU(),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Linear(hidden_dim // 2, n_classes),
        )
        self.input_size = input_size

    def forward(self, input):
        x = input.view(-1, self.input_size)
        return self.main(x)

In [None]:
model = BaseModel(
    input_size,
    input_size,
    parameters["n_classes"],
).to(parameters["device"])
criterion = nn.CrossEntropyLoss()

## Dataset

In [None]:
data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
}

In [None]:
trainset = datasets.FakeData(
    size=parameters["dataset_size"],
    image_size=parameters["input_size"],
    num_classes=parameters["n_classes"],
    transform=data_tfms["train"],
)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=parameters["batch_size"], shuffle=True, num_workers=0
)

## Training Loop

In [None]:
for i, lr in enumerate(learning_rates):
    # Log hyperparameters
    run[f"trials/{i}/parms"] = stringify_unsupported(parameters)
    run[f"trials/{i}/parms/lr"] = lr

    optimizer = optim.SGD(model.parameters(), lr=lr)
    for _ in trange(parameters["epochs"]):
        for x, y in trainloader:
            x, y = x.to(parameters["device"]), y.to(parameters["device"])
            optimizer.zero_grad()
            outputs = model.forward(x)
            loss = criterion(outputs, y)

            _, preds = torch.max(outputs, 1)
            acc = (torch.sum(preds == y.data)) / len(x)

            # Log metrics
            run[f"trials/{i}/training/batch/loss"].append(loss)
            run[f"trials/{i}/training/batch/acc"].append(acc)

            loss.backward()
            optimizer.step()

# Stop logging
run.stop()

## Log metadata from each HPO trial into separate runs

Create local Neptune runs to log metrics from each trial into a separate run.

### Training Loop

In [None]:
for i, lr in enumerate(learning_rates):
    # Create a new run
    run = neptune.init_run(
        api_token=neptune.ANONYMOUS_API_TOKEN,
        project="common/pytorch-integration",
        name=f"trial-{i}",
        tags=["trial-level"],  # to indicate that the run only contains results from a single trial
    )

    # Log hyperparameters
    run["parms"] = stringify_unsupported(parameters)
    run["parms/lr"] = lr

    for _ in trange(parameters["epochs"]):
        for x, y in trainloader:
            x, y = x.to(parameters["device"]), y.to(parameters["device"])
            optimizer.zero_grad()
            outputs = model.forward(x)
            loss = criterion(outputs, y)

            _, preds = torch.max(outputs, 1)
            acc = (torch.sum(preds == y.data)) / len(x)

            # Log metrics
            run["training/batch/loss"].append(loss)
            run["training/batch/acc"].append(acc)

            loss.backward()
            optimizer.step()

    # Stop logging
    run.stop()

## Explore the results in Neptune
Follow the link to the run and explore metadata (such as metrics and hyperparameters) that were logged to the run in Neptune.

You can also check out these example runs:
- [Log metadata across HPO trials into a single run](https://app.neptune.ai/o/common/org/pytorch-integration/e/PYTOR1-1025/all)
- [Log metadata from each HPO trial into separate runs](https://app.neptune.ai/o/common/org/pytorch-integration/experiments?split=tbl&dash=Loss-vs-Accuracy-bf72be6c-d771-457f-8f51-30fef2bee3d5&viewId=97f3180c-a819-4054-99d7-d62dac102450)