# How to use Neptune in HPO training job

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/how-to-guides/neptune-hpo/notebooks/Neptune_hpo.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>
<a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/how-to-guides/neptune-hpo/notebooks/Neptune_hpo.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a>
<a target="_blank" href="https://app.neptune.ai/o/showcase/org/hpo/runs/table?viewId=9ca5a860-361e-4b3e-aae8-ddd8c5454cba&detailsTab=dashboard&dash=table&type=run"> 
  <img alt="Explore in Neptune" src="https://neptune.ai/wp-content/uploads/2024/01/neptune-badge.svg">
</a>
<a target="_blank" href="https://docs-legacy.neptune.ai/tutorials/hpo/">
  <img alt="View tutorial in docs" src="https://neptune.ai/wp-content/uploads/2024/01/docs-badge-2.svg">
</a>

## Introduction

When running a hyperparameter optimization job, you can use Neptune to track all the metadata from the study and each trial.

In this guide, you'll learn how to configure Neptune to track the metadata of your hyperparameter optimization job.

## Before you start

This notebook example lets you try out Neptune anonymously, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs-legacy.neptune.aitune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [None]:
! pip install -qU neptune numpy torch torchvision tqdm "numpy<2.0"

## Import libraries

In [None]:
import neptune
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from tqdm.auto import trange
from functools import reduce
from neptune.utils import stringify_unsupported

## Hyperparameters

In [None]:
parameters = {
    "batch_size": 128,
    "epochs": 2,
    "input_size": (3, 32, 32),
    "n_classes": 10,
    "dataset_size": 1000,
    "model_filename": "basemodel",
    "device": torch.device("cuda:0" if torch.cuda.is_available() else "cpu"),
}

input_size = reduce(lambda x, y: x * y, parameters["input_size"])

### Hyperparameter search space

In [None]:
learning_rates = [0.01, 0.05, 0.1]  # learning rate choices

## Model

In [None]:
class BaseModel(nn.Module):
    def __init__(self, input_size, hidden_dim, n_classes):
        super(BaseModel, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_size, hidden_dim * 2),
            nn.ReLU(),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Linear(hidden_dim // 2, n_classes),
        )
        self.input_size = input_size

    def forward(self, input):
        x = input.view(-1, self.input_size)
        return self.main(x)

In [None]:
model = BaseModel(
    input_size,
    input_size,
    parameters["n_classes"],
).to(parameters["device"])

criterion = nn.CrossEntropyLoss()

## Dataset

In [None]:
data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
}

In [None]:
trainset = datasets.FakeData(
    size=parameters["dataset_size"],
    image_size=parameters["input_size"],
    num_classes=parameters["n_classes"],
    transform=data_tfms["train"],
)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=parameters["batch_size"], shuffle=True, num_workers=0
)

## Log metadata across HPO trials into a single run

Create a global Neptune run to log metadata across different trials.

To connect to the Neptune app, you need to tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in the public project [common/hpo](https://app.neptune.ai/common/hpo).  

**Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

#### Log to your own project instead

Replace the code below with the following:

```python
import os
from getpass import getpass

os.environ["NEPTUNE_API_TOKEN"]=getpass("Enter your Neptune API token: ")
os.environ["NEPTUNE_PROJECT"]="workspace-name/project-name",  # Replace with your workspace and project names
```

In [None]:
import os

os.environ["NEPTUNE_API_TOKEN"] = neptune.ANONYMOUS_API_TOKEN
os.environ["NEPTUNE_PROJECT"] = "common/hpo"

In [None]:
run = neptune.init_run(tags=["notebook"])

**To view the newly created run and its metadata in the Neptune app, use the link that appeared in the cell output.**



### Training loop

In [None]:
for i, lr in enumerate(learning_rates):
    # Log hyperparameters
    run[f"trials/{i}/params"] = stringify_unsupported(parameters)
    run[f"trials/{i}/params/lr"] = lr

    optimizer = optim.SGD(model.parameters(), lr=lr)

    # Initialize fields for best values across all trials
    best_loss = None

    for _ in trange(parameters["epochs"]):
        for x, y in trainloader:
            x, y = x.to(parameters["device"]), y.to(parameters["device"])
            optimizer.zero_grad()
            outputs = model.forward(x)
            loss = criterion(outputs, y)

            _, preds = torch.max(outputs, 1)
            acc = (torch.sum(preds == y.data)) / len(x)

            # Log trial metrics
            run[f"trials/{i}/metrics/batch/loss"].append(loss)
            run[f"trials/{i}/metrics/batch/acc"].append(acc)

            # Log best values across all trials
            if best_loss is None or loss < best_loss:
                run["best/trial"] = i
                run["best/metrics/loss"] = best_loss = loss
                run["best/metrics/acc"] = acc
                run["best/params"] = stringify_unsupported(parameters)
                run["best/params/lr"] = lr

            loss.backward()
            optimizer.step()

### Stop logging

In [None]:
run.stop()

### Explore the results in Neptune
Follow the link to the run and explore the logged metadata (such as metrics and hyperparameters) in Neptune:

- The best trial, with its metrics and parameters, is available in the *best* namespace
- Metadata across all trials is available in the *trials* namespace

To organize all relevant metadata in one view, create a [custom dashboard](https://docs-legacy.neptune.aitune.ai/app/custom_dashboard/). [See an example](https://app.neptune.ai/o/showcase/org/hpo/runs/details?viewId=9ca5a9f2-e889-435c-a6f4-77cc41886832&detailsTab=dashboard&dashboardId=9ca5aa39-24cd-43bf-8cef-07aae8b4478b&shortId=HPO-1&type=run).

You can also create [saved table views](https://docs-legacy.neptune.aitune.ai/app/experiments/#custom-views) to view best trials across different runs. An example is available [here](https://app.neptune.ai/o/showcase/org/hpo/runs/table?viewId=9ca5a9f2-e889-435c-a6f4-77cc41886832&detailsTab=dashboard&dash=table&type=run).

## Log metadata from each HPO trial into separate runs

You can also log metadata from each trial into separate runs. This way, you can track metadata from each trial separately.  
Aggregated values can be logged to a parent sweep-level run. Sweep-level identifiers can be used to group all trials from the same sweep.

### Create a sweep-level identifier

In [None]:
import uuid

sweep_id = str(uuid.uuid4())

### Initialize sweep-level run

In [None]:
sweep_run = neptune.init_run(
    tags=["notebook", "sweep-level"],
)

### Assign sweep_id to sweep-level run as a group tag


In [None]:
sweep_run["sys/group_tags"].add(sweep_id)

### Training Loop

In [None]:
for i, lr in enumerate(learning_rates):
    # Create trial-level run
    with neptune.init_run(
        name=f"trial-{i}",
        tags=[
            "notebook",
            "trial-level",
        ],  # to indicate that the run only contains results from a single trial
    ) as trial_run:
        # Add sweep_id to the trial-level run
        trial_run["sys/group_tags"].add(sweep_id)

        # Log hyperparameters
        trial_run["params"] = stringify_unsupported(parameters)
        trial_run["params/lr"] = lr

        optimizer = optim.SGD(model.parameters(), lr=lr)

        # Initialize fields for best values across all trials
        best_loss = None

        for _ in trange(parameters["epochs"]):
            for x, y in trainloader:
                x, y = x.to(parameters["device"]), y.to(parameters["device"])
                optimizer.zero_grad()
                outputs = model.forward(x)
                loss = criterion(outputs, y)

                _, preds = torch.max(outputs, 1)
                acc = (torch.sum(preds == y.data)) / len(x)

                # Log trial metrics
                trial_run["metrics/batch/loss"].append(loss)
                trial_run["metrics/batch/acc"].append(acc)

                # Log best values across all trials to sweep-level run
                if best_loss is None or loss < best_loss:
                    sweep_run["best/trial"] = i
                    sweep_run["best/metrics/loss"] = best_loss = loss
                    sweep_run["best/metrics/acc"] = acc
                    sweep_run["best/params"] = stringify_unsupported(parameters)
                    sweep_run["best/params/lr"] = lr

                loss.backward()
                optimizer.step()

### Stop the sweep-level run

In [None]:
sweep_run.stop()

### Explore the results in Neptune
Follow the link to the runs and explore the logged metadata (such as metrics and hyperparameters) in Neptune:

- The best trial, with its metrics and parameters, is available in the *best* namespace of the sweep-level run
- Metadata across all trials are available in the trial-level runs

To group all trials under a sweep, use the [run groups](https://docs-legacy.neptune.aitune.ai/usage/groups/). [See an example](https://app.neptune.ai/o/showcase/org/hpo/runs/table?viewId=9ca5a860-361e-4b3e-aae8-ddd8c5454cba&detailsTab=dashboard&dash=table&type=run).