# How to present CV with Neptune

## Introduction

When training models with cross-validation, you can use Neptune namespaces to organize, visualize and compare models.

By the end of this guide, you will learn how to organize your run to track cross-validation metadata, so that you can easily analyze the results.

[See this example in Neptune](https://app.neptune.ai/o/common/org/showroom/e/SHOW-3700)

[![image](https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MT0sYKbymfLAAtTq4-t%2Fuploads%2FLRq4l78xEp31FFS5WzFM%2Fimage.png?alt=media&token=be868c83-7b1b-4314-b138-677531720f0e)](https://app.neptune.ai/o/common/org/showroom/e/SHOW-3700)
<center><small>CV results presented in Neptune UI</small></center>



## Before you start

Make sure that you have:
* [Python 3.7+ installed](https://www.python.org/downloads/),
* [Basic familiarity with Neptune (create run and log metadata to it)](hhttps://docs.neptune.ai/you-should-know/what-can-you-log-and-display),
* Familiarity with cross-validation techniques in machine learning.

**Install dependencies**

In [None]:
! pip install neptune-client torch==1.10.2 torchvision==0.11.3 scikit-learn==1.0.2 #add other dependencies

## Step 1: Create a Neptune *Run*

To log metadata to the Neptune project, you need the `project name` and the `api_token`.

To make this example easy to follow, we have created a public project **'common/showroom'** and a shared user **'neptuner'** with the API token **'ANONYMOUS'**. As you will see in the code cell below.

**(Optional)** If you want to log to your own project you have to have or create [Neptune account](https://app.neptune.ai/register/) and [project](https://docs.neptune.ai/getting-started/installation#setting-the-project-name).
Then you can pass [project](https://docs.neptune.ai/getting-started/installation#setting-the-project-name) and [api_token](https://docs.neptune.ai/getting-started/installation#authentication-neptune-api-token) arguments to the `init()` method.

`run = neptune.init(api_token='<YOUR_API_TOKEN>', project='<YOUR_WORKSPACE/YOUR_PROJECT>')` 


In [None]:
import neptune.new as neptune

run = neptune.init(
    project="common/showroom",
    api_token="ANONYMOUS",
    tags=["Colab Notebook", "cross-validation"],
)

Running this cell creates a [Run in Neptune](https://app.neptune.ai/o/common/org/showroom/e/SHOW-3700), and you can log model building metadata to it.

**Click on the link above to open the Run in Neptune UI.** 

For now, it is empty, but you should keep the tab open to see what happens next.

## Step 2: Log config and hyperparameters

### Log Hyperparameters

In [None]:
parameters = {
    "epochs": 3,
    "learning_rate": 1e-2,
    "batch_size": 10,
    "input_size": 32 * 32 * 3,
    "n_classes": 10,
    "k_folds": 5,
    "model_name": "checkpoint.pth",
    "seed": 42,
}

In [None]:
run["global/params"] = parameters

### Log Config
Model and Dataset

In [None]:
import torch.nn as nn


class BaseModel(nn.Module):
    def __init__(self, input_sz, hidden_dim, n_classes):
        super(BaseModel, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_sz, hidden_dim * 2),
            nn.ReLU(),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Linear(hidden_dim // 2, n_classes),
        )

    def forward(self, input):
        x = input.view(-1, 32 * 32 * 3)
        return self.main(x)

In [None]:
import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [None]:
torch.manual_seed(parameters["seed"])
model = BaseModel(
    parameters["input_size"], parameters["input_size"], parameters["n_classes"]
).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=parameters["learning_rate"])

Log model, criterion and optimizer name

In [None]:
run["global/config/model"] = type(model).__name__
run["global/config/criterion"] = type(criterion).__name__
run["global/config/optimizer"] = type(optimizer).__name__

In [None]:
from torchvision import datasets, transforms

data_dir = "data/CIFAR10"
compressed_ds = "./data/CIFAR10/cifar-10-python.tar.gz"
data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
}

In [None]:
trainset = datasets.CIFAR10(data_dir, transform=data_tfms["train"], download=True)

validset = datasets.CIFAR10(
    data_dir, train=False, transform=data_tfms["train"], download=True
)

dataset_size = len(trainset)

Log dataset details

In [None]:
run["global/dataset/CIFAR-10"].track_files(data_dir)
run["global/dataset/dataset_transforms"] = data_tfms
run["global/dataset/dataset_size"] = dataset_size

## Step 3: Log losses and metrics per fold 
Training Loop

In [None]:
from sklearn.model_selection import KFold

splits = KFold(n_splits=parameters["k_folds"], shuffle=True)
epoch_acc_list, epoch_loss_list = [], []

In [None]:
from torch.utils.data import SubsetRandomSampler, DataLoader

for fold, (train_ids, _) in enumerate(splits.split(trainset)):
    train_sampler = SubsetRandomSampler(train_ids)
    train_loader = DataLoader(
        trainset, batch_size=parameters["batch_size"], sampler=train_sampler
    )
    for epoch in range(parameters["epochs"]):
        epoch_acc, epoch_loss = 0, 0.0
        for x, y in train_loader:
            x, y = x.to(device), y.to(device)
            optimizer.zero_grad()
            outputs = model.forward(x)
            _, preds = torch.max(outputs, 1)
            loss = criterion(outputs, y)
            acc = (torch.sum(preds == y.data)) / len(x)

            # Log batch loss and acc
            run[f"fold_{fold}/training/batch/loss"].log(loss)
            run[f"fold_{fold}/training/batch/acc"].log(acc)

            loss.backward()
            optimizer.step()

        epoch_acc += torch.sum(preds == y.data).item()
        epoch_loss += loss.item() * x.size(0)

    epoch_acc_list.append((epoch_acc / len(train_loader.sampler)) * 100)
    epoch_loss_list.append(epoch_loss / len(train_loader.sampler))

    # Log model checkpoint
    torch.save(model.state_dict(), f"./{parameters['model_name']}")
    run[f"fold_{fold}/checkpoint"].upload(parameters["model_name"])

In [None]:
from statistics import mean

# log global acc and loss
run["global/metrics/train/mean_acc"] = mean(epoch_acc_list)
run["global/metrics/train/mean_loss"] = mean(epoch_loss_list)

## Stop run

**Warning**

Once you are done logging, you should stop tracking the run using the `stop()` method.
This is needed only while logging from a notebook environment. While logging through a script, Neptune automatically stops tracking once the script has completed execution.

In [None]:
run.stop()

## Explore the run in the Neptune UI

After running the code cell in **Step 1**, you will get a link on the cell output similar to https://app.neptune.ai/common/showroom/e/SHOW-3700 with: 
* **common/showroom** replaced by **your_workspace/your_project**,
* **SHOW-3700** replaced by your *Run ID*. 

**Click on the link to open the Run in Neptune UI.**

![image](https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MT0sYKbymfLAAtTq4-t%2Fuploads%2FnP5NQ7TUcuqr47cCERGk%2Fper_fold_metadata.gif?alt=media&token=1f851480-3881-4320-8b67-fb0bcc3e0bce)
<center><small>Analysing per-fold metadata</small></center>

## Cross-validation with Integrations
If you are using Neptune with XGBoost or LightGBM you can get the structure for cross-validation automatically, by using available integrations.
<div style="position: relative; padding-bottom: 62.5%; height: 0;"><iframe src="https://www.loom.com/embed/98dc6247c65f49b8baf7476cf996dbe4" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>

## Conclusion

You learned how to organize your run to track cross-validation metadata with Neptune and how to present the result in the Neptune UI for further comparison and analysis. 

Visit our docs for more tutorials and guides on how to use Neptune: https://docs.neptune.ai
