# How to reproduce a Neptune run

## Introduction
When building ML models for research or production, it's crucial to be able to reproduce a run to validate its results and performance. With Neptune, you can reproduce any run by retrieving the same metadata - such as hyperparameters, data, and code version.

In this guide, we'll show you how to re-open an existing Neptune run to retrieve the metadata required for reproducing it. 

[See this example in Neptune](https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5234/metadata)



## Before you start

Make sure that you have:
* [Python 3.7+ installed](https://www.python.org/downloads/),
* [Basic familiarity with Neptune (create run and log metadata to it)](https://docs.neptune.ai/usage/#getting-started)

In [2]:
! pip install -U neptune torch torchvision

Collecting neptune
  Downloading neptune-1.0.0-py3-none-any.whl (442 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m443.0/443.0 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Installing collected packages: neptune
Successfully installed neptune-1.0.0


## Step 1: Get run ID
You will get the run ID of the failed run **programmatically**.

**Note**: To log or retrieve metadata from Neptune, you need the project name and the API token

To make this example easy to follow, we'll log the metadata to the public project **'common/showroom'** using a shared token for anonymous logging.

**(Optional)** If you want to log to your own project, you need a [Neptune account](https://app.neptune.ai/register/) and a [project](https://docs.neptune.ai/setup/creating_project).
Then you can pass [project](https://docs.neptune.ai/setup/creating_project/#next-steps) and [api_token](https://docs.neptune.ai/setup/setting_api_token/#setting-your-api-token) arguments to the `init_run()` method.

`run = neptune.init_run(api_token='YOUR_API_TOKEN', project='YOUR_WORKSPACE/YOUR_PROJECT')` 


In [18]:
project_name="common/pytorch-integration"

In [19]:
import neptune

# Fetch project
project = neptune.init_project(
    project=project_name, api_token=neptune.ANONYMOUS_API_TOKEN, mode="read-only"
)

# Fetch only inactive runs with tag "showcase-run"
runs_table_df = project.fetch_runs_table(state="inactive", tag=["showcase-run", "reproduce", "Basic script"]).to_pandas()

# Extract the last successful run's id
old_run_id = runs_table_df[runs_table_df["sys/failed"] == False]["sys/id"].values[0]

https://app.neptune.ai/common/pytorch-integration/


In [20]:
print("old_run_id = ", old_run_id)

old_run_id =  PYTOR1-5233


## Step 2: Resume old run
Use the `neptune.init_run()` method to:
* Re-open a run using the ID you got from the previous step 
* Re-open it in the `read-only` mode

Use the `read-only` mode so the metadata previously logged to the run is not accidentally changed. Also, you can re-open a run as many times as needed.

**(Optional)** If you already have a [Neptune account](https://app.neptune.ai/register/) you can pass your credentials to **[project](https://docs.neptune.ai/getting-started/installation#setting-the-project-name)** and **[api_token](https://docs.neptune.ai/getting-started/installation#authentication-neptune-api-token)** arguments of neptune.init_run()

```python
from getpass import getpass

run = neptune.init_run(
    api_token=getpass("Enter your Neptune API token: "),
    project="workspace-name/project-name",  # replace with your own
) 
```

In [21]:
old_run = neptune.init_run(
    project=project_name,
    api_token=neptune.ANONYMOUS_API_TOKEN,
    with_id=old_run_id,
    mode="read-only",
)

https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5233


## Step 3: Fetch relevant metadata from Neptune

Fetch metadata (i.e., dataset and hyperparameters) needed to re-run the training. Precisely, you will download the hyperparameters and dataset path used in the old run to instantiate a model and dataset objects with the same configuration.

To do that:

Use the [fetch()](https://docs.neptune.ai/api/field-types/#fetch-1) method to retrieve relevant metadata

In [23]:
# Fetch hyperparameters
old_run_params = old_run["config/params"].fetch()

In [25]:
# Fetch dataset path
dataset_path = old_run["config/dataset/path"].fetch()

## Step 4: Create a new run
Create a new Neptune run that will be used to log metadata in the re-run session.

In [26]:
new_run = neptune.init_run(
    project=project_name,
    api_token=neptune.ANONYMOUS_API_TOKEN,
    tags=["reproduce", "new-run", "showcase-run"],
)

  new_run = neptune.init_run(


https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5234


Running this cell creates a run in Neptune, and you can log model building metadata to it.

**Click on the link above to open the run in the Neptune app.** 

For now, it is empty, but you should keep the tab open to see what happens next.

## Step 5: Log Hyperparameters and Dataset details from failed run to new run
Now you can continue working and logging metadata to a brand new run.
You can log metadata using the Neptune API Client. For details, see [What you can log and display](https://docs.neptune.ai/logging/what_you_can_log).

In [27]:
new_run["config/params"] = old_run_params
new_run["config/dataset/path"] = dataset_path

### Load dataset and model

Dataset

In [28]:
import torch
from torchvision import datasets, transforms

data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    ),
}

In [29]:
trainset = datasets.CIFAR10(dataset_path, transform=data_tfms["train"], download=True)

trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=old_run_params["bs"], shuffle=True, num_workers=0
)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/CIFAR10/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting data/CIFAR10/cifar-10-python.tar.gz to data/CIFAR10


Model

In [30]:
import torch.nn as nn


class BaseModel(nn.Module):
    def __init__(self, input_sz, hidden_dim, n_classes):
        super(BaseModel, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_sz, hidden_dim * 2),
            nn.ReLU(),
            nn.Linear(hidden_dim * 2, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Linear(hidden_dim // 2, n_classes),
        )

    def forward(self, input):
        x = input.view(-1, 32 * 32 * 3)
        return self.main(x)

In [32]:
model = BaseModel(
    old_run_params["input_sz"],
    old_run_params["input_sz"],
    old_run_params["n_classes"],
)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=old_run_params["lr"])

### Log losses and metrics

In [33]:
for i, (x, y) in enumerate(trainloader, 0):
    optimizer.zero_grad()
    outputs = model.forward(x)
    _, preds = torch.max(outputs, 1)
    loss = criterion(outputs, y)
    acc = (torch.sum(preds == y.data)) / len(x)

    new_run["training/batch/loss"].append(loss)

    new_run["training/batch/acc"].append(acc)

    loss.backward()
    optimizer.step()

## Stop logging

Once you are done logging, stop tracking the run.

In [34]:
old_run.stop()
new_run.stop()

Shutting down background jobs, please wait a moment...
Done!
Explore the metadata in the Neptune app:
https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5233/metadata
Shutting down background jobs, please wait a moment...
Done!
Waiting for the remaining 50 operations to synchronize with Neptune. Do not kill this process.
All 50 operations synced, thanks for waiting!
Explore the metadata in the Neptune app:
https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5234/metadata


## Explore the run in the Neptune app

After running the code cell in **Step 4**, you will get a link on the cell output similar to https://app.neptune.ai/common/pytorch-integration/e/PYTOR1-5234/metadata with: 
* **common/showroom** replaced by **your_workspace/your_project**,
* **PYTOR1-5234** replaced by your Run ID. 

**Click on the link to open the Run in Neptune UI.**

## Conclusion
You learned how to:
* Re-open a old run in order to fetch the metadata needed to reproduce it.
* Use fetched metadata to parametrize a new run with the same training loop.

**This knowledge can be applied to any other scenario as well!**

Visit our docs for more tutorials and guides on how to use Neptune: https://docs.neptune.ai
