# Neptune Advanced Quickstart (with forking)

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/scale-examples/blob/lb/advanced_quickstart/how-to-guides/advanced_quickstart/neptune_scale_quickstart.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>


This guide shows how to:
- **Create a Neptune run**
- **Log training metrics to the run**

We will also demonstrate how Neptune can be used for large scale training, by logging 66 unique metrics for 20000 steps each. This demonstrates a scale of logging ~1.3M datapoints. These values can be adjusted to see the ingestion and UI responsiveness of Neptune. 

## Install Neptune and dependencies

In [None]:
! pip install -q -U neptune_scale numpy

Import the required packages.

In [11]:
import random
import numpy as np
from neptune_scale import Run
import time
import os
from tqdm import trange

from training_simulation import *  # code that simulates neural network training metrics without actual training

### Get and set your API token

If you haven't already, [create a project](https://docs-beta.neptune.ai/setup#1-create-a-project).

To find your API token and full project name:
1. Log into Neptune Scale.
2. In the bottom-left corner, expand your user menu and select **Get your API token**.
3. Using your token, set the `NEPTUNE_API_TOKEN` environment variable before running this notebook.
4. To find the full project name, open the project settings. Copy and paste the project path below.

In [None]:
os.environ["NEPTUNE_PROJECT"] = "copy-paste-your-project-id-from-neptune-ui-here"
print(os.environ["NEPTUNE_PROJECT"])

## Initialize the Neptune run

To initialize the `Run` object, you must provide a unique `run_id` to identify your experiment. You can also pass your API token and project name as arguments to the `Run` constructor (as seen in the pseudo code below), but these have already been set as environment variables.

```python
run = Run(
    api_token = "YOUR_API_TOKEN",
    project = "YOUR_PROJECT_NAME/YOUR_WORKSPACE_NAME",
    run_id = "UNIQUE_RUN_IDENTIFIER"
)

```

To log any configuration parameters or single values like `learning rate`, `batch size` or `optimizer`, use the `run.log_configs()` method.

In [None]:
run_index = random.randint(0, 10_000)
custom_id = f"quickstart-{run_index}"  # Sets a random value for the custom run_id

run = Run(
    experiment_name=custom_id,  # This run becomes the head of an experiment
    run_id=custom_id,  # You can customize your run_id, but if not specified, will be generated automatically
)

# Add any tags to identify your runs
run.add_tags(["Quickstart", "Long"])
run.add_tags(["Notebook"], group_tags=True)

parameters = get_parameters(run_index=run_index, n_steps=100_000)
allowed_datatypes = [int, float, str, datetime, bool]
run.log_configs(
    {
        **{f"parameters/model/{k}": v if v in allowed_datatypes else str(v) for k, v in parameters["model"].items()},
        **{f"parameters/optimizer/{k}": v if v in allowed_datatypes else str(v) for k, v in parameters["optimizer"].items()},
    }
)

print(run.get_experiment_url())

## Execute training loop that logs to Neptune

This training loop tracks 66 different training metrics each for 20 000 steps. You can increase the number of metrics in the logging dictionary as well as the number of steps.

```python
metrics_to_log = {
    "metric_1": metric_1,
    ... ,
    "metric_x": metric_x
}

```

To find the run in the Neptune web app, navigate to the [**All runs**](https://scale.neptune.ai/) tab. Next to the search bar, enable the **Show all runs** toggle.

In [None]:
start_time = time.time()  # Time length of execution
logging_time = 0

# Simulate the training loop
model_state = simulate_init_new_model(parameters)
optimizer_state = simulate_init_optimizer()
for step in trange(1, parameters["training"]["steps"] + 1):
    # simulate training step
    model_state, optimizer_state, loss, accuracy = simulate_training_step(model_state, optimizer_state, step, parameters)
    metrics = {
        "train/metrics/loss": loss,
        "train/metrics/accuracy": accuracy,
    }

    # capture debugging metrics for each layer of the model
    for layer_idx, layer_state in enumerate(model_state["layers"]):
        metrics[f"metrics/layer/{layer_idx:02d}/activation_mean"] = layer_state["activation_mean"]
        metrics[f"metrics/layer/{layer_idx:02d}/gradient_norm"] = layer_state["gradient_norm"]
        
    # simulate evaluation step
    eval_loss, eval_accuracy, eval_bleu, eval_wer = simulate_eval_step(model_state, step, parameters)
    metrics["test/metrics/loss"] = eval_loss
    metrics["test/metrics/accuracy"] = eval_accuracy
    metrics["test/metrics/bleu"] = eval_bleu
    metrics["test/metrics/wer"] = eval_wer

    # Simulate hardware metrics
    for i in range(10):
        metrics[f"hardware/gpu_{i}"] = random.random()
    
    # Log metrics usig the run.log_metrics() method
    logging_time_start = time.time()
    run.log_metrics(data=metrics, step=step)
    logging_time_end = time.time()
    logging_time += logging_time_end - logging_time_start

    # save checkpoint every 100 steps
    save_checkpoint(custom_id, step, model_state, optimizer_state, parameters)

# Close run and ensure all operations are processed
run.close()

# Calculate some post run metrics for review
num_ops = parameters["training"]["steps"] * len(metrics)
end_time = time.time()
execution_time = end_time - start_time

print(f"Unique metrics per run: {len(metrics)}")
print(f'Number of steps per run: {parameters["training"]["steps"]}')
print(f"Total data points logged per run: {num_ops}")
print(
    f"Total execution time: {execution_time:.2f} seconds to process {num_ops} operations ({num_ops/execution_time:.0f} datapoints/second)."
)
print(f"Logging time {logging_time}, ({num_ops/logging_time:.0f} datapoints/second).")

### Forking

To fork the run, all you need is the parent run id & fork step at which the new run will be created.

In most scenarios, you'll want these to correspond to the checkpoint from specific step in one of the previous runs.

Forking is useful especially if you want to keep the original experiment running in parallel to the new run.

In [None]:
fork_at_step = 70_000
forked_run = Run(
    experiment_name=f"{custom_id}-forked",
    run_id=f"{custom_id}-forked",
    fork_run_id=custom_id, # id of the parent run
    fork_step=70_000, # checkpoint step from which we're going to start the forked run
)

# change tags & configs if you want to
forked_run.add_tags(["Forked"])
forked_run.log_configs(
    {
        "parameters/optimizer/lr": 0.001,
        "parameters/model/batch_size": 32,
    }
)

print(forked_run.get_experiment_url())

In [None]:
start_time = time.time()  # Time length of execution
logging_time = 0

# Simulate the training loop
checkpoint = load_checkpoint(
    checkpoint_path=f"./{checkpoint_path(custom_id, fork_at_step)}",
    parameters=parameters
)
model_state = checkpoint["model_state"]
model_state["simulation_behavior"] = "fast_convergence"
model_state["simulation_behavior_start_step"] = model_state["simulation_behavior_start_step"] * 2
optimizer_state = checkpoint["optimizer_state"]
for step in trange(fork_at_step, parameters["training"]["steps"] + fork_at_step + 1):
    # simulate training step
    model_state, optimizer_state, loss, accuracy = simulate_training_step(model_state, optimizer_state, step, parameters)
    metrics = {
        "train/metrics/loss": loss,
        "train/metrics/accuracy": accuracy,
    }

    # capture debugging metrics for each layer of the model
    for layer_idx, layer_state in enumerate(model_state["layers"]):
        metrics[f"metrics/layer/{layer_idx:02d}/activation_mean"] = layer_state["activation_mean"]
        metrics[f"metrics/layer/{layer_idx:02d}/gradient_norm"] = layer_state["gradient_norm"]
        
    # simulate evaluation step
    eval_loss, eval_accuracy, eval_bleu, eval_wer = simulate_eval_step(model_state, step, parameters)
    metrics["test/metrics/loss"] = eval_loss
    metrics["test/metrics/accuracy"] = eval_accuracy
    metrics["test/metrics/bleu"] = eval_bleu
    metrics["test/metrics/wer"] = eval_wer

    # Simulate hardware metrics
    for i in range(10):
        metrics[f"hardware/gpu_{i}"] = random.random()
    
    # Log metrics usig the run.log_metrics() method
    logging_time_start = time.time()
    forked_run.log_metrics(data=metrics, step=step)
    logging_time_end = time.time()
    logging_time += logging_time_end - logging_time_start

    # print(f"progress={100 * step/steps:.1f}%")

# Close run and ensure all operations are processed
forked_run.close()

# Calculate some post run metrics for review
num_ops = parameters["training"]["steps"] * len(metrics)
end_time = time.time()
execution_time = end_time - start_time

print(f"Unique metrics per run: {len(metrics)}")
print(f'Number of steps per run: {parameters["training"]["steps"]}')
print(f"Total data points logged per run: {num_ops}")
print(
    f"Total execution time: {execution_time:.2f} seconds to process {num_ops} operations ({num_ops/execution_time:.0f} datapoints/second)."
)
print(f"Logging time {logging_time}, ({num_ops/logging_time:.0f} datapoints/second).")