![Neptune + Optuna](https://neptune.ai/wp-content/uploads/2023/09/optuna.svg)

# Neptune + Optuna

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/integrations-and-supported-tools/optuna/notebooks/Neptune_Optuna_integration.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a><a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/integrations-and-supported-tools/optuna/notebooks/Neptune_Optuna_integration.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a><a target="_blank" href="https://app.neptune.ai/o/common/org/optuna-integration/runs/details?viewId=b6190a29-91be-4e64-880a-8f6085a6bb78&detailsTab=dashboard&dashboardId=Vizualizations-5ea92658-6a56-4656-b225-e81c6fbfc8ab&shortId=NEP1-18517&type=run"> 
  <img alt="Explore in Neptune" src="https://neptune.ai/wp-content/uploads/2024/01/neptune-badge.svg">
</a><a target="_blank" href="https://docs.neptune.ai/integrations/optuna/">
  <img alt="View tutorial in docs" src="https://neptune.ai/wp-content/uploads/2024/01/docs-badge-2.svg">
</a>

## Introduction

This guide will show you how to:

* Create a Neptune `run`,
* Create a `NeptuneCallback()`,
* Log Optuna study using `NeptuneCallback()`,
* Load an Optuna study from an existing Neptune `run`,
* Log bpth study and trial-level Optuna runs to Neptune.

## Before you start

This notebook example lets you try out Neptune as an anonymous user, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [None]:
! pip install -U lightgbm "neptune[optuna]" optuna plotly

## Import libraries

In [None]:
import lightgbm as lgb
import neptune
import neptune.integrations.optuna as optuna_utils
import optuna
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

## Create a sample `objective` function for Optuna

In [None]:
def objective(trial):
    data, target = load_breast_cancer(return_X_y=True)
    train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        "verbose": -1,
        "objective": "binary",
        "metric": "binary_logloss",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0, step=0.1),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0, step=0.1),
        "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),
    }

    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(test_x)
    accuracy = roc_auc_score(test_y, preds)

    return accuracy

## Quickstart

### Start a run

To create a new run for tracking the metadata, you tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in the public project [common/optuna-integration](https://app.neptune.ai/o/common/org/optuna-integration). **Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### Log to your own project instead

Replace the code below with the following:

```python
from getpass import getpass

run = neptune.init_run(
    project="workspace-name/project-name",  # replace with your own (see instructions below)
    api_token=getpass("Enter your Neptune API token: "),
)
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. The workspace name is displayed in the top-left corner of the app. To copy the project path, in the top-right corner, open the settings menu and select **Properties**.

For more help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [None]:
run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,
    project="common/optuna-integration",
)

**To open the run in the Neptune web app, click the link that appeared in the cell output.**

We'll use the `run` object we just created to log metadata. You'll see the metadata appear in the app.

### Initialize the NeptuneCallback

In [None]:
neptune_callback = optuna_utils.NeptuneCallback(run)

### Pass the NeptuneCallback to Optuna Study `.optimize()` method

In [None]:
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

You can view the logging live in the Neptune tab once Optuna you run the below cell

### Stop logging

Once you are done logging, stop tracking the run.

In [None]:
run.stop()

## More Options

### Customize which plots you want to log and how often

By default, `NeptuneCallback` creates and logs all of the plots from `optuna.visualizations`, but it adds overhead to your Optuna sweep.
You can decide which plots to create and log and how often you want to do that with:
* `plot_update_freq` argument: pass integer k to update plots every k trials or 'never' to not log any plots
* `log_plot_contour`, `log_plot_slice`, and other `log_{OPTUNA_PLOT_FUNCTION}` arguments: pass 'False', and the plots will not be created or logged

In [None]:
# Create a Neptune run
run = neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")

# Create a NeptuneCallback for Optuna
neptune_callback = optuna_utils.NeptuneCallback(
    run,
    plots_update_freq=2,  # create/log plots every 2 trials
    log_plot_slice=False,  # do not create/log plot_slice
    log_plot_contour=False,  # do not create/log plot_contour
)

# Pass NeptuneCallback to Optuna Study .optimize()
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

# Stop logging to a Neptune run
run.stop()

### Log charts and study object after the sweep

If you want to log study metadata after the Study was finished you can use the `.log_study_metadata()`.
`.log_study_metadata()` function logs the same things that `NeptuneCallback` logs, and you can customize what is logged with similar flags.

In [None]:
# Create a new Neptune run
run = neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")

# Run Optuna with Neptune Callback
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5)

# Log Optuna charts and study object after the sweep is complete
optuna_utils.log_study_metadata(study, run, log_plot_contour=False)

# Get run-id for the next step
prev_run_id = run["sys/id"].fetch()

# Stop logging
run.stop()

### Load the Optuna Study from an existing Neptune run

If you logged the Optuna Study to Neptune, you can load the Study directly from the run with the `load_study_from_run()` function and continue working with it.

It works both for Optuna `InMemoryStorage` and database storage.

In [None]:
# Fetch an existing Neptune run
run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,
    project="common/optuna-integration",
    with_id=prev_run_id,  # You can pass the ID of some other run
)

# Run Optuna with Neptune Callback
study = optuna_utils.load_study_from_run(run)

# Create callback to log advanced options during the sweep
neptune_callback = optuna_utils.NeptuneCallback(run)

# Continue logging to the same run
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

# Stop logging
run.stop()

### Keep track of both study-level and trial-level Runs

You can log trial-level information to separate Neptune Runs and have a main run for the study-level information.

**Warning**
The sweep will take longer as each trial-level run needs to synchronize with Neptune. 

#### Create a unique sweep ID

In [None]:
import uuid

sweep_id = uuid.uuid1()
print("sweep-id: ", sweep_id)

#### Create a study-level Neptune run

In [None]:
run_study_level = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration"
)

#### Log the sweep ID to the study-level run 

You can also add a tag 'study-level' to distinguish between the study-level and trial-level runs for the sweap. 

In [None]:
run_study_level["sys/tags"].add("study-level")
run_study_level["sweep-id"] = str(sweep_id)

#### Create an objective function that logs each trial to Neptune as a run

Inside of the objective function, you need to:
create a trial-level Neptune run
* log the sweep ID and a tag 'trial-level' to distinguish between study-level and trial-level Runs
* log parameters and scores to the trial-level run
* stop the trial-level run

In [None]:
def objective_with_logging(trial):
    data, target = load_breast_cancer(return_X_y=True)
    train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        "verbose": -1,
        "objective": "binary",
        "metric": "binary_logloss",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0, step=0.1),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0, step=0.1),
        "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),
    }

    # create a trial-level run
    run_trial_level = neptune.init_run(
        api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration"
    )

    # log sweep id to trial-level run
    run_trial_level["sys/tags"].add("trial-level")
    run_trial_level["sweep-id"] = str(sweep_id)

    # log parameters of a trial-level run
    run_trial_level["parameters"] = param

    # run model training
    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(test_x)
    accuracy = roc_auc_score(test_y, preds)

    # log score of a trial-level run
    run_trial_level["score"] = accuracy

    # stop trial-level run
    run_trial_level.stop()

    return accuracy

#### Create a study-level NeptuneCallback

In [None]:
neptune_callback = optuna_utils.NeptuneCallback(run_study_level)

#### Pass the NeptuneCallback to the `study.optimize()` method and run the parameter sweep

In [None]:
study = optuna.create_study(direction="maximize")
study.optimize(objective_with_logging, n_trials=5, callbacks=[neptune_callback])

#### Stop logging to the Neptune run

In [None]:
run_study_level.stop()

## Go to the Neptune app to see your parameter sweep

Now when you go to the Neptune app, you have:
* all the trial-level Runs logged with `"sys/tags"="trial-level"`
* study-level run logged with `"sys/tags"="study-level"`

You can use filters to find all the Runs that belong to the 'sweep-id' of the parameter sweep and compare them. You can also look only at the 'study-level' run to see the high-level picture of the sweep.

To compare sweeps between each other or find your current sweep, use Group by:
* Go to the Runs Table
* Click **+ Group by** in the top right
* Type 'sweep-id' and click on it
* Click **Show all** to see your trials in a separate Table View