![Neptune + Optuna](https://neptune.ai/wp-content/uploads/2023/09/optuna.svg)

# Neptune + Optuna

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/integrations-and-supported-tools/optuna/notebooks/Neptune_Optuna_integration.ipynb">
  <img alt="Open in Colab" src="https://colab.research.google.com/assets/colab-badge.svg"/>
</a><a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/integrations-and-supported-tools/optuna/notebooks/Neptune_Optuna_integration.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a><a target="_blank" href="https://app.neptune.ai/o/showcase/org/optuna/runs/details?viewId=9c1c6f28-8234-4548-b44c-26b88e4c82a7&detailsTab=dashboard&dashboardId=Visualizations-9c1c70b1-6609-4961-8e93-d321e4006574"> 
  <img alt="Explore in Neptune" src="https://neptune.ai/wp-content/uploads/2024/01/neptune-badge.svg">
</a><a target="_blank" href="https://docs.neptune.ai/integrations/optuna/">
  <img alt="View tutorial in docs" src="https://neptune.ai/wp-content/uploads/2024/01/docs-badge-2.svg">
</a>

## Introduction

This guide will show you how to:

* Create a Neptune `run`,
* Create a `NeptuneCallback()`,
* Log Optuna study using `NeptuneCallback()`,
* Load an Optuna study from an existing Neptune `run`,
* Log bpth study and trial-level Optuna runs to Neptune.

## Before you start

This notebook example lets you try out Neptune as an anonymous user, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [None]:
%pip install -U -q lightgbm "neptune[optuna]" optuna plotly

## Import libraries

In [None]:
import lightgbm as lgb
import neptune
import neptune.integrations.optuna as optuna_utils
import optuna
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

## Create a sample `objective` function for Optuna

In [None]:
def objective(trial):
    data, target = load_breast_cancer(return_X_y=True)
    train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        "verbose": -1,
        "objective": "binary",
        "metric": "binary_logloss",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0, step=0.1),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0, step=0.1),
        "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),
    }

    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(test_x)
    accuracy = roc_auc_score(test_y, preds)

    return accuracy

## Quickstart

## Create the Neptune run

To create a new run for tracking the metadata, you must tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in a public project. 

**Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### To log to your own project instead

Replace the cell below with the following:

```python
import os
from getpass import getpass

os.environ["NEPTUNE_PROJECT"] = "workspace-name/project-name"  # replace with your own (see instructions below)
os.environ["NEPTUNE_API_TOKEN"] = getpass("Enter your Neptune API token: ")
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. In the project view, in the top-right corner, select the menu and then **Edit project details**.

For help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [None]:
import os

os.environ["NEPTUNE_PROJECT"] = "common/optuna"
os.environ["NEPTUNE_API_TOKEN"] = neptune.ANONYMOUS_API_TOKEN

In [None]:
run = neptune.init_run(tags=["quickstart", "study", "notebook"])

**To open the run in the Neptune web app, click the link that appeared in the cell output.**

We'll use the `run` object we just created to log metadata. You'll see the metadata appear in the app.

### Initialize the NeptuneCallback

By default, the callback logs all the plots from the `optuna.visualization` module, details of all trials, and the Study object itself. For how to customize the NeptuneCallback further, see the [integration guide](https://docs.neptune.ai/integrations/optuna/#customizing-which-plots-to-log-and-how-often) in the Neptune docs.

In [None]:
neptune_callback = optuna_utils.NeptuneCallback(run)

### Pass the NeptuneCallback to Optuna Study `.optimize()` method

In [None]:
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

You can view the logging live in the Neptune tab once Optuna you run the below cell

### Stop logging

Once you are done logging, stop tracking the run.

In [None]:
run.stop()

## More options

### Log charts and study object after the sweep

To log study metadata after the Study is finished, use the `log_study_metadata()` function.
This method is generally faster than using `NeptuneCallback`, as it doesn't log the data live. It logs the same metadata as the callback and accepts the same flags for customization.

In [None]:
# Create a new Neptune run
run = neptune.init_run(tags=["log-after-study", "study", "notebook"])

# Run Optuna with Neptune Callback
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5)

# Log Optuna charts and study object after the sweep is complete
optuna_utils.log_study_metadata(study, run)

# Get run-id for the next step
prev_run_id = run["sys/id"].fetch()

# Stop logging
run.stop()

### Load the Optuna Study from an existing Neptune run

If you logged the Optuna Study to Neptune, you can load the Study directly from the run with the `load_study_from_run()` function and continue working with it.

It works both for Optuna `InMemoryStorage` and database storage.

In [None]:
# Fetch an existing Neptune run
run = neptune.init_run(with_id=prev_run_id)  # You can pass the ID of some other run

# Run Optuna with Neptune Callback
study = optuna_utils.load_study_from_run(run)

# Create callback to log advanced options during the sweep
neptune_callback = optuna_utils.NeptuneCallback(run)

# Continue logging to the same run
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

# Stop logging
run.stop()

### Keep track of both study-level and trial-level Runs

You can log trial-level information to separate Neptune Runs and have a main run for the study-level information.

**Warning**  
The sweep will take longer as each trial-level run needs to synchronize with Neptune. 

#### Create an objective function that logs each trial to Neptune as a run

Inside of the objective function, you need to:  
* create a trial-level Neptune run
* log the study name and a "trial" tag to distinguish between study-level and trial-level runs
* log parameters and scores to the trial-level run
* stop the trial-level run

In [None]:
def objective_with_logging(trial):
    data, target = load_breast_cancer(return_X_y=True)
    train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        "verbose": -1,
        "objective": "binary",
        "metric": "binary_logloss",
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0, step=0.1),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0, step=0.1),
        "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),
    }

    # create a trial-level run
    run_trial_level = neptune.init_run(tags=["trial", "notebook"])

    # log study name and trial number to trial-level run
    run_trial_level["study/study_name"] = study.study_name
    run_trial_level["trial/number"] = trial.number

    # log parameters of a trial-level run
    run_trial_level["trial/parameters"] = param

    # run model training
    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(test_x)
    accuracy = roc_auc_score(test_y, preds)

    # log score of a trial-level run
    run_trial_level["trial/score"] = accuracy

    # stop trial-level run
    run_trial_level.stop()

    return accuracy

#### Create an Optuna study

In [None]:
study = optuna.create_study(direction="maximize")

#### Create a study-level Neptune run

In [None]:
run_study_level = neptune.init_run(tags=["study", "notebook"])

#### Create a study-level NeptuneCallback

In [None]:
neptune_callback = optuna_utils.NeptuneCallback(run_study_level)

#### Pass the NeptuneCallback to the `study.optimize()` method and run the parameter sweep

In [None]:
study.optimize(objective_with_logging, n_trials=5, callbacks=[neptune_callback])

#### Stop logging to the Neptune run

In [None]:
run_study_level.stop()

## Go to the Neptune app to see your parameter sweep

Now when you go to the Neptune app, you have:
* all the trial-level runs tagged as `trial`
* the study-level run tagged as `study`

You can use filters to find all the trials that belong to the same study and compare them. You can also look only at the 'study' runs to see the high-level picture.

To compare trials within each study, group the runs by study name:
1. Go to the runs table.
1. Next to the search input box, click **Group by**.
1. Type 'study/study_name' and select the field from the list.
1. To see your trials in a separate table view, click **Show all**.