# Neptune + Scikit-Optimize

## Before you start

### Install dependencies

In [None]:
! pip install --quiet lightgbm==2.2.3 scikit-optimize==0.8.1 neptune-client==0.4.132 neptune-contrib['monitoring']==0.25.0

### Create a sample objective function for skopt

In [None]:
import skopt
import lightgbm as lgb
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

space = [skopt.space.Real(0.01, 0.5, name='learning_rate', prior='log-uniform'),
          skopt.space.Integer(1, 30, name='max_depth'),
          skopt.space.Integer(2, 100, name='num_leaves'),
          skopt.space.Integer(10, 1000, name='min_data_in_leaf'),
          skopt.space.Real(0.1, 1.0, name='feature_fraction', prior='uniform'),
          skopt.space.Real(0.1, 1.0, name='subsample', prior='uniform'),
          ]

@skopt.utils.use_named_args(space)
def objective(**params):
    data, target = load_breast_cancer(return_X_y=True)
    train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        'objective': 'binary',
        'metric': 'binary_logloss',
        'verbosity':-1,
        **params
    }

    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(test_x)
    accuracy = roc_auc_score(test_y, preds)
    return -1.0 * accuracy

### Initialize Neptune

Neptune gives you an option of logging data under a public folder as an anonymous user. This is great when you are just trying out the application and don't have a Neptune account yet.

In [None]:
import neptune

neptune.init(api_token='ANONYMOUS', project_qualified_name='shared/scikit-optimize-integration')

**Note:** 


Instead of logging data to the public project 'shared/scikit-opitmize-integration' as an anonymous user 'neptuner' you can log it to your own project.

To do that:

1. Get your Neptune API token

  ![image](https://neptune.ai/wp-content/uploads/get_token.gif)

2. Pass the token to the `api_token` parameter of `neptune.init()` (learn how to do this securely [here](https://docs.neptune.ai/security-and-privacy/api-tokens/how-to-find-and-set-neptune-api-token.html))
3. Create a new Neptune Project (learn how to do that [here](https://docs.neptune.ai/workspace-project-and-user-management/projects/create-project.html))
4. Pass your username and project_name to the `project_qualified_name` parameter of `neptune.init()`

For example:

```
neptune.init(api_token='eyJhcGlfYW908fsdf23f940jiri0bn3085gh03riv03irn', project_qualified_name='siddhantsadangi/skopt')
```

## Quickstart

### Step 1: Create an Experiment

This creates an experiment in Neptune.

Once you have a live experiment you can log things to it.

In [None]:
neptune.create_experiment(name='skopt-sweep')

Click on the link above to open this experiment in Neptune.

For now it is empty but keep the tab with experiment open to see what happens next.

### Step 2: Run skopt with the Neptune Callback

This causes the metrics, parameters and results pickle logged after every iteration. Everything can be inspected live on the experiment tab (through the link displayed before).

In [None]:
# Create Neptune Callback
import neptunecontrib.monitoring.skopt as skopt_utils

neptune_callback = skopt_utils.NeptuneCallback()

In [None]:
# Run the skopt minimize function with the Neptune Callback
results = skopt.forest_minimize(objective, space, n_calls=25, n_random_starts=10,
                                callback=[neptune_callback])

### Step 3: Log best parameter configuration, best score and diagnostic plots

You can log additional information from skopt results after the training has completed.

You can change the Neptune experiment to which the results are logged with the ``experiment`` parameter, and choose whether or not you want to log plots and the pickle objects with the ``log_plots`` and ``log_pickle`` parameters.  

More information about the ``log_results()`` method [here](https://docs.neptune.ai/api-reference/neptunecontrib/monitoring/skopt/index.html?highlight=skopt#neptunecontrib.monitoring.skopt.log_results).

In [None]:
skopt_utils.log_results(results)

### Step 4: Stop logging and Explore results in the Neptune UI

When you track experiments with Neptune in Jupyter notebooks you need to explicitly stop the experiment by running ```neptune.stop()```.

If you are running Neptune in regular ```.py``` scripts it will stop automatically when your code stops running.

In [None]:
neptune.stop()

## Logging BayesSearchCV 

### Prepare the data and initialize BayesSearchCV optimizer

In [None]:
from skopt import BayesSearchCV
from skopt.space import Real, Categorical, Integer

from sklearn.datasets import load_iris
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

X, y = load_iris(True)
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    train_size=0.75,
                                                    random_state=0)

opt = BayesSearchCV(
    SVC(),
    {
        'C': Real(1e-6, 1e+6, prior='log-uniform'),
        'gamma': Real(1e-6, 1e+1, prior='log-uniform'),
        'degree': Integer(1,8),
        'kernel': Categorical(['linear', 'poly', 'rbf']),
    },
    n_iter=32,
    random_state=0
)

### Create Neptune experiment and pass NeptuneCallback to the `fit method`

In [None]:
neptune.create_experiment(name='skopt-sweep-bayes-search')

opt.fit(X_train, y_train, callback=skopt_utils.NeptuneCallback())

### Log diagnostic plots and best parameters via ``log_results`` function 

In [None]:
skopt_utils.log_results(opt._optim_results[0])

### Stop experiment

In [None]:
neptune.stop()