# Organize ML runs

## Introduction

This guide will show you how to:

- Keep track of code, data, environment and parameters
- Log results like evaluation metrics and model files
- Find runs on the dashboard with tags
- Organize runs in a dashboard view and save it for later

## Setup

Install dependencies

In [None]:
! pip install --quiet neptune-client==0.5.4 scikit-learn==0.23.1

## Step 1: Create a basic training script

In [None]:
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score

data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target,
                                                    test_size=0.4, random_state=1234)

params = {'n_estimators': 10,
          'max_depth': 3,
          'min_samples_leaf': 1,
          'min_samples_split': 2,
          'max_features': 3,
          }

clf = RandomForestClassifier(**params)

clf.fit(X_train, y_train)
y_train_pred = clf.predict_proba(X_train)
y_test_pred = clf.predict_proba(X_test)

train_f1 = f1_score(y_train, y_train_pred.argmax(axis=1), average='macro')
test_f1 = f1_score(y_test, y_test_pred.argmax(axis=1), average='macro')
print(f'Train f1:{train_f1} | Test f1:{test_f1}')

## Step 2: Initialize Neptune and create new run

Connect your script to Neptune application and create new run.

In [None]:
import neptune.new as neptune

run = neptune.init(project='common/quickstarts',
                   api_token='ANONYMOUS')

Click on the link above to open this run in Neptune.

For now it is empty but keep the tab with run open to see what happens next. 

**Few explanations**

In the above code You tell Neptune: 

* **who you are**: your Neptune API token `api_token` 
* **where you want to send your data**: your Neptune `project`.

At this point you have new run in Neptune. For now on you will use `run` to log metadata to it.

---

**Note**


Instead of logging data to the public project 'common/colab-test-run' as an anonymous user 'neptuner' you can log it to your own project.

To do that:

1. Get your [Neptune API token](https://docs-beta.neptune.ai/administration/security-and-privacy/how-to-find-and-set-neptune-api-token)
2. Pass the token to ``api_token`` argument of ``neptune.init()`` method: ``api_token=YOUR_API_TOKEN``
3. Pass your project to the ``project`` argument of the ``neptune.init()``.

For example:

```python
neptune.init(project_qualified_name='my_workspace/my_project', 
             api_token='MY_API_TOKEN')
```

## Step 3: Save parameters

In [None]:
run['parameters'] = params

## Step 4. Add tags to organize things

Pass a list of strings to the ``.append_tag`` method of the run object.

In [None]:
run["sys/tags"].add(['run-organization', 'me'])

## Step 5. Add logging of train and evaluation metrics

In [None]:
run['train/f1'] = train_f1
run['test/f1'] = test_f1

Runs can be viewed as dictionary-like structures - **namespaces** - that you can define in your code. You can apply hierarchical structure to your metadata that will be reflected in the UI as well. Thanks to this you can easily organize your metadata in a way you feel is most convenient.

There is one special namespace: **system namespace**, denoted `sys`. You can use it to add name and tags to the run.

## Step 6. Execute a few runs with different parameters

Let's execute some runs with different model configuration.

Change parameters in the ``params`` dictionary of the **Step 1: Create a basic training script**

```python

    params = {'n_estimators': 10,
              'max_depth': 3,
              'min_samples_leaf': 1,
              'min_samples_split': 2,
              'max_features': 3,
              }
``` 

Run all the cells, log things to Neptune.

## Step 7. Go to Neptune UI

Click on one of the links created when you run the script or go directly to the app.

If you are logging things to the public project ``common/quickstarts`` you can just [follow this link](https://alpha.neptune.ai/o/common/org/quickstarts/e/QUI-10/parameters).

## Step 8. See that everything got logged

Go to one of the runs you executed and see that you logged things correctly:

- click on the run link or one of the rows in the runs table in the UI
- Go to ``Parameters`` section to see your parameters
- Go to ``Monitoring`` to see hardware utilization charts
- Go to ``All metadata`` to review all logged metadata

![image](https://neptune.ai/wp-content/uploads/docs-organize-runs-review.gif)

## Step 9. Filter runs by tag

Go to the runs space and filter by the ``run-organization`` tag

Neptune should filter all those runs for you.

![img](https://neptune.ai/wp-content/uploads/docs-organize-ml-runs-tags.gif)

## Step 10. Choose parameter and metric columns you want to see

Use the ``Add column`` button to choose the columns for the runs table:

- Click on ``Add column``,
- Type metadata name of interest, for example `test_f1`,
- Click on ``test_f1`` to add it.

![img](https://neptune.ai/wp-content/uploads/docs-organize-ml-runs-cols.gif)

## Step 11. Save the view of runs table

You can save the current view of runs table for later:

- Click on the ``Save as new``

Both the columns and the filtering on rows will be saved as view.

![img](https://neptune.ai/wp-content/uploads/docs-organize-ml-runs-view.gif)

---

**tip:**

    Create and save multiple views of the runs table for different use cases or runs groups.

---