# Organize ML runs

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/how-to-guides/organize-ml-experimentation/notebooks/Organize_ML_runs.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>
<a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/how-to-guides/organize-ml-experimentation/notebooks/Organize_ML_runs.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a>
<a target="_blank" href="https://app.neptune.ai/o/common/org/quickstarts/runs/table?viewId=9b012c15-0971-49a2-827b-0d53d0907164"> 
  <img alt="Explore in Neptune" src="https://neptune.ai/wp-content/uploads/2024/01/neptune-badge.svg">
</a>
<a target="_blank" href="https://docs.neptune.ai/tutorials/basic_ml_run_tracking/">
  <img alt="View tutorial in docs" src="https://neptune.ai/wp-content/uploads/2024/01/docs-badge-2.svg">
</a>

## Introduction

This guide will show you how to:

- Keep track of code, data, environment and parameters
- Log results like evaluation metrics and model files
- Find runs on the dashboard with tags
- Organize runs in a dashboard view and save it for later

## Before you start

This notebook example lets you try out Neptune anonymously, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Setup

Install dependencies

In [None]:
! pip install -U neptune scikit-learn

## Create a basic training script

In [None]:
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score
from sklearn.model_selection import train_test_split

data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.4, random_state=1234
)

params = {
    "n_estimators": 10,
    "max_depth": 3,
    "min_samples_leaf": 1,
    "min_samples_split": 2,
    "max_features": 3,
}

clf = RandomForestClassifier(**params)

clf.fit(X_train, y_train)
y_train_pred = clf.predict_proba(X_train)
y_test_pred = clf.predict_proba(X_test)

train_f1 = f1_score(y_train, y_train_pred.argmax(axis=1), average="macro")
test_f1 = f1_score(y_test, y_test_pred.argmax(axis=1), average="macro")
print(f"Train f1:{train_f1} | Test f1:{test_f1}")

## Initialize Neptune and create a new run

To create a new run for tracking the metadata, you tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in a public project. **Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### Log to your own project instead

Replace the code below with the following:

```python
import neptune
from getpass import getpass

run = neptune.init_run(
    project="workspace-name/project-name",  # replace with your own (see instructions below)
    api_token=getpass("Enter your Neptune API token: "),
    capture_hardware_metrics=True,
    capture_stderr=True,
    capture_stdout=True,
)
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. The workspace name is displayed in the top-left corner of the app. 

    To copy the project path, in the top-right corner, open the settings menu and select **Properties**.

For more help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [None]:
import neptune

run = neptune.init_run(
    project="common/quickstarts",
    api_token=neptune.ANONYMOUS_API_TOKEN,
    capture_hardware_metrics=True,
    capture_stderr=True,
    capture_stdout=True,
)  # Hardware metrics, stderr, and stdout are not captured by default in interactive kernels

**To open the run in the Neptune web app, click the link that appeared in the cell output.**

We'll use the `run` object we just created to log metadata. You'll see the metadata appear in the app.

## Save parameters

In [None]:
run["parameters"] = params

## Add tags to organize things

Pass a list of strings to the ``.append_tag`` method of the run object.

In [None]:
run["sys/tags"].add(["run-organization", "me"])

## Add logging of train and evaluation metrics

In [None]:
run["train/f1"] = train_f1
run["test/f1"] = test_f1

Runs can be viewed as dictionary-like structures - **namespaces** - that you can define in your code. You can apply hierarchical structure to your metadata that will be reflected in the UI as well. Thanks to this you can easily organize your metadata in a way you feel is most convenient.

There is one special namespace: **system namespace**, denoted `sys`. You can use it to add name and tags to the run.

## Stop logging

Once you are done logging, stop tracking the run.

In [None]:
run.stop()

## Execute a few runs with different parameters

Let's execute some runs with different model configurations.

Change parameters in the `params` dictionary of the **Create a basic training script** step

```python
params = {
    "n_estimators": 10,
    "max_depth": 3,
    "min_samples_leaf": 1,
    "min_samples_split": 2,
    "max_features": 3,
}
``` 

Run all the cells, log things to Neptune.

## Go to the Neptune app

Click on one of the links created when you run the script or go directly to the app.

## See if everything is logged

Go to one of the runs you executed and see that you logged things correctly:

- In the console output or the runs table in the web app, click on the run link
- Go to ``Parameters`` section to see your parameters
- Go to ``Monitoring`` to see hardware utilization charts
- Go to **All metadata** to review all logged metadata

## Filter runs by tag

Go to the runs space and filter by the ``run-organization`` tag

Neptune should filter all those runs for you.

## Choose the parameter and metric columns you want to see

Use the ``Add column`` button to choose the columns for the runs table:

- Click on ``Add column``,
- Type metadata name of interest, for example `test_f1`,
- Click on ``test_f1`` to add it.

## Save the view of runs table

You can save the current view of runs table for later by clicking on **Save as new**

Both the columns and the filtering on rows will be saved as view.

---

**Tip:**  
Create and save multiple views of the runs table for different use cases or runs groups.

---