# Neptune + skorch

## Introduction

In this guide, you will learn how to use `NeptuneLogger()` to log training metrics to Neptune.

[See this example in Neptune](https://app.neptune.ai/o/common/org/skorch-integration/e/SKOR-13).

## Before you start

This notebook example lets you try out Neptune as an anonymous user, with zero setup.

* If you are running the notebook on your local machine, you need to have [Python](https://www.python.org/downloads/) and [pip](https://pypi.org/project/pip/) installed.
* If you want to see the example recorded to your own workspace instead:
    * Create a Neptune account → [Take me to registration](https://neptune.ai/register)
    * Create a Neptune project that you will use for tracking metadata → [Tell me more about projects](https://docs.neptune.ai/administration/projects)

## Install Neptune and dependencies

In [None]:
! pip install neptune-client>=0.11.0 torch==1.13.0 scikit-learn==1.1.3 skorch==0.12.1

## Import libraries

In [None]:
import torch
from torch import nn
import torch.nn.functional as F
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np
import matplotlib.pyplot as plt
import neptune.new as neptune
from neptune.new.types import File
from skorch.callbacks import NeptuneLogger, Checkpoint
from skorch import NeuralNetClassifier

## Loading data
Use scikit-learn's ```fetch_openml``` to load MNIST data.

In [None]:
mnist = fetch_openml("mnist_784", as_frame=False, cache=False)

## Preprocessing data

Each image of the MNIST dataset is encoded in a 784 dimensional vector, representing a 28 x 28 pixel image. Each pixel has a value between 0 and 255, corresponding to the grey-value of a pixel.<br />
The above ```fetch_mldata``` method returns ```data``` and ```target``` as ```uint8``` which we convert to ```float32``` and ```int64``` respectively.

In [None]:
X = mnist.data.astype("float32")
y = mnist.target.astype("int64")

To avoid big weights that deal with the pixel values in the range [0, 255], we scale `X` down. A commonly used range is [0, 1].

In [None]:
X /= 255.0

In [None]:
X.min(), X.max()

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

In [None]:
assert X_train.shape[0] + X_test.shape[0] == mnist.data.shape[0]

In [None]:
X_train.shape, y_train.shape

### Print a selection of training images and their labels

In [None]:
def plot_example(X, y):
    """Plot the first 5 images and their labels in a row."""
    for i, (img, y) in enumerate(zip(X[:5].reshape(5, 28, 28), y[:5])):
        plt.subplot(151 + i)
        plt.imshow(img)
        plt.xticks([])
        plt.yticks([])
        plt.title(y)

In [None]:
plot_example(X_train, y_train)

## Build a neural network with PyTorch
Next we'll build a simple, fully connected neural network with one hidden layer. The input layer has 784 dimensions (28 x 28), the hidden layer has 98 (= 784 / 8), and the output layer 10 neurons, representing digits 0 - 9.

In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"

In [None]:
mnist_dim = X.shape[1]
hidden_dim = int(mnist_dim / 8)
output_dim = len(np.unique(mnist.target))

In [None]:
mnist_dim, hidden_dim, output_dim

A neural network in PyTorch's framework.

In [None]:
class ClassifierModule(nn.Module):
    def __init__(
        self,
        input_dim=mnist_dim,
        hidden_dim=hidden_dim,
        output_dim=output_dim,
        dropout=0.5,
    ):
        super(ClassifierModule, self).__init__()
        self.dropout = nn.Dropout(dropout)

        self.hidden = nn.Linear(input_dim, hidden_dim)
        self.output = nn.Linear(hidden_dim, output_dim)

    def forward(self, X, **kwargs):
        X = F.relu(self.hidden(X))
        X = self.dropout(X)
        X = F.softmax(self.output(X), dim=-1)
        return X

## Start a run

To connect your script to Neptune and create a new run, we tell Neptune:
* **Who you are** - with a Neptune API token
* **Where to send your data** - to a Neptune project

The cell below lets you record data to the public project [common/skorch-integration](https://app.neptune.ai/common/skorch-integration) as an anonymous user.

In [None]:
run = neptune.init_run(
    api_token=neptune.ANONYMOUS_API_TOKEN,
    project="common/skorch-integration",
    name="skorch-example",
)

Alternatively, you can log the example to your own workspace.

To do that, replace the code above with the following:

```python
from getpass import getpass

run = neptune.init_run(
    api_token=getpass("Enter your Neptune API token: "),
    project="workspace-name/project-name",  # replace with your own
)
```

For example, if your workspace name is `ml-team` and the project name is `classification`, the project argument is: `project="ml-team/classification"`.

To find your API token and project name, [log in to Neptune](https://app.neptune.ai/).
- In the top-right corner, click your avatar and select **Get your API token**.
- To find and copy your project name, navigate to the project, then click **Settings** → **Properties**.

---

You now have new run in Neptune! From here on, we'll use the `run` object to initialize the `NeptuneLogger`.

**To open the run in Neptune, follow the link that appeared in the cell output.**

There's not much to display yet, but keep the tab with the run open to see what happens next.

## Create NeptuneLogger 

In [None]:
neptune_logger = NeptuneLogger(run, close_after_train=False)

In [None]:
checkpoint_dirname = "./checkpoints"
checkpoint = Checkpoint(dirname=checkpoint_dirname)

## Initialize a trainer and pass neptune_logger

In [None]:
net = NeuralNetClassifier(
    ClassifierModule,
    max_epochs=20,
    lr=0.1,
    device=device,
    callbacks=[neptune_logger, checkpoint],
)

In [None]:
net.fit(X_train, y_train);

## More options

### Log model weights
Use the Checkpoint Callback to save the model files to disk. You can then upload the files to Neptune.

In [None]:
neptune_logger.run["training/model/checkpoints"].upload_files(checkpoint_dirname)

### Log test score

In [None]:
y_pred = net.predict(X_test)
neptune_logger.run["training/test/acc"] = accuracy_score(y_test, y_pred)

### Log misclassified images

In [None]:
error_mask = y_pred != y_test
plot_example(X_test[error_mask], y_pred[error_mask])

for (x, y_hat, y) in zip(X_test[error_mask], y_pred[error_mask], y_test[error_mask]):
    x_reshaped = x.reshape(28, 28)
    neptune_logger.run["training/test/misclassified_images"].log(
        File.as_image(x_reshaped), description=f"y_pred={y_hat}, y_true={y}"
    )

## Stop logging
Once you are done logging, stop tracking the run.

In [None]:
neptune_logger.run.stop()

## Explore the results in Neptune

Follow the link to the run and explore metadata (such as metrics, hyperparameters, and model checkpoints) that were logged to the run in Neptune.

You can also check out an [example run](https://app.neptune.ai/o/common/org/skorch-integration/e/SKOR-13).