![Neptune + MosaicML Composer](https://neptune.ai/wp-content/uploads/2023/09/mosiacml.svg)

# Neptune + MosaicML

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/integrations-and-supported-tools/mosaicml/notebooks/Neptune_MosaicML.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>

<a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/integrations-and-supported-tools/mosaicml/notebooks/Neptune_MosaicML.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a>

## Introduction

[MosaicML Composer](https://github.com/mosaicml/composer) is a PyTorch library for efficient neural network training.

This guide will show you how to:

* Create a Neptune logger for MosaicML Composer
* Automatically log your Composer training metadata to Neptune

## Before you start

This notebook example lets you try out Neptune as an anonymous user, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [2]:
%pip install -U neptune mosaicml torch torchvision

# TODO: Update after released
%pip install git+https://github.com/AleksanderWWW/composer.git@neptune-logger

Collecting mosaicml
  Using cached mosaicml-0.16.2-py3-none-any.whl (604 kB)
Installing collected packages: mosaicml
  Attempting uninstall: mosaicml
    Found existing installation: mosaicml 0.16.0
    Uninstalling mosaicml-0.16.0:
      Successfully uninstalled mosaicml-0.16.0
Successfully installed mosaicml-0.16.2
[0mNote: you may need to restart the kernel to use updated packages.
Collecting git+https://github.com/AleksanderWWW/composer.git
  Cloning https://github.com/AleksanderWWW/composer.git to /tmp/pip-req-build-s07a00ty
  Running command git clone --filter=blob:none --quiet https://github.com/AleksanderWWW/composer.git /tmp/pip-req-build-s07a00ty
  Resolved https://github.com/AleksanderWWW/composer.git to commit f34f86bd3e23def060577d559fc9a557ff11a19f
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Installing backend dependencies ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Buildi

## Import libraries

In [1]:
import torch

from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from composer import Trainer
from composer.models import mnist_model
from composer.loggers import NeptuneLogger
from composer.algorithms import LabelSmoothing, BlurPool, ProgressiveResizing


## Prepare dataset and dataloaders

In [2]:
transform = transforms.Compose([transforms.ToTensor()])
train_dataset = datasets.MNIST('data', download=True, train=True, transform=transform)
eval_dataset = datasets.MNIST('data', download=True, train=False, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=128)
eval_dataloader = DataLoader(eval_dataset, batch_size=128)

## (Neptune) Create `neptune_logger`

To create a new run for tracking the metadata, you tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in the public project [common/mosaicml](https://app.neptune.ai/o/common/org/mosaicml). **Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### Log to your own project instead

Replace the code below with the following:

```python
from getpass import getpass

neptune_logger = NeptuneLogger(
    project="workspace-name/project-name",  # replace with your own (see instructions below)
    api_token=getpass("Enter your Neptune API token: "),
    tags=["mnist", "notebook"],  # (optional) use your own
)
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. The workspace name is displayed in the top-left corner of the app.

    To copy the project path, in the top-right corner, open the settings menu and select **Properties**.

For more help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [21]:
from neptune import ANONYMOUS_API_TOKEN

neptune_logger = NeptuneLogger(
    api_token=ANONYMOUS_API_TOKEN,  # Replace with your own
    project="common/mosaicml",  # Replace with your own
    capture_hardware_metrics=True,  # This is turned off by default in Notebooks
    tags=["mnist", "notebook"],  # (optional) use your own
)

https://app.neptune.ai/siddhant.sadangi/sandbox/e/SAN-1007


**To open the run in the Neptune web app, click the link that appeared in the cell output.**

## Configure Composer algorithms

### Label smoothing

In [22]:
label_smoothing = LabelSmoothing(0.1)

### Blur pooling

In [23]:
blurpool = BlurPool(
    replace_convs=True,
    replace_maxpools=True,
    blur_first=True,
)

### Progressive resizing

In [24]:
prog_resize = ProgressiveResizing(
    initial_scale=.6,
    finetune_fraction=0.34,
)

## Train model with Composer algorithms

In [25]:
trainer = Trainer(
    model=mnist_model(),
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="3ep",
    device="gpu" if torch.cuda.is_available() else "cpu",
    algorithms = [label_smoothing, blurpool, prog_resize],
    loggers=neptune_logger,
)

trainer.fit()

Value '3765361996' is outside the range of 32-bit integers ('-2147483648' to '2147483647') and will be logged as float


******************************
Config:
blurpool/num_blurconv_layers: 0
blurpool/num_blurpool_layers: 0
enabled_algorithms/BlurPool: true
enabled_algorithms/LabelSmoothing: true
enabled_algorithms/ProgressiveResizing: true
node_name: unknown because NODENAME environment variable not set
num_gpus_per_node: 1
num_nodes: 1
rank_zero_seed: 3765361996

******************************


train          Epoch   0:    0%|| 0/469 [00:00<?, ?ba/s]         



eval           Epoch   0:    0%|| 0/79 [00:00<?, ?ba/s]         



## Log additional metadata to your custom namespaces
`neptune_logger`'s `neptune_run` property exposes the Neptune `Run` object, and can be used to log any additional metadata to your custom namespaces.

For example, here we will be logging a sample of the training and eval datasets as a series of images.

In [43]:
from neptune.types import File
neptune_logger.neptune_run[neptune_logger._base_namespace]["images/training"].extend([File.as_image(img/255) for img in train_dataset.data[:50]])
neptune_logger.neptune_run[neptune_logger._base_namespace]["images/eval"].extend([File.as_image(img/255) for img in eval_dataset.data[:50]])

The images will be logged to the "training/images/eval" namespace.

## Stop logging

Once you are done logging, stop tracking the run.

In [16]:
trainer.close()

Shutting down background jobs, please wait a moment...
Done!
All 0 operations synced, thanks for waiting!
Explore the metadata in the Neptune app:
https://app.neptune.ai/siddhant.sadangi/sandbox/e/SAN-1005/metadata


## Analyze run in the Neptune app
Follow the run link in the above cell output and explore the logged metadata.  
You can also explore this [example run](https://app.neptune.ai/o/common/org/mosaicml/runs/details?viewId=standard-view&detailsTab=dashboard&dashboardId=Overview-99f571df-0fec-4447-9ffe-5a4c668577cd&shortId=CAT-2). #TODO: Update link