![Neptune + MosaicML Composer](https://neptune.ai/wp-content/uploads/2023/09/mosiacml.svg)

# Neptune + MosaicML

<a target="_blank" href="https://colab.research.google.com/github/neptune-ai/examples/blob/main/integrations-and-supported-tools/mosaicml-composer/notebooks/Neptune_MosaicML_Composer.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>

<a target="_blank" href="https://github.com/neptune-ai/examples/blob/main/integrations-and-supported-tools/mosaicml-composer/notebooks/Neptune_MosaicML_Composer.ipynb">
  <img alt="Open in GitHub" src="https://img.shields.io/badge/Open_in_GitHub-blue?logo=github&labelColor=black">
</a>

## Introduction

[MosaicML Composer](https://github.com/mosaicml/composer) is a PyTorch library for efficient neural network training.

This guide will show you how to:

* Create a Neptune logger for MosaicML Composer
* Automatically log your Composer training metadata to Neptune

## Before you start

This notebook example lets you try out Neptune as an anonymous user, with zero setup.

If you want to see the example logged to your own workspace instead:

  1. Create a Neptune account. [Register &rarr;](https://neptune.ai/register)
  1. Create a Neptune project that you will use for tracking metadata. For instructions, see [Creating a project](https://docs.neptune.ai/setup/creating_project) in the Neptune docs.

## Install Neptune and dependencies

In [1]:
%pip install -U neptune torch torchvision #mosaicml

# TODO: Update after released
%pip install git+https://github.com/AleksanderWWW/composer.git@neptune-logger

Collecting neptune
  Using cached neptune-1.7.0-py3-none-any.whl (459 kB)
Collecting torch
  Using cached torch-2.1.0-cp310-cp310-manylinux1_x86_64.whl (670.2 MB)
Collecting torchvision
  Using cached torchvision-0.16.0-cp310-cp310-manylinux1_x86_64.whl (6.9 MB)
Collecting requests-oauthlib>=1.0.0
  Using cached requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting PyJWT
  Using cached PyJWT-2.8.0-py3-none-any.whl (22 kB)
Collecting swagger-spec-validator>=2.7.4
  Using cached swagger_spec_validator-3.0.3-py2.py3-none-any.whl (27 kB)
Collecting oauthlib>=2.1.0
  Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB)
Collecting bravado<12.0.0,>=11.0.0
  Using cached bravado-11.0.3-py2.py3-none-any.whl (38 kB)
Collecting boto3>=1.16.0
  Downloading boto3-1.28.59-py3-none-any.whl (135 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.8/135.8 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hCollecting future>=0.17.1
  Using cached future-0

## Import libraries

In [1]:
import torch

from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from composer import Trainer
from composer.models import mnist_model
from composer.loggers import NeptuneLogger
from composer.algorithms import LabelSmoothing, BlurPool, ProgressiveResizing

## Prepare dataset and dataloaders

In [2]:
transform = transforms.Compose([transforms.ToTensor()])
train_dataset = datasets.MNIST('data', download=True, train=True, transform=transform)
eval_dataset = datasets.MNIST('data', download=True, train=False, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=128)
eval_dataloader = DataLoader(eval_dataset, batch_size=128)

## (Neptune) Create `neptune_logger`

To create a new run for tracking the metadata, you tell Neptune who you are (`api_token`) and where to send the data (`project`).

You can use the default code cell below to create an anonymous run in the public project [common/mosaicml](https://app.neptune.ai/o/common/org/mosaicml). **Note**: Public projects are cleaned regularly, so anonymous runs are only stored temporarily.

### Log to your own project instead

Replace the code below with the following:

```python
from getpass import getpass

neptune_logger = NeptuneLogger(
    project="workspace-name/project-name",  # replace with your own (see instructions below)
    api_token=getpass("Enter your Neptune API token: "),
    tags=["mnist", "notebook"],  # (optional) use your own
)
```

To find your API token and full project name:

1. [Log in to Neptune](https://app.neptune.ai/).
1. In the bottom-left corner, expand your user menu and select **Get your API token**.
1. The workspace name is displayed in the top-left corner of the app.

    To copy the project path, in the top-right corner, open the settings menu and select **Properties**.

For more help, see [Setting Neptune credentials](https://docs.neptune.ai/setup/setting_credentials) in the Neptune docs.

In [6]:
from neptune import ANONYMOUS_API_TOKEN

neptune_logger = NeptuneLogger(
    api_token=ANONYMOUS_API_TOKEN,  # Replace with your own
    project="common/mosaicml-composer",  # Replace with your own
    capture_hardware_metrics=True,  # This is turned off by default in Notebooks
    tags=["mnist", "notebook"],  # (optional) use your own
)

https://app.neptune.ai/siddhant.sadangi/sandbox/e/SAN-1028


**To open the run in the Neptune web app, click the link that appeared in the cell output.**

## Configure Composer algorithms

### Label smoothing

In [11]:
label_smoothing = LabelSmoothing(0.1)

### Blur pooling

In [12]:
blurpool = BlurPool(
    replace_convs=True,
    replace_maxpools=True,
    blur_first=True,
)

### Progressive resizing

In [13]:
prog_resize = ProgressiveResizing(
    initial_scale=.6,
    finetune_fraction=0.34,
)

## Train model with Composer algorithms

In [14]:
trainer = Trainer(
    model=mnist_model(),
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="3ep",
    device="gpu" if torch.cuda.is_available() else "cpu",
    algorithms = [label_smoothing, blurpool, prog_resize],
    loggers=neptune_logger,
)

trainer.fit()

Value '2240760941' is outside the range of 32-bit integers ('-2147483648' to '2147483647') and will be logged as float


  return torch._C._cuda_getDeviceCount() > 0
******************************
Config:
blurpool/num_blurconv_layers: 0
blurpool/num_blurpool_layers: 0
enabled_algorithms/BlurPool: true
enabled_algorithms/LabelSmoothing: true
enabled_algorithms/ProgressiveResizing: true
node_name: unknown because NODENAME environment variable not set
num_cpus_per_node: 1
num_nodes: 1
rank_zero_seed: 2240760941

******************************


train          Epoch   0:    0%|| 0/469 [00:00<?, ?ba/s]         



eval           Epoch   0:    0%|| 0/79 [00:00<?, ?ba/s]         



train          Epoch   1:    0%|| 0/469 [00:00<?, ?ba/s]         

Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute: training/metrics/metrics/train/MulticlassAccuracy. Invalid point: 469.0


eval           Epoch   1:    0%|| 0/79 [00:00<?, ?ba/s]         

train          Epoch   2:    0%|| 0/469 [00:00<?, ?ba/s]         

eval           Epoch   2:    0%|| 0/79 [00:00<?, ?ba/s]         

Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute: training/metrics/metrics/train/MulticlassAccuracy. Invalid point: 938.0


## Log additional metadata
`neptune_logger`'s `base_handler` property exposes the [Namespace handler](https://docs.neptune.ai/api/field_types/#namespace-handler) being used by the logger. You can use it to log any additional metadata.

**Note:** The default namespace used by `NeptuneLogger` is "training". You can update this by passing yout own "base_namespace" while initializing `NeptuneLogger`

In the below example, we log a sample of the training dataset as a series of images in the "training/images" namespace.

In [24]:
from neptune.types import File
neptune_logger.base_handler["images"].extend([File.as_image(img/255) for img in train_dataset.data[:50]])

## Log to your custom namespace
If you want to log to a different namespace than used by the base handler, you can use `neptune_logger`'s `neptune_run` property to access the underlying `Run` object and pass your own namespaces.

In the below example, we log a sample of the eval dataset as a series of images in the "eval/images" namespace.

In [21]:
neptune_logger.neptune_run["eval/images"].extend([File.as_image(img/255) for img in eval_dataset.data[:50]])

## Stop logging

Once you are done logging, stop tracking the run.

In [25]:
trainer.close()

Shutting down background jobs, please wait a moment...
Done!
Waiting for the remaining 50 operations to synchronize with Neptune. Do not kill this process.
All 50 operations synced, thanks for waiting!
Explore the metadata in the Neptune app:
https://app.neptune.ai/siddhant.sadangi/sandbox/e/SAN-1028/metadata


## Analyze run in the Neptune app
Follow the run link in the above cell output and explore the logged metadata.  
You can also explore this [example run](https://app.neptune.ai/o/common/org/mosaicml/runs/details?viewId=standard-view&detailsTab=dashboard&dashboardId=Overview-99f571df-0fec-4447-9ffe-5a4c668577cd&shortId=CAT-2). #TODO: Update link