[![](https://img.shields.io/badge/Source%20on%20GitHub-orange)](https://github.com/laminlabs/lamin-mlops/blob/main/docs/wandb.ipynb)
[![](https://img.shields.io/badge/Source%20%26%20report%20on%20LaminHub-mediumseagreen)](https://lamin.ai/laminlabs/lamindata/transform/nrPNwWEVUsL95zKv)

# Weights & Biases

We show how LaminDB can be integrated with W&B to track the training process and associate datasets & parameters with models.

In [None]:
# pip install lamindb torchvision lightning wandb
!lamin init --storage ./lamin-mlops
!wandb login

In [None]:
import lamindb as ln
import wandb
import lightning

from torch import utils
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
from autoencoder import LitAutoEncoder

ln.track()

## Define a model

We use a basic PyTorch Lightning autoencoder as an example model.

````{dropdown} Code of LitAutoEncoder
```{eval-rst}
.. literalinclude:: autoencoder.py
   :language: python
   :caption: Simple autoencoder model
```
````

## Query & download the MNIST dataset

We saved the MNIST dataset in [curation notebook](/mnist) which now shows up in the Artifact registry:

In [None]:
ln.Artifact.filter(kind="dataset").to_dataframe()

You can also find it on lamin.ai if you were connected your instance.

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/LlMSvBjHuXbs36TBGoCM.png" alt="instance view" width="800px">

Let's get the dataset:

In [None]:
artifact = ln.Artifact.get(key="testdata/mnist")
artifact

And download it to a local cache:

In [None]:
path = artifact.cache()
path

Create a PyTorch-compatible dataset:

In [None]:
dataset = MNIST(path.as_posix(), transform=ToTensor())
dataset

## Monitor training with wandb

Train our example model and track the training progress with `wandb`.

In [None]:
from lightning.pytorch.loggers import WandbLogger

MODEL_CONFIG = {"hidden_size": 32, "bottleneck_size": 16, "batch_size": 32}

# create the data loader
train_loader = utils.data.DataLoader(
    dataset, batch_size=MODEL_CONFIG["batch_size"], shuffle=True
)

# init model
autoencoder = LitAutoEncoder(
    MODEL_CONFIG["hidden_size"], MODEL_CONFIG["bottleneck_size"]
)

# initialize the logger
wandb_logger = WandbLogger(project="lamin")

# add batch size to the wandb config
wandb_logger.experiment.config["batch_size"] = MODEL_CONFIG["batch_size"]

In [None]:
from lightning.pytorch.callbacks import ModelCheckpoint

# store checkpoints to disk and upload to LaminDB after training
checkpoint_callback = ModelCheckpoint(
    dirpath=f"model_checkpoints/{wandb_logger.version}.ckpt",
    filename="last_epoch",
    save_top_k=1,
    monitor="train_loss",
)

# train model
trainer = lightning.Trainer(
    accelerator="cpu",
    limit_train_batches=3,
    max_epochs=2,
    logger=wandb_logger,
    callbacks=[checkpoint_callback],
)
trainer.fit(model=autoencoder, train_dataloaders=train_loader)

In [None]:
wandb_logger.experiment.name

In [None]:
wandb_logger.version

In [None]:
wandb.finish()

**See the training progress in the `wandb` UI:**

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/awrTvbxrLaiNav17VxBN.png" alt="Wandb training ui" width="800px">

## Save model in LaminDB

In [None]:
# save checkpoint as a model
artifact = ln.Artifact(
    f"model_checkpoints/{wandb_logger.version}.ckpt",
    key="testmodels/wandb/litautoencoder.ckpt",
    kind="model",
).save()

# create a label with the wandb experiment name
experiment_label = ln.ULabel(
    name=wandb_logger.experiment.name, description="wandb experiment name"
).save()

# annotate the model artifact
artifact.ulabels.add(experiment_label)

**See the checkpoints:**

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/248fOMXqxT0U4f7LRSgj.png" alt="Wandb check points" width="800px">

If later on, you want to re-use the checkpoint, you can download it like so:

In [None]:
ln.Artifact.get(key="testmodels/wandb/litautoencoder.ckpt").cache()

Or with the CLI:
```
lamin get artifact --key 'testmodels/litautoencoder'
```

In [None]:
ln.finish()