[![](https://img.shields.io/badge/Source%20on%20GitHub-orange)](https://github.com/laminlabs/lamin-mlops/blob/main/docs/mlflow.ipynb)

# MLFlow

We show how LaminDB can be integrated with [MLflow](https://mlflow.org/) to track the training process and associate datasets & parameters with models.

In [None]:
# !pip install 'lamindb[jupyter]' torchvision lightning wandb
!lamin init --storage ./lamin-mlops

In [None]:
import lamindb as ln
import mlflow
import lightning

from torch import utils
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
from autoencoder import LitAutoEncoder

```{dropdown} Tracking models in both LaminDB and MLFlow
````{note}
It is not always necessary to track all model parameters and metrics in both LaminDB and MLFlow.
However, if specific artifacts or runs should be queryable by specific model attributes such as, for example, the learning rate, then these attributes should be tracked.
Below, we show exemplary how to do that for the batch size and learning rate but the approach generalizes to more features.
````
```

In [None]:
# define model run parameters & features
MODEL_CONFIG = {"batch_size": 32, "lr": 0.001}

# TODO redo this with Feature.from_dict
hyperparameter = ln.Feature(name="Autoencoder hyperparameter", is_type=True).save()
for param_name, param_value in MODEL_CONFIG.items():
    ln.Feature(
        name=param_name, dtype=type(param_value).__name__, type=hyperparameter
    ).save()

ln.track(params=MODEL_CONFIG)

## Define a model

We use a basic PyTorch Lightning autoencoder as an example model.

````{dropdown} Code of LitAutoEncoder
```{eval-rst}
.. literalinclude:: autoencoder.py
   :language: python
   :caption: Simple autoencoder model
```
````

## Query & download the MNIST dataset

We saved the MNIST dataset in [curation notebook](/mnist) which now shows up in the Artifact registry:

In [None]:
ln.Artifact.filter(kind="dataset").df()

On LaminHub it looks like this:

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/LlMSvBjHuXbs36TBGoCM.png" alt="instance view" width="800px">

Let's get the dataset:

In [None]:
artifact = ln.Artifact.get(key="testdata/mnist")
artifact

And download it to a local cache:

In [None]:
path = artifact.cache()
path

Create a PyTorch-compatible dataset:

In [None]:
dataset = MNIST(path.as_posix(), transform=ToTensor())
dataset

## Monitor training with MLflow

Train our example model and track the training progress with `MLflow`.

In [None]:
# enable MLFlow PyTorch autologging
mlflow.pytorch.autolog()

In [None]:
with mlflow.start_run() as mlflow_run:
    train_dataset = MNIST(
        root="./data", train=True, download=True, transform=ToTensor()
    )
    train_loader = utils.data.DataLoader(train_dataset, batch_size=32)

    # Initialize model
    autoencoder = LitAutoEncoder(32, 16)

    # Create checkpoint callback
    from lightning.pytorch.callbacks import ModelCheckpoint

    checkpoint_callback = ModelCheckpoint(
        dirpath="model_checkpoints",
        filename=f"{mlflow_run.info.run_id}_last_epoch",
        save_top_k=1,
        monitor="train_loss",
    )

    # Train model
    trainer = lightning.Trainer(
        accelerator="cpu",
        limit_train_batches=3,
        max_epochs=2,
        callbacks=[checkpoint_callback],
    )
    trainer.fit(model=autoencoder, train_dataloaders=train_loader)

    # Get run information
    run_id = mlflow_run.info.run_id
    ln.context.run.reference = run_id

    # save model summary artifact
    local_model_summary_path = (
        f"{mlflow_run.info.artifact_uri.removeprefix('file://')}/model_summary.txt"
    )

    mlflow_model_summary_af = ln.Artifact(
        local_model_summary_path,
        key=f"testmodels/mlflow/{local_model_summary_path}",
        kind="model",
    ).save()

    # save checkpoint as a model
    mlflow_model_ckpt_af = ln.Artifact(
        f"model_checkpoints/{run_id}_last_epoch.ckpt",
        key="testmodels/mlflow/litautoencoder.ckpt",
        kind="model",
    ).save()

**See the training progress in the `mlflow` UI:**

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/C0seowxsq4Du2B4T0000.png" alt="MLFlow training UI" width="800px">

**See the checkpoints:**

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/n0xxFoMRtZPiQ7VT0001.png" alt="MLFlow checkpoints UI" width="800px">

If later on, you want to re-use the checkpoint, you can get it via:

In [None]:
ln.Artifact.get(key="testmodels/mlflow/litautoencoder.ckpt").cache()

Or on the CLI:
```
lamin get artifact --key 'testmodels/litautoencoder'
```

In [None]:
ln.finish()

In [None]:
!rm -rf ./lamin-mlops
!lamin delete --force lamin-mlops