# ðŸ”¥ LightlyTrain - Semantic Segmentation with DINOv3 EoMT ðŸ”¥

This notebook demonstrates how to use LightlyTrain for semantic segmentation with our state-of-the-art [EoMT](https://arxiv.org/abs/2503.19108) model built on [DINOv3](https://github.com/facebookresearch/dinov3) backbones, with our publicly released weights trained on the [COCO-Stuff](https://arxiv.org/abs/1612.03716) and [Cityscapes](https://www.cityscapes-dataset.com/) dataset.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lightly-ai/lightly-train/blob/main/examples/notebooks/eomt_semantic_segmentation.ipynb)

> **Important**: When running on Google Colab make sure to select a GPU runtime for faster processing. You can do this by going to `Runtime` > `Change runtime type` and selecting a GPU hardware accelerator.

## Installation

LightlyTrain can be installed directly via `pip`:

In [None]:
!pip install lightly-train

> **Important**: LightlyTrain is officially supported on
> - Linux: CPU or CUDA
> - MacOS: CPU only
> - Windows (experimental): CPU or CUDA
>
> We are planning to support MPS for MacOS.
>
> Check the [installation instructions](https://docs.lightly.ai/train/stable/installation.html) for more details on installation.

## Prediction using LightlyTrain's model weights

### Download the model weights

Here we download the ViT-S/16 weights trained on COCO-Stuff with the command below as an example:

In [None]:
!wget -O lightlytrain_dinov3_eomt_vits16_cocostuff.pt https://lightly-train-checkpoints.s3.us-east-1.amazonaws.com/dinov3_eomt/lightlytrain_dinov3_eomt_vits16_cocostuff.pt

You can also check our [semantic segmentation docs](https://docs.lightly.ai/train/stable/semantic_segmentation.html) for all available model weights.

### Download an example image

After downloading the model weights, we download an example image for inference with the following command:

In [None]:
!wget -O cat.jpg https://upload.wikimedia.org/wikipedia/commons/3/3a/Cat03.jpg

### Load the model weights to LightlyTrain

Then, we load the model weights with LightlyTrain's `load_model_from_checkpoint` function and do inference on the example image with the `predict` method:

In [None]:
import lightly_train

model = lightly_train.load_model_from_checkpoint(
    "lightlytrain_dinov3_eomt_vits16_cocostuff.pt",
)

### Predict the mask

Simply call the `model`'s `.predict()` method to predict the mask

In [None]:
masks = model.predict("cat.jpg")

## Visualize the image and mask

Finally, we visualize the image and mask to see if it makes sense:

In [None]:
import matplotlib.pyplot as plt
import torch
from torchvision.io import read_image
from torchvision.utils import draw_segmentation_masks

image = read_image("cat.jpg")
masks = torch.stack([masks == class_id for class_id in masks.unique()])
image_with_masks = draw_segmentation_masks(image, masks, alpha=0.6)
plt.imshow(image_with_masks.permute(1, 2, 0))
plt.show()

Congrats! Using LightlyTrain's EoMT model weights for inference is as simple as the above example.