# Semantic Segmentation with convpaint and DINOv2

This notebooks demonstrates how to run a semantic segmentation on an image using DINOv2 for feature extraction and a random forest algorithm for classification. It is based on the notebook provided by convpaint.


## Imports

In [45]:
%load_ext autoreload
%autoreload 2

import napari
import numpy as np
import skimage
from matplotlib import pyplot as plt
from dino_paint_utils import (train_dino_forest,
                              predict_dino_forest,
                              selfpredict_dino_forest)
                              

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Choose the model

1) Choose the **DINOv2 model** to be used (assign None to not use DINOv2):

|key | model| features
|---|---|---|
|'s' | dinov2_vits14| 384|
|'b' | dinov2_vitb14| 768|
|'l' | dinov2_vitl14| 1024|
|'g' | dinov2_vitg14| 1536|

2) Choose the **layers of VGG16** to be attatched as additional features (give a list of indices; only use Conv2d layers; assign None to not use VGG16):

|index|layer|
|---|---|
|**0**|**Conv2d3, 64, kernel_size=3, stride=1, padding=1**|
|1|ReLUinplace=True|
|**2**|**Conv2d64, 64, kernel_size=3, stride=1, padding=1**|
|3|ReLUinplace=True|
|4|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**5**|**Conv2d64, 128, kernel_size=3, stride=1, padding=1**|
|6|ReLUinplace=True|
|**7**|**Conv2d128, 128, kernel_size=3, stride=1, padding=1**|
|8|ReLUinplace=True|
|9|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**10**|**Conv2d128, 256, kernel_size=3, stride=1, padding=1**|
|11|ReLUinplace=True|
|**12**|**Conv2d256, 256, kernel_size=3, stride=1, padding=1**|
|13|ReLUinplace=True|
|**14**|**Conv2d256, 256, kernel_size=3, stride=1, padding=1**|
|15|ReLUinplace=True|
|16|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**17**|**Conv2d256, 512, kernel_size=3, stride=1, padding=1**|
|18|ReLUinplace=True|
|**19**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|20|ReLUinplace=True|
|**21**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|22|ReLUinplace=True|
|23|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**24**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|25|ReLUinplace=True|
|**26**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|27|ReLUinplace=True|
|**28**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|29|ReLUinplace=True|
|30|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|

3) Choose if the **image itself** (3 rgb channels) shall be added as features.



In [46]:
dinov2_model = 's'
vgg16 = None #[0,12] #[0,2,5,7,10,12,14,17,19,21,24,26,28]
image_as_feature = False #True

## Train

Load an image and its annotation/labels to train the model on.

In [47]:
image_to_train = skimage.data.cells3d()
image_to_train = image_to_train[30, 1]
from napari_convpaint.convpaint_sample import create_annotation_cell3d
labels_to_train = create_annotation_cell3d()[0][0]
# crop = ((60,288), (0,178))
# crop = ((20,20+224), (0,224))
# image_to_train = image_to_train[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]
# labels_to_train = labels_to_train[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]

# LOAD ASTRONAUT IMAGE (RGB) AND ANNOTATION
# image_to_train = skimage.data.astronaut()#[0:504,0:504,:]
# labels_to_train = plt.imread('astro_labels_2.tif')[:,:,0]#[0:504,0:504]

Exctract the features using DINOv2 and use them to train a random forest classifier.

In [48]:
train = train_dino_forest(image_to_train, labels_to_train, crop_to_patch=True, scale=2, upscale_order=1, dinov2_model=dinov2_model, vgg16_layers=vgg16, append_image_as_feature=image_as_feature, show_napari=True)
random_forest, image_train, labels_train, features_space_train = train

## Predict

Load an image to predict the labels for using the trained model above.

In [49]:
image_to_pred = skimage.data.cells3d()
image_to_pred = image_to_pred[40, 1].T
# crop = ((20,248), (50,278))
# crop = ((20,20+224), (0,224))
# image_to_pred = image_to_pred[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]

# LOAD AN IMAGE TO PREDICT BASED ON THE CLASSIFIER TRAINED ON THE ASTRONAUT IMAGE
# image_to_pred = skimage.data.camera()
# image_pred = skimage.data.cat()
# image_pred = skimage.data.horse().astype(np.int32)
# image_pred = skimage.data.binary_blobs().astype(np.int32)
# image_to_pred = skimage.data.coins()

Exctract the features and use them together with the trained classifier to make a prediciton for the labels.

In [50]:
pred = predict_dino_forest(image_to_pred, random_forest, crop_to_patch=True, scale=2, upscale_order=1, dinov2_model=dinov2_model, vgg16_layers=vgg16, append_image_as_feature=image_as_feature, show_napari=True)
predicted_labels, image_pred, features_space_pred = pred

## Selfpredict

We can also directly do a training and prediction on the same image (extracting the features only once).

In [51]:
# self_pred_image = image_to_train
# self_pred_labels = labels_to_train

# self_pred = selfpredict_dino_forest(self_pred_image, self_pred_labels, crop_to_patch=True, scale=1, upscale_order=1, dinov2_model='s', append_image_as_feature=image_as_feature, show_napari=True)