# Semantic Segmentation with convpaint and DINOv2

This notebooks demonstrates how to run a semantic segmentation on an image using DINOv2 for feature extraction and a random forest algorithm for classification. It is based on the notebook provided by convpaint.


## Imports

In [None]:
%load_ext autoreload
%autoreload 2

import napari
import numpy as np
import skimage
from matplotlib import pyplot as plt
from dino_paint_utils import (train_dino_forest,
                              predict_dino_forest,
                              selfpredict_dino_forest,
                              test_dino_forest)

## Choose the model

1) Choose the **DINOv2 model** to be used (assign None to not use DINOv2):

|key | model| features
|---|---|---|
|'s' | dinov2_vits14| 384|
|'b' | dinov2_vitb14| 768|
|'l' | dinov2_vitl14| 1024|
|'g' | dinov2_vitg14| 1536|
|+ '_r' | *base_model*_reg (not supported yet)| add registers|

2) Choose the **layers of DINOv2** to used features (give a list of indices 0-11); each layer has the number of features specific for the model as listed in the table above.

3) Choose the **layers of VGG16** to be attatched as additional features (give a list of indices; only use Conv2d layers; assign None to not use VGG16):

|index|layer|
|---|---|
|**0**|**Conv2d3, 64, kernel_size=3, stride=1, padding=1**|
|1|ReLUinplace=True|
|**2**|**Conv2d64, 64, kernel_size=3, stride=1, padding=1**|
|3|ReLUinplace=True|
|4|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**5**|**Conv2d64, 128, kernel_size=3, stride=1, padding=1**|
|6|ReLUinplace=True|
|**7**|**Conv2d128, 128, kernel_size=3, stride=1, padding=1**|
|8|ReLUinplace=True|
|9|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**10**|**Conv2d128, 256, kernel_size=3, stride=1, padding=1**|
|11|ReLUinplace=True|
|**12**|**Conv2d256, 256, kernel_size=3, stride=1, padding=1**|
|13|ReLUinplace=True|
|**14**|**Conv2d256, 256, kernel_size=3, stride=1, padding=1**|
|15|ReLUinplace=True|
|16|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**17**|**Conv2d256, 512, kernel_size=3, stride=1, padding=1**|
|18|ReLUinplace=True|
|**19**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|20|ReLUinplace=True|
|**21**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|22|ReLUinplace=True|
|23|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|
|**24**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|25|ReLUinplace=True|
|**26**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|27|ReLUinplace=True|
|**28**|**Conv2d512, 512, kernel_size=3, stride=1, padding=1**|
|29|ReLUinplace=True|
|30|MaxPool2dkernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False|

4) Choose if the **image itself** (3 rgb channels) shall be added as features.

5) Choose the **scale factor** to use.



In [63]:
dinov2_model = None#'s'
dinov2_layers = (3)
vgg16 = [2] # [0,2,5,7,10,12,14,17,19,21,24,26,28]
image_as_feature = False
scale = 1

## Train

Load an image and its annotation/labels to train the model on.

In [None]:
# image_to_train = skimage.data.cells3d()
# image_to_train = image_to_train[30, 1]
# from napari_convpaint.convpaint_sample import create_annotation_cell3d
# labels_to_train = create_annotation_cell3d()[0][0]
# image_to_train = image_to_train[:, :126]
# labels_to_train = labels_to_train[:, :126]

# crop = ((60,288), (0,178))
# crop = ((20,20+224), (0,224))
# image_to_train = image_to_train[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]
# labels_to_train = labels_to_train[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]

# LOAD ASTRONAUT IMAGE (RGB) AND ANNOTATION
image_to_train = skimage.data.astronaut()#[0:504,0:504,:]
labels_to_train = plt.imread('images_and_labels/astro_labels_2.tif')[:,:,0]#[0:504,0:504]

# LOAD HARDER CELL IMAGE AND ITS LABELS
# image_to_train = plt.imread('images_and_labels/00_00016.tiff')
# labels_to_train = plt.imread('images_and_labels/00_00016_labels.tiff')[:,:,0]

Exctract the features using DINOv2 and/or VGG16 and use them to train a random forest classifier.

In [64]:
train = train_dino_forest(image_to_train, labels_to_train,
                          crop_to_patch=True, scale=scale, upscale_order=1,
                          dinov2_model=dinov2_model, dinov2_layers=dinov2_layers, vgg16_layers=vgg16, append_image_as_feature=image_as_feature,
                          show_napari=True)
random_forest, image_train, labels_train, features_space_train = train

using VGG16 layers ['features.2 Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))']


## Predict

Load an image to predict the labels for using the trained model above.

In [None]:
# image_to_pred = skimage.data.cells3d()
# image_to_pred = image_to_pred[40, 1][:,125:251]
# ground_truth = plt.imread('images_and_labels/cells_cross_ground_truth.tif')[:,:,0]

# crop = ((20,248), (50,278))
# crop = ((20,20+224), (0,224))
# image_to_pred = image_to_pred[crop[0][0]:crop[0][1], crop[1][0]:crop[1][1]]

# LOAD AN IMAGE TO PREDICT BASED ON THE CLASSIFIER TRAINED ON THE ASTRONAUT IMAGE
image_to_pred = skimage.data.camera()
ground_truth = plt.imread('images_and_labels/cam_ground_truth.tif')[:,:,0]

# image_to_pred = skimage.data.cat()
# image_to_pred = skimage.data.horse().astype(np.int32)
# image_to_pred = skimage.data.binary_blobs().astype(np.int32)
# image_to_pred = skimage.data.coins()
# ground_truth = None


Exctract the features and use them together with the trained classifier to make a prediciton for the labels.

In [None]:
pred = predict_dino_forest(image_to_pred, random_forest, ground_truth=ground_truth,
                           crop_to_patch=True, scale=scale, upscale_order=1,
                           dinov2_model=dinov2_model, dinov2_layers=dinov2_layers, vgg16_layers=vgg16, append_image_as_feature=image_as_feature,
                           show_napari=True)
predicted_labels, image_pred, features_space_pred, acc = pred

## Selfpredict

We can also directly do a training and prediction on the same image (extracting the features only once).

In [None]:
# self_pred_image = skimage.data.astronaut()#[0:504,0:504,:]
# self_pred_labels = plt.imread('images_and_labels/astro_labels_2.tif')[:,:,0]#[0:504,0:504]
# ground_truth = plt.imread('images_and_labels/astro_ground_truth.tif')[:,:,0]
self_pred_image = image_to_train
self_pred_labels = labels_to_train
ground_truth = None

self_pred = selfpredict_dino_forest(self_pred_image, self_pred_labels, ground_truth,
                                    crop_to_patch=True, scale=scale, upscale_order=1,
                                    dinov2_model=dinov2_model, dinov2_layers=dinov2_layers, vgg16_layers=vgg16, append_image_as_feature=image_as_feature,
                                    show_napari=True)
predicted_labels, image_scaled, labels_scaled, feature_space, acc = self_pred

## Tests against ground truth

In [None]:
image_to_train = skimage.data.astronaut()#[0:504,0:504,:]
labels_to_train = plt.imread('images_and_labels/astro_labels_2.tif')[:,:,0]#[0:504,0:504]
ground_truth = plt.imread('images_and_labels/astro_ground_truth.tif')[:,:,0]
image_to_pred = None

# viewer = napari.Viewer()
# viewer.add_image(image_to_train)
# viewer.add_labels(labels_to_train)
# viewer_2 = napari.Viewer()
# viewer_2.add_image(image_to_pred)
# viewer_2.add_labels(ground_truth)

all_vggs = [0,2,5,7,10,12,14,17,19,21,24,26,28]
single_vggs = [[i] for i in all_vggs]
consecutive_vggs = [all_vggs[:s] for s in range(1,len(all_vggs))]
dual_vggs = [[all_vggs[i], all_vggs[j]] for i in range(len(all_vggs)) for j in range(i+1, len(all_vggs))]

dinos = [None, 's']#, 'b']
vggs = [None, [24,26,28], all_vggs]#[0], [10], [17], [24], [0, 10, 17, 24]]#
im_feats = [False]#, True]
scales = [1]#, 2]

In [None]:
test = test_dino_forest(image_to_train, labels_to_train, ground_truth, image_to_pred,
                        scales=scales, dinos=dinos, vggs=vggs, im_feats=im_feats,
                        print_avg=True, print_max=True)
accs, avg_accs, max_acc = test