<a href="https://colab.research.google.com/github/Koushouu/Bioimage-Analysis-Workshop-Taipei/blob/main/Cellpose_2_0_in_colab_part_2_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Running cellpose 2.0 in colab - part 2

In this notebook we will train a custom model specifically for our images

If your notebook is in Chinese please go to "説明" tab above and select "查看英文版本"

There are three sections in this notebook:

* Section 1 - Setup: we will install cellpose 2.0 and openCV to our runtime, and we will initialize the colab cloud GPU. Just like what we did for part 1 notebook.

* Section 2 - Train model on manual annotations: We will train a model based on our manually annotated images

* Section 3 - Validation: we will validate the efficiency of the model compare to our manual annotations.

The content of this notebook is mostly modified from: https://colab.research.google.com/github/MouseLand/cellpose/blob/main/notebooks/run_cellpose_2.ipynb

# Setup
Same as we did in part 1. We will first install cellpose 2.0, check the GPU is working, and mount google drive to get your models and images.

In [None]:
# Install Cellpose and OpenCV
!pip install "opencv-python-headless<4.3"
!pip install cellpose

# Check GPU status
!nvcc --version
!nvidia-smi

import os, shutil
import numpy as np
import matplotlib.pyplot as plt
from cellpose import core, utils, io, models, metrics
from glob import glob

use_GPU = core.use_gpu()
yn = ['NO', 'YES']
print(f'>>> GPU activated? {yn[use_GPU]}')

print("Connect Google Drive to Colab")
from google.colab import drive
drive.mount('/content/gdrive')

# Train model on manual annotations

Fill out the form below with the paths to your data and the parameters to start training.

## Training parameters

<font size = 4> **Paths for training, predictions and results**


<font size = 4>**`train_dir:`, `test_dir`:** These are the paths to your folders train_dir (with images and masks of training images) and test_dir (with images and masks of test images). You can leave the test_dir blank, but it's recommended to have some test images to check the model's performance. To find the paths of the folders containing the respective datasets, go to your Files on the left of the notebook, navigate to the folder containing your files and copy the path by right-clicking on the folder, **Copy path** and pasting it into the right box below.

<font size = 4>**`initial_model`:** Choose a model from the cellpose [model zoo](https://cellpose.readthedocs.io/en/latest/models.html#model-zoo) to start from.

<font size = 4>**`model_name`**: Enter the path where your model will be saved once trained (for instance your result folder).

<font size = 4>**Training parameters**

<font size = 4>**`number_of_epochs`:** Input how many epochs (rounds) the network will be trained. At least 100 epochs are recommended, but sometimes 250 epochs are necessary, particularly from scratch. **Default value: 100**

In [None]:
#@markdown ###Path to images and masks:

train_dir = "/content/gdrive/MyDrive/Colab Notebooks/train" #@param {type:"string"}
test_dir = "/content/gdrive/MyDrive/Colab Notebooks/test" #@param {type:"string"}
#Define where the patch file will be saved
base = "/content"

# model name and path
#@markdown ###Name of the pretrained model to start from and new model name:
from cellpose import models
initial_model = "cyto2" #@param ['cyto','nuclei','tissuenet','livecell','cyto2','CP','CPx','TN1','TN2','TN3','LC1','LC2','LC3','LC4','scratch']
model_name = "demo_cyto2" #@param {type:"string"}

# other parameters for training.
#@markdown ###Training Parameters:
#@markdown Number of epochs:
n_epochs =  100#@param {type:"number"}

Channel_to_use_for_training = "Grayscale" #@param ["Grayscale", "Blue", "Green", "Red"]

# @markdown ###If you have a secondary channel that can be used for training, for instance nuclei, choose it here:

Second_training_channel= "None" #@param ["None", "Blue", "Green", "Red"]


#@markdown ###Advanced Parameters

Use_Default_Advanced_Parameters = True #@param {type:"boolean"}
#@markdown ###If not, please input:
learning_rate = 0.1 #@param {type:"number"}
weight_decay = 0.0001 #@param {type:"number"}

if (Use_Default_Advanced_Parameters): 
  print("Default advanced parameters enabled")
  learning_rate = 0.1 
  weight_decay = 0.0001
  
#here we check that no model with the same name already exist, if so delete
model_path = train_dir + 'models/'
if os.path.exists(model_path+'/'+model_name):
  print("!! WARNING: "+model_name+" already exists and will be deleted in the following cell !!")
  
if len(test_dir) == 0:
  test_dir = None

# Here we match the channel to number
if Channel_to_use_for_training == "Grayscale":
  chan = 0
elif Channel_to_use_for_training == "Blue":
  chan = 3
elif Channel_to_use_for_training == "Green":
  chan = 2
elif Channel_to_use_for_training == "Red":
  chan = 1


if Second_training_channel == "Blue":
  chan2 = 3
elif Second_training_channel == "Green":
  chan2 = 2
elif Second_training_channel == "Red":
  chan2 = 1
elif Second_training_channel == "None":
  chan2 = 0

if initial_model=='scratch':
  initial_model = 'None'

Default advanced parameters enabled


## Train new model

Using settings from form above, train model in notebook.

In [None]:
# start logger (to see training across epochs)
logger = io.logger_setup()

# DEFINE CELLPOSE MODEL (without size model)
model = models.CellposeModel(gpu=use_GPU, model_type=initial_model)

# set channels
channels = [chan, chan2]

# get files
output = io.load_train_test_data(train_dir, test_dir, mask_filter='_seg.npy')
train_data, train_labels, _, test_data, test_labels, _ = output

new_model_path = model.train(train_data, train_labels, 
                              test_data=test_data,
                              test_labels=test_labels,
                              channels=channels, 
                              save_path=train_dir, 
                              n_epochs=n_epochs,
                              learning_rate=learning_rate, 
                              weight_decay=weight_decay, 
                              nimg_per_epoch=8,
                              model_name=model_name)

# diameter of labels in training images
diam_labels = model.diam_labels.copy()

# Validation

Evaluate on test data to check performance

In [None]:
# get files (during training, test_data is transformed so we will load it again)
output = io.load_train_test_data(test_dir, mask_filter='_seg.npy')
test_data, test_labels = output[:2]

# run model on test images
masks = model.eval(test_data, 
                   channels=[chan, chan2],
                   diameter=diam_labels)[0]

# check performance using ground truth labels
ap = metrics.average_precision(test_labels, masks)[0]
print('')
print(f'>>> average precision at iou threshold 0.5 = {ap[:,0].mean():.3f}')


Plot results

In [None]:
from natsort import natsorted
from os.path import join

plt.figure(figsize=(15,15))
for k,im in enumerate(test_data):
    # Show the original test image
    img = im.copy()
    plt.subplot(3,len(test_data), k+1)
    plt.imshow(test_data[k])
    plt.axis('off')
    if k==0:
        plt.title('image')
    # Show the segmented result
    plt.subplot(3,len(test_data), len(test_data) + k+1)
    plt.imshow(test_data[k], cmap = 'gray')
    plt.imshow(np.ma.array(masks[k], mask=masks[k]==0), cmap = 'prism', alpha = 0.4)
    plt.axis('off')
    if k==0:
        plt.title('predicted labels')
    # Show ground truth
    plt.subplot(3,len(test_data), 2*len(test_data) + k+1)
    plt.imshow(test_data[k], cmap = 'gray')
    plt.imshow(np.ma.array(test_labels[k], mask=test_labels[k]==0), cmap = 'prism', alpha = 0.4)
    plt.axis('off')
    if k==0:
        plt.title('true labels')
plt.tight_layout()