<a href="https://colab.research.google.com/github/MouseLand/cellpose/blob/master/notebooks/run_cellpose_GPU.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Running cellpose with/without training

(thanks to Matteo Carandini for setting this up)

UPDATED DEC 2020 for TORCH VERSION cellpose v0.6

UPDATED NOV 2021 for cellpose / omnipose v0.7

## installation

Install cellpose -- by default the torch GPU version is installed in COLAB notebook.

Note that cellpose uses the latest version of numpy, so please click the "Restart runtime" button once the install completes.

In [2]:
!pip install "opencv-python-headless<4.3"
!pip install cellpose

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting opencv-python-headless<4.3
  Downloading opencv_python_headless-3.4.18.65-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (45.7 MB)
[K     |████████████████████████████████| 45.7 MB 2.6 MB/s eta 0:00:01
Installing collected packages: opencv-python-headless
  Attempting uninstall: opencv-python-headless
    Found existing installation: opencv-python-headless 4.6.0.66
    Uninstalling opencv-python-headless-4.6.0.66:
      Successfully uninstalled opencv-python-headless-4.6.0.66
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
qudida 0.0.4 requires opencv-python-headless>=4.0.1, but you have opencv-python-headless 3.4.18.65 which is incompatible.
albumentations 1.3.0 requires opencv-python-headless>=4.1.1, but you have opencv-python-headless 3.4.18.65 which is incomp

Check CUDA version and GPU

In [3]:
!nvcc --version
!nvidia-smi

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
Sun Apr 16 16:27:57 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02    Driver Version: 525.89.02    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   55C    P5    38W / 350W |   3720MiB / 24576MiB |     30%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   

import libraries and check GPU (the first time you import cellpose the models will download).

In [4]:
import numpy as np
import time, os, sys
import tifffile
from urllib.parse import urlparse
import skimage.io
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
mpl.rcParams['figure.dpi'] = 300

from urllib.parse import urlparse
from cellpose import models, core

use_GPU = core.use_gpu()
print('>>> GPU activated? %d'%use_GPU)

# call logger_setup to have output of cellpose written
from cellpose.io import logger_setup
logger_setup();


>>> GPU activated? 1
2023-04-16 16:28:02,206 [INFO] WRITING LOG OUTPUT TO /home/xzhang/.cellpose/run.log
2023-04-16 16:28:02,207 [INFO] 
cellpose version: 	2.2 
platform:       	linux 
python version: 	3.9.12 
torch version:  	1.11.0


Download sample images

# load 3d dataset allencell

In [5]:
from natsort import natsorted
import glob
import os 
data_folder = '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/'
image_files = natsorted(glob.glob(data_folder + 'images/*.tiff'))
seg_files = natsorted(glob.glob(data_folder + 'masks/*.tiff'))


valid_img_files = image_files[::5]
valid_seg_files = seg_files[::5]

train_img_files = [f for i,f in enumerate(image_files) if i%5 != 0]
train_seg_files = [f for i,f in enumerate(seg_files) if i%5 != 0]

print(len(train_img_files), len(train_seg_files), len(valid_img_files), len(valid_seg_files))
print(train_img_files,valid_img_files)

80 80 20 20
['/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/2.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/3.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/4.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/5.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/7.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/8.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/9.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/10.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/12.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/13.tiff', '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/images/14.tiff', '/data/download_data/quilt

# predicting with pre-trained (Generalist) Cellpose without finetuning

In [39]:

from cellpose import models, metrics
import tifffile
from stardist_matching import matching


# DEFINE CELLPOSE MODEL
# model_type='cyto' or model_type='nuclei'
model = models.Cellpose(gpu=use_GPU, model_type='nuclei')
ap_algo1  = []
ap_algo2  = []
# test 3D stack
for file,mask in zip(valid_img_files,valid_seg_files):
    img = tifffile.imread(file)
    gt = tifffile.imread(mask)
    file_name=os.path.basename(file)


    ### TWO WAYS TO RUN CELLPOSE IN 3D

    # 1. computes flows from 2D slices and combines into 3D flows to create masks
    print('running cellpose 2D slice flows => masks')
    masks, flows, styles, _ = model.eval(img, channels=[0,0], diameter=allen_cell_diameter, do_3D=True)

    # 2. computes masks in 2D slices and stitches masks in 3D based on mask overlap
    print('running cellpose 2D + stitching masks')
    masks_stitched, flows_stitched, styles_stitched, _ = model.eval(img, channels=[0,0], diameter=allen_cell_diameter, do_3D=False, stitch_threshold=0.5)
    #average_precision returns [AP,TP,FP,FN]
    # ap1 = metrics.average_precision(gt, masks,  threshold=[0.1,0.2,0.7])[0]
    # ap2 = metrics.average_precision(gt, masks_stitched,threshold=[0.1,0.2,0.7])[0]

    #calculate accuracy
    # ap1 = matching(gt, masks).precision
    # ap2 = matching(gt, masks_stitched).precision
    # ap_algo1.append(ap1)
    # ap_algo2.append(ap2)

    tifffile.imwrite(data_folder +'results_2d_cellpose_algo1/' + file_name.split('.')[0] + 'pred.tiff', masks)
    tifffile.imwrite(data_folder +'results_2d_cellpose_algo2/' + file_name.split('.')[0] + 'pred.tiff', masks_stitched)


2023-04-13 17:33:03,746 [INFO] ** TORCH CUDA version installed and working. **
2023-04-13 17:33:03,747 [INFO] >>>> using GPU
2023-04-13 17:33:03,747 [INFO] >> nuclei << model set to be used
2023-04-13 17:33:03,843 [INFO] >>>> model diam_mean =  17.000 (ROIs rescaled to this size during training)
running cellpose 2D slice flows => masks
2023-04-13 17:33:03,861 [INFO] ~~~ FINDING MASKS ~~~
2023-04-13 17:33:03,913 [INFO] multi-stack tiff read in as having 65 planes 1 channels
2023-04-13 17:33:05,588 [INFO] running YX: 65 planes of size (624, 924)
2023-04-13 17:33:05,592 [INFO] 0%|          | 0/3 [00:00<?, ?it/s]
2023-04-13 17:33:05,711 [INFO] 67%|######6   | 2/3 [00:00<00:00, 16.80it/s]
2023-04-13 17:33:05,759 [INFO] 100%|##########| 3/3 [00:00<00:00, 17.96it/s]
2023-04-13 17:33:05,963 [INFO] running ZY: 624 planes of size (65, 924)
2023-04-13 17:33:05,970 [INFO] 0%|          | 0/20 [00:00<?, ?it/s]
2023-04-13 17:33:06,075 [INFO] 25%|##5       | 5/20 [00:00<00:00, 47.97it/s]
2023-04-13 17

FileNotFoundError: [Errno 2] No such file or directory: '/data/download_data/quilt-data-access-tutorials-main/all_fov/allen100/masks/66_mask.tiff'

# training

In [22]:


# model name and path
#@markdown ###Name of the pretrained model to start from and new model name:
from stardist_matching import matching
from cellpose import models
initial_model = "nuclei" #@param ['cyto','nuclei','tissuenet','livecell','cyto2','CP','CPx','TN1','TN2','TN3','LC1','LC2','LC3','LC4','scratch']
model_name = "allen_2d" #@param {type:"string"}
train_dir ='./allen_train/'
# other parameters for training.
#@markdown ###Training Parameters:
#@markdown Number of epochs:
n_epochs =  100#@param {type:"number"}

Channel_to_use_for_training = "Grayscale" #@param ["Grayscale", "Blue", "Green", "Red"]

# @markdown ###If you have a secondary channel that can be used for training, for instance nuclei, choose it here:

# Second_training_channel= "Red" #@param ["None", "Blue", "Green", "Red"]


#@markdown ###Advanced Parameters

Use_Default_Advanced_Parameters = True #@param {type:"boolean"}
#@markdown ###If not, please input:
learning_rate = 0.1 #@param {type:"number"}
weight_decay = 0.0001 #@param {type:"number"}

if (Use_Default_Advanced_Parameters): 
  print("Default advanced parameters enabled")
  learning_rate = 0.1 
  weight_decay = 0.0001
  
#here we check that no model with the same name already exist, if so delete
model_path = train_dir + 'models/'
if os.path.exists(model_path+'/'+model_name):
  print("!! WARNING: "+model_name+" already exists and will be deleted in the following cell !!")
  
# if len(test_dir) == 0:
  test_dir = None



if initial_model=='scratch':
  initial_model = 'None'





Default advanced parameters enabled


In [23]:
pwd

'/data/programs/cellpose-main/notebooks'

In [43]:
  # start logger (to see training across epochs)
# logger = io.logger_setup()

# DEFINE CELLPOSE MODEL (without size model)
# model already defined
model = models.CellposeModel(gpu=use_GPU, model_type=initial_model)
# model = models.Cellpose(gpu=use_GPU, model_type='nuclei')
# set channels

# get files
#cellpose load all the training data at the same time
train_data = []
train_labels = []
test_data = []
test_labels = []

for i in range(len(train_img_files)):
      
        train_data_3d = tifffile.imread(train_img_files[i])
        train_labels_3d = tifffile.imread(train_seg_files[i])

        #--------------use 2d slices to train the model----------------
        # use the slice as training data if mask is not empty in the slice
        train_img_idx = [idx for idx,slice in enumerate(train_labels_3d) if np.sum(slice)>1] 

        # print(train_img_idx)
        train_data_3d = train_data_3d[train_img_idx]
        train_labels_3d = train_labels_3d[train_img_idx]
        for slice in train_data_3d:
                train_data.append(slice)
        for slice in train_labels_3d:
                train_labels.append(slice)
        # train_data.append([slice for slice in train_data_3d])
        # train_labels.append([slice for slice in train_labels_3d])
for i in range(len(valid_img_files)):
        

        test_data_3d = tifffile.imread(valid_img_files[i])
        test_labels_3d = tifffile.imread(valid_seg_files[i])

        test_img_idx = [idx for idx,slice in enumerate(test_labels_3d) if np.sum(slice)>1] 

        test_data_3d = test_data_3d[test_img_idx]
        test_labels_3d = test_labels_3d[test_img_idx]

        for slice in test_data_3d:
                test_data.append(slice)
        for slice in test_labels_3d:
                test_labels.append(slice)
        # test_data.append(slice for slice in test_data_3d)
        # test_labels.append(slice for slice in test_labels_3d)
#---------------------------------------------------------------
# # 
print(len(train_data),len(test_data))
print(train_data[0].shape)
# print(train_data[0].shape)
new_model_path = model.train(train_data, train_labels, 
                              test_data=test_data,
                              test_labels=test_labels,
                              channels=[0,0], 
                              save_path='./allen_train', 
                              n_epochs=n_epochs,
                              learning_rate=learning_rate, 
                              weight_decay=weight_decay, 
                              nimg_per_epoch=8,
                              model_name=model_name)

# diameter of labels in training images
diam_labels = model.diam_labels.copy()

2023-04-14 10:04:09,944 [INFO] >> nuclei << model set to be used
2023-04-14 10:04:09,961 [INFO] ** TORCH CUDA version installed and working. **
2023-04-14 10:04:09,962 [INFO] >>>> using GPU
2023-04-14 10:04:10,064 [INFO] >>>> model diam_mean =  17.000 (ROIs rescaled to this size during training)
2701 675
(624, 924)
2023-04-14 10:05:37,087 [INFO] computing flows for labels


100%|██████████| 2701/2701 [04:31<00:00,  9.96it/s]


2023-04-14 10:10:19,693 [INFO] computing flows for labels


100%|██████████| 675/675 [01:09<00:00,  9.69it/s]


2023-04-14 10:11:49,141 [INFO] >>>> median diameter set to = 17
2023-04-14 10:11:49,142 [INFO] >>>> mean of training label mask diameters (saved to model) 77.215
2023-04-14 10:11:49,154 [INFO] >>>> training network with 2 channel input <<<<
2023-04-14 10:11:49,154 [INFO] >>>> LR: 0.10000, batch_size: 8, weight_decay: 0.00010
2023-04-14 10:11:49,155 [INFO] >>>> ntrain = 2234, ntest = 675
2023-04-14 10:11:49,155 [INFO] >>>> nimg_per_epoch = 2234
2023-04-14 10:12:29,329 [INFO] Epoch 0, Time 40.2s, Loss 0.4606, Loss Test 0.3860, LR 0.0000
2023-04-14 10:13:04,206 [INFO] saving network parameters to ./allen_train/models/allen_2d
2023-04-14 10:15:29,442 [INFO] Epoch 5, Time 220.3s, Loss 0.1983, Loss Test 0.1516, LR 0.0556
2023-04-14 10:18:33,428 [INFO] Epoch 10, Time 404.3s, Loss 0.1640, Loss Test 0.1517, LR 0.1000
2023-04-14 10:24:33,261 [INFO] Epoch 20, Time 764.1s, Loss 0.1554, Loss Test 0.1457, LR 0.1000
2023-04-14 10:30:30,643 [INFO] Epoch 30, Time 1121.5s, Loss 0.1506, Loss Test 0.1547,

KeyboardInterrupt: 

In [6]:
print(new_model_path)

NameError: name 'new_model_path' is not defined

## run cellpose 2D after training

There are two ways to run cellpose in 3D, this cell shows both, choose which one works best for you.

First way: computes flows from 2D slices and combines into 3D flows to create masks



In [9]:
from cellpose import models, metrics
import tifffile
from stardist_matching import matching

new_model_path='./allen_train/models/allen_2d'
# DEFINE CELLPOSE MODEL
# model_type='cyto' or model_type='nuclei'
# model = models.Cellpose(gpu=use_GPU, model_type='nuclei')
model = models.CellposeModel(gpu=use_GPU, pretrained_model = new_model_path)

ap_algo1_after_training  = []
ap_algo2_after_training  = []
# test 3D stack
dia_list = [125,50,75,100,25]
thre_list = [0.5, 0.075,0.125, 0.25,0.375,0.625,0.75]
# allen_cell_diameter=100
for thre in thre_list:
    for dia in dia_list:
 
        for file,mask in zip(valid_img_files,valid_seg_files):
            img = tifffile.imread(file)
            gt = tifffile.imread(mask)
            file_name=os.path.basename(file)
            # img_3D = imgs[-1]

            # * with 3D you have to set the diameter manually (no auto detect) *

            ### TWO WAYS TO RUN CELLPOSE IN 3D

            # 1. computes flows from 2D slices and combines into 3D flows to create masks
            # print('running cellpose 2D slice flows => masks')
            # masks_after_training, flows, styles= model.eval(img, channels=[0,0], diameter=dia, do_3D=True)

            # 2. computes masks in 2D slices and stitches masks in 3D based on mask overlap
            print('running cellpose 2D + stitching masks')
            masks_stitched_after_training, flows_stitched, styles_stitched= model.eval(img, channels=[0,0], diameter=dia, do_3D=False, stitch_threshold=thre)
            #average_precision returns [AP,TP,FP,FN]
            # ap1 = metrics.average_precision(gt, masks_after_training)[0]
            # ap2 = metrics.average_precision(gt, masks_stitched_after_training)[0]
            
            #calculate accuracy for each algorithm
            # ap1 = matching(gt, masks_after_training).precision
            # ap2 = matching(gt, masks_stitched_after_training).precision
            # ap_algo1_after_training.append(ap1)
            # ap_algo2_after_training.append(ap2)

            # tifffile.imwrite(data_folder +'results_2d_cellpose_algo1_after_training/' + file_name.split('.')[0] +'_thre_'+str(thre)+ '_dia_' + str(dia) + '.tiff', masks_after_training)
            tifffile.imwrite(data_folder +'results_2d_cellpose_algo2_after_training/' + file_name.split('.')[0] +'_thre_'+str(thre)+ '_dia_' + str(dia) + '.tiff', masks_stitched_after_training)


2023-04-16 21:17:25,856 [INFO] >>>> loading model ./allen_train/models/allen_2d
2023-04-16 21:17:25,864 [INFO] ** TORCH CUDA version installed and working. **
2023-04-16 21:17:25,864 [INFO] >>>> using GPU
2023-04-16 21:17:25,962 [INFO] >>>> model diam_mean =  17.000 (ROIs rescaled to this size during training)
2023-04-16 21:17:25,963 [INFO] >>>> model diam_labels =  77.215 (mean diameter of training ROIs)
running cellpose 2D + stitching masks
2023-04-16 21:17:26,036 [INFO] multi-stack tiff read in as having 65 planes 1 channels
2023-04-16 21:17:26,120 [INFO] 0%|          | 0/65 [00:00<?, ?it/s]
2023-04-16 21:17:26,242 [INFO] 5%|4         | 3/65 [00:00<00:02, 24.74it/s]
2023-04-16 21:17:26,355 [INFO] 9%|9         | 6/65 [00:00<00:02, 25.83it/s]
2023-04-16 21:17:26,465 [INFO] 14%|#3        | 9/65 [00:00<00:02, 26.48it/s]
2023-04-16 21:17:26,582 [INFO] 18%|#8        | 12/65 [00:00<00:02, 26.15it/s]
2023-04-16 21:17:26,713 [INFO] 25%|##4       | 16/65 [00:00<00:01, 27.82it/s]
2023-04-16 21

# test which cell size / stitching threshold combination gives the best result

# test metrics functions

In [33]:
# a block to test metrics function
from cellpose import models, metrics
gt = tifffile.imread(valid_seg_files[0])
gt =np.array(gt)
print(gt.shape)
gt2 = gt.copy()
# print(gt.min(),gt.max())
# gt2[gt>0] = gt2[gt>0] +10
gt2 = gt2 +10
from skimage.segmentation import relabel_sequential
# gt2 = relabel_sequential(gt2)[0]
# print(gt2[gt2>0]== gt[gt>0])
print(gt.shape, gt2.shape, gt.min(),gt.max(),gt2.min(),gt2.max())
ap2 = metrics.average_precision(gt, gt2)[0]
# # 
print(ap2)

(65, 624, 924)
(65, 624, 924) (65, 624, 924) 0 34 10 44
[0.77272725 0.77272725 0.77272725]


In [27]:
aa = gt2[gt2==44]
bb = gt[gt==34]

print(len(aa), len(bb))

20035 20035


In [35]:
from stardist_matching import matching

ap2 = matching(gt, gt2)
print(ap2.precision)

0.9714285714285714


Second way: computes masks in 2D slices and stitches masks in 3D based on mask overlap

Note stitching (with stitch_threshold > 0) can also be used to track cells over time.