# Retraining Cellpose on Custom Data

<div class="custom-button-row">
    <a 
        class="custom-button custom-download-button" href="../../../notebooks/05_segmentation/deep_learning/cellpose_retraining_notebook.ipynb" download>
        <i class="fas fa-download"></i> Download this Notebook
    </a>
    <a
    class="custom-button custom-download-button" href="https://colab.research.google.com/github/HMS-IAC/bobiac/blob/gh-pages/colab_notebooks/05_segmentation/deep_learning/cellpose_retraining_notebook.ipynb" target="_blank">
        <img class="button-icon" src="../../../_static/logo/icon-google-colab.svg" alt="Open in Colab">
        Open in Colab
    </a>
</div>

In [1]:
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "matplotlib",
#     "tifffile",
#     "cellpose"
# ]
# ///

## Overview

In this section...

If you do not have an Apple Silicon Mac or a GPU we suggest to run this notebook in Google Colab.

...

You need pair of raw data and respective labels.
You can generate the labels in any way you want.
you can for example follow the [Cellpose Documentation](https://cellpose.readthedocs.io/en/latest/gui.html#training-your-own-cellpose-model) and use the Cellpose GUI to manually update/create the labels.

The images we will use for this section can be downloaded from the <a href="../../../_static/data/05_segmentation_cellpose_training.zip" download> <i class="fas fa-download"></i> Cellpose Training Dataset</a> (it contains both training and test images).

## Import Libraries

In [2]:
import tifffile
from cellpose import core, io
import matplotlib.pyplot as plt



Welcome to CellposeSAM, cellpose v
cellpose version: 	4.0.6 
platform:       	darwin 
python version: 	3.13.0 
torch version:  	2.7.1! The neural network component of
CPSAM is much larger than in previous versions and CPU excution is slow. 
We encourage users to use GPU/MPS if available. 




## Setup

In [3]:
io.logger_setup()  # run this to get printing of progress

print("GPU available:", core.use_gpu())

2025-07-11 09:27:00,441 [INFO] WRITING LOG OUTPUT TO /Users/ranit/.cellpose/run.log
2025-07-11 09:27:00,442 [INFO] 
cellpose version: 	4.0.6 
platform:       	darwin 
python version: 	3.13.0 
torch version:  	2.7.1
2025-07-11 09:27:00,502 [INFO] ** TORCH MPS version installed and working. **
GPU available: True


### Init the Model

...

In [4]:
from cellpose import core, io, models, plot
from natsort import natsorted

# Check if colab notebook instance has GPU access
if core.use_gpu():
    gpu = True
else:
    gpu = False
    raise ImportError("No GPU access, change your runtime")


gpu = False
# Initialize the Cellpose model
model = models.CellposeModel(gpu=gpu)

2025-07-11 09:27:02,987 [INFO] ** TORCH MPS version installed and working. **
2025-07-11 09:27:02,987 [INFO] >>>> using CPU
2025-07-11 09:27:02,987 [INFO] >>>> using CPU
2025-07-11 09:27:03,811 [INFO] >>>> loading model /Users/ranit/.cellpose/models/cpsam


## Data Handling

Training dataset vs Test dataset

In [5]:
import os


ROOT_FOLDER_PATH = "../../../_static/data/05_segmentation_cellpose_training/"

train_dir = os.path.join(ROOT_FOLDER_PATH, "train/")
test_dir = os.path.join(ROOT_FOLDER_PATH, "test/")

masks_ext = "_seg.npy"

# get files
train_data, train_labels, _, test_data, test_labels, _ = io.load_train_test_data(train_dir, test_dir, mask_filter=masks_ext)

2025-07-11 09:27:06,787 [INFO] not all flows are present, running flow generation for all images
2025-07-11 09:27:06,810 [INFO] 5 / 5 images in ../../../_static/data/05_segmentation_cellpose_training/train/ folder have labels
2025-07-11 09:27:06,812 [INFO] not all flows are present, running flow generation for all images
2025-07-11 09:27:06,829 [INFO] 3 / 3 images in ../../../_static/data/05_segmentation_cellpose_training/test/ folder have labels


In [6]:
import numpy as np

# Convert images to float32
train_data = [img.astype(np.float32) for img in train_data]
# Convert labels (masks) to int32
train_labels = [lbl.astype(np.int32) for lbl in train_labels]

# Convert test images to float32 and labels to int32
test_data = [img.astype(np.float32) for img in test_data]
test_labels = [lbl.astype(np.int32) for lbl in test_labels]


In [None]:
from cellpose import metrics

# run model on test images
masks = model.eval(test_data, batch_size=32)[0]

# check performance using ground truth labels
ap = metrics.average_precision(test_labels, masks)[0]
print("")
print(f">>> average precision at iou threshold 0.5 = {ap[:, 0].mean():.3f}")

2025-07-11 09:27:09,482 [INFO] 0%|          | 0/3 [00:00<?, ?it/s]
2025-07-11 09:27:49,597 [INFO] 33%|###3      | 1/3 [00:40<01:20, 40.11s/it]


## Train new model

In [13]:
from cellpose import train

model_name = "new_model"

# Training params
n_epochs = 10
learning_rate = 1e-5
weight_decay = 0.1
batch_size = 1

# (not passing test data into function to speed up training)

new_model_path, train_losses, test_losses = train.train_seg(
    model.net,
    train_data=train_data,
    train_labels=train_labels,
    batch_size=batch_size,
    n_epochs=n_epochs,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    nimg_per_epoch=max(2, len(train_data)),  # can change this
    model_name=model_name,
)

2025-07-10 21:42:42,210 [INFO] computing flows for labels


100%|██████████| 5/5 [00:02<00:00,  2.14it/s]

2025-07-10 21:42:44,554 [INFO] >>> computing diameters



100%|██████████| 5/5 [00:00<00:00, 403.68it/s]

2025-07-10 21:42:44,569 [INFO] >>> normalizing {'lowhigh': None, 'percentile': None, 'normalize': True, 'norm3D': True, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False}
2025-07-10 21:42:44,607 [INFO] >>> n_epochs=10, n_train=5, n_test=None
2025-07-10 21:42:44,607 [INFO] >>> AdamW, learning_rate=0.00001, weight_decay=0.10000
2025-07-10 21:42:44,628 [INFO] >>> saving model to /Users/ranit/Research/github/bobiac/content/05_segmentation/deep_learning/models/new_model





RuntimeError: Expected scalar_type == ScalarType::Float || inputTensor.scalar_type() == ScalarType::Int || scalar_type == ScalarType::Bool to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)

## Evaluate on test data

In [None]:
from cellpose import metrics

model = models.CellposeModel(gpu=True, pretrained_model=new_model_path)

# run model on test images
masks = model.eval(test_data, batch_size=32)[0]

# check performance using ground truth labels
ap = metrics.average_precision(test_labels, masks)[0]
print("")
print(f">>> average precision at iou threshold 0.5 = {ap[:, 0].mean():.3f}")

TO REMOVE: <a href="https://colab.research.google.com/github/MouseLand/cellpose/blob/main/notebooks/train_Cellpose-SAM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>