<a href="https://colab.research.google.com/github/wyldescience/Cellpose-batch-segmentation-and-counts/blob/main/Cellpose_segment_count.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Install packages and mount drive**

# Cellpose batch processing to segment ,count, and save output to csv file.

Here I use a pretrained cellpose model (Stringer, C., Wang, T., Michaelos, M., & Pachitariu, M. (2021), see paper and github to segment, count, and save output (overlays of masks over original image). Please see the Image.sc forum here where some of the fantastic community suggested this as a good way to segment and count the collembola egg images from my study. Another option that worked quite well was also template matching (opencv).

Here I apply this to the mass counting of eggs from the springtail, *Folsomia candida*, laid on black filter paper and imaged using a dissection microscope (pretty low resolution). The output created from this script results in the original images with overlays as well as separate masks. Additionally, this script also pulls pertinent information from filenames to add to columns in the csv file.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
!pip install "opencv-python-headless<4.3"
!pip install cellpose

In [3]:
!nvcc --version
!nvidia-smi

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
Mon Jan 15 07:36:15 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla V100-SXM2-16GB           Off | 00000000:00:04.0 Off |                    0 |
| N/A   43C    P0              26W / 300W |      0MiB / 16384MiB |      0%      Default |
|                                      

In [2]:
import numpy as np
import time, os, sys
from urllib.parse import urlparse
from glob import glob
import skimage.io
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
mpl.rcParams['figure.dpi'] = 300
from cellpose import utils
import cv2
from tqdm import tqdm
import csv
from skimage.measure import find_contours
from skimage.measure import label


from cellpose import models, core

#use_GPU = models.use_gpu()
use_GPU = core.use_gpu()
print('>>> GPU activated? %d'%use_GPU)

>>> GPU activated? 1


Function for extracting useful information from filenames and writing to file


In [58]:
import re

def extract_info_from_filename(filename):
    iso, gen, age, treat, temp, rep = None, None, None, None, None, None

    # Define a regular expression pattern to find '20' or '25'
    temp_match = re.search(r'(20|25)', filename)

    if temp_match:
        temp = temp_match.group(1)

    # Define a regular expression pattern to find 'R' followed by a digit (1-5)
    rep_match = re.search(r'R([1-5])', filename)

    if rep_match:
        rep = rep_match.group(1)

    for part in filename.split("_"):
        if part.startswith('I') and part[1:].isdigit():
            iso = int(part[1:])
        elif part.startswith('F') and part[1:].isdigit():
            gen = int(part[1:])
        elif part.startswith(('O', 'Y')):
            age = part[0]
        elif part.lower() in ['swi', 'con']:
            treat = part.upper()

    print(f"Extracted Info: {iso}, {gen}, {age}, {treat}, {temp}, {rep}")
    return iso, gen, age, treat, temp, rep

In [None]:
import os
import time
import pandas as pd
import re
import numpy as np
from skimage import io, color
from tqdm import tqdm  # Assuming tqdm is imported properly
from cellpose import models  # Assuming cellpose.models is imported properly

# Path to the images folder to process
image_folder = "/content/drive/Othercomputers/ThinkPad/Desktop/Folsomia candida/Data/egg count images/cropped"

# Get a list of all image files with .tif or .jpg extension in the folder
image_paths = [os.path.join(image_folder, file) for file in os.listdir(image_folder) if file.lower().endswith(('.tif', '.jpg'))]

# Set the output folder path
output_folder = "/content/drive/MyDrive/folsomia candida experiments/cellpose output/output"

# Create the CSV file path
csv_file_path = "/content/drive/MyDrive/folsomia candida experiments/cellpose output/output/results.csv"
fieldnames = ['image', 'date', 'iso', 'gen', 'age', 'temp', 'treat', 'rep', 'num_cells', 'false_pos', 'false_neg', 'final_count']

# Create a folder for saving label files
label_folder = "/content/drive/MyDrive/folsomia candida experiments/cellpose output/output/labels"
os.makedirs(label_folder, exist_ok=True)

# Load the chosen Cellpose model
model = models.Cellpose(gpu=True, model_type='cyto2')  # Change to true to turn on GPU

# Check if the CSV file already exists
if os.path.exists(csv_file_path):
    # Open the CSV file in append mode
    with open(csv_file_path, 'a', newline='') as csv_file:
        csv_writer = csv.writer(csv_file)

        # Assuming 'files' is the list of input image paths
        for image_path in tqdm(image_paths):
            print(f"Processing image: {image_path}")

            # Extract information from the filename
            isoline, generation, age, treatment, temp, rep = extract_info_from_filename(os.path.basename(image_path))
            print(f"Extracted Info: {isoline}, {generation}, {age}, {treatment}, {temp}, {rep}")

            # Extract date from the image filename using regular expression
            date_match = re.search(r'(\d{2}-\d{2}-\d{2})', os.path.basename(image_path))
            date = date_match.group(1) if date_match else None
            print(f"Extracted Date: {date}")

            t1 = time.time()
            img = skimage.io.imread(image_path)

            # Evaluate the model to get the labels directly
            labels, _, _, _ = model.eval(
                img,
                diameter=8,
                channels=[1, 1],
                do_3D=False,
                flow_threshold=0.8,
                cellprob_threshold=-0.8,
                stitch_threshold=0.0
            )

            # Save the label file
            label_filename = os.path.join(label_folder, os.path.splitext(os.path.basename(image_path))[0] + '_labels.tif')
            io.imsave(label_filename, labels.astype('uint16'))

            # Prepare the data to be written to the CSV
            csv_data = [
                os.path.splitext(os.path.basename(image_path))[0],
                date if date else "",  # Convert to string and handle None case
                str(isoline),
                str(generation),
                str(age),
                str(temp) if temp else "",  # Convert to string and handle None case
                str(treatment),
                str(rep) if rep else "",  # Write 'rep' to the 'rep' column
                str(np.max(labels)),  # Convert to string
                "",  # false_pos
                "",  # false_neg
                "",  # final_count
]


            # Write the data to the CSV
            csv_writer.writerow(csv_data)

            # Create an overlay image with labels
            overlay_image = img.copy()
            overlay_image = color.label2rgb(labels, img, bg_label=0)

            # Construct the output overlay image path in the specified folder
            overlay_name = os.path.join(output_folder, os.path.splitext(os.path.basename(image_path))[0] + '_overlay.png')

            # Convert the overlay image to uint8
            overlay_image_uint8 = (overlay_image * 255).astype(np.uint8)

            # Save the overlay image to the specified folder
            io.imsave(overlay_name, overlay_image_uint8)

            t2 = time.time()
            time_elapsed = (t2 - t1) / 60
            print(f'Time spent on current image: {round(time_elapsed, 1)} minutes')
            print('------')
else:
    # If the CSV file doesn't exist, create it and write the header
    with open(csv_file_path, 'w', newline='') as csv_file:
        csv_writer = csv.writer(csv_file)
        csv_writer.writerow(fieldnames)  # Write the header
        print("CSV file created.")
