
# Cell Segmentation with Cellpose

This notebook shows **how to run Cellpose from the command line inside a notebook** to train, infer, and (optionally) visualize results.  
It assumes your dataset is organized under `data/` as described below.

---

## Quick Start

```bash
# 1) install (run once in your env)
pip install "cellpose[gpu]"

# 2) train with a test split for built-in evaluation
python -m cellpose --train   --dir data/train   --test_dir data/test   --pretrained_model cyto3   --chan 0 --chan2 0   --learning_rate 0.2   --n_epochs 500   --batch_size 8   --use_gpu   --model_name my_cell_model

# 3) run inference and save predicted masks as .tif
python -m cellpose --dir data/test/images   --pretrained_model my_cell_model   --chan 0 --chan2 0   --diameter 0   --save_tif   --use_gpu
```



## 1) Installation
Install Cellpose (GPU build). If you're on CPU only, drop the `[gpu]` extra.


In [None]:

# Install Cellpose 
# !pip install "cellpose[gpu]"



## 2) Dataset Layout

Cellpose expects the following structure:

```
data/
├─ train/
│  ├─ images/
│  └─ masks/
└─ test/
   ├─ images/
   └─ masks/
```

- `images/` contains microscopy images (`.tif` or `.tiff`).
- `masks/` contains corresponding masks (binary or instance-labeled).
- **Filenames must match** between `images/` and `masks/` (e.g., `A01.tif` in both).
- **Channel mapping note:** this project uses grayscale inputs → `--chan 0 --chan2 0`. If you have multi-channel (e.g., nuclei/cytoplasm), set them accordingly.



### Preprocessing (explained, not included)
The dataset here was prepared from a larger internal collection using a **CSV-driven copier** that:
- Reads `train_split.csv` and `test_split.csv` with columns `Image` and `Mask`,
- Copies files into the layout above,
- If a file is missing, retries an alternate extension (`.tif` ⇄ `.tiff`),
- Logs copied/missing counts.

We **omit** that script to keep this repo lean; any equivalent data mover is fine so long as the layout above is respected.



## 3) Training (with built‑in evaluation)

Passing `--test_dir` makes Cellpose evaluate on that split after each epoch (object‑level metrics).  
This cell calls the **CLI** directly from the notebook.


In [None]:

# Train with built-in evaluation
# !python -m cellpose --train \
#   --dir data/train \
#   --test_dir data/test \
#   --pretrained_model cyto3 \
#   --chan 0 --chan2 0 \
#   --learning_rate 0.2 \
#   --n_epochs 500 \
#   --batch_size 8 \
#   --use_gpu \
#   --model_name my_cell_model



### What metrics does Cellpose print?
During training with `--test_dir`, Cellpose prints validation performance each epoch. This is **instance‑level** matching (objects are matched by IoU thresholds) and typically summarized with **precision/recall/F‑scores**. It complements pixel‑level scores and is better at reflecting splits/merges of individual cells.

> Tip: If you run this notebook on Linux/macOS, you can capture logs with `| tee results/logs/train_log.txt`. On Windows, use PowerShell redirection instead.



## 4) Inference (segment new images)

Apply your trained model to a folder of images and save predicted masks as `.tif`.


In [None]:

# Inference (uncomment to run)
# !python -m cellpose --dir data/test/images \
#   --pretrained_model my_cell_model \
#   --chan 0 --chan2 0 \
#   --diameter 0 \
#   --save_tif \
#   --use_gpu



## 5) Visual Sanity Check — Overlay prediction on the image

This cell displays an image and its predicted mask as an overlay so you can quickly verify segmentation quality.


In [None]:

# Optional visualization (change filenames to an example that exists in your data)
# from skimage.io import imread
# from skimage.color import label2rgb
# import matplotlib.pyplot as plt
# from pathlib import Path
#
# img_path = Path('data/test/images') / 'example.tif'     # <-- change me
# pr_path  = Path('results/predictions') / 'example.tif'  # <-- change me
#
# img = imread(img_path)
# mask = imread(pr_path)
# overlay = label2rgb(mask, image=img, alpha=0.3, bg_label=0)
#
# plt.figure(figsize=(12,5))
# plt.subplot(1,2,1); plt.title('Image');   plt.axis('off'); plt.imshow(img, cmap='gray')
# plt.subplot(1,2,2); plt.title('Overlay'); plt.axis('off'); plt.imshow(overlay)
# plt.show()



## 6) Troubleshooting

- **Masks empty or under‑segmented?** Try setting a fixed size (`--diameter 20` as a starting guess) or tune thresholds (`--cellprob_threshold`, `--flow_threshold`).  
- **GPU not used?** Ensure CUDA is installed and `--use_gpu` is set. If needed, try `pip install cupy-cudaXXX` (matching your CUDA) or use CPU by omitting `--use_gpu`.  
- **Channel confusion?** Double‑check `--chan`/`--chan2`. For grayscale, `0/0` is fine. For RGB, `--chan 1` (green), `--chan2 2` (blue), etc., depending on your biology.  
- **File pairing issues?** Make sure filenames in `images/` and `masks/` are identical (including extension).  



## 7) Using the Cellpose GUI (Optional)

Cellpose also provides a **graphical user interface (GUI)** which can be very useful for:

- Manually checking segmentation quality  
- Editing masks (add/remove cells)  
- Running inference interactively on images  
- Evaluating how well your trained model generalizes

To start the GUI after installing Cellpose, run in a terminal:

```bash
cellpose
```

or, from Python:

```bash
python -m cellpose
```

This opens the GUI window where you can drag and drop images, apply your model, and visually inspect the results.
