# Enable few-shot learning for biomedical image segmentation 
## Stand upon previous knowledge & explainable frameworks

Peicheng Wu, Yequan Zhao, Yubin Deng, Zhixiong Chen, Liyan Tan

# Introduction

Semantic Segmentation (SS): Classifies each pixel in an image into a predefined class.

Instance Segmentation (IS): Identifies and delineates each object of interest in an image (differentiates between instances of the same class)

Application in biomedical study:
* Segmentation of cell bodies, membranes and nuclei from microscopy images

Deep Learning based models have achieved great success in biomedical segmentation.

<center><img src="figs/instance_seg.png" width=500px alt="default"/></center>



## Problems of Deep Learning based Segmentation:

* (Big) Data-driven
    - Need large amount (e.g., > 10000) expert annotated training data
    - Not capable of few-shot learning (i.e., with small (<100) trainig data)
* Lack explainability
    - Black-box nature of deep learning based methods
    - Trustworthy? 
* Huge computational overhead (high-end GPUs)
    - Sustainability?
    - Data privacy?

## Problem Formulation:

Reasons behind: previous biomedical segmentation frameworks are total data-driven, without building on previous (human) knowledge or experience

Target: involve previous human knowledge and experience in bimedical image segmentation framework to

* Enable few-shot learning
* Enhance explainability / transparency

## Presentation Distribution:

* Background & Problem formulation: Yequan Zhao

* Cellpose[1]: Deep learning based method capable of small dataset adaptation
    - High-level ideas & framework explained: Yubin Deng
    - Experiment results: Zhixiong Chen

* Kartezio[2]: Fully transparent and easily interpretable segmentation pipeline for few-shot learning
    - High-level ideas & framework explained: Liyan Tan
    - Experiment results: Peicheng Wu

[1]: Pachitariu, M. & Stringer, C. (2022). Cellpose 2.0: how to train your own model. Nature methods, 1-8.

[2]: Cortacero, Kévin, et al. "Evolutionary design of explainable algorithms for biomedical image segmentation." Nature communications 14.1 (2023): 7112.

# Cellpose

## Motivation

We expect to save human labor via fully-automated methods (e.g., data-driven deep learning), but...

- need huge amount of data → huge amount of expert annotated training data
- Do not generalize well to new data source → repeated annotation and training

Problem: Previous deep learning based methods lack generalizability / transferability, due to 
- Limited model capacity
- Only trained on specialized datasets

## Method:

  1. Developed a new model with better expressive ability

  <center><img src="figs/CellposeArch.jpg" width="500" height="500" alt="Description of image"><center>
  

## Method:

2. New training dataset:  
  - Generalist dataset: to generalize the model more widely and more robustly.
    - Exisitng specialized datasets for cell segmentation
    - Internet searches for keywords such as ‘cytoplasm’, ‘cellular microscopy’, ‘fluorescent cells’, 
    - Other types of microscopy images
    - Nonmicroscopy images: fruits, rocks, jellyfish

  * Specialist dataset: to benchmark expressive power
    * 100 images from Cell Image Library
    * large and visually uniform

## Further adapt to few-shot learning:

<center><img src="figs/humanloop.jpg" width=1000px alt="default"/></center>

With Cellpose generalist pretrained model, we could quick adapt to new data with fewer training data:

- training from scratch: 200,000 user-annotated regions of interest (ROI)
- fine-tune from Cellpose pretrained model: only 500–1,000 ROI
- human-in-the-loop pipeline: reduced the required user annotation to 100–200 ROI

# Kartezio

## Motivation:

Cellpose enables few-shot learning, but 
* only for cell segmentation tasks
* cannot guarantee explainability / transparency due to the "black box" nature of deep learning models

Need for Explainability:
* A growing demand for algorithms that are not only effective but also transparent and interpretable. 
* Decisions must be justifiable and comprehensible in medical and biological settings.

Kartezio is a fully explainable framwork, with 
* comparable performance to Cellpose in small dataset (100)
* better performance in tiny datasets (10~20)
* few-shot learning for general segmentation tasks including cell segmentation, tumor segmentation， etc.


## Highlights of core methods:

Explainability and Transparency:
  * Kartezio is designed to create fully transparent and interpretable image processing pipelines.
  * Kartezio's pipelines can be easily understood and inspected by humans.

Few-Shot Learning:
  * The ability to perform effectively with much smaller training datasets.

Modularity and Flexibility:
  * The modular nature is based on Cartesian Genetic Programming (CGP).
  * Adaptively assemble and parameterize CV functions to create custom pipelines.
  * Integration of Expert Knowledge: decades of human expertise into its pipeline generation process.

Practical Utility & Complement to DL:
  * Effective across a variety of imaging types and scenarios in biomedicine.
  * A complementary tool for deep learning.


## Framework

<center><img src="figs/kartezio_framework.png" width=1000px alt="default"/></center>

## Model examle

<center><img src="figs/kartezio_example.png" width=1000px alt="default"/></center>

Hence, we can see Kartezio as a method to find a better way to preprocess specific images (cancer or other medical images).  It is driven by the CGA which can automatically search which functions combined together is better.
