
<div align="center">
    <img src="https://gitee.com/niov/STOCNSFs/raw/main/cs_benchmark/precision.png" width=80% height=80% alt="" />
    <h4>
      Cell Segmentation: How to choose tools that match your stereo-seq data
    </h4>
</div>

# What Is Cell Segmentation?
The difficulty of achieving accurate, automated cell segmentation is due in large part to the differences in cell shape, size and density across tissue types.Deep learning algorithms for computer vision are increasingly being used for a variety of tasks in biological image analysis, including nuclear and cell segmentation. Based on these, we will explore which segmentation algorithm is more suitable for stereo-seq image data.

**Journal**: bioRxiv<br>
**Doi**: https://doi.org/10.1101/2023.08.08.552402<br>
**Published Date**: Dec 27, 2023<br>
**Github**: -<br>
**Tutorial**: -<br>
**Environment（mirror）**：cs_benchmark (URL: https://cloud.stomics.tech/#/public/image)<br>

# Tutorial

## Input and Output

We have deployed the following 5 cell segmentation methods:

> [Cellpose](https://github.com/MouseLand/cellpose) is a generalist algorithm for cellular segmentation.<br>
> [DeepCell](https://github.com/vanvalenlab/deepcell-tf) is a deep learning library for single-cell analysis of biological images. Here, pre-trained DeepCell models are used for cell/nuclei segmentation from raw image data.<br><br>
> [StereoCell](https://github.com/STOmics/StereoCell/tree/dev) is an open-source software for measuring and analyzing cell images. Here, CellProfiler is used for object detection and region growth-based object segmentation.<br><br>
> [SAM](https://github.com/facebookresearch/segment-anything) is an open-source software for measuring and analyzing cell images. Here, CellProfiler is used for object detection and region growth-based object<br><br>
> [LT](https://github.com/BGI-DEV-REG/ARTISTA) is an open-source software for measuring and analyzing cell images. Here, CellProfiler is used for object detection and region growth-based object.<br><br>

The main parameters of the program include,
- Input

    - ```is_gpu```: Use GPU or not<br>
    - ```method```: segmentation methods, ['deepcell', 'cellpose', 'stereocell', 'sam', 'lt']<br>
    - ```image_path```: Stereo-seq Image data<br>

- Output

    - ```output_path```: result of cell segmentation

## Demo Data

Here, we present the StereoCell cell segmentation test dataset to compare the performance of different segmentation methods. Recent studies have shown that the diversity of data modalities, complex differences in image backgrounds, and cell distribution and morphology pose great challenges to segmentation methods. Therefore, we chose imaging data under [stereo-seq]() technology to construct a test set, covering 4 staining methods, namely: ssDNA, [HE](), [FB]() and [mIF](); all 42 ROIs in the test set come from 11 animal sections and 1 plant tissue sample. The test dataset is available at https://datasets.deepcell.org/ for noncommercial use.

<div align="center">
    <img src="docs/slice.png" width=60% height=60% alt="Fig StereoCell benchmarking" />
    <h6>
      Fig 1 Benchmarking for stereo-seq Image Data
    </h6>
</div>


## Time Estimates
<div align="center">
    <img src="docs/time.png" align="right" width=40% height=40% alt="" />
    <br>
</div>
It can be seen from the test data that the traditional method is much faster than the deep learning method. Under the deep learning method, the time consumption of cellpose and deepcell is similar, and stereocell is faster. The SAM time consumption of large visual models is relatively long.

| Method |CPU Core|CPU Memory (Gb)|GPU Memory (Gb)|Running Time (min)|
|:--:|:--:|:--:|:--:|:--:|
|LT|32|32|0|5~10|
|SAM|32|32|0|5~10|
|StereoCell|32|32|0|5~10|
|Cellpose|32|32|0|5~10|
|Deepcell|32|32|0|5~10|
</div>

# Benchmarking

## Index
<div align="center">
    <img src="docs/seg.png" width=50% height=50% alt="Single-cell Stereo-seq reveals induced progenitor cells involved in axolotl brain regeneration" />
    <h6>
      Fig 2 precision and recall for cell segmentation
    </h6>
</div>

To evaluate the relative performance of different deep learning architectures, we compared several alternatives: StereoCell (kernel), Deepcell (whole-cell), Cellpose (whole-cell), SAM (whole-cell), and LT. All methods are evaluated on the StereoCell test set.
 - n_pred,
 - n_true,
 - precision,
 - recall,
 - F1,

## Result
<div align="center">
    <img src="docs/precision.png" width=80% height=80% alt="Single-cell Stereo-seq reveals induced progenitor cells involved in axolotl brain regeneration" />
    <h6>
      Fig 3 Benchmarking for stereo-seq Image Data
    </h6>
</div>

We evaluate 5 popular cell segmentation methods on the StereoCell test dataset, including their segmentation levels and algorithm performance. From the F1 score of each algorithm, the following recommended solutions can be obtained:
 - scene 1: If the data is **ssDNA staining**, we recommend using the _StereoCell_ cell segmentation algorithm

 - scene 2: If the data is **HE/mIF** stained, we recommend using the _Deepcell/Cellpose_ cell segmentation algorithm

  - scene 3: If the data is **FB** stained, we recommend using the _Deepcell/StereoCell/Cellpose_ cell segmentation algorithm

  - scene 4: Under **ssDNA/HE** staining, the segmentation results need to be adjusted to the optimal segmentation results. It is recommended to use the _LT algorithm_ and realize it through parameter adjustment.

All in all, **Cellpose** and **Deepcell** are more universal.

## Supplementary information

| Species | StereoCell | DeepCell | Cellpose | SAM | LT |
|--------: | :---------:|:--------:| :---------:|:--------:|:--------:|
| HE/mouse_stomach | 0.14831 | 0.00965 |  0.00103 | 0.27114 | 0.02818 |
| HE/human_ovarian_cancer | 0.57156 | 0.0197 |  0.34593 | 0.29969 | 0.12209 |
| HE/mouse_large_intestine | 0.12297 | 0.00691 |  0.0155 | 0.31266 | 0.02818 |
| HE/human_stomach_cancer | 0.25906 | 0.00238 |  0.03362 | 0.31374 | 0.04236 |
| HE/mouse_brain | 0.48167 | 0.00483 |  0.32376 | 0.48882 | 0.24997 |
| HE/human_melanoma | 0.51569 | 0.01312 |  0.30805 | 0.44213 | 0.16007 |
| FB/arabidopsis_thaliana_seeds | 0.11188 | 0.03638 |  0.71235 | 0.20158 | 0.10615 |
| ssDNA/murine_kidney | 0.50257 | 0.48804 |  0.58297 | 0.30123 | 0.18944 |
| ssDNA/mouse_placenta | 0.34727 | 0.50867 |  0.56316 | 0.2437 | 0.22016 |
| ssDNA/mouse_brain | 0.41138 | 0.43785 |  0.54767 | 0.30734 | 0.23151 |
| ssDNA/human_liver | 0.44711 | 0.48498 |  0.5488 | 0.32185 | 0.17862 |
| ssDNA/murine_prostate | 0.58785 | 0.67761 |  0.72194 | 0.36294 | 0.19122 |
| mIF/mouse_liver | 0.02789 | 0.01132 |  0.67243 | 0.25676 | 0.13448 |

# Acknowledgements

We thank: 

- [Cellpose_Cell_Segmentation_Tutorial](https://cloud.stomics.tech/#/public/tool/app-detail/notebook/224/--)
- [DeepCell_Cell_Segmentation](https://cloud.stomics.tech/#/public/tool/app-detail/notebook/233/--)
- [StereoCell_Cell_Segmentation](https://cloud.stomics.tech/#/public/tool/app-detail/notebook/222/--)
- [SAM_Cell_Segmentation](https://cloud.stomics.tech/#/public/tool/app-detail/notebook/206/--)
- [LT_Cell_Segmentation](https://cloud.stomics.tech/#/public/tool/app-detail/notebook/79/--)

# Appendix

**cs_benchmark** uses conda's multi-environment solution to meet users' needs to call multiple segmentation methods at the same time. The details of the environment construction are listed below:

<details close>
<summary>CellPose</summary>

```text
source activate 
conda deactivate
conda create -n cellpose python=3.8
conda activate cellpose
pip install torch
pip install torchvision
pip install cellpose
pip install tifffile
pip install patchify
```
</details>

<details close>
<summary>DeepCell</summary>

```text
source activate 
conda deactivate
conda create -n deepcell python=3.8
conda activate deepcell
pip install DeepCell==0.12.9
pip install tifffile>=2023.2.3
```

</details>

<details close>
<summary>StereoCell</summary>

```text
source activate 
conda deactivate
conda create -n stereocell python=3.8
conda activate stereocell
pip install onnxruntime>=1.15.1
pip install tifffile>=2023.2.3
pip install scikit-image>=0.21.0
pip install opencv-python>=4.8.0.76
pip install tqdm
pip install stio==0.1.0 --no-deps
pip install cell-bin==1.2.5 --no-deps
pip install stio==0.1.0 --no-deps
pip install requests
mkdir /home/weights
python -c "from cellbin.dnn.weights import auto_download_weights, WEIGHTS; auto_download_weights('/home/weights', WEIGHTS.keys())"

```
</details>


<details close>
<summary>SAM</summary>

```text

source activate 
conda deactivate
conda create -n sam python=3.8
conda activate sam
pip install https://ghproxy.com/https://github.com/facebookresearch/segment-anything.git
pip install opencv-python pycocotools matplotlib onnxruntime onnx
pip install torch torchvision scipy tifffile tqdm -i https://mirrors.ustc.edu.cn/pypi/web/simple/

```

</details>

<details close>
<summary>Eval</summary>

```text
pip install cython six openpyxl
cd eval
python setup.py install --user
```
    
</details>

# Running

sam ran for a total of 4769.733977794647 s

## 分割细胞
图片名字要求带_img，mask图片要求名字_mask

In [None]:
import os
import time
import subprocess
os.environ['CUDA_VISIBLE_DEVICES'] = '2'

work_path = os.path.abspath('.')
# work_path = '/data/work/benchmark/benchmark'
__py__ = {
    'MEDIAR':'anaconda3/envs/MEDIAR/bin/python',
    'cellpose': 'anaconda3/envs/cellpose/bin/python',
    'cellpose3':'anaconda3/envs/cellpose3/bin/python',
    'deepcell': 'anaconda3/envs/deepcell/bin/python',
    'sam': 'anaconda3/envs/sam/bin/python',
    'stardist':'anaconda3/envs/stardist/bin/python',
}
__methods__ = ['MEDIAR','cellpose','cellpose3', 'sam','stardist']


__script__ = {
    'MEDIAR':os.path.join(work_path,'src/methods/MEDIAR/MEDIAR/iMEDIAR.py'),
    'cellpose': os.path.join(work_path, 'src/methods/cellpose/icellpose.py'),
    'cellpose3':os.path.join(work_path,'src/methods/cellpose3/icellpose.py'),
    'deepcell': os.path.join(work_path, 'src/methods/deepcell/ideepcell2.py'),
    'sam': os.path.join(work_path, 'src/methods/sam/isam.py'),
    'stardist':os.path.join(work_path,'src/methods/stardist/istardist.py')
}
# ############################### 图片名字要求带-img，mask图片要求名字-mask

#method = ['sam','cellpose3','cellpose','MEDIAR']
#method= ['MEDIAR','cellpose','sam','cellpose3','v3']

#######你需要修改的部分#####
is_gpu = True
method = ['cellpose','cellpose3','MEDIAR','sam','stardist','deepcell']
image_path = ''
output_path = ''
img_type = 'ss' # he or ss
###########################
print(os.listdir(image_path))
print(work_path)

for m in method: assert m in __methods__
for m in method:
    start = time.time()
    cmd = '{} {} -i {} -o {} -g {} -t {}'.format(__py__[m], __script__[m], 
                                    image_path, os.path.join(output_path, m), is_gpu, img_type)
    print(cmd)
    os.system(cmd)
    t = time.time() - start
    print('{} ran for a total of {} s'.format(m, t))
    print('{} result saved to {}'.format(m, os.path.join(output_path, m)))
    

## 评估

In [None]:
# evaluation
import os

py = '/storeData/USER/data/01.CellBin/00.user/fanjinghong/home/anaconda3/envs/benchmark/bin/python'
script = '/storeData/USER/data/01.CellBin/00.user/fanjinghong/code/benchmark2/src/eval/cell_eval_multi.py'

gt_path = '/storeData/USER/data/01.CellBin/00.user/shican/benchmark_new/dataset/ssDNA/gt'
dt_path = '/storeData/USER/data/01.CellBin/00.user/fanjinghong/code/benchmark2/input/ssDNA/gaussian_blur_output_3'
eval_path = '/storeData/USER/data/01.CellBin/00.user/fanjinghong/code/benchmark2/input/ssDNA/gaussian_blur_eval_3'
if not os.path.exists(eval_path): os.makedirs(eval_path)

cmd = '{} {} -g {} -d {} -o {}'.format(py, script, gt_path, dt_path, eval_path)
print(cmd)
os.system(cmd)



# **Contact Information**
For questions about this notebook, please contact: _cloud@stomics.tech_.

# **Cite**
If you use STOmics/Stereo-seq data in your research, please considering referring us in your article:
> **Code available** The source code of this algorithm is available at Github (https://github.com/BGI-DEV-REG/ARTISTA). The visual and convenient execution of this algorithm can be found from STOmics Cloud Platform (https://cloud.stomics.tech/).

> **Acknowledgement** We express our gratitude to the computing platform STOmics Cloud (https://cloud.stomics.tech/) for enabling workflow automation and accelerating Stereo-seq data analysis. If you use STOmics/Stereo-seq data in your research, please considering referring us in your article.

In [None]:
pip install compute_overlap

In [None]:
cd src/eval


In [None]:
python setup.py install --user