# Cell Correction

This part shows how to correct cells in Stereopy. Generally, there are two ways to complete it:

correcting from BGEF and mask;

correcting from GEM and mask.

**Note::**

1.If you have generate mask file from ssDNA image, serveral necessary modules should be installed previously, refer to the Image Preparation.

2.We provide two versions of algorithm, one is slower but more accurate while the other one is faster but less accurate. Set the parameter `fast` to `True` to run faster version, default is `True`.

## Correcting from BGEF and Mask

On this way, you should specify the path of BGEF by `bgef_path`, the path of mask by `mask_path` and the output path to save corrected result by `out_dir`. 

Otherwise, specify the count of precesses by `process_count`, default to 10. Cell correction dafaults to return a StereoExpData object, if set `only_save_result` to `True`, only return the path of CGEF after correcting.

In [None]:
from stereo.tools.cell_correct import cell_correct

bgef_path = "SS200000135TL_D1.raw.gef"
mask_path = "SS200000135TL_D1_regist.tif"
out_dir = "cell_correct_result"

data = cell_correct(out_dir=out_dir,
                    bgef_path=bgef_path,
                    mask_path=mask_path,
                    process_count=10,
                    only_save_result=False,
                    fast=True
                    )

Output directory includes such files:

1. `.raw.cellbin.gef` - the CGEF without correcting, generated from BGEF and mask;
2. `.adjusted.gem` - the gem after correction;
3. `.adjusted.cellbin.gef` - the CGEF after correcting, generated from the `.adjusted.gem`;
4. `err.log` - records the cells which cannot be corrected and not contained in `.adjusted.gem` and `.adjusted.cellbin.gef`.

On certain occations, mask file could be generated from ssDNA image. Segmentation model `model_type` could be set as `deep-learning` or `deep-cell`, more details to check in the part of Cell segmentation.

In [None]:
from stereo.tools.cell_correct import cell_correct

out_dir = './ell_correct_result'
bgef_path = './SS200000135TL_D1.raw.gef'
image_path = './SS200000135TL_D1_regist.tif'
model_path = './cell_segmentation/seg_model_20211210.pth'
model_type = 'deep-learning'
# model_path = 'cell_segmentation_deepcell'
# model_type = 'deep-cell'

data = cell_correct(out_dir=out_dir,
                    bgef_path=bgef_path,
                    image_path=image_path,
                    model_path=model_path,
                    model_type=model_type,
                    gpu=-1,
                    process_count=10,
                    only_save_result=False,
                    fast=True
                    )

## Correcting from GEM and Mask

On this way, you should also specify the path of BGEF by `bgef_path`, the path of mask by `mask_path` and the output path to save corrected result by `out_dir`. 

In output directory, the file named `*.bgef` is generated form mask.

In [None]:
from stereo.tools.cell_correct import cell_correct

gem_path = "SS200000135TL_D1.cellbin.gem"
mask_path = "SS200000135TL_D1_mask.tif"
out_dir = "cell_correct_result"

data = cell_correct(out_dir=out_dir,
                  gem_path=gem_path,
                    mask_path=mask_path,
                    process_count=10,
                    only_save_result=False,
                    fast=True
                    )

Similar to what we have done on BGEF and ssDNA image, you can correct cells from GEM and ssDNA image.

In [None]:
from stereo.tools.cell_correct import cell_correct

out_dir = './cell_correct_result'
gem_path = './SS200000135TL_D1.cellbin.gem'
image_path = './SS200000135TL_D1_regist.tif'
model_path = './seg_model_20211210.pth'
model_type = 'deep-learning'
# model_path = './cell_segmentation_deepcell'
# model_type = 'deep-cell'

data = cell_correct(out_dir=out_dir,
                    gem_path=gem_path,
                    image_path=image_path,
                    model_path=model_path,
                    model_type=model_type,
                    gpu=-1,
                    process_count=10,
                    only_save_result=False,
                    fast=True
                    )

## Running on Jupyter Notebook

Notebook can not support multiprocess directly, we recommend following two steps to improve performance.

Firstly, write the source code in a python file by command `%%writefile`.

In [None]:
%%writefile temp.py
from stereo.tools.cell_correct import cell_correct

bgef_path = "SS200000135TL_D1.raw.gef"
mask_path = "SS200000135TL_D1_regist.tif"
out_dir = "cell_correct_result"

data = cell_correct(out_dir=out_dir,
                    bgef_path=bgef_path,
                    mask_path=mask_path,
                    process_count=10,
                    only_save_result=False,
                    fast=True
                    )

Secondly, run the file by command `%run`.

In [None]:
%run temp.py

## Performance

Take a GEF containing 55460 cells and 25546 genes as an example.

#### Machine configuration

|physical cores |logic cores |memory   |
| --- | --- | --- |
|12             |48          |250G     |

#### Comparision of performance

`fast=False`

|process  |memory(max) |cpu    |time   |
| --- | --- | --- | --- |
|10       |140G        |2330%  |2h13m  |

`fast=True`(only support single process)

|process  |memory(max) |cpu    |time   |
| --- | --- | --- | --- |
|1        |49G         |99%    |40m    |