# **CartoCell - inference workflow (Phase 5)**
___  
  
**CartoCell** is a deep learning-based image processing pipeline for the high-throughput segmentation of whole epithelial cysts acquired at low resolution with minimal human intervention. The official documentation of the workflow is in [CartoCell tutorial](https://biapy.readthedocs.io/en/latest/tutorials/cartocell.html). 

<figure>
<center>
<img src='https://biapy.readthedocs.io/en/latest/_images/cartocell_pipeline.png' width='800px'/>
<figcaption><b>Figure 1</b>: CartoCell processing phases (from Andrés-San Román et al., 2022).</figcaption></center>
</figure>


**This notebook replicates CartoCell's Phase 5**, i.e., it allows the segmentation of 3D epithelial cysts using a deep learning model trained on a large dataset of low-resolution cysts (see Figure 1, Phase 4, model M2).

___


**CartoCell** relies on the [BiaPy library](https://github.com/BiaPyX/BiaPy), freely available in GitHub: https://github.com/BiaPyX/BiaPy

Please note that **CartoCell** is based on a publication. If you use it successfully for your research please be so kind to cite our work:
 
*''CartoCell, a high-throughput pipeline for accurate 3D image analysis, unveils cell morphology patterns in epithelial cysts''. Jesús A. Andrés-San Román, Carmen Gordillo-Vázquez, Daniel Franco-Barranco, Laura Morato, Antonio Tagua, Pablo Vicente-Munuera, Ana M. Palacios, María P. Gavilán, Valentina Annese, Pedro Gómez-Gálvez, Ignacio Arganda-Carreras, Luis M. Escudero. [under revision]*


___


## **Expected inputs and outputs**
___
**Inputs**

This notebook expects two folders as input:
* An **input folder with 3D TIFF images** to be processed. A cyst per image is expected. For optimal performance, the voxel resolution should match that of the training images: 1.62 x 1.62 x 0.5 micron/voxel.
* An **output folder to store the segmentation results**.

**Outputs**

If the execution is successful, two folders will be created for each input image containing:
* A **TIFF image** with the cell instances before 3D Voronoi post-processing.
* A **TIFF image** with the cell instances after 3D Voronoi post-processing.



<figure>
<center>
<img src='https://biapy.readthedocs.io/en/latest/_images/cyst_sample.gif' width='300'/>
<img src='https://biapy.readthedocs.io/en/latest/_images/cyst_instance_prediction.gif' width='300'/>
<figcaption><b>Figure 2</b>: Example of input and output images. From left to rigth: 3D TIFF input image and the resulting TIFF image with the cell instances after Voronoi post-processing</figcaption></center>
</figure>


<font color='red'><b>Note</b></font>: for testing purposes, you can also run this notebook with the samples images provided in *Manage file(s) source > Option 3*.



## **Prepare the environment**
___

Establish connection with Google services. You **must be logged in to Google** to continue.
Since this is not Google's own code, you will probably see a message warning you of the dangers of running unfamiliar code. This is completely normal.


## **Manage file(s) source**
---
The input folder can be provided using three different options: by directly uploading the folder (option 1), by using a folder stored in your own Google Drive (option 2) or by automatically downloading a few samples of our data (option 3).

Depending on the option chosen, different steps will have to be taken, as explained in the following cells.


### **Option 1: use your local files and upload them to the notebook**
---
You will be prompted to upload your files to Colab and they will be stored under `/content/input/`.

In [None]:
#@markdown ##Play the cell to upload local files
from google.colab import files
!mkdir -p /content/input 
%cd /content/input
uploaded = files.upload()
%cd /content

### **Option 2: mount your Google Drive**
---
To use this notebook on your own data from Google Drive, you need to mount Google Drive first.

Play the cell below to mount your Google Drive and follow the link that will be shown. In the new browser window, select your drive and select 'Allow', copy the code, paste into the cell and press enter. This will give Colab access to the data on the drive. 

Once this is done, your data will be available in the **Files** tab on the top left of notebook.

In [None]:
#@markdown ##Play the cell to connect your Google Drive to Colab

#@markdown * Click on the URL. 

#@markdown * Sign in your Google Account. 

#@markdown * Copy the authorization code. 

#@markdown * Enter the authorization code. 

#@markdown * Click on "Files" site on the right. Refresh the site. Your Google Drive folder should now be available here as "drive". 

# mount user's Google Drive to Google Colab.
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


### **Option 3: download a few of our samples**
---
If you do not have data at hand but would like to test the notebook, no worries! You can run the following cell to download three of our cyst samples and continue with rest of the notebook.

In [None]:
#@markdown ##Play the download our data samples
import os 

fname = "/content/test_x.zip "

!mkdir -p /content/input 

%cd /content 
if not os.path.exists(fname):
    !pip install --upgrade --no-cache-dir gdown &> /dev/null
    !gdown --id 1KKNBqIZ7NiWwxlRM8ctdgkx83JRNyBjq &> /dev/null
    %cd /content/input
    !unzip {fname} &> /dev/null
    !rm {fname}
%cd /content

print( 'Input images successfully donwloaded and stored under /content/input/' )

/content
/content/input
/content
Input images successfully donwloaded and stored under /content/input/



## **Check for GPU access**
---

By default, the session should be using Python 3 and GPU acceleration, but it is possible to ensure that these are set properly by doing the following:

Go to **Runtime -> Change the Runtime type**

**Runtime type: Python 3** *(Python 3 is programming language in which this program is written)*

**Accelerator: GPU** *(Graphics processing unit)*

## **Paths to load input images and save output files**
___

If option 1 (uploading your own folder) or option 3 (downloading our prepared data samples) were chosen, define data_path as '/content/input', and output_path as '/content/out'. Please make sure you donwload the results from the '/content/out' folder later!

If option 2 was chosen, introduce here the paths to your input files and to the folder where you want to store the results. E.g. '/content/gdrive/MyDrive/...'.

In case you have troubles finding the path to your folders, at the top left of this notebook you will find a small folder icon. Explore until you find the folders. There you can copy the folder path by right clicking and clicking "copy".

In [None]:
#@markdown #####Path to images
data_path = '/content/input' #@param {type:"string"}

#@markdown #####Path to store the resulting images (it'll be created if not existing):
output_path = '/content/output' #@param {type:"string"}

## **Install BiaPy library**


In [None]:
#@markdown ##Play to install BiaPy and its dependences

import os
import sys
import numpy as np
from tqdm.notebook import tqdm
from skimage.io import imread
from skimage.exposure import match_histograms                                                                           

# Clone the repo
%cd /content/ 
if not os.path.exists('BiaPy'):
    !git clone https://github.com/BiaPyX/BiaPy.git
    %cd /content/BiaPy
    !git checkout 2bfa7508c36694e0977fdf2c828e3b424011e4b1
    %cd /content/
    !pip install --upgrade --no-cache-dir gdown &> /dev/null
    sys.path.insert(0, 'BiaPy')
    %cd /content/BiaPy
    
    # Install dependencies 
    !pip install git+https://github.com/aleju/imgaug.git &> /dev/null
    !pip install numpy_indexed yacs fill_voids edt &> /dev/null

/content
Cloning into 'BiaPy'...
remote: Enumerating objects: 15895, done.[K
remote: Counting objects: 100% (1964/1964), done.[K
remote: Compressing objects: 100% (660/660), done.[K
remote: Total 15895 (delta 1381), reused 1862 (delta 1284), pack-reused 13931[K
Receiving objects: 100% (15895/15895), 824.84 MiB | 16.32 MiB/s, done.
Resolving deltas: 100% (8650/8650), done.
/content/BiaPy


## **Download pretrained model and apply 3D segmentation workflow (Phase 5)**


In [None]:
#@markdown ##Play to download pretrained model M2 (from Phase 4)
import errno

job_name = "cartocell_inference"

# remove template file it is exists
template_file = '{}.yaml'.format(job_name)
if os.path.exists( template_file ):
    os.remove( template_file )

# Download .yaml file and model weights 
%cd /content/
if not os.path.exists("cartocell_inference.yaml"):
    !wget https://raw.githubusercontent.com/BiaPyX/BiaPy/master/templates/instance_segmentation/CartoCell_paper/cartocell_inference.yaml &> /dev/null
    print("\nCartocell yaml configuration file downloaded!")

if not os.path.exists("model_weights_cartocell.h5"):
    !gdown --id 1rnTls60MrndQHyyLgwwKr5UNZbjytnCF &> /dev/null
    !wget https://github.com/BiaPyX/BiaPy/blob/master/templates/instance_segmentation/CartoCell_paper/model_weights_cartocell.h5
    print( '\nM2 model weights successfully downloaded!')

# Check folders before modifying the .yaml file
if not os.path.exists(data_path):
    raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), data_path)
ids = sorted(next(os.walk(data_path))[2])
if len(ids) == 0:
    raise ValueError("No images found in dir {}".format(data_path))


# open template configuration file
import yaml
with open( template_file, 'r') as stream:
    try:
        biapy_config = yaml.safe_load(stream)
    except yaml.YAMLError as exc:
        print(exc)

# update paths to data
biapy_config['DATA']['TEST']['PATH'] = data_path
biapy_config['DATA']['TEST']['LOAD_GT'] = False
biapy_config['PATHS']['CHECKPOINT_FILE'] = "/content/model_weights_cartocell.h5"

# save file
with open( template_file, 'w') as outfile:
    yaml.dump(biapy_config, outfile, default_flow_style=False)

print( "Inference configuration finished.")

/content

Cartocell yaml configuration file downloaded!
--2023-02-11 23:14:06--  https://github.com/BiaPyX/BiaPy/blob/master/templates/instance_segmentation/CartoCell_paper/model_weights_cartocell.h5
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘model_weights_cartocell.h5.1’

model_weights_carto     [ <=>                ] 134.87K  --.-KB/s    in 0.01s   

2023-02-11 23:14:06 (11.5 MB/s) - ‘model_weights_cartocell.h5.1’ saved [138106]


M2 model weights successfully downloaded!
Inference configuration finished.


In [None]:
#@markdown ##Play to pass images through the model
import os
import errno

# Run the code 
%cd '/content/BiaPy'
!python -u main.py --config '/content/'{job_name}'.yaml' --result_dir {output_path} --name {job_name} --run_id 1 --gpu 0



/content/BiaPy
2023-02-11 23:14:06.961340: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-11 23:14:07.934335: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-02-11 23:14:07.934475: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
Date: 2023-02-11 23:14:12
Arguments: Namespace(config='/content/cartocell_infer

## **Visualize 3D instance segmentation results**


In [None]:
#@markdown ###Play to visualize results in 3D

%matplotlib inline
import matplotlib
import numpy as np
from numpy.random import randint, seed
from matplotlib import pyplot as plt
from ipywidgets import interact, fixed
import ipywidgets as widgets
from google.colab import output
output.enable_custom_widget_manager()

final_results = os.path.join(output_path, job_name, 'results', job_name+"_1")
instance_results = os.path.join(final_results, "per_image_instances")
instance_post_results = os.path.join(final_results, "per_image_post_processing")

# Show a few examples to check that they have been stored correctly 
ids_input = sorted(next(os.walk(data_path))[2])
ids_pred = sorted(next(os.walk(instance_results))[2])
ids_pred_pos = sorted(next(os.walk(instance_post_results))[2])

# create random color map
vals = np.linspace(0,1,256)
np.random.shuffle(vals)
cmap = plt.cm.colors.ListedColormap(plt.cm.gist_rainbow(vals))
cmap.colors[0] = [0., 0., 0., 1.] # set background to black

samples_to_show = min(len(ids_input), 3)
chosen_images = np.random.choice(len(ids_input), samples_to_show, replace=False) 
seed(1)

test_samples = []
test_sample_preds = []
test_sample_preds_post = []

# read 3D images again
for i in range(len(chosen_images)):
    aux = imread(os.path.join(data_path, ids_input[chosen_images[i]]))
    test_samples.append(aux)
    
    aux = imread(os.path.join(instance_results, ids_pred[chosen_images[i]])).astype(np.uint16)
    test_sample_preds.append(aux)
    
    aux = imread(os.path.join(instance_post_results, ids_pred_pos[chosen_images[i]])).astype(np.uint16)
    test_sample_preds_post.append(aux)

# function to show results in 3D within a widget
def scroll_in_z(z, j):

    plt.figure(figsize=(25,5))
    # Source
    plt.subplot(1,3,1)
    plt.axis('off')
    plt.imshow(test_samples[j][z-1], cmap='gray')
    plt.title('Source (z = ' + str(z) + ')', fontsize=15)

    # Prediction
    plt.subplot(1,3,2)
    plt.axis('off')
    plt.imshow(test_sample_preds[j][z-1], cmap=cmap, interpolation='nearest')
    plt.title('3D segmentation (z = ' + str(z) + ')', fontsize=15)
    
    # Voronoi
    plt.subplot(1,3,3)
    plt.axis('off')
    plt.imshow(test_sample_preds_post[j][z-1], cmap=cmap, interpolation='nearest')
    plt.title('3D segmentation with Voronoi (z = ' + str(z) + ')', fontsize=15)

for j in range(samples_to_show):
    interact(scroll_in_z, z=widgets.IntSlider(min=1, max=test_samples[j].shape[0], step=1, value=test_samples[j].shape[0]//2), j=fixed(j))

interactive(children=(IntSlider(value=66, description='z', max=133, min=1), Output()), _dom_classes=('widget-i…

interactive(children=(IntSlider(value=66, description='z', max=133, min=1), Output()), _dom_classes=('widget-i…

interactive(children=(IntSlider(value=63, description='z', max=126, min=1), Output()), _dom_classes=('widget-i…

In [None]:
#@markdown ###Play to display the path to the output files.

final_results = os.path.join(output_path, job_name, 'results', job_name+"_1")

instance_results = os.path.join(final_results, "per_image_instances")
voronoi_results = os.path.join(final_results, "per_image_instances_voronoi")

peak_dir = os.path.join(output_path, str(job_name)+'/results/'+str(job_name)+'_1/per_image_local_max_check')
print("Output paths:")
print("    Instance segmentation files before Voronoi post-processing are in {}".format(instance_results))
print("    Instance segmentation files after Voronoi post-processing are in {}".format(voronoi_results))

Output paths:
    Instance segmentation files before Voronoi post-processing are in /content/output/cartocell_inference/results/cartocell_inference_1/per_image_instances
    Instance segmentation files after Voronoi post-processing are in /content/output/cartocell_inference/results/cartocell_inference_1/per_image_instances_voronoi
