## CellProfiler-OMERO demo
We have setup an example notebook to guide you through the steps to analise images stored in OMERO using CellProfiler

Let's start by importing some libraries we are going to need

In [1]:
import warnings
warnings.filterwarnings('ignore')

import ezomero

import cellprofiler_core.preferences as cp_preferences
import cellprofiler_core.pipeline as cp_pipeline
import cellprofiler_core.measurement as cp_measurement
from cellprofiler_core.modules.injectimage import InjectImage

import pandas as pd
import tempfile


Bad key text.latex.preview in file /home/julio/.conda/envs/cellprofiler4/lib/python3.8/site-packages/matplotlib/mpl-data/stylelib/_classic_test.mplstyle, line 123 ('text.latex.preview : False')
You probably need to get an updated matplotlibrc file from
https://github.com/matplotlib/matplotlib/blob/v3.5.3/matplotlibrc.template
or from the matplotlib source distribution

Bad key mathtext.fallback_to_cm in file /home/julio/.conda/envs/cellprofiler4/lib/python3.8/site-packages/matplotlib/mpl-data/stylelib/_classic_test.mplstyle, line 155 ('mathtext.fallback_to_cm : True  # When True, use symbols from the Computer Modern')
You probably need to get an updated matplotlibrc file from
https://github.com/matplotlib/matplotlib/blob/v3.5.3/matplotlibrc.template
or from the matplotlib source distribution

Bad key savefig.jpeg_quality in file /home/julio/.conda/envs/cellprofiler4/lib/python3.8/site-packages/matplotlib/mpl-data/stylelib/_classic_test.mplstyle, line 418 ('savefig.jpeg_quality: 95    

In [2]:
# Make CellProfiler run without a GUI
cp_preferences.set_headless()

# Tell CellProfiler to get input from and save output in a temp directory
output_dir = tempfile.TemporaryDirectory()
input_dir = tempfile.TemporaryDirectory()
cp_preferences.set_default_output_directory(output_dir.name)
cp_preferences.set_default_image_directory(input_dir.name)

Let's connect to OMERO. When we connect we get a connection object that we will have to use in every interaction with OMERO.

In [3]:
# Creating a connection object
host = "omero.mri.cnrs.fr"
port = 4064
conn = ezomero.connect(host=host, port=port)

# Connecting
conn.connect()
# The connection will timeout after a period of inactivity. To avoid that we can tell our new connection to say "Hi, I'm still here"
conn.c.enableKeepAlive(60)
# Let's verify that we are connected
conn.isConnected()

True

Time to grasp a Dataset from OMERO and download a CellProfiler pipeline that is attached to it. Go to the browser, select a dataset and copy the ID.

In [4]:
dataset_id = int(input("Dataset id: "))
dataset = conn.getObject("Dataset", dataset_id)

file_ann_ids = ezomero.get_file_annotation_ids(conn, "Dataset", dataset_id)
for file_ann_id in file_ann_ids:
    if conn.getObject("FileAnnotation", file_ann_id).getFile().getName().endswith(".cppipe"):
        cp_pipeline_path = ezomero.get_file_annotation(conn, file_ann_id, input_dir.name)
        print(f"Downloaded {cp_pipeline_path}")
        break

Downloaded /tmp/tmp691gubaa/Megane_SpotInNuclei-encours_test_noOUT.cppipe


We create a new pipeline with that file and we remove the first 4 modules. The first 4 modules are in charge of preparing the image data when they are loaded from disk. We don't need them here because we are using OMERO.

In [5]:
pipeline = cp_pipeline.Pipeline()
pipeline.load(cp_pipeline_path)

for i in range(4):
    print('Remove module: ', pipeline.modules()[0].module_name)
    pipeline.remove_module(1)

# TODO: Enable modules
print('Pipeline modules:')
for module in pipeline.modules(False):
    print(module.module_num, module.module_name)

Remove module:  Images
Remove module:  Metadata
Remove module:  NamesAndTypes
Remove module:  Groups
Pipeline modules:
1 IdentifyPrimaryObjects
2 IdentifyPrimaryObjects
3 RelateObjects
4 MaskObjects
5 MeasureObjectIntensity


We can now start feeding images into the pipeline

In [14]:
# Lets create some dataframes to store data on the experiment, images and the different objects measured by cellprofiler
measurement_dfs = {}
for column in pipeline.get_measurement_columns():
    if column[0] not in measurement_dfs.keys():
        measurement_dfs[column[0]] = pd.DataFrame()

# Lets collect all images in a dataset and feed them one at a time into the pipeline.
for image_id in ezomero.get_image_ids(conn=conn, dataset=dataset_id, across_groups=False):
    image, image_pixels = ezomero.get_image(conn, image_id)

    pipeline_copy = pipeline.copy()

    for c in range(image.getSizeC()):
        inject_image_module = InjectImage(f"ch{c}", image_pixels[...,c].squeeze())
        inject_image_module.set_module_num(1)
        pipeline_copy.add_module(inject_image_module)

    measurements = pipeline_copy.run()

    for object_name, _ in measurement_dfs.items():
        if object_name == "Experiment": continue
        if object_name == "Image": continue
        data = {f:measurements.get_measurement(object_name,f) for f in measurements.get_feature_names(object_name)}
        pd.DataFrame.from_dict(data).head()
        measurement_dfs[object_name] = pd.concat([measurement_dfs[object_name], pd.DataFrame.from_dict(data)], ignore_index=True)
        # print(object_name)
        # print(object_df.head(2))
        # object_df.describe()




In [15]:
for k, v in measurement_dfs.items():
    print(k)
    # v.describe()
    print(v.head())



Experiment
Empty DataFrame
Columns: []
Index: []
Image
Empty DataFrame
Columns: []
Index: []
Nuclei
   Children_Spot_Count  Location_Center_X  Location_Center_Y  \
0                    8        1708.305485         129.838988   
1                   12        1323.383832         161.267437   
2                    8        1572.371668         172.805566   
3                   86         613.909757         205.582901   
4                   13        1653.183031         304.096250   

   Location_Center_Z  Number_Object_Number  
0                  0                     1  
1                  0                     2  
2                  0                     3  
3                  0                     4  
4                  0                     5  
Spot
   Children_SpotInNuclei_Count  Location_Center_X  Location_Center_Y  \
0                            0        1631.166667           3.333333   
1                            0         398.500000           4.500000   
2                       

In [48]:
# remove the output directory
output_dir.cleanup()

# and close the connection to the OMERO server
conn.close()