# Part 2:  Segmentation

# Setup

### Our usual imports and initializing napari

In [1]:
import numpy as np
import pandas as pd
import napari
import tifffile
import skimage as ski
import scipy.ndimage as ndi
import glob
import plotly.express as px
import cellpose.models as models
import matplotlib.pyplot as plt
import cv2
import dask

In [2]:
viewer = napari.Viewer()

### Support functions for this notebook

Skimage's rolling ball background subtraction (ski.restoration.rolling_ball), is slow and does not work as well as ImageJ's.  This function more closely matches ImageJ's rolling ball background subtraction.

In [3]:
def backsub_2D(inp, radius=60):
    filterSize =(radius, radius)
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,
                                    filterSize)
    blurred = cv2.GaussianBlur(inp, (5, 5), 0)
    tophat_img = cv2.morphologyEx(blurred,
                                cv2.MORPH_TOPHAT,
                                kernel)
    rtn = inp.astype(np.single) - (blurred-tophat_img)
    rtn = np.clip(rtn, 0, np.inf)
    return rtn

In [4]:
def backsub_3D(inp, radius=60):
    process = [dask.delayed(backsub_2D)(inp[i], radius) for i in range(inp.shape[0])]
    rtn = np.stack(dask.compute(*process))

### Loading the image for this notebook

In [5]:
img = tifffile.imread('files/C-hela-cells.tif')
img.shape

(512, 672, 3)

And visualize in napari with the appropriate names and colors

In [6]:
viewer.layers.clear()
viewer.add_image(img, name=['lysosomes', 'mitocondria', 'nucleii'], colormap=['red', 'green', 'blue'], channel_axis=2)

[<Image layer 'lysosomes' at 0x15f076c2410>,
 <Image layer 'mitocondria' at 0x15f076c1db0>,
 <Image layer 'nucleii' at 0x15f07753cd0>]

# Pre-processing

### Standard pipeline:  subtract background, gaussian blur, threshold

To make things easier on ourselves, we split the 3 channels into 3 separate variables.  The contour of the 3 channels is very different:  the lysosomes are small puncta, the mitos are large networks with holes in them, and the nucleii are very large blobs.  We would not want to use the same rolling ball radius for all 3 channels.

In [8]:
lyso = img[:,:,0]
mitos = img[:,:,1]
nucleii = img[:,:,2]

In [10]:
lyso_backsub = backsub_2D(lyso, radius=20)
mito_backsub = backsub_2D(mitos, radius=20)
nucleii_backsub = backsub_2D(nucleii, radius=200)

viewer.add_image(lyso_backsub, name='lysosomes_backsubbed', colormap='red', blending='additive')
viewer.add_image(mito_backsub, name='mito_backsubbed', colormap='green', blending='additive')
viewer.add_image(nucleii_backsub, name='nucleii_backsubbed', colormap='blue', blending='additive')


<Image layer 'nucleii_backsubbed' at 0x15f795be560>

For segmentation we will use the nucleii primarily, we'll apply some blurring to make sure we avoid holes and small puncta.

In [11]:
blurred = ndi.gaussian_filter(nucleii_backsub, 10)
viewer.add_image(blurred, name='blurred', colormap='gray', blending='additive')

<Image layer 'blurred' at 0x15f79b66e90>

# Simple Segmentation

## Thresholding

Mousing over the image (make sure you have the "blurred" layer selected) we can see that nucleii have pixel intensities > 300, so we will use that as our threshold and visualize the binary image.

In [12]:
thresholded = blurred > 300
viewer.add_image(thresholded, name='thresholded', colormap='gray', blending='additive')

<Image layer 'thresholded' at 0x15f79e730d0>

With the "thresholded" layer selected, try mousing over the pixels.  Turns out python, whenever given an expression of A > B, returns a boolean array of all "True" or "False".  Napari is smart enough to turn these into 1 and 0.

## Label images

scipy.ndimage has a function called label that will take a binary image and return a "labeled" image.  A label image is an image where each pixel is assigned a number, and all pixels with the same number are connected.  This is exactly what we want for segmentation.  The function actually returns two arguments, the label_img and the number of objects it found.

In [13]:
label_img, number_objects = ndi.label(thresholded)

If we add the label_img to napari, we can see each individual object is a different intensity

In [14]:
viewer.add_image(label_img, name='label_img', colormap='gray', blending='additive')

<Image layer 'label_img' at 0x15f871ea380>

...but this is not very conducive to seeing separation between objects if they have a label value that is very similar.  Napari has a nice feature where you can instead .add_labels(label_img) and it will automatically assign a random color to each label.

In [21]:
viewer.layers.remove('label_img')
viewer.add_labels(label_img, name='label_img')

<Labels layer 'label_img' at 0x15f87614eb0>

## Regionprops (analyze particles)

### Regionprops (quantifying labels)

We have a binary image (thresholded) that ImageJ would normally use for Analyze Particles, but we have managed to improve on it with a label image (label_img).  Label images are superior, as if two objects are touching in a binary image, ImageJ lumps them together into a single object.  With labeled images (as we shall see with cellpose), you can have objects touching but still be separated (each gets a different intensity value assigned to it).  Now we want to quantify each object, and we can do that with skimage.measure.regionprops_table.  We have to specify what properties we want to collect, it can be computationally expensive to collect all of them, so we will just collect the ones we need.

The available properties are:  https://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.regionprops

The most useful ones are:  

label (the index of the object in the image), 

area (in number of pixels), 

centroid (the z/y/x position of the center of the object), 

mean_intensity, max_intensity, min_intensity, 

perimeter,  

eccentricity (how extended the object is, ranging from 0->1, with a circle having ecc=0), 

orientation (the angle the object makes in radians), 

axis_major_length, axis_minor_length

Unfortunately regionprops_table returns a dictionary instead of a simple table, so we have to do some extra work to get it into a table.  We'll use pandas.DataFrame.from_dict to convert the dictionary into a table.

In [27]:
results = pd.DataFrame.from_dict(ski.measure.regionprops_table(label_img, properties=['label', 'area', 'centroid', 'orientation', 'eccentricity']))
results

Unnamed: 0,label,area,centroid-0,centroid-1,orientation,eccentricity
0,1,14404.0,179.556651,463.463968,0.109041,0.787659
1,2,15433.0,171.781572,298.301173,-1.419267,0.549767
2,3,14937.0,279.70804,140.079199,-0.609245,0.726105
3,4,14258.0,408.910296,333.062141,-0.593316,0.736625


'centroid-0' is the y position, 'centroid-1' is the x position.  Orientation in radians is not intuitive, so let's fix that.

In [28]:
results['orientation'] = (90+results['orientation']/np.pi*180)
results

Unnamed: 0,label,area,centroid-0,centroid-1,orientation,eccentricity
0,1,14404.0,179.556651,463.463968,96.247604,0.787659
1,2,15433.0,171.781572,298.301173,8.681978,0.549767
2,3,14937.0,279.70804,140.079199,55.092811,0.726105
3,4,14258.0,408.910296,333.062141,56.005484,0.736625


Comparing to the image, we should see that label 2 (centered at 171/298) is pretty flat (8 degrees), and label 1 (centered at 180/463) is mostly vertical (96 degrees).

### Regionprops (quantifying intensities)

Note that the only argument we gave to ski.measure.regionprops_table that had an image in it was label_img which does not include any information about the raw intensities of our original image:  just shape information.  

What if we want to quantify the intensities of the lysosome or mitocondrial channel?  ski.measure.regionprops_table lets use give an intensity image as an argument.

In [30]:
mito_results = pd.DataFrame.from_dict(ski.measure.regionprops_table(label_img, mito_backsub, properties=['label', 'area', 'centroid', 'mean_intensity']))
mito_results

Unnamed: 0,label,area,centroid-0,centroid-1,mean_intensity
0,1,14404.0,179.556651,463.463968,65.834702
1,2,15433.0,171.781572,298.301173,66.851036
2,3,14937.0,279.70804,140.079199,73.726715
3,4,14258.0,408.910296,333.062141,82.320663


Compare to the original image and mito_backsubbed, does this make sense?  Note a fun feature of napari label layers:  you can change the "contour" argument to 1, and it will show just the outline of the labels.

Regionprops is a very powerful tool, we can write our own custom functions to perform some kind of analysis on each object.  For instance you could write something to take find the 90th percentile of intensity in each object and the 10th percentile, useful for looking at things like how punctate the signal is in an object.

# Watershed segmentation

## Masking cells vs background

This is fine for finding the nucleii, but what if we wanted to find the whole cell, and keep them separate?  First we need to find a binary image of the cells.  The raw, unbackground subtracted lyso channel seems like a reasonable place to start.  First we'll blur it a little bit.

In [44]:
smoothed_lyso = ndi.gaussian_filter(lyso, 10)
viewer.add_image(smoothed_lyso, name='smoothed_lyso', colormap='gray', blending='additive')

<Image layer 'smoothed_lyso' at 0x15f0af17760>

If we adjust the contrast lower limit to around ~350 we can see the cells, so we'll use that as our threshold.

In [45]:
masked_lyso = smoothed_lyso>350
viewer.add_image(masked_lyso, name='masked_lyso', colormap='gray', blending='additive')

<Image layer 'masked_lyso' at 0x15f07792080>

## Separating the cells

Now we have a binary image of where there is cell vs no cell, but we have not separated them into individual cells.  To do so we are going to combine our nuclear mask with a little trick:  we are going to use the distance transform of the cell mask.  The distance transform is a measure of how far each pixel is from the nearest "edge" of the cell mask.  We can use this to "push" the cells apart from each other.

The distance transform is actually a pretty simple concept:  for each pixel measure how many pixels away a background (False) pixel is and that is your new intensity.

In [53]:
edt = ndi.distance_transform_edt(masked_lyso)
viewer.add_image(edt, name='edt', colormap='gray', blending='additive')

<Image layer 'edt' at 0x15f7ed0e290>

ski.segmenation.watershed takes an image that can be used to decide when to break two objects apart (the edt), our seed image with the nucleii labeled (label_img), and a final mask that limits how far we can grow our cells (masked_lyso).

In [54]:
watershedded = ski.segmentation.watershed(-edt, label_img, mask=masked_lyso)
viewer.add_labels(watershedded, name='watershedded', blending='additive')

<Labels layer 'watershedded' at 0x15f7e819c00>

If we turn the watershedded onto contour=1, and then adjust the contrast on the EDT, we can see kind of what it is doing.  It breaks the boundary between two cells where the distance transform is the smallest.

## Quantifying the cells

Watershedded is a label image, but that contains the whole cells now.  We can use the same regionprops_table function to quantify the cells.

In [58]:
results = pd.DataFrame(ski.measure.regionprops_table(watershedded, lyso_backsub, properties=('label', 'area', 'centroid', 'mean_intensity')))
results

Unnamed: 0,label,area,centroid-0,centroid-1,mean_intensity
0,1,44235.0,188.152933,499.062462,71.550331
1,2,50295.0,131.96952,319.039407,54.592186
2,3,42868.0,243.536437,136.338947,73.122444
3,4,38548.0,397.226497,366.374883,94.249893


Let's say we want to also quantify the mitos, we will create a new table, but take the mito_intensity and just add it to the results table we already have by defining a new column 'mito_mean_intensity'

In [60]:
lyso_results = pd.DataFrame(ski.measure.regionprops_table(watershedded, mito_backsub, properties=('label', 'area', 'centroid', 'mean_intensity')))
results['mito_mean_intensity'] = lyso_results['mean_intensity']
results

Unnamed: 0,label,area,centroid-0,centroid-1,mean_intensity,mito_mean_intensity
0,1,44235.0,188.152933,499.062462,71.550331,101.695312
1,2,50295.0,131.96952,319.039407,54.592186,108.867981
2,3,42868.0,243.536437,136.338947,73.122444,99.088104
3,4,38548.0,397.226497,366.374883,94.249893,106.858253


# Visualizing results

### Plotting

Obviously we can plot results using plotly.  For those that don't know:  plotly functions take a datatable, and then you specify which columns you want to plot on which axis.  You can also specify which column you want to use to color the points, and which column you want to use to size the points.

In [67]:
px.bar(results, x='label', y='area', width=400)

In [66]:
px.bar(results, x='label', y='mean_intensity', title='Lyosome background intensity', width=400)

### With images

But this is boring!  We can visualize this in a much more exciting way using napari.

The ski.util.map_array() function lets us take a label image (in this case watershedded), and map intensities onto each object.  We can use this to visualize the lysosome intensity of each cell.

In [71]:
intensity_img = ski.util.map_array(watershedded, results['label'].values, results['mean_intensity'].values)
viewer.add_image(intensity_img, name='intensity_img', colormap='blue', blending='additive')

<Image layer 'intensity_img [1]' at 0x15f7e346bc0>

We can do this for any quantity.

In [72]:
intensity_img = ski.util.map_array(watershedded, results['label'].values, results['area'].values)
viewer.add_image(intensity_img, name='area_img', colormap='blue', blending='additive')

<Image layer 'area_img' at 0x15f7e1f9510>