In [3]:
import tifffile
try:
    import napari
    napari_available = True
except ImportError:
    print("Napari is not installed. Some parts of the notebook will not be available.")
    napari_available = False

from tapenade.preprocessing import (
    change_arrays_pixelsize,
    compute_mask,
    global_image_equalization,
    local_image_equalization,
    align_array_major_axis,
    crop_array_using_mask
)
from tapenade.preprocessing.segmentation_postprocessing import (
    remove_labels_outside_of_mask
)

### <font color='red'> After clicking on a cell, press "Shift+Enter" to run the code, or click on the "Run" button in the toolbar above.<br>

### Replace "..." signs with the appropriate path to your data.
</font>

# Preprocessing and Postprocessing Cookbook

This notebook presents typical use cases of the preprocessing toolbox we provide in the `preprocessing.py` and `segmentation_postprocessing.py` scripts, and illustrates the use of function's parameters.

All functions work on 3D (ZYX convention) or 4D (TZYX convention) images/labels/masks.

From a raw image, a classical preprocessing pipeline would go through the following steps:
1. **correcting for anisotropy**: dilate the image shape to make it isotropic.
2. **computing the mask**: compute a boolean (0/1) mask of background/foreground voxels.
3. **image equalization**: equalize the image intensity either globally or in local regions of the image to make it more homogeneous.
4. **image segmentation**: extract the objects of interest from the image, e.g with Stardist3D. ***NOT COVERED IN THIS NOTEBOOK***
5. **spatio-temporal registration**: correct for object drift, or fuse two images spatially. ***NOT COVERED IN THIS NOTEBOOK***
6. **forcing axis alignment**: specify axis to align the major axis of the objects to.
7. **cropping array to mask**: crop any array (image, labels, or mask) to the smallest bounding box containing the mask.

We also provide segmentation postprocessing functions:
1. **removing labels outside of mask**: remove labels that are not fully contained in the mask

Before starting, specify if you wish to display each intermediate result in a Napari viewer:

In [None]:
display_in_napari = True

# The final value of display_in_napari depends on whether napari is installed
display_in_napari = display_in_napari and napari_available

# Preprocessing

## 0. Loading the data

In [3]:
path_to_data = ...

data = tifffile.imread(path_to_data)

if display_in_napari:
    viewer = napari.view_image(data, name='raw data')

## 1. Correcting for anisotropy

The function `change_arrays_pixelsize` dilates the image shape to make it isotropic. It is useful when the image has a different resolution in the Z axis compared to the XY plane.

In [4]:
help(change_arrays_pixelsize)

Help on function change_arrays_pixelsize in module tapenade.preprocessing._preprocessing:

change_arrays_pixelsize(mask: numpy.ndarray = None, image: numpy.ndarray = None, labels: numpy.ndarray = None, input_pixelsize: tuple[float, float, float] = (1, 1, 1), output_pixelsize: tuple[float, float, float] = (1, 1, 1), order: int = 1, n_jobs: int = -1) -> numpy.ndarray
    Resizes an input image to have isotropic voxel dimensions.
    
    Parameters:
    - mask: numpy array, input mask
    - image: numpy array, input image
    - labels: numpy array, input labels
    - input_pixelsize: tuple of floats, input pixel dimensions (e.g. in microns)
    - output_pixelsize: tuple of floats, output pixel dimensions (e.g. in microns)
    - order: int, order of interpolation for resizing (defaults to 1 for
      linear interpolation). Choose 0 for nearest-neighbor interpolation
      (e.g. for label images)
    - n_jobs: int, optional number of parallel jobs for resizing (default: -1)
    
    Return

Here, the pixelsize of the data is (1, 0.62, 0.62) µm/pix, and we want to change it to (0.621, 0.621, 0.621) µm/pix to make the image isotropic:

In [5]:
isotropic_data = change_arrays_pixelsize(image=data, input_pixelsize=(1, 0.62, 0.62),
                                         output_pixelsize=(0.62,0.62,0.62))

if display_in_napari:
    viewer.add_image(isotropic_data, name='isotropic data')

Making array isotropic:   0%|          | 0/7 [00:00<?, ?it/s]

After making the data isotropic, it is usually easier to visually identify a typical object size in the image, that will be useful to define parameters in the following steps.

In [6]:
object_size = 8

## 2. Computing the mask

The function `compute_mask` computes a boolean (0/1) mask of background/foreground voxels. It is useful to remove background noise and to define the region of interest for the following steps.

In [5]:
help(compute_mask)

Help on function compute_mask in module tapenade.preprocessing._preprocessing:

compute_mask(image: numpy.ndarray, method: str, sigma_blur: float, threshold_factor: float = 1, compute_convex_hull: bool = False, registered_image: bool = False, n_jobs: int = -1) -> numpy.ndarray
    Compute the mask for the given image using the specified method.
    
    Parameters:
    - image: numpy array, input image
    - method: str, method to use for thresholding. Can be 'snp otsu' for Signal-Noise Product thresholding,
      or 'otsu' for Otsu's thresholding.
    - sigma_blur: float, standard deviation of the Gaussian blur. Should typically be
      around 1/3 of the typical object diameter.
    - threshold_factor: float, factor to multiply the threshold (default: 1)
    - compute_convex_hull: bool, set to True to compute the convex hull of the mask. If set to
      False, a hole-filling operation will be performed instead.
    - n_jobs: int, number of parallel jobs to run (-1 for using all avail

The parameter `method` can be set to the following values:
1. `otsu` computes the mask by first blurring the image with a Gaussian filter of size `sigma_blur` (which should be set to the typical object size if it is known) and then applying Otsu's thresholding method.
2. `snp otsu` computes a local Signal-and-Noise Product map from the image by using a Gaussian filter of size `sigma_blur` and then using Otsu's thresholding method on the map. 

`snp otsu` is usually the most robust method, but it is also the slowest. `otsu` is the fastest but can be sensitive to noise and large intensity variations among foreground objects. In case of doubt, it is recommended to try all methods and visually inspect the results. `compute_mask` also has a parameter `threshold_factor` that can be used to multiply the initial threshold value given by the methods above.

`compute_mask` can also be called with the parameter `compute_convex_hull` to return the convex hull of the mask. This is particularly useful when artifactual holes remain in the mask, but it leads to less precise mask, and the computation takes way longer. When set to False (default), a simple hole-filling operation is performed on the mask.  

In [17]:
mask_otsu = compute_mask(isotropic_data, method='otsu', sigma_blur=object_size, threshold_factor=0.6)
mask_snp = compute_mask(isotropic_data, method='snp otsu', sigma_blur=object_size/2)

if display_in_napari:
    viewer.add_image(mask_otsu, name='mask otsu')
    viewer.add_image(mask_snp, name='mask snp')

Thresholding image:   0%|          | 0/7 [00:00<?, ?it/s]

Thresholding image:   0%|          | 0/7 [00:00<?, ?it/s]

## 3. Image equalization

To correct for intensity variations in the image, we provide the functions `global_image_equalization` and `local_image_equalization`.

`global_image_equalization` equalizes the image by mapping the `perc_low` percentile to 0 and the `perc_high` percentile to 1. The image is finally clipped to the range [0, 1]. This function is useful when the image has long tails in the intensity histogram, i.e when there are very bright or very dark objects in the image, which lead to a poor contrast. Outliers can be removed by setting `perc_low` and `perc_high` to a value between 0 and 100.

`local_image_equalization` computes the intensity histogram in boxes of size `box_size` (which should be set to the typical object size if it is known) centered on the vertices of a uniform 3D grid spanning the image array. For each point, the `perc_low` and `perc_high` percentiles of the histogram are computed and interpolated on each voxel of the image. The image is then equalized to map the `perc_low` percentile to 0 and the `perc_high` percentile to 1. The image is finally clipped to the range [0, 1]. This function is useful when the image has a non-uniform illumination, i.e when the intensity varies across the image, which leads to contrast variations across space.

The functions have an optional parameter `mask` to specify a mask of the background/foreground voxels. If a mask is provided, values outside the mask are set to 0.

In [None]:
help(global_image_equalization)

In [6]:
help(local_image_equalization)

Help on function local_image_equalization in module tapenade.preprocessing._preprocessing:

local_image_equalization(image: numpy.ndarray, box_size: int, perc_low: float, perc_high: float, mask: numpy.ndarray = None, n_jobs: int = -1) -> numpy.ndarray
    Performs local image equalization on either a single image or a temporal stack of images.
    Stretches the image histogram in local neighborhoods by remapping intesities in the range
    [perc_low, perc_high] to the range [0, 1].
    This helps to enhance the contrast and improve the visibility of structures in the image.
    
    Parameters:
    - image: numpy array, input image or temporal stack of images
    - box_size: int, size of the local neighborhood for equalization
    - perc_low: float, lower percentile for intensity equalization
    - perc_high: float, upper percentile for intensity equalization
    - mask: numpy array, binary mask used to set the background to zero (optional)
    - n_jobs: int, number of parallel jobs to

In [None]:
equalized_data_global = global_image_equalization(isotropic_data, mask=mask_snp, 
                                                  perc_low=1, perc_high=99)

# in the rest of the notebook, we will use the locally equalized data
equalized_data = local_image_equalization(isotropic_data, mask=mask_snp,
                                            box_size=object_size,
                                            perc_low=1, perc_high=99)

if display_in_napari:
    viewer.add_image(equalized_data_global, name='equalized data global')
    viewer.add_image(equalized_data, name='equalized data')

# 4. Image segmentation

As stated above, we do not cover the image segmentation step in this notebook. We refer the reader to the `segmentation` notebook provided with this package, which uses Stardist3D to detect nuclei.

For the purpose of this notebook, we will directly load a pre-segmented array.

In [11]:
path_to_labels = ...

labels = tifffile.imread(path_to_labels)

if display_in_napari:
    viewer.add_labels(labels, name='labels')

## 5. Spatio-temporal registration

As stated above, we do not cover the spatio-temporal registration step in this notebook. We refer the reader to the `registration` notebook provided with this package.

## 6. Forcing axis alignment

When the object of interest (e.g a gastruloid) has a preferential orientation, it can be useful to align the major axis of the objects to a specific axis. We provide the function `align_array_major_axis` to do so. It computes the principal axes of the mask and rotates the image, labels, or mask to align the major axis to the specified axis.

All three arrays can be given at the same time, or only a combinations of two of them (containing the mask) can be given. The major axis is aligned with axis `target_axis` (can be 'X', 'Y', or 'Z') by rotating the image in the plane `rotation_plane` (can be 'XY', 'XZ', or 'YZ').
If the data is temporal (i.e 4D), the major axis is computed on a mask obtained by summing the 3D masks along the time axis. If only a specific time range is to be used to compute the major axis, the parameter `temporal_slice` can be used.

In [7]:
help(align_array_major_axis)

Help on function align_array_major_axis in module tapenade.preprocessing._preprocessing:

align_array_major_axis(target_axis: str, rotation_plane: str, mask: numpy.ndarray, image: Optional[numpy.ndarray] = None, labels: Optional[numpy.ndarray] = None, order: int = 1, temporal_slice: Optional[int] = None, n_jobs: int = -1) -> Union[numpy.ndarray, tuple[numpy.ndarray, numpy.ndarray], tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]]
    Aligns the major axis of an array to a target axis in a specified rotation plane.
    This function uses Principal Component Analysis (PCA) to determine the major axis of the array,
    and then rotates the array to align the major axis with the target axis.
    
    Parameters:
    - target_axis: str, the target axis to align the major axis with ('X', 'Y', or 'Z')
    - rotation_plane: str, the rotation plane to perform the rotation in ('XY', 'XZ', or 'YZ')
    - mask: numpy array, binary mask indicating the region of interest
    - image: numpy array,

In [13]:
aligned_mask, aligned_data, align_labels = align_array_major_axis(
    target_axis='X', rotation_plane='XY', # -> align the major axis with the X axis
    mask=mask_snp, image=equalized_data, labels=labels,
    temporal_slice=slice(2, 10) # -> use the frames from time 2 to 10 to compute the major axis
)

if display_in_napari:
    viewer.add_image(aligned_mask, name='aligned mask')
    viewer.add_image(aligned_data, name='aligned data')
    viewer.add_labels(align_labels, name='aligned labels')

Aligning mask:   0%|          | 0/7 [00:00<?, ?it/s]

Aligning image:   0%|          | 0/7 [00:00<?, ?it/s]

Aligning labels:   0%|          | 0/7 [00:00<?, ?it/s]

## 7. Cropping array to mask

The function `crop_array_using_mask` crops any array (image, labels, or mask) to the smallest bounding box containing the mask. It has an optional parameter `margin` to add a margin around the bounding box.

This function can be used to drastically reduce the size of the data to process at each stage by removing useless background voxels. **Though presented at the very end of the pipeline, it can be used at any stage of the pipeline to reduce the size of the data to process.**

In [8]:
help(crop_array_using_mask)

Help on function crop_array_using_mask in module tapenade.preprocessing._preprocessing:

crop_array_using_mask(mask: numpy.ndarray, image: Optional[numpy.ndarray] = None, labels: Optional[numpy.ndarray] = None, margin: int = 0, n_jobs: int = -1) -> Union[numpy.ndarray, tuple[numpy.ndarray, numpy.ndarray], tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]]
    Crop an array using a binary mask. If the array is temporal, the cropping
    slice is computed by aggregating mask instances at all times.
    
    Parameters:
    - mask: numpy array, binary mask indicating the region of interest
    - image: numpy array, input image or temporal stack of images (optional)
    - labels: numpy array, labels corresponding to the mask (optional)
    - margin: int, optional margin to add around the mask (default: 0)
    - n_jobs: int, number of parallel jobs to use (not used currently as the function is not computationally intensive)
    
    Returns:
    - cropped_array: numpy array, cropped array 

In [15]:
cropped_mask, cropped_data, cropped_labels = crop_array_using_mask(
    mask=aligned_mask, image=aligned_data, labels=align_labels, margin=0
)

if display_in_napari:
    viewer.add_image(cropped_mask, name='cropped mask')
    viewer.add_image(cropped_data, name='cropped data')
    viewer.add_labels(cropped_labels, name='cropped labels')

# Segmentation postprocessing

## 1. Removing labels outside of mask

Due to the presence of noise in the image, the segmentation can sometimes produce labels that are not fully contained in the mask. We provide the function `remove_labels_outside_of_mask` to remove these labels. It takes as input the labels and the mask, and removes the labels that are not fully contained in the mask.

In [16]:
labels_filtered = remove_labels_outside_of_mask(cropped_labels, cropped_mask)

if display_in_napari:
    viewer.add_labels(labels_filtered, name='labels filtered')

Removing labels outside of mask:   0%|          | 0/7 [00:00<?, ?it/s]