## To-do
- Now that we have import and metadata extraction working, we need to start preprocessing (mostly interpolating timepoints for z-slices if recorded on frame-by-frame basis by the scope) and some scheme for identification of a nuclear and a spot channel that is compatible with switching between the two channels (e.g. using mCherry to segment nuclei during cycles but not at the division).
- Makes sense to use dask for visualization (e.g. choosing a threshold).
- Write DoG/segmentation fuction so that it can take either 2D or 3D data - give the option to segment off of a projection, or off of raw 3D data.
    - Write in options for DoG and LoG segmentation algorithm with standard nuclear sizes vs box DoG/LoG vs watershed.
        - Actually, box filtering might not be very helpful if we're cutting off part of the nucleus is z - the BP filtering will project it into a distorted gaussian if we're not right in the middle of the nucleus, and then misplace the centroid and botch the diameter estimation from $\sigma$. For 3D segmentation, it might be better to use a single filter to find markers then perform a watershed.
- 3D DoG notes:
    - $\sigma_{x, y} = 8$ works perfectly to segment out nuclei during nc 13.
    - $\sigma_z$ is BP-filtered (1, 9) where 9 is the Z-sigma corresponding to the whole nucleus. This allow the BP to be very permissive in Z and filter out the nuclei in x and y.
- Proposed procedure for local peak finding:
    - Run box DoG as below with permissive BP in z and LoG approximation in (x, y), only varying $\sigma$ in the latter.
    - Peak-finding on standard image (e.g. $\sigma_{x, y} = 8$), then use coordinates as initial guess for next sigma values.
- Simple BP filter + peak finding does a good job finding markers. Give option then to watershed segment directly off of the image, off of distance-transformed otsu thresholded image, and off of edge-finding.
    - For data with the mid-nuclear plane on the boundary of our z-stack, might be useful to give the option to segment in 2D, then threshold each nuclear column locally to identify the nucleus.
    - Need to write loop over timepoints, clean up small objects at each step, then commit segmentation to file.

In [1]:
import preprocessing.import_data as im

trim_series = True
lif_test_name = "test_data/2021-06-14/p2pdpwt"
lsm_test_name = "test_data/2023-04-07/p2pdp_zld-sites-ctrl_fwd_1"

(
    channels_full_dataset,
    original_global_metadata,
    original_frame_metadata,
    export_global_metadata,
    export_frame_metadata,
) = im.import_save_dataset(lsm_test_name, trim_series)

  warn('Due to an issue with JPype 0.6.0, reading is slower. '


In [2]:
import napari
import matplotlib.pyplot as plt
import numpy as np

In [2]:
nuclear_channel = channels_full_dataset[1]

In [3]:
import dask.array as da

image = nuclear_channel[40:50]
# nuclear_test = da.from_array(image, chunks=image.shape)

In [4]:
test_stack = image[4]

In [6]:
viewer = napari.view_image(test_stack, name="Nuclear Channel")
napari.run()

In [8]:
import warnings
import numpy as np
from skimage.filters import difference_of_gaussians
from skimage.filters import rank
from skimage.filters import gaussian
from skimage.filters import threshold_otsu
from skimage.filters import threshold_li
from skimage.filters import sobel
from skimage.feature import peak_local_max
from skimage.segmentation import watershed
from skimage.morphology import binary_closing
from skimage.morphology import remove_small_objects
from skimage.util import img_as_ubyte
from scipy import ndimage as ndi
from multiprocess import Pool


def ellipsoid(diameter, height):
    """
    Constructs an ellipsoid footprint for morphological operations - this is usually
    better than built-in skimage.morphology footprints because the voxel dimensions
    in our images are typically anisotropic.

    :param int diameter: Diameter in xy-plane of ellipsoid footprint.
    :param int heigh: Height in z-axis of ellipsoid footprint.
    :return: Ellipsoid footprint.
    :rtype: bool
    """
    # Coerce diameter and height to odd integers (skimage requires footprints to be
    # odd in size).
    if diameter < 3 or height < 3:
        raise Exception(
            " ".join(
                [
                    "Setting diameter or height below 3 results in an",
                    "empty or improperly dimensioned footprint.",
                ]
            )
        )

    round_odd = lambda x: int(((x + 1) // 2) * 2 - 1)
    diameter = round_odd(diameter)
    height = round_odd(height)

    # Generate coordinate arrays
    x = np.arange(-diameter // 2 + 1, diameter // 2 + 1)
    y = np.copy(x)
    z = np.arange(-height // 2 + 1, height // 2 + 1)

    zz, yy, xx = np.meshgrid(z, y, x, indexing="ij")
    ellipsoid_eqn = (
        (xx / (diameter / 2)) ** 2
        + (yy / (diameter / 2)) ** 2
        + (zz / (height / 2)) ** 2
    )
    ellipsoid_footprint = ellipsoid_eqn < 1

    return ellipsoid_footprint


def iterative_peak_local_max(image, footprint):
    """
    Find peaks in an image as coordinate list.

    :param image: 2D (projected) or 3D image of a nuclear marker.
    :type image: Numpy array.
    :param footprint: Footprint used during maximum dilation. This sets the minimum
        distance between peaks, with a single maximum within the net footprint of
        iterated maximum dilations. Can be given as Tuple(num_iter, footprint) where
        the footprint is used for maximum diation num_iter times, or as a Numpy array
        of booleans that gets used as a footprint for a single maximum dilation.
    :type footprint: {Tuple(int, ndarray), ndarray}
    :return: Coordinates of the local maxima.
    :rtype: Numpy array.
    """
    # Check footprint type to decide number of iterations
    if type(footprint) is tuple:
        num_iter = footprint[0]
        footprint = footprint[1]
    elif isinstance(footprint, np.ndarray):
        num_iter = 1
    else:
        raise TypeError("footprint parameter must being either a tuple or numpy array.")

    # We apply a maximum dilation to the image, then compare to the original image
    # such that the only points that are selected correspond to maxima within the net
    # footprint of the dilation after the iterated application of the max filter.
    image_max = np.copy(image)
    for i in range(num_iter):
        image_max = ndi.maximum_filter(image_max, footprint=footprint)

    peak_mask = image == image_max
    coords = np.transpose(np.nonzero(peak_mask))

    return coords


def mark_nuclei(stack, *, low_sigma, high_sigma, max_footprint):
    """
    Uses a difference of gaussians bandpass filter to enhance nuclei, then a local
    maximum to find markers for each nucleus. Being permissive with the filtering at
    this stage is recommended, since further filtering of the nuclear localization can
    be done post-segmentation using the size and morphology of the segmented objects.

    :param stack: 2D (projected) or 3D image of a nuclear marker.
    :type stack: Numpy array.
    :param low_sigma: Sigma to use as the low-pass filter (mainly filters out
        noise). Can be given as float (assumes isotropic sigma) or as sequence/array
        (each element corresponsing the sigma along of the image axes).
    :param high_sigma: Sigma to use as the high-pass filter (removes structured
        background and dims down areas where nuclei are close together that might
        start to coalesce under other morphological operations). Can be given as float
        (assumes isotropic sigma) or as sequence/array (each element corresponsing the
        sigma along of the image axes).
    :param max_footprint: Footprint used by :func:`~iterative_peak_local_max`
        during maximum dilation. This sets the minimum distance between peaks.
    :type max_footprint: Numpy array of booleans.
    :return: Tuple(dog, marker_coordinates, markers) where dog is the
        bandpass-filtered image, marker_coordinates is an array of the nuclear
        locations in the image indexed as per the image (this can be used for
        visualization) and markers is a boolean array of the same shape as image, with
        the marker positions given by a True value.
    :rtype: Tuple of numpy arrays.
    """
    # Band-pass filter image using difference of gaussians - this seems to work
    # better than trying to do blob detection by varying sigma on an approximation of
    # a Laplacian of Gaussian filter.
    dog = difference_of_gaussians(stack, low_sigma=low_sigma, high_sigma=high_sigma)

    # Find local minima of the bandpass-filtered image to localize nuclei
    marker_coordinates = iterative_peak_local_max(
        dog,
        footprint=max_footprint,
    )

    # Generate marker mask for segmentation downstream
    mask = np.zeros(dog.shape, dtype=bool)
    mask[tuple(marker_coordinates.T)] = True
    markers, _ = ndi.label(mask)

    return (dog, marker_coordinates, markers)


def segment_nuclei_stack(
    stack,
    markers,
    *,
    denoising,
    thresholding,
    closing_footprint,
    watershed_method,
    **kwargs
):
    """
    Segments nuclei in a z-stack using watershed method.

    :param stack: 2D (projected) or 3D image of a nuclear marker.
    :type stack: Numpy array.
    :param markers: Boolean array of dimensions matching stack, with nuclei containing
        (ideally) a single 'True' value, and all other values being false. This is
        used to see the watershed segmentation.
    :type markers: Numpy array of booleans.
    :param denoising: Determines which method to use for initial denoising of the
        image (before any filtering or morphological operations) between a gaussian
        filter and a median filter.
        * ``gaussian``: requires a ``denoising_sigma`` keyword argument to determine
        the sigma parameter for the gaussian filter.
        * ``median``: requires a ``median_footprint`` keyword argument to determine
        the footprint used for the median filter.
    :type denoising: {'gaussian', 'median'}
    :param thresholding: Determines which method to use to determine a threshold
        for binarizing the stack, between global and local Otsu threholding, and
        Li's cross-entropy minimization method.
        * ``local_otsu``: requires a ``otsu_footprint`` keyword argument to determine
        the footprint used for the local Otsu thresholding.
    :type thresholding: {'global_otsu', 'local_otsu', 'li'}
    :param closing_footprint: Footprint used for closing operation.
    :type closing_footprint: Numpy array of booleans.
    :param watershed_method: Determines what to use as basins for the watershed
        segmentation, between the inverted denoised image itself (works well for
        bright nuclear markers), the distance-transformed binarized image, and the
        sobel gradient of the image.
    :type watershed_method: {'raw', 'distance_transform', 'sobel'}
    :param denoising_sigma: Sigma used for gaussian filter denoising of the image
        prior to any morphological operations or other filtering. If given as a scalar,
        sigma is assumed to be isotropic. Can also be given as a sequence of scalars
        matching the dimensions of the image, where each element sets the sigma in the
        corresponding image axis
    :type denoising_sigma: scalar or sequence of scalars, only required if using
        ``denoising='gaussian'``.
    :param median_footprint: Footprint used for median filter denoising of the image
        prior to any morphological operations or other filtering.
    :type median_footprint: Numpy array of booleans, only required if using
        ``denoising='median'``.
    :param otsu_footprint: Footprint used for local (rank) Otsu thresholding of the
        image for binarization.
    :type otsu_thresholding: Numpy array of booleans, only required if using
        ``thresholding='local_otsu'``.
    :param min_size: Smallest allowable object size.
    :type min_size: int, optional
    :return: A labeled matrix of the same type and shape as markers, with each label
        corresponding to a mask for a single nucleus, assigned to an integer value.
    :rtype: Numpy array.
    """
    # Normalize image
    stack /= np.max(np.abs(stack))

    # Denoising step
    if denoising == "gaussian":
        try:
            denoising_sigma = kwargs["denoising_sigma"]
        except KeyError:
            raise Exception("Gaussian denoising requires a denoising_sigma parameter.")
        denoised_stack = gaussian(stack, sigma=denoising_sigma)

    elif denoising == "median":
        try:
            median_footprint = kwargs["median_footprint"]
        except KeyError:
            raise Exception("Median denoising requires a median_footprint parameter.")
        stack = img_as_ubyte(stack)
        denoised_stack = rank.median(stack, footprint=median_footprint)

    else:
        raise Exception("Unrecognized denoising parameter.")

    # Thresholding step
    if thresholding == "global_otsu":
        threshold = threshold_otsu(denoised_stack)

    elif thresholding == "local_otsu":
        try:
            otsu_footprint = kwargs["otsu_footprint"]
        except KeyError:
            raise Exception(
                "Local Otsu thresholding requires an otsu_footprint parameter."
            )
        # Convert denoised stack to uint8 for rank operation
        denoised_stack = img_as_ubyte(denoised_stack)
        threshold = rank.otsu(denoised_stack, otsu_footprint)

    elif thresholding == "li":
        threshold_guess = threshold_otsu(denoised_stack)
        threshold = threshold_li(denoised_stack, initial_guess=threshold_guess)

    else:
        raise Exception("Unrecognized thresholding parameter.")

    # Binarize stack by thresholding
    binarized_stack = denoised_stack >= threshold

    # Clean up binarized image with a closing operation
    binarized_stack = binary_closing(binarized_stack, closing_footprint)

    # Segmentation step
    if watershed_method == "raw":
        watershed_landscape = -denoised_stack

    elif watershed_method == "distance_transform":
        watershed_landscape = -(ndi.distance_transform_edt(binarized_stack))

    elif watershed_method == "sobel":
        watershed_landscape = sobel(denoised_stack)

    else:
        raise Exception("Unrecognized watershed_method parameter")

    labels = watershed(watershed_landscape, markers=markers, mask=binarized_stack)

    # Remove small objects if a min_size parameter is provided
    try:
        min_size = kwargs["min_size"]
        remove_small_objects(labels, min_size=min_size, out=labels)
    except KeyError:
        pass

    return labels


def segment_nuclei(
    movie,
    *,
    low_sigma,
    high_sigma,
    max_footprint,
    denoising,
    thresholding,
    closing_footprint,
    watershed_method,
    num_processes=1,
    **kwargs
):
    """
    Segments nuclei in a movie using watershed method.

    :param movie: 2D (projected) or 3D movie of a nuclear marker.
    :type stack: Numpy array.
    :param low_sigma: Sigma to use as the low-pass filter (mainly filters out
        noise). Can be given as float (assumes isotropic sigma) or as sequence/array
        (each element corresponsing the sigma along of the image axes).
    :param high_sigma: Sigma to use as the high-pass filter (removes structured
        background and dims down areas where nuclei are close together that might
        start to coalesce under other morphological operations). Can be given as float
        (assumes isotropic sigma) or as sequence/array (each element corresponsing the
        sigma along of the image axes).
    :param max_footprint: Footprint used by :func:`~iterative_peak_local_max`
        during maximum dilation. This sets the minimum distance between peaks.
    :type max_footprint: Numpy array of booleans.
    :param denoising: Determines which method to use for initial denoising of the
        image (before any filtering or morphological operations) between a gaussian
        filter and a median filter.
        * ``gaussian``: requires a ``denoising_sigma`` keyword argument to determine
        the sigma parameter for the gaussian filter.
        * ``median``: requires a ``median_footprint`` keyword argument to determine
        the footprint used for the median filter.
    :type denoising: {'gaussian', 'median'}
    :param thresholding: Determines which method to use to determine a threshold
        for binarizing the stack, between global and local Otsu threholding, and
        Li's cross-entropy minimization method.
        * ``local_otsu``: requires a ``otsu_footprint`` keyword argument to determine
        the footprint used for the local Otsu thresholding.
    :type thresholding: {'global_otsu', 'local_otsu', 'li'}
    :param closing_footprint: Footprint used for closing operation.
    :type closing_footprint: Numpy array of booleans.
    :param watershed_method: Determines what to use as basins for the watershed
        segmentation, between the inverted denoised image itself (works well for
        bright nuclear markers), the distance-transformed binarized image, and the
        sobel gradient of the image.
    :type watershed_method: {'raw', 'distance_transform', 'sobel'}
    :param int num_processes: Number of worker processes used in parallel loop over
        frames of movie.
    :param denoising_sigma: Sigma used for gaussian filter denoising of the image
        prior to any morphological operations or other filtering. If given as a scalar,
        sigma is assumed to be isotropic. Can also be given as a sequence of scalars
        matching the dimensions of the image, where each element sets the sigma in the
        corresponding image axis
    :type denoising_sigma: scalar or sequence of scalars, only required if using
        ``denoising='gaussian'``.
    :param median_footprint: Footprint used for median filter denoising of the image
        prior to any morphological operations or other filtering.
    :type median_footprint: Numpy array of booleans, only required if using
        ``denoising='median'``.
    :param otsu_footprint: Footprint used for local (rank) Otsu thresholding of the
        image for binarization.
    :type otsu_thresholding: Numpy array of booleans, only required if using
        ``thresholding='local_otsu'``.
    :param min_size: Smallest allowable object size.
    :type min_size: int, optional
    :return: A labeled matrix of the same type and shape as markers, with each label
        corresponding to a mask for a single nucleus, assigned to an integer value.
    :rtype: Numpy array.
    """
    # Initialize segmentation array
    movie_shape = movie.shape
    num_timepoints = movie_shape[0]

    segmented_movie = np.empty(movie_shape, dtype=int)

    # Define inner function to iterate segmentation over frame number i
    def segment_frame(i):
        _, _, markers = mark_nuclei(
            movie[i],
            low_sigma=low_sigma,
            high_sigma=high_sigma,
            max_footprint=max_footprint,
        )
        labels = segment_nuclei_stack(
            movie[i],
            markers,
            denoising=denoising,
            thresholding=thresholding,
            closing_footprint=closing_footprint,
            watershed_method=watershed_method,
            **kwargs
        )
        segmented_movie[i] = labels
        return None

    # Loop over frames of movie in parallel loop
    with Pool(processes=num_processes) as pool:
        pool.map(segment_frame, range(num_timepoints))

        pool.close()
        pool.join()

    return segmented_movie

In [29]:
dog, marker_coords, marker_bool = mark_nuclei(
    test_stack,
    low_sigma=[1, 6, 6],
    high_sigma=[15, 10, 10],
    max_footprint=(6, ellipsoid(5, 3)),
)

In [11]:
viewer.add_points(marker_coords)

<Points layer 'marker_coords' at 0x7f014c063a00>

In [None]:
labels_test = segment_nuclei_stack(
    test_stack,
    marker_bool,
    denoising="gaussian",
    denoising_sigma=3,
    thresholding="global_otsu",
    closing_footprint=ellipsoid(3, 3),
    watershed_method="raw",
    min_size=8000,
)

In [19]:
viewer.add_image(labels_test, name="gaussian_global-otsu_raw_minsize")

<Image layer 'gaussian_global-otsu_raw_minsize' at 0x7f03adff5f30>

In [7]:
labels_test_movie = segment_nuclei(
    image,
    low_sigma=[1, 6, 6],
    high_sigma=[15, 10, 10],
    max_footprint=(6, ellipsoid(5, 3)),
    denoising="gaussian",
    denoising_sigma=3,
    thresholding="global_otsu",
    closing_footprint=ellipsoid(3, 3),
    watershed_method="raw",
    min_size=8000,
    num_processes=4,
)