# Infer ***cellmask***, ***nucleus***, and ***cytoplasm*** from a composite image of cytoplasmic organelles - 3️⃣
### Alternative workflow: ***"B"*** (an alternative workflow for images with only cytoplasmic organelles, NO **nuclei** or **cell membrane** makers, and more than one cell per field of view)
--------------

## OVERVIEW
We will start by segmenting the different cell regions - the nucleus, cell, and cytoplasm - since they will be necessary for determining which organelle are in which cell. This is integral to our single cell analysis approach.

This notebook goes through the workflow steps to segment the ***cytoplasm*** from a composite image of cytoplasmic organelles, uses the inverse of the cytoplasm to identify the nucleus, and then combines the two segmentations to produce the ***cellmask***.

`NOTE: this workflow is optimized for images with multiple fluorescent cells in the field of view.`


## OBJECTIVE: 
### ✅ Infer sub-cellular component #1: ***cytoplasm***
Segment the ***cytoplasm*** of all cell in the image using a composite of multiple organelle markers combined. This mask should be specific to the cytoplasmic area, but will only be a semantic segmentation.

> ***Biological relevance:***
> The combination of organelle markers used to create the composite image for the cytoplasm segmentation is based on the assumption that the organelle labels used will "fill up" the entire cytoplasm (not including the nucleus). This is NOT the most accurate method to determine the cell area, but is required in the case where membrane and nuclei markers can not or are not included. This largely depends on the organelle labeles used and the cell type. 
>
> *It is important to consider specifics of your system as the cell type and labeling method may differ from the example above.*


### ✅ Infer sub-cellular component #2: ***nucleus***
Segment all ***nuclei*** from the inverse of the cytoplasm mask. Because the organelles used for the composite are cytoplasmic, the nuclei should remain "empty".


### ✅ Infer sub-cellular component #3: ***cellmask***
Segment the ***cellmask*** by combining the ***cytoplasm*** and ***nucleus*** masks. To create an instance segmentation of the cellmask, the nuclei will be used as the seeds for the watershed operation. The cell with the highest combined fluorescence intensity will be considered the main cell for analysis and everything else will be discarded. The nuclei and cytoplasm associated to that cell will be selected by masking.



### IMPORTS

In [31]:
# top level imports
from pathlib import Path
import os, sys
from collections import defaultdict
from typing import Union, Tuple, List

import numpy as np

from scipy import ndimage as ndi
from aicssegmentation.core.pre_processing_utils import ( intensity_normalization, 
                                                         image_smoothing_gaussian_slice_by_slice )
from aicssegmentation.core.MO_threshold import MO
from aicssegmentation.core.utils import hole_filling

from skimage import filters
from skimage.segmentation import watershed, clear_border
from skimage.morphology import remove_small_holes, binary_opening, binary_erosion   # function for post-processing (size filter)
from skimage.measure import label

# # package for io 
from aicsimageio import AICSImage

import napari

### import local python functions in ../infer_subc
sys.path.append(os.path.abspath((os.path.join(os.getcwd(), '..'))))


from infer_subc.core.file_io import (read_czi_image,
                                     read_ome_image,
                                     import_inferred_organelle,
                                     export_inferred_organelle,
                                     list_image_files)

                                             
from infer_subc.core.img import *
from infer_subc.organelles import (get_nuclei, 
                                   non_linear_cellmask_transform,
                                   choose_max_label_cellmask_union_nucleus)


%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Get and load Image for processing

In [2]:
test_img_n = 2

data_root_path = Path(os.path.expanduser("~")) / "Desktop/test_astrocyte_images"
in_data_path = data_root_path / "deconvolved-astro-images"
im_type = ".tiff"

img_file_list = list_image_files(in_data_path,im_type)
test_img_name = img_file_list[test_img_n]

out_data_path = data_root_path / "20230919_test-segmentation"
if not Path.exists(out_data_path):
    Path.mkdir(out_data_path)
    print(f"making {out_data_path}")

In [3]:
img_data,meta_dict = read_czi_image(test_img_name)

channel_names = meta_dict['name']
img = meta_dict['metadata']['aicsimage']
scale = meta_dict['scale']
channel_axis = meta_dict['channel_axis']

---------
## infer ***cytoplasm*** from composite image

### summary of steps

➡️ INPUT
- create composite image from multiple organelle channels

PRE-PROCESSING
- rescale image intensities: 
    - min=0, max=1
- smooth image:
    - median filter (media size = user input)
    - gaussian filter (sigma = user input)
- log transform image
- apply scharr edge detection filter 
- combine log imge + scharr edge filtered intensity

CORE PROCESSING
- apply MO thresholding method from the Allen Cell [aicssegmentation](https://github.com/AllenCell/aics-segmentation) package (threshold options = user input)

POST-PROCESSING
- fill holes (hole size = user input)
- remove small objects (object size = user input)

OUTPUT ➡️ 
- save single ***cellmask*** (cell, CM) at unsigned integer 8-bit tif files


### EXTRACTION prototype - cytoplasm

In [4]:
###################
# INPUT
###################
# Creating a composite image
weights = [4,3,1,1,6,6]
struct_img_raw = weighted_aggregate(img_data, *weights)

### PRE-PROCESSING prototype - cytoplasm

> **NOTE**: No smoothing was done here because these test images were already pre-processed.

In [5]:
###################
# PRE_PROCESSING
###################
med_filter_size = 0
gaussian_smoothing_sigma = 0

structure_img_smooth = scale_and_smooth(struct_img_raw,
                                        median_size = med_filter_size, 
                                        gauss_sigma = gaussian_smoothing_sigma)

In [6]:
# log scale the image, apply the scharr edge detection filter to logged image, add the two images together
composite_cytomask = non_linear_cellmask_transform(structure_img_smooth)

## CORE PROCESSING prototype

In [12]:
###################
# CORE_PROCESSING
###################
# threshold the composite image after
# log/edge detection using the MO filter function from aicssegmentation - this applies a global threshold, then a local threshold to produce a semantic segmentation
thresh_method = 'med'
cutoff_size =  200
thresh_adj = 0.001

bw_cyto = masked_object_thresh(composite_cytomask, 
                          global_method=thresh_method, 
                          cutoff_size=cutoff_size, 
                          local_adjust=thresh_adj)

In [13]:
viewer.add_image(bw_cyto)

<Image layer 'bw_cyto' at 0x1db47e96ad0>

## POST-PROCESSING prototype

In [14]:
###################
# POST_PROCESSING
###################
hole_min_width = 0
hole_max_width = 30

small_object_width = 50

fill_filter_method = "slice_by_slice"

cleaned_cyto = fill_and_filter_linear_size(bw_cyto, 
                                           hole_min=hole_min_width, 
                                           hole_max=hole_max_width, 
                                           min_size= small_object_width,
                                           method=fill_filter_method)

In [15]:
viewer.add_image(cleaned_cyto)

<Image layer 'cleaned_cyto' at 0x1db476d3ca0>

## POST POST-PROCESSING prototype

In [16]:
cytoplasm_multiple = cleaned_cyto.astype(bool)

# save this until the end when only one cytoplasm is saved as a file.
# cytoplasm_mask = label_bool_as_uint16(cleaned_cyto)

### Cytoplasm "CORE" processing function for plugin workflow:

In [7]:
def _segment_cytoplasm_area(in_img: np.ndarray, 
                           MO_method: str,
                           MO_cutoff: int,
                           MO_adjust: float,
                           holefill_min: int,
                           holefill_max: int,
                           obj_min_size: int,
                           fill_filter_method: str):
    """ 
    Function for segmenting the cytoplasmic area from a fluorescent image

    Parameters:
    ----------
    in_img: np.ndarray, 
        fluorescence image (single channel, ZYX array) of the cytoplasm to get segmented
    MO_method: str,
        masked object threshold method; options: 'med', 'tri', 'ave'
    MO_cutoff: int,
        object cutoff size for the MO threshold method
    MO_adjust: float,
        adjustment value for the MO threshold method
    holefill_min: int,
        smallest sized hole to fill in the final mask
    holefill_max: int,
        largest sized hole to fill in the final mask
    obj_min_size: int,
        size of the smallest object to be included in the mask; small objects are removed
    fill_filter_method: str
        fill holes and remove small objects in '3D' or 'slice_by_slice'


    """
    # create cytoplasm mask
    bw_cyto = masked_object_thresh(in_img, 
                            global_method=MO_method, 
                            cutoff_size=MO_cutoff, 
                            local_adjust=MO_adjust)
    
    # fill holes and filter small objects from the raw mask
    cleaned_cyto = fill_and_filter_linear_size(bw_cyto, 
                                            hole_min=holefill_min, 
                                            hole_max=holefill_max, 
                                            min_size= obj_min_size,
                                            method=fill_filter_method)
    
    # create a boolean mask
    cyto_semantic_seg = cleaned_cyto.astype(bool)

    return cyto_semantic_seg

In [9]:
thresh_method = 'med'
cutoff_size =  200
thresh_adj = 0.001
hole_min_width = 0
hole_max_width = 30

small_object_width = 50

fill_filter_method = "slice_by_slice"

test_cyto_masks = _segment_cytoplasm_area(composite_cytomask, MO_method='med', MO_cutoff=200, MO_adjust=0.001, holefill_max=30, holefill_min=0, obj_min_size=50, fill_filter_method='slice_by_slice')

# np.array_equal(test_cyto_masks, cytoplasm_multiple)

---------
## infer ***nucleus*** from composite image

### summary of steps

➡️ INPUT
- segmented cytoplasm object (from [01a_infer_cytoplasm_from-composite.ipynb](./01a_infer_cytoplasm_from-composite.ipynb))

PRE-PROCESSING
- binary dilation
- fill nucleus (hole size = user input)
- binary erosion

CORE-PROCESSING
- logical **XOR** of the cytoplasm and the filled in cytoplasm resulting in the nucleus and any artifacts from dilation/erosion


POST-PROCESSING
  - remove small objects (object size = user input)

OUTPUT ➡️ 
- save labeled ***nuclei*** (nucleus, NU) as unsigned integer 16-bit tif files

## EXTRACT prototype

In [None]:
###################
# INPUT
###################
# cytoplasm_mask = import_inferred_organelle("cyto-seg",meta_dict, out_data_path)

loaded  inferred 3D `cyto-seg`  from C:\Users\Shannon\Desktop\test_astrocyte_images\20230919_test-segmentation 


## PRE-PROCESSING prototype


In [12]:
###################
# PRE_PROCESSING
###################           
cytoplasm_inverse = 1 - test_cyto_masks

cytoplasm_inv_opened = binary_opening(cytoplasm_inverse, footprint=np.ones([3,3,3]))

## CORE PROCESSING prototype

In [13]:
###################
# CORE_PROCESSING
###################
max_nuc_width = 350

nuc_removed = fill_and_filter_linear_size(cytoplasm_inv_opened, 
                                          hole_max=0, 
                                          hole_min=0, 
                                          min_size=max_nuc_width, 
                                          method='3D')

nuc_objs = np.logical_xor(cytoplasm_inv_opened, nuc_removed)

## POST PROCESSING prototype

In [14]:
###################
# POST_PROCESSING
###################
hole_max = 0
hole_min = 0
min_size = 10

nuc_cleaned = fill_and_filter_linear_size(nuc_objs, 
                                          hole_max=hole_max, 
                                          hole_min=hole_min, 
                                          min_size=min_size, 
                                          method='3D')

## LABELING prototype

In [15]:
###################
# LABELING
###################
# create instance segmentation based on connectivity
nuc_labels = label(nuc_cleaned).astype(np.uint16)

### Nuclei "CORE" processing step for plugin workflow:

In [17]:
def _segment_nuclei_seeds(cyto_seg: np.ndarray,
                          max_nuclei_width: int,
                          filter_small_objs: int):
    """ 
    
    
    """
    # create the inverse of the cytoplasm and increase likelihood for object separation with binary opening
    cytoplasm_inverse = 1 - cyto_seg
    cytoplasm_inv_opened = binary_opening(cytoplasm_inverse, footprint=np.ones([3,3,3]))

    # isolate the nuclei objects that fill be used as seeds for watershed
    # these aren't exactly the inverse of the cytoplasm because of the binary opening
    nuc_removed = fill_and_filter_linear_size(cytoplasm_inv_opened, 
                                            hole_max=0, 
                                            hole_min=0, 
                                            min_size=max_nuclei_width, 
                                            method='3D')

    nuc_objs = np.logical_xor(cytoplasm_inv_opened, nuc_removed)

    # remove an small debris leftover that aren't the correct size for nuclei
    nuc_cleaned = fill_and_filter_linear_size(nuc_objs, 
                                            hole_max=0, 
                                            hole_min=0, 
                                            min_size=filter_small_objs, 
                                            method='3D')
    

    return label(nuc_cleaned).astype(np.uint16)

In [18]:
test_nuc_labels = _segment_nuclei_seeds(test_cyto_masks, max_nuclei_width=350, filter_small_objs=10)

np.array_equal(nuc_labels, test_nuc_labels)

True

---------
## infer ***cellmask*** from cytoplasm mask

### summary of steps

➡️ INPUT
- segmented cytoplasm object (from [01a_infer_cytoplasm_from-composite.ipynb](./01a_infer_cytoplasm_from-composite.ipynb))
- segmented nucleus object (from [02a_infer_nucleus_from-cytoplasm.ipynb](./02a_infer_nucleus_from-cytoplasm.ipynb))

PRE-PROCESSING

CORE-PROCESSING
- logical **OR** of the nucleus and cytoplasm

POST-PROCESSING
- fill small holes (hole size = user input)

OUTPUT ➡️ 
- save labeled ***cellmask*** (cell, CM) as unsigned integer 16-bit tif files

## EXTRACT prototype

## PRE-PROCESSING prototype

No preprocessing steps are required.

## CORE PROCESSING prototype

In [19]:
###################
# CORE_PROCESSING
###################
cells = np.logical_or(test_nuc_labels, test_cyto_masks)

cell_multiple = fill_and_filter_linear_size(cells, 
                                            hole_min=0,
                                            hole_max=20,
                                            min_size=0,
                                            method='3D')

### Cellmask "CORE" function for plugin workflow:

In [26]:
def _combine_cytoplasm_and_nuclei(cyto_seg: np.ndarray,
                                  nuc_seg: np.ndarray,
                                  fillhole_max: int):
    """
    Function to combine the the cytoplasm and nuclei segmentations to produce the entire cell mask.

    Parameters:
    ----------
    cyto_seg: np.ndarray,
        image containing the cytoplasm segmentation
    nuc_seg: np.ndarray,
        image containing the nuclei segmentation
    fillhole_max: int
        size of the gaps between the nuclei and cytoplasm (usually small)
    """ 
    
    cells = np.logical_or(cyto_seg.astype(bool), nuc_seg.astype(bool))

    cell_multiple = fill_and_filter_linear_size(cells, 
                                                hole_min=0,
                                                hole_max=fillhole_max,
                                                min_size=0,
                                                method='3D')
    
    cell_area = cell_multiple.astype(bool)

    return cell_area

In [28]:
test_cell_area = _combine_cytoplasm_and_nuclei(test_cyto_masks, test_nuc_labels, 20)

np.array_equal(cell_multiple, test_cell_area)

True

## POST PROCESSING prototype

In [20]:
###################
# POST_PROCESSING
###################
cell_labels = masked_inverted_watershed(test_cell_area, markers=nuc_labels, mask=test_cell_area, method='3D')

In [71]:
# # determine the largest cell
# cell_IDs = np.unique(cell_labels)[1:]

# dict = {}
# for obj in cell_IDs:
#     pxlcnt = np.sum(cell_labels==obj)
#     dict[obj] = pxlcnt

# largest_ID = max(dict, key=dict.get)
# largest_cell = cell_labels == largest_ID

# largest_ID, dict

(1, {1: 1801077, 2: 684129, 3: 486997, 4: 1122844})

In [21]:
# determine the brightest cell
target_labels = None
labels_in = cell_labels

if target_labels is None:
    all_labels = np.unique(cell_labels)[1:]
else:
    all_labels = np.unique(target_labels)[1:]

all_labels

array([1, 2, 3, 4], dtype=uint16)

In [22]:
# create a composite from each intensity channel after they have been min-max normalized
norm_channels = [(min_max_intensity_normalization(img_data[c])) for c in range(len(img_data))]
normed_signal = np.stack(norm_channels, axis=0)

normed_composite = normed_signal.sum(axis=0)

np.max(normed_signal), normed_signal.shape, normed_composite.shape

(1.0, (6, 20, 1688, 1688), (20, 1688, 1688))

In [23]:
# find the cell mask that has the highest intensity
total_signal = [normed_composite[labels_in == label].sum() for label in all_labels]

keep_label = all_labels[np.argmax(total_signal)]

good_cell = cell_labels == keep_label

In [None]:
def _select_highest_intensity_cell(raw_image: np.ndarray,
                                   cell_seg: np.ndarray,
                                   nuc_seg: np.ndarray,
                                   labels_to_consider: Union(list, None) = None):
    """ 
    Create an instance segmentation of the cell area using a watershed operation based on nuclei seeds.
    Then, select the cell with the highest combined organelle intensity.

    Parameters:
    ----------
    raw_image: np.ndarray,
        gray scale 3D multi-channel numpy array (CZYX)
    cell_seg: np.ndarray,
        binary cell segmentation with multiple cells in the FOV
    nuc_seg: np.ndarray,
        labeled nuclei segmentation with each nuclei having a different ID number (e.g., the result of the skimage label() function)
    labels_to_consider: Union(list, None)
        a list of labels that should be considered when determining the highest intensity. Default is None which utilizes all possible labels in the cell image
        
    Output
    ----------
    good_cell: np.ndarray  
        a binary image of the single cell with the highest total fluorescence intensity
    """
    # instance segmentation of cell area with watershed function
    cell_labels = masked_inverted_watershed(cell_seg, markers=nuc_seg, mask=cell_seg, method='3D')

    # create composite of all fluorescence channels after min-max normalization
    norm_channels = [(min_max_intensity_normalization(raw_image[c])) for c in range(len(raw_image))]
    normed_signal = np.stack(norm_channels, axis=0)
    normed_composite = normed_signal.sum(axis=0)

    # list of cell IDs to measure intensity of
    if labels_to_consider is None:
        all_labels = np.unique(cell_labels)[1:]
    else:
        all_labels = np.unique(labels_to_consider)[1:]

    # measure total intensity in each cell from the ID list
    total_signal = [normed_composite[cell_labels == label].sum() for label in all_labels]

    # select the cell with the highest total intensity
    keep_label = all_labels[np.argmax(total_signal)]
    good_cell = cell_labels == keep_label

    return good_cell

In [30]:
test_good_cell = _select_highest_intensity_cell(img_data, test_cell_area, test_nuc_labels)

np.array_equal(good_cell, test_good_cell)

True

## LABELING prototype

In [35]:
good_cyto = apply_mask(test_cyto_masks, good_cell).astype(bool)

In [51]:
good_cyto_inverse = 1 - good_cyto

nuc_single = clear_border(good_cyto_inverse)

good_nuc = fill_and_filter_linear_size(nuc_single,
                                       hole_min=0,
                                       hole_max=0,
                                       min_size=10,
                                       method='3D').astype(bool)

In [32]:
# viewer = napari.Viewer()



In [43]:
# nuc_masked = apply_mask(nuc_labels, test_good_cell)

# nuc_seed = binary_erosion(nuc_masked.astype(bool), footprint=np.ones([3,3,3]))

# test_good_nuc = watershed(good_cyto, nuc_seed)

In [44]:
# viewer.add_image(good_cyto)
# viewer.add_image(nuc_seed)
# viewer.add_image(test_good_nuc)

<Image layer 'test_good_nuc [1]' at 0x260951a9e10>

#### Create final step - nucleus and cytoplasm instance segmentation using good cell mask

In [50]:
def _mask_cytoplasm_nuclei(cellmask: np.ndarray,
                           cyto_seg: np.ndarray,
                           small_obj_size: int):
    """ 
    mask the cytoplasm with the cell mask to isolate the cytoplasmic area of intereste.
    create a single nuclei segmentation from the inverse of the cytoplas (no binary opening)

    Parameters:
    ----------
    cellmask: 
        binary segmentation of a single cell
    cyto_seg:
        semantic segmentation of cytoplasm from multiple cells in an image
    small_obj_size:
        size of small objects to be removed from the final nucleus segmentation image
    """

    good_cyto = apply_mask(cyto_seg, cellmask).astype(bool)

    good_cyto_inverse = 1 - good_cyto

    nuc_single = clear_border(good_cyto_inverse)

    good_nuc = fill_and_filter_linear_size(nuc_single,
                                        hole_min=0,
                                        hole_max=0, 
                                        min_size=small_obj_size,
                                        method='3D')
    
    stack = stack_masks(nuc_mask=good_nuc, cellmask=cellmask, cyto_mask=good_cyto)
    
    return stack

In [52]:
test_good_cyto, test_good_nuc = _mask_cytoplasm_nuclei(good_cell, test_cyto_masks, 10)

np.array_equal(test_good_cyto, good_cyto), np.array_equal(test_good_nuc, good_nuc)

(True, True)

In [53]:
###################
# LABELING
###################
stack = stack_masks(nuc_mask=test_good_nuc, cellmask=test_good_cell, cyto_mask=test_good_cyto)

In [99]:
out_file_n = export_inferred_organelle(stack, "masks", meta_dict, out_data_path)

saved file: 04282022_astro_arsenite50uM_4_Linear unmixing_0_cmle.ome-masks
