# SCohenLab Volumetric Image Processing notebook (Simplified MCZ)

--------------
# PIPELINE OVERVIEW
## 1. GOAL SETTING

### GOAL:  Infer sub-cellular components in order to understand interactome 

To measure shape, position, size, and interaction of eight organelles/cellular components (Nuclei (NU), Lysosomes (LS),Mitochondria (MT), Golgi (GL), Peroxisomes (PO), Endoplasmic Reticulum (ER), Lipid Droplet (LD), and SOMA) during differentiation of iPSCs, in order to understand the Interactome / Spatiotemporal coordination.

### summary of _OBJECTIVES_
- Infer subcellular objects:
  -  #### #1. [infer NUCLEI ](#image-processing-objective-1-infer-nucleii) - NU, ch 0
  -  #### #2. [Infer SOMA](#image-processing-objective-2-infer-soma) (extent of single brightest cell)- SO, composite icluding ch 1, ch 4,ch 5, and ch 7 "residuals"
  -  #### #3. [Infer CYTOSOL](#image-processing-objective-3-infer-cytosol)- CY 
  -  #### #4. [Infer LYSOSOMES](#image-processing-objective-4-infer-lysosomes)  - LS, ch 1
  -  #### #5. [Infer MITOCHONDRIA](#image-processing-objective-5-infer-mitochondria) - MT, ch 2
  -  #### #6. [Infer GOLGI complex](#image-processing-objective-6-infer-golgi-complex) - GL, ch 3
  -  #### #7. [Infer PEROXISOMES](#image-processing-objective-7-infer-peroxisomes) - PO, ch  4
  -  #### #8. [Infer ENDOPLASMIC RETICULUM ](#image-processing-objective-8-infer-endoplasmic-reticulum)- ER, ch 5
  -  #### #9. [Infer LB](#image-processing-objective-9-infer-lipid-bodies-droplet), LD, ch 6 





## 2. DATA CREATION
> METHODS
> iPSC lines prepared and visualized on Zeiss Microscopes. 32 channel multispectral images collected.  Linear Unmixing in  ZEN Blue software with target emission spectra yields 8 channel image outputs.  Channels correspond to: Nuclei (NU), Lysosomes (LS),Mitochondria (MT), Golgi (GL), Peroxisomes (PO), Endoplasmic Reticulum (ER), Lipid Droplet (LD), and a “residual” signal.

> Meta-DATA
>   - Microcope settings
>  - OME scheme
> - Experimenter observations
> - Sample, bio-replicate, image numbers, condition values, etc
>  - Dates
>  - File structure, naming conventions
>  - etc.





## 3. IMAGE PROCESSING
### INFERENCE OF SUB-CELLULAR OBJECTS
The imported images have already been pre-processed to transform the 32 channel spectral measuremnts into "linearly unmixed" images which estimate independently labeled sub-cellular components.  Thes 7 channels (plus a residual "non-linear" signal) will be used to infer the shapes and extents of these sub-cellular components.   
We will perform computational image analysis on the pictures (in 2D an 3D) to _segment_ the components of interest for measurement.  In other prcoedures we can used these labels as "ground truth" labels to train machine learning models to automatically perform the inference of these objects.
Pseudo-independent processing of the imported multi-channel image to acheive each of the 9 objecives stated above.  i.e. infering: NUCLEI, SOMA, CYTOSOL, LYSOSOME, MITOCHONDRIA, GOLGI COMPLEX, PEROZISOMES, ENDOPLASMIC RETICULUM, and LIPID BODIES

### General flow for infering objects via segmentation
- Pre-processing
- Core-processing (thresholding)
- Post-processing 

### QC




## 4. QUANTIFICATION

SUBCELLULAR COMPONENT METRICS
-  extent 
-  size
-  shape
-  position



### NOTE: PIPELINE TOOL AND DESIGN CHOICES?
We want to leverage the Allen Cell & Structure Setmenter.  It has been wrapped as a [napari-plugin](https://www.napari-hub.org/plugins/napari-allencell-segmenter) but fore the workflow we are proving out here we will want to call the `aicssegmentation` [package](https://github.com/AllenCell/aics-segmentation) directly.

#### ​The Allen Cell & Structure Segmenter 
​The Allen Cell & Structure Segmenter is a Python-based open source toolkit developed at the Allen Institute for Cell Science for 3D segmentation of intracellular structures in fluorescence microscope images. This toolkit brings together classic image segmentation and iterative deep learning workflows first to generate initial high-quality 3D intracellular structure segmentations and then to easily curate these results to generate the ground truths for building robust and accurate deep learning models. The toolkit takes advantage of the high replicate 3D live cell image data collected at the Allen Institute for Cell Science of over 30 endogenous fluorescently tagged human induced pluripotent stem cell (hiPSC) lines. Each cell line represents a different intracellular structure with one or more distinct localization patterns within undifferentiated hiPS cells and hiPSC-derived cardiomyocytes.

More details about Segmenter can be found at https://allencell.org/segmenter
In order to leverage the A
# IMPORTS

import  all nescessary packages

we are using `napari` for visualization, and `scipy` `ndimage` and `skimage` for analyzing the image files.  The underlying data format are `numpy` `ndarrays` and tools from  Allen Institute for Cell Science.


In [1]:
# top level imports
from pathlib import Path
import os, sys
from collections import defaultdict

import numpy as np
import scipy

# # function for core algorithm
from scipy import ndimage as ndi
import aicssegmentation
from aicssegmentation.core.seg_dot import dot_3d_wrapper, dot_slice_by_slice, dot_2d_slice_by_slice_wrapper, dot_3d
from aicssegmentation.core.pre_processing_utils import ( intensity_normalization, 
                                                         image_smoothing_gaussian_3d,  
                                                         image_smoothing_gaussian_slice_by_slice )
from aicssegmentation.core.utils import topology_preserving_thinning
from aicssegmentation.core.MO_threshold import MO
from aicssegmentation.core.utils import hole_filling
from aicssegmentation.core.vessel import filament_2d_wrapper, vesselnessSliceBySlice
from aicssegmentation.core.output_utils import   save_segmentation,  generate_segmentation_contour
                                                 
from skimage import filters
from skimage import morphology
from skimage.segmentation import watershed
from skimage.feature import peak_local_max
from skimage.morphology import remove_small_objects, binary_closing, ball , dilation   # function for post-processing (size filter)
from skimage.measure import label

# # package for io 
from aicsimageio import AICSImage

import napari

### import local python functions in ../infer_subc
sys.path.append(os.path.abspath((os.path.join(os.getcwd(), '..'))))

from infer_subc.utils.file_io import read_input_image, list_image_files

%load_ext autoreload
%autoreload 2

viewer = None
# # from infer_subc.bioim.object import BioImObject
# # from infer_subc.transforms.transform import BioImTransform
# # from infer_subc.transforms.pipeline import BioImPipeline
# from infer_subc.utils.file_io import read_input_image, get_raw_meta_data

In [14]:

from aicssegmentation.workflow import WorkflowEngine, WorkflowStep, WorkflowDefinition


# from dataclasses import dataclass
# from typing import List
# from aicssegmentation.workflow import Workflow


# @dataclass
# class Channel:
#     index: int
#     name: str = None

#     @property
#     def display_name(self):
#         if self.name is None or self.name.strip().isspace():
#             return f"Channel {self.index}"

#         return f"Ch{self.index}.  {self.name}"

# @dataclass
# class SegmenterModel:
#     """
#     Main Segmenter plugin model
#     """

#     layers: List[str] = None
#     selected_layer: Layer = None
#     channels: List[str] = None
#     selected_channel: Channel = None
#     workflows: List[str] = None
#     active_workflow: Workflow = None

#     def reset(self):
#         """
#         Reset model state
#         """
#         self.layers = None
#         self.selected_layer = None
#         self.channels = None
#         self.selected_channel = None
#         self.workflows = None
#         self.active_workflow = None

[autoreload of infer_subc.utils.file_io failed: Traceback (most recent call last):
  File "/opt/anaconda3/envs/napariNEW/lib/python3.9/site-packages/IPython/extensions/autoreload.py", line 257, in check
    superreload(m, reload, self.old_objects)
  File "/opt/anaconda3/envs/napariNEW/lib/python3.9/site-packages/IPython/extensions/autoreload.py", line 455, in superreload
    module = reload(module)
  File "/opt/anaconda3/envs/napariNEW/lib/python3.9/importlib/__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 613, in _exec
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/Users/ahenrie/Projects/Imaging/infer-subc/infer_subc/utils/file_io.py", line 15, in <module>
    class AICSImageReaderWrap:
  File "/Users/ahenrie/Projects/Imaging/infer-subc/infer_subc/utils/file_io.py", line 23, in AICSImageReaderWrap
    meta:

In [15]:
#sys.path.append(os.path.abspath((os.path.join(os.getcwd(), '..'))))
# viewer = napari.Viewer()


# Get and load Image for processing
For testing purposes... TODO: build a nice wrapper for this.



Read the data into memeory from the `.czi` files.  (Note: there is also the 2D slice .tif file read for later comparision).  WE will also collect metatdata.

> the `data_path` variable should have the full path to the set of images wrapped in a `Path()`.   Below the path is built in 3 stages
> 1. my user directory "~" plus
> 2. general imaging data directory "Projects/Imaging/data" plus
> 3. "raw" where the linearly unmixed zstacks are

The image "type" is also set by `im_type = ".czi"`


In [16]:
# build the datapath
# all the imaging data goes here.
data_root_path = Path(os.path.expanduser("~")) / "Projects/Imaging/data"

# linearly unmixed ".czi" files are here
data_path = data_root_path / "raw"
im_type = ".czi"

# depricate this
# list_img_files = lambda img_folder,f_type: [os.path.join(img_folder,f_name) for f_name in os.listdir(img_folder) if f_name.endswith(f_type)]
img_file_list = list_image_files(data_path,im_type)

test_img_name = img_file_list[5]
test_img_name

In [20]:
bioim_image = read_input_image(test_img_name)

img_data = bioim_image.image
raw_meta_data = bioim_image.raw_meta
ome_types = []
meta_dict = bioim_image.meta

# get some top-level info about the RAW data
channel_names = meta_dict['name']
img = meta_dict['metadata']['aicsimage']
scale = meta_dict['scale']
channel_axis = meta_dict['channel_axis']


---------------------------
Please proceed to 01_infer_nuclei.ipynb


everything below is just testing some speed of different libraries..  

In [29]:
max_img = img_data.max(axis=1)
max_img.shape

(8, 768, 768)

In [31]:
viewer = napari.Viewer()

for i,ch in enumerate(channel_names):
    viewer.add_image(max_img[i,:,:],
                                name = ch)

In [28]:
i,ch

(0, '0 :: None :: Nuclei_Jan22')

In [25]:
from os import truncate
from scipy.ndimage import gaussian_filter, median_filter

sigma = 1.34
truncate_range = 3.0
f = lambda x: gaussian_filter(x,sigma=sigma, mode="nearest", truncate=truncate_range)

size = 3
f2 = lambda x: median_filter(x,size=size)


# %timeit np.vectorize(f, signature='(n,m)->(n,m)' )(dd)
# 143 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# %timeit np.vectorize(f2, signature='(n,m)->(n,m)' )(dd)
# 378 ms ± 16.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


378 ms ± 16.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [26]:
def _slice_by_slice(struct_img, sigma=1.34, truncate_range=3.0):
    """
    wrapper for applying 2D Guassian smoothing slice by slice on a 3D image
    """
    structure_img_smooth = np.zeros_like(struct_img)

    for zz in range(struct_img.shape[0]):
        structure_img_smooth[zz, :, :] = gaussian_filter(
            struct_img[zz, :, :], sigma=sigma, mode="nearest", truncate=truncate_range
        )

    return structure_img_smooth    

def _med_slice_by_slice(struct_img, size=3):
    """
    wrapper for applying 2D Guassian smoothing slice by slice on a 3D image
    """
    structure_img_smooth = np.zeros_like(struct_img)

    for zz in range(struct_img.shape[0]):
        structure_img_smooth[zz, :, :] = median_filter(
            struct_img[zz, :, :], size=size
        )
    return structure_img_smooth     
    # this might be faster:  scipy.signal.medfilt2d()

# %timeit _slice_by_slice(dd, sigma=1.34, truncate_range=3.0)
# 145 ms ± 4.93 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
#median
# %timeit _med_slice_by_slice(dd, size=3)
# 354 ms ± 7.66 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


354 ms ± 7.66 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [27]:

v1 = np.vectorize(f2, signature='(n,m)->(n,m)' )(dd)
v2 = _med_slice_by_slice(dd, size=3)

(v1!=v2).any()

False

In [24]:
img_data.shape[-3:]
dd = im.get_image_dask_data()
dd

Unnamed: 0,Array,Chunk
Bytes,270.00 MiB,67.50 MiB
Shape,"(8, 15, 768, 768)","(8, 15, 384, 384)"
Count,1 Graph Layer,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 270.00 MiB 67.50 MiB Shape (8, 15, 768, 768) (8, 15, 384, 384) Count 1 Graph Layer 4 Chunks Type float32 numpy.ndarray",8  1  768  768  15,

Unnamed: 0,Array,Chunk
Bytes,270.00 MiB,67.50 MiB
Shape,"(8, 15, 768, 768)","(8, 15, 384, 384)"
Count,1 Graph Layer,4 Chunks
Type,float32,numpy.ndarray


Visualize the raw data file with [napari](https://napari.org)

In [26]:

viewer = napari.view_image(
    img_data,
    channel_axis=0,
    name=channel_names,
    scale=scale
)
viewer.scale_bar.visible = True


In [None]:
viewer.dims.ndisplay = 3
viewer.camera.angles = (-30, 25, 120)

In [None]:
##  need to save: 

# output_path, list_of_files
viewer.dims.ndisplay = 2

In [59]:
image = img_data[2,:,:,:]


img_ = np.asanyarray(image)
img_

array([[[1045,    0,  239, ...,  222,    0,  157],
        [1279,    0,    0, ...,    0,  188,    0],
        [  52,    0,    0, ...,  289,  444,  113],
        ...,
        [   0,   57,  366, ...,   72,   85,  340],
        [ 218,   45,  415, ...,    0,    0,    0],
        [   0,  382,    0, ...,  212,  283,   17]],

       [[ 593,    0,    0, ...,  443,    0,   36],
        [ 417, 2778,  452, ...,  243,    0,  199],
        [1170,    0,    0, ...,  420,   60,    0],
        ...,
        [   0,  123,  279, ...,  439,    6,   74],
        [  90,  166,  139, ...,  372,    0,   57],
        [ 330,    0,  117, ...,  306,    0,    0]],

       [[1092,    0,    0, ...,  107,    0,  126],
        [1765, 1469,  603, ...,  244,    0,  499],
        [   0,    0,    0, ...,   55,  295,    0],
        ...,
        [  65,  178,    0, ...,   96,  308,    0],
        [ 177,    0,    0, ...,    0,   71,  148],
        [ 141,    0,  108, ...,   11,    0,  109]],

       ...,

       [[1079,  432,  27

In [25]:
im_interpolated = HistogramEqualization().F(im_cls.image)

In [27]:
viewer.add_image(
    im_interpolated,
    scale = scale
)

<Image layer 'im_interpolated' at 0x17686b640>

In [29]:
MedianBlur().apply(im_cls)

error: OpenCV(4.6.0) /Users/runner/work/opencv-python/opencv-python/opencv/modules/imgproc/src/median_blur.dispatch.cpp:285: error: (-215:Assertion failed) (ksize % 2 == 1) && (_src0.dims() <= 2 ) in function 'medianBlur'
