# Image Processing Pipeline Setup

SCohenLab 3D Image Processing notebook 00.1 - Pipeline Setup


## OVERVIEW

The first thing we need to be able to do is access the data files and interact with them.





## IMPORTS

The convention with notebooks (and python in general) is to import the nescessary packages as the first thing.

We are using `napari` for visualization, and `scipy` `ndimage` and `skimage` for analyzing the image files.  The underlying data format are `numpy` `ndarrays` and tools from  Allen Institute for Cell Science `aicssegmentation`.

### NOTES: 
There are a few conventions used here worth explanation.  Note the `imports.py` and `constants.py` files in the base level of the `infer_subc` module.  These provide sortcuts for keeping track of imports and constants.   cf. the bottom of the imports below.  A second thing to note is the use of the "magics" ([[ link to magics info %%]]) `%load_ext autoreload` `%autoreload 2`, which tells the notebook to reload any changes made in the source code of the module on change; hence, avoid re-executing the imports.


In [None]:
# top level imports
from pathlib import Path
import os, sys

import numpy as np
import scipy

import tifffile

from typing import Union, List, Tuple, Any
# TODO:  prune the imports.. this is the big set for almost all organelles
# # function for core algorithm
from scipy import ndimage as ndi
import aicssegmentation
from aicssegmentation.core.pre_processing_utils import (intensity_normalization, 
                                                        image_smoothing_gaussian_slice_by_slice )

# # package for io 
from aicsimageio import AICSImage

import napari

### import local python functions in ../infer_subc
sys.path.append(os.path.abspath((os.path.join(os.getcwd(), '..'))))

from infer_subc.core.file_io import (read_czi_image,
                                        list_image_files,
                                        get_raw_meta_data,
                                        read_input_image)
from infer_subc.utils._aicsimage_reader import reader_function, _get_meta
from infer_subc.core.img import *
from infer_subc.organelles import fixed_get_optimal_Z_image, get_optimal_Z_image
from infer_subc.constants import (TEST_IMG_N,
                                     NUC_CH ,
                                     LYSO_CH ,
                                     MITO_CH ,
                                     GOLGI_CH ,
                                     PEROX_CH ,
                                     ER_CH ,
                                     LD_CH ,
                                     RESIDUAL_CH , 
                                     ALL_CHANNELS)

%load_ext autoreload
%autoreload 2


#### Get and load an image - specifically for __multichannel "raw"__ images



Read the data into memeory from the `.czi` files.  (Note: there is also the 2D slice .tif file read for later comparision).  We will also collect metatdata here.

> the `data_path` variable should have the full path to the set of images wrapped in a `Path()`.   Below the path is built in 3 stages
> 1. my user directory "~" plus
> 2. general imaging data directory "Projects/Imaging/data" plus
> 3. "raw" where the linearly unmixed zstacks are

The image "type" is also set by `im_type = ".czi"`


In [None]:
# # this will be the example for testing the pipeline below
# test_img_n = TEST_IMG_N

# # build the datapath
# # all the imaging data goes here.
# data_root_path = Path(os.path.expanduser("~")) / "Documents\Python Scripts\infer-subc"

# # linearly unmixed ".czi" files are here
# data_path = data_root_path / "raw"
# im_type = ".czi"

# # get the list of all files in "raw"
# img_file_list = list_image_files(data_path,im_type)
# test_img_name = img_file_list[test_img_n]

# test_img_name

In [None]:
# # isolate image as an ndarray and metadata as a dictionary
# img_data, meta_dict = read_czi_image(test_img_name)

# # get some top-level info about the RAW data
# channel_names = meta_dict['name']
# img = meta_dict['metadata']['aicsimage']
# scale = meta_dict['scale']
# channel_axis = meta_dict['channel_axis']

# print(img_data.shape)
# print(meta_dict)


### Get and load Image for processing - specifically for __pre-processed__ images (.tif 16-bit single channel images)

> #### Preprocessing:
> 
> In this instance, we are using [Huygens Essential Software](https://svi.nl/Homepage) to deconvolve 3D fluorescence confocal images. The output is one 3-dimensional .tif file for each channel in the original image.

The basic steps here include:
1. creating a separate list of image names for each channel
2. use reader_function to isolate the image and associate metadata from one image (from your list of choice)

In [None]:
# this will be the example for testing the pipeline below
test_img_n = TEST_IMG_N

# build the datapath
# all the imaging data goes here.
data_root_path = Path(os.path.expanduser("~")) / "Documents\Python Scripts\infer-subc"

# linearly unmixed ".czi" files are here
data_path = data_root_path / "neuron_raw"
im_type = ".tif"

# get the list of all files in "raw"
img_file_list = list_image_files(data_path,im_type)
# test_img_name = img_file_list[test_img_n]
# test_img_name

In [None]:
# This creates a separate list of names for each channel type (defined by the suffix of the file name)
# These lists will be used to read in one channel's worth of image data at a time during each subsequent analysis step - using reader_function which is a wrapper to read in any image and get the image and metadata out
ch0 = []
ch1 = []
ch2 = []
ch3 = []
ch4 = []
ch5 = []
for name in img_file_list:
    if name.endswith('_cmle_ch00.tif'):
        ch0.append(name)
    if name.endswith('_cmle_ch01.tif'):
        ch1.append(name)
    if name.endswith('_cmle_ch02.tif'):
        ch2.append(name)
    if name.endswith('_cmle_ch03.tif'):
        ch3.append(name)
    if name.endswith('_cmle_ch04.tif'):
        ch4.append(name)
    if name.endswith('_cmle_ch05.tif'):
        ch5.append(name)


In [None]:
# Determine all unique name prefixes in the file list
pref_list = []
for name in img_file_list:
    if name[:-16] not in pref_list:
        pref_list.append(name[:-16])        

# For each unique name, file names starting with that prefix in the file list are add to a new list.
# All the files with the same prefix and their metadata are then read into memory using read_czi_image().
# Goal: export the images as multichannel .tif files maintaining the metadata
for unique in pref_list:
    channels = []
    for name in img_file_list:
        if name.startswith(unique):
            channels.append(name)
    image = []
    metadata = []
    for channel in channels:
        img_data, meta_dict = read_czi_image(channel)
        image.append(img_data)
        metadata.append(meta_dict)
    # checking my work below
    print(metadata)
    print(np.shape(image))
    print(image)
    break

In [None]:
#select one image
test_img = img_file_list[0]

# isolate image as an ndarray and metadata as a dictionary
img_data, meta_dict = read_czi_image(test_img)


# # get some top-level info about the RAW data
channel_names = meta_dict['name']
img = meta_dict['metadata']['aicsimage']
# scale = meta_dict['scale'] #this can't be read from the .tif file
# channel_axis = meta_dict['channel_axis'] #this can't be read from the .tif file
huygens_meta = meta_dict['metadata']['raw_image_metadata']

img_data, meta_dict, channel_names, img, huygens_meta


### Get and load Image for processing - specifically for __pre-processed__ images (.OME TIF format)

> #### Preprocessing:
> 
> In this instance, we are using [Huygens Essential Software](https://svi.nl/Homepage) to deconvolve 3D fluorescence confocal images. The output is one 3-dimensional .tif file for each channel in the original image.

The basic steps here include:
1. creating a separate list of image names for each channel
2. use reader_function to isolate the image and associate metadata from one image (from your list of choice)

In [102]:
# this will be the example for testing the pipeline below
test_img_n = TEST_IMG_N

# build the datapath
# all the imaging data goes here.
data_root_path = Path(os.path.expanduser("~")) / "Documents\Python Scripts\infer-subc"

# linearly unmixed ".czi" files are here
data_path = data_root_path / "neuron_raw_OME"
im_type = ".tiff"

# get the list of all files in "raw"
img_file_list = list_image_files(data_path,im_type)
# test_img_name = img_file_list[test_img_n]
# test_img_name

img_file_list

['C:\\Users\\Shannon\\Documents\\Python Scripts\\infer-subc\\neuron_raw_OME\\20221027_C2-107_well_1_cell_1_untreated_Linear_unmixing_decon.ome.tiff']

In [125]:
#select one image
test_img = img_file_list[0]

# isolate image as an ndarray and metadata as a dictionary
img_data, meta_dict = read_czi_image(test_img)

# # get some top-level info about the RAW data
channel_names = meta_dict['name']
img = meta_dict['metadata']['aicsimage']
scale = meta_dict['scale']
channel_axis = meta_dict['channel_axis']
huygens_meta = meta_dict['metadata']['raw_image_metadata']



  d = to_dict(os.fspath(xml), parser=parser, validate=validate)


<bound method Array.max of dask.array<getitem, shape=(49, 1688, 1688), dtype=float32, chunksize=(49, 1688, 1688), chunktype=numpy.ndarray>>

In [130]:
viewer = napari.Viewer()



In [131]:
viewer.add_image(img_data,
                 scale=scale)

<Image layer 'img_data' at 0x24d3ce5f790>

--------------

## SUMMARY

The above shows the general procedure for importing the relavent modules, setting up the file I/O and finally reading in the `img_data` multichannel 3D flourescence image.

### NEXT:  CHOOZE Z-SLICE

proceed to [01_infer_cellmask_fromaggr_3D.ipynb](./01_infer_cellmask_fromaggr_3D.ipynb)