# **Measure Organelle Morphology**

***Prior to this notebook, you should have already run through [2.0_quantification_setup](2.0_quantification_setup.ipynb).***

In notebooks 2.1 through 2.4, we will go over the implementation of `infer-subc` quantification methods (explained in detail in the `method_...` notebooks) to assess the morphology, interactions, and distribution of organelles at the single-cell level. 

### 📍 **Purpose**
This notebook can be used to measure the `morphology` -- the amount, size, and shape -- of one or more `organelles` from one or more cells. It includes options to:
1. 🦠 Quantify the morphology of *one or more organelle(s)* from <ins>ONE CELL</ins>
2. 🧪 Batch process the morphology of *one or more organelle(s)* from *multiple cells* for a <ins>SINGLE EXPERIMENT</ins>
3. 🧮 Summarize morphology metrics *per cell* across <INS>ONE OR MORE EXPERIMENTS</ins>

### 🍃 **Biological Relevance - Organelle Morphology**
Measurements of organelle morphology are included as part of the organelle signature analysis. These metrics can provide information about the physiology of a cell and its constituent organelles. 

Organelle amounts have been demonstrated to differ between cell types. The cytoplasms of some specialized cell types, like adipocytes, are composed almost entirely of a single large lipid droplet for fat storage[[1]](https://doi.org/10.3390/biom11121906), while other cell types, like muscle cells, have large and elaborate mitochdondrial networks for effective metabolite diffusion during muscle contraction[[2]](https://doi.org/10.1038/nature14614).

Additionally, different organelle morphologies are important to their function. For example, fission and fusion maintain mitochondrial homeostasis by modulating the size and connectedness of the mitochondrial network. A recent study demonstrated that asymmetric fission events resulted in morphologically distinct daughter mitochondria with different fates; the larger mitochondria continued to grow and divide, while the smaller, spherical fragments containing high reactive oxygen species (ROS) were destined for autophagic degradation[[3]](https://doi.org/10.1038/s41586-021-03510-6).

The following morphological measurements are included for each organelle:
- `label`: the unique ID number for the object being measured
- `centroid`: centroid coordinate tuple (row, col, Z)
- `bbox`: bounding box coordinates (min_row, min_col, max_row, max_col); pixels/voxels belonging to the bounding box are in the half-open interval [min_row; max_row) and [min_col; max_col).
- `area`: (or `volume` for 3D z-stack images) area of the region i.e. number of pixels of the region scaled by pixel-area; this metric has the option to be converted into "real world" units using the scale from the metadata.
- `surface_area`: the surface area of the region. For 3D, surface area of a 2D surface mesh of the region (skimage.measure.marching_cubes) using skimage.measure.mesh_surface_area; this metric has the option to be converted into "real world" units using the scale from the metadata.
- `SA_to_volume`: surface area / area (or volume); this metric has the option to be converted into "real world" units using the scale from the metadata.
- `equivalent_diameter`: the diameter of a circle with the same area as the region; this metric has the option to be converted into "real world" units using the scale from the metadata.
- `extent`: ratio of pixels/voxels in the region to pixels/voxels in the total bounding box. Computed as area / (rows * cols)
- `euler_number`: Euler characteristic of the set of non-zero pixels. Computed as number of connected components subtracted by number of holes (input.ndim connectivity). In 3D, number of connected components plus number of holes subtracted by number of tunnels.
- `solidity`: ratio of pixels/voxels in the region to pixels/voxels of the convex hull image.
- `axis_major_length`: the length of the major axis of the ellipse that has the same normalized second central moments as the region; this metric has the option to be converted into "real world" units using the scale from the metadata.
- `mask_volume`: the volume of the mask used to define the area of analysis; usually this will be the cell mask since the analysis is intended to be done at the single-cell level.

The following measures of the intensity images are also included:
- `min_intensity`: value with the least intensity in the region.
- `max_intensity`: value with the greatest intensity in the region.
- `mean_intensity`: value with the mean intensity in the region.
- `standard_deviation_intensity`: the standard deviation of the intensity in the region.

These measurements and definitions are derived from the [`skimage.measure.regionprops()`](https://scikit-image.org/docs/stable/api/skimage.measure.html#skimage.measure.regionprops) function. More in depth information about each measurement can be found there.

*You can learn more about the implementation of regionprops within infer-subc in the [method_morphology](method_morphology.ipynb) notebook.*

-----

## 🗂️ **Table of Contents**
The following sections are included in this notebook:

**IMPORTS AND LOAD IMAGE**

**EXPLANATION OF STEPS** - This section serves as *expository examples* of the functions used to quantify, batch process, and summarize organelle morphology.

🦠 **Quantify the morphology of *one or more organelles* from <ins>ONE CELL</ins>**
- **`STEP 1`** - Apply cell mask for single cell analysis
- **`STEP 2`** - Loop through the list of organelles to quantify the morphology of each
- **`STEP 3`** - Combine all of the tables together and add column
- **`DEFINE`** - The get_organelle_morph() function

🧪 **Batch process *multiple cells* from a <ins>SINGLE EXPERIMENT</ins>**
- **`STEP 1`** - List images and segmentations to be collected for each
- **`STEP 2`** - Loop through the list of images and perform the morphology quantification on all organelles
- **`STEP 3`** - Combine all of the tables together and create/store the csv file
- **`DEFINE`** - The batch_process_org_morph() function

🧮 **Summarize metrics *per cell* across <INS>ONE OR MORE EXPERIMENTS</ins>**
- **`STEP 1`** - Get the orgnaelle morphology .csv files
- **`STEP 2`** - Summarize the mean, median, and standard deviation of each feature per cell
- **`STEP 3`** - Calculate additional metrics
- **`STEP 4`** - Unstack the organelle names, fill NA values with 0, and save file
- **`DEFINE`** - The batch_org_morph_summary_stats() function

**EXECUTE QUANTIFICATION** - Once you understand how the functions work, this section can be used to quantify your data in a quick and easy way.
- **`STEP 1`:** 🧪 **Batch process *multiple cells* from a <ins>SINGLE EXPERIMENT</ins>**
- **`STEP 2`:** 🧮 **Summarize metrics *per cell* across <INS>ONE OR MORE EXPERIMENTS</ins>**

-----
---------------------
## **IMPORTS AND LOAD IMAGE**
Details about the functions included in this subsection are outlined in the [`2.0_quantification_setup`](2.0_quantification_setup.ipynb) notebook. Please visit that notebook first if you are confused about any of the code included here.

In [1]:
from typing import List, Union
from pathlib import Path
import os
import time
import warnings

from infer_subc.core.img import *

import numpy as np
import pandas as pd
import napari
from napari.utils.notebook_display import nbscreenshot

from infer_subc.utils.stats import get_morphology_metrics, get_org_morphology, batch_process_org_morph, batch_org_morph_summary_stats
from infer_subc.utils.batch import list_image_files, find_segmentation_tiff_files
from infer_subc.core.file_io import read_czi_image, read_tiff_image

pd.set_option('display.max_columns', None)

#### &#x1F3C3; **Run code; no user input required**

#### &#x1F6D1; &#x270D; **User Input Required:**

Please specify the following information about your data: `raw_img_type`, `data_root_path`, `raw_data_path`, `seg_data_path`, and `quant_data_path`.

In [2]:
#### USER INPUT REQUIRED ###
raw_img_type = ".czi"
data_root_path = Path(os.path.expanduser("~")) / "Documents/Python_Scripts/Infer-subc"
raw_data_path = data_root_path / "raw_two"
seg_data_path = data_root_path / "out_two"
quant_data_path = data_root_path / "quant_two"

#### &#x1F3C3; **Run code; no user input required**

In [3]:
# Create the output directory to save the segmentation outputs in.
if not Path.exists(quant_data_path):
    Path.mkdir(quant_data_path)
    print(f"making {quant_data_path}")

# Create a list of the file paths for each image in the input folder. Select test image path.
raw_img_file_list = list_image_files(raw_data_path,raw_img_type)
pd.set_option('display.max_colwidth', None)
pd.DataFrame({"Image Name":raw_img_file_list})

Unnamed: 0,Image Name
0,C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi
1,C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a48hrs-Ctrl + oleic acid_01_Unmixing.czi


#### &#x1F6D1; &#x270D; **User Input Required:**

Use the list above to specify which image you wish to analyze based on its index: `test_img_n`

In [4]:
#### USER INPUT REQUIRED ###
test_img_n = 0

#### &#x1F3C3; **Run code; no user input required**

In [5]:
# Read in the image and metadata as an ndarray and dictionary from the test image selected above. 
test_img_name = raw_img_file_list[test_img_n]
img_data,meta_dict = read_czi_image(test_img_name)

# Define some of the metadata features.
channel_names = meta_dict['name']
meta = meta_dict['metadata']['aicsimage']
scale = meta_dict['scale']
channel_axis = meta_dict['channel_axis']
file_path = meta_dict['file_name']

print("Metadata information")
print(f"File path: {file_path}")
for i in list(range(len(channel_names))):
    print(f"Channel {i} name: {channel_names[i]}")
print(f"Scale (ZYX): {scale}")
print(f"Channel axis: {channel_axis}")

Metadata information
File path: C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi
Channel 0 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Nuclei_Jan22
Channel 1 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Lyso+405_Jan22
Channel 2 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Mito+405_Jan22
Channel 3 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Golgi+405_Jan22
Channel 4 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Peroxy+405_Jan22
Channel 5 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: ER+405_Jan22
Channel 6 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: BODIPY+405low_Jan22
Channel 7 name: 0 :: a24hrs-Ctrl_14_Unmixing-0 :: Residuals
Scale (ZYX): (0.3891184878080979, 0.07987165184837317, 0.07987165184837318)
Channel axis: 0


#### &#x1F6D1; &#x270D; **User Input Required:**

Specify the following information about the segmentation files: - `org_file_names`, `org_channels_ordered`, `regions_file_names`, `suffix_separator`, and `mask_name`.

In [6]:
#### USER INPUT REQUIRED ###
org_file_names = ["lyso", "mito", "golgi", "perox", "ER", "LD"]
org_channels_ordered = [1, 2, 3, 4, 5, 6]
regions_file_names = ["cell", "nuc"]
suffix_separator = "-20230426_test_"
mask_name = "cell"

#### &#x1F3C3; **Run code; no user input required**

In [7]:
# find file paths for segmentations
all_suffixes = org_file_names + regions_file_names
filez = find_segmentation_tiff_files(file_path, all_suffixes, seg_data_path, suffix_separator)

# read the segmentation and masks/regions files into memory
organelles = [read_tiff_image(filez[org]) for org in org_file_names]
regions = [] 
for m in regions_file_names:
    mfile = read_tiff_image(filez[m])
    regions.append(mfile)

# match the intensity channels to the segmentation files
intensities = [img_data[ch] for ch in org_channels_ordered]

# open viewer and add images
viewer = napari.Viewer()
for r, reg in enumerate(regions_file_names):
    viewer.add_image(regions[r],
                     scale=scale,
                     name=f"{reg} mask")

# colors = ["red", "bop orange", "yellow", "green", "blue", "cyan", "magenta", "bop purple"]
for o, org in enumerate(org_file_names):
    viewer.add_image(intensities[o],
                     scale=scale,
                     name=f"{org} intensity channel")
    viewer.add_labels(organelles[o],
                      scale=scale,
                      name=f"{org} segmentation")
viewer.grid.enabled = True
viewer.reset_view()

print("The following matching files were found and can now be viewed in Napari:")
filez



The following matching files were found and can now be viewed in Napari:


{'raw': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/raw_two/a24hrs-Ctrl_14_Unmixing.czi'),
 'lyso': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_lyso.tiff'),
 'mito': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_mito.tiff'),
 'golgi': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_golgi.tiff'),
 'perox': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_perox.tiff'),
 'ER': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_ER.tiff'),
 'LD': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24hrs-Ctrl_14_Unmixing-20230426_test_LD.tiff'),
 'cell': WindowsPath('C:/Users/Shannon/Documents/Python_Scripts/Infer-subc/out_two/a24h

------
-----
## **EXPLANATION OF STEPS**

-----
### 🦠 **Quantify One or More Organelles from <ins>ONE CELL</ins>**

#### **`STEP 1` - Apply cell mask for single cell analysis**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** To ensure we are performing single cell analysis, we will apply the cell segmentation as a mask to the segmentation file. This will exclude any objects outside of the mask area from the analysis. The mask file is selected from the list of regions and added to Napari for visual inspection if desirer.

In [8]:
# select the mask from the region list
mask = regions[regions_file_names.index(mask_name)]

# add mask to napari for visual inspection
viewer.layers.clear()
viewer.add_image(img_data, scale=scale, name="Intensity Image")
viewer.add_labels(mask, scale=scale, name="Mask")
viewer.grid.enabled = False
viewer.reset_view()

#### **`STEP 2` - Loop through the list of organelles to quantify the morphology of each**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** The block of code below loops through the list of organelles to:
1) Select the intensity image that organelle segmentation was derived from
2) Select the organelle segmentation image
3) Assure the segmentation files are formatted corrected (e.g., the ER should only include on object)

> ***IMPORTANT**: The solidity measurement may cause an error for objects that are very small. It depends on the convex hull measurement which is computed as 0 when the object volume is very small; the solidity value is them output in the table as `inf` (infinity). The following error message will be included:*
> ```python
> UserWarning: Failed to get convex hull image. Returning empty image, see error message below:
> ```

In [9]:
# empty list to collect a morphology data for each organelle
org_tabs = []

# loop through the list of organelles and run the get_morphology_metrics function
for j, target in enumerate(org_file_names):
    # select intensity image
    org_img = intensities[j]  
    
    # select segmentation and if ER, ensure it is only one object
    if target == 'ER':
        org_obj = (organelles[j] > 0).astype(np.uint16)  
    else:
        org_obj = organelles[j]
    
    # run get_morphology_metrics function to output a table of measurements
    org_metrics = get_morphology_metrics(segmentation_img=org_obj, 
                                        seg_name=target,
                                        intensity_img=org_img, 
                                        mask=mask,
                                        mask_name=mask_name,
                                        scale=scale)

    # add table to list above
    org_tabs.append(org_metrics)

# print each table separately
for i, org in enumerate(org_file_names):
    print(f"{org} morphology metrics table:")
    display(org_tabs[i])

lyso morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
96,lyso,114,"(0.3891, 0.0799, 0.0799)",5.447659,27.045770,21.408675,14,336,265,15,343,271,0.064542,1.562071,24.202503,0.497677,0.619048,1,inf,0.736380,0.0,6637.0,2528.307692,1649.847307,3835.846084
97,lyso,115,"(0.3891, 0.0799, 0.0799)",5.447659,27.201289,22.413982,14,339,278,15,344,284,0.039718,1.174476,29.570387,0.423314,0.533333,1,inf,0.537688,0.0,6273.0,2528.750000,1597.274417,3835.846084
98,lyso,116,"(0.3891, 0.0799, 0.0799)",5.836777,16.665719,16.179001,15,206,200,16,212,206,0.079436,1.651287,20.787649,0.533342,0.888889,1,inf,0.641275,0.0,5771.0,2326.687500,1525.191616,3835.846084
99,lyso,117,"(0.3891, 0.0799, 0.0799)",5.836777,20.234984,16.713143,15,247,207,16,260,214,0.079436,2.329911,29.330681,0.533342,0.351648,1,inf,1.372233,0.0,5825.0,2116.250000,1546.793255,3835.846084


mito morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,mito,2,"(0.3891, 0.0799, 0.0799)",2.956764,22.682883,17.49335,0,149,133,16,415,348,203.783039,1280.213295,6.282237,7.301125,0.089714,-103,0.185256,19.25954,0.0,43784.0,10056.840861,6935.683498,3835.846084
1,mito,4,"(0.3891, 0.0799, 0.0799)",1.774606,24.773455,30.612095,0,296,370,10,325,398,4.282095,25.249848,5.896611,2.014741,0.212438,1,0.633725,4.512618,0.0,31822.0,7089.010435,5592.276475,3835.846084
2,mito,5,"(0.3891, 0.0799, 0.0799)",1.4989,38.567764,21.885549,0,433,253,10,541,303,14.109813,85.110914,6.032037,2.998064,0.105259,-2,0.271352,10.437631,0.0,34851.0,9333.687544,6706.927624,3835.846084
3,mito,6,"(0.3891, 0.0799, 0.0799)",1.439738,37.566488,37.621423,0,444,459,10,489,491,5.287456,33.380718,6.31319,2.161471,0.147917,1,0.415934,4.767055,0.0,33315.0,7625.987793,5527.341506,3835.846084
4,mito,7,"(0.3891, 0.0799, 0.0799)",2.336648,40.365703,34.997394,0,478,364,15,531,494,21.939221,130.494378,5.947995,3.473288,0.085515,0,0.245411,11.472257,0.0,40697.0,9515.990156,6749.381493,3835.846084
5,mito,8,"(0.3891, 0.0799, 0.0799)",5.525747,2.539321,3.886951,14,0,16,16,91,78,1.459636,18.677485,12.795989,1.407393,0.052109,4,0.165215,13.556272,0.0,16678.0,4454.433673,3108.94257,3835.846084
6,mito,9,"(0.3891, 0.0799, 0.0799)",2.823502,33.392245,19.500704,4,389,230,11,447,256,5.145961,32.731384,6.360597,2.142016,0.196381,1,0.525742,5.780187,0.0,33082.0,8540.91944,6358.932369,3835.846084


golgi morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,golgi,1,"(0.3891, 0.0799, 0.0799)",2.96114,25.181697,18.171747,0,276,182,15,370,279,26.514236,135.579215,5.11345,3.699646,0.078095,1,0.248136,9.16419,0.0,65535.0,20510.154105,10766.252252,3835.846084
1,golgi,3,"(0.3891, 0.0799, 0.0799)",1.362499,25.808138,14.335402,3,308,172,5,342,190,0.826631,9.399228,11.370531,1.164406,0.272059,1,0.7891,2.934292,1239.0,39152.0,16865.594595,6811.403022,3835.846084
2,golgi,4,"(0.3891, 0.0799, 0.0799)",1.167355,26.161862,19.042084,3,321,231,4,335,246,0.407109,4.895599,12.02527,0.91954,0.780952,1,inf,1.358344,2270.0,37253.0,21760.97561,7230.485604,3835.846084
3,golgi,5,"(0.3891, 0.0799, 0.0799)",3.567122,19.622267,18.042716,4,223,208,15,270,254,11.95263,53.883197,4.508062,2.836754,0.202464,1,0.496801,5.283393,0.0,65535.0,25973.866251,14454.151672,3835.846084
4,golgi,6,"(0.3891, 0.0799, 0.0799)",1.956617,22.657591,16.101593,4,268,191,7,298,214,1.489424,13.351535,8.964225,1.416902,0.289855,0,0.723764,2.815886,0.0,50751.0,21227.326667,10388.062045,3835.846084
5,golgi,7,"(0.3891, 0.0799, 0.0799)",1.948433,29.835657,18.349492,4,364,224,7,385,237,1.020256,8.612446,8.441459,1.249022,0.501832,1,0.805882,1.847332,3565.0,57651.0,24923.238443,10744.754757,3835.846084
6,golgi,9,"(0.3891, 0.0799, 0.0799)",3.485415,22.369099,10.471821,8,265,124,11,294,140,1.102174,10.974168,9.956838,1.281594,0.318966,1,0.716129,2.440131,1555.0,65229.0,29881.112613,13897.834826,3835.846084
7,golgi,10,"(0.3891, 0.0799, 0.0799)",4.363178,21.707823,22.104662,9,253,243,14,285,310,5.431434,37.351765,6.876962,2.180914,0.204104,1,0.403616,6.706301,0.0,65535.0,21733.930987,10786.142354,3835.846084
8,golgi,12,"(0.3891, 0.0799, 0.0799)",4.080419,27.772298,14.146867,10,336,170,12,361,186,1.179128,9.107892,7.724263,1.310753,0.59375,1,0.889513,1.956575,1878.0,40019.0,21569.0,7052.847649,3835.846084
9,golgi,13,"(0.3891, 0.0799, 0.0799)",4.490122,21.838828,15.4027,11,261,179,13,284,207,1.012809,10.176071,10.047379,1.245976,0.31677,1,0.766917,3.167001,1090.0,37787.0,17091.517157,6942.699076,3835.846084


perox morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,perox,1,"(0.3891, 0.0799, 0.0799)",0.628576,34.350954,18.966445,1,429,236,3,432,240,0.032271,1.01954,31.593203,0.395006,0.541667,1,0.928571,0.861873,442.0,20765.0,8222.461538,5993.759211,3835.846084
1,perox,2,"(0.3891, 0.0799, 0.0799)",0.972796,23.362458,21.223039,2,291,265,4,295,268,0.034753,1.039502,29.910952,0.404885,0.583333,1,1.0,0.890054,2075.0,25484.0,9120.285714,6566.167151,3835.846084
2,perox,5,"(0.3891, 0.0799, 0.0799)",1.167355,23.362458,24.241046,3,291,302,4,295,306,0.029788,0.876097,29.410583,0.384606,0.75,1,inf,0.34199,0.0,20059.0,9315.916667,6322.844474,3835.846084
3,perox,6,"(0.3891, 0.0799, 0.0799)",1.167355,24.320918,12.979143,3,304,162,4,306,164,0.009929,0.444234,44.738801,0.266671,1.0,1,inf,0.178598,3917.0,14763.0,10761.75,4355.440126,3835.846084
4,perox,7,"(0.3891, 0.0799, 0.0799)",1.167355,33.884012,20.533158,3,421,256,4,428,259,0.032271,1.148861,35.600575,0.395006,0.619048,1,inf,0.697408,1304.0,13503.0,7845.461538,3067.641605,3835.846084
5,perox,11,"(0.3891, 0.0799, 0.0799)",1.556474,32.228212,15.375293,4,402,191,5,406,195,0.029788,0.876097,29.410583,0.384606,0.75,1,inf,0.34199,0.0,21069.0,11390.916667,5963.191126,3835.846084
6,perox,12,"(0.3891, 0.0799, 0.0799)",2.016341,34.027744,18.237361,4,424,226,7,429,232,0.081918,1.754738,21.420573,0.538841,0.366667,1,0.868421,1.105154,227.0,15322.0,9024.424242,3865.203809,3835.846084
7,perox,13,"(0.3891, 0.0799, 0.0799)",1.556474,38.278489,24.280982,4,479,303,5,481,306,0.009929,0.494978,49.849288,0.266671,0.666667,1,inf,0.252576,0.0,9770.0,6366.25,3796.89573,3835.846084
8,perox,14,"(0.3891, 0.0799, 0.0799)",1.556474,40.623951,35.032935,4,507,437,5,511,441,0.032271,0.920559,28.526013,0.395006,0.8125,1,inf,0.385654,0.0,21009.0,10204.923077,6200.073405,3835.846084
9,perox,16,"(0.3891, 0.0799, 0.0799)",1.945592,24.407837,19.577951,5,304,243,6,308,248,0.0422,1.085746,25.72836,0.431956,0.85,1,inf,0.46274,200.0,22497.0,9510.176471,5398.077612,3835.846084


ER morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,ER,1,"(0.3891, 0.0799, 0.0799)",2.698696,28.102147,22.608693,0,0,41,16,628,562,229.631999,2259.962127,9.841669,7.597626,0.01767,-11,0.066378,47.058328,0.0,45278.0,8918.054516,4392.976511,3835.846084


LD morphology metrics table:


Unnamed: 0,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,LD,1,"(0.3891, 0.0799, 0.0799)",3.558773,23.621466,16.592543,6,284,196,13,309,221,4.565086,19.802354,4.337784,2.05818,0.420343,1,0.814438,2.575421,5479.0,43756.0,23792.611746,6275.688489,3835.846084


#### **`STEP 3` - Combine all of the tables together and add column**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** This code block combines the above tables together so that each organelle object is listed as a separate column in a single table. A new column is then added to specify which image the data is from.

In [10]:
# combine the lists for each organelle into one table
final_org_tab = pd.concat(org_tabs, ignore_index=True)

# add a new column to list the name of the image these data are derived from 
final_org_tab.insert(loc=0,column='image_name',value=file_path.stem)

# print table for inspection
display(final_org_tab)

Unnamed: 0,image_name,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,a24hrs-Ctrl_14_Unmixing,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,a24hrs-Ctrl_14_Unmixing,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,a24hrs-Ctrl_14_Unmixing,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,a24hrs-Ctrl_14_Unmixing,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,a24hrs-Ctrl_14_Unmixing,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
142,a24hrs-Ctrl_14_Unmixing,perox,42,"(0.3891, 0.0799, 0.0799)",5.058540,36.221794,35.702628,13,452,446,14,456,449,0.024824,0.799834,32.220540,0.361929,0.833333,1,inf,0.391290,3727.0,9813.0,7179.700000,2008.598370,3835.846084
143,a24hrs-Ctrl_14_Unmixing,perox,43,"(0.3891, 0.0799, 0.0799)",5.058540,37.891112,34.041298,13,474,425,14,476,428,0.012412,0.539441,43.461666,0.287263,0.833333,1,inf,0.276684,5396.0,14727.0,10237.000000,4098.921663,3835.846084
144,a24hrs-Ctrl_14_Unmixing,perox,44,"(0.3891, 0.0799, 0.0799)",5.447659,24.001431,21.685153,14,300,271,15,302,273,0.009929,0.444234,44.738801,0.266671,1.000000,1,inf,0.178598,7855.0,14096.0,12243.000000,2576.774631,3835.846084
145,a24hrs-Ctrl_14_Unmixing,ER,1,"(0.3891, 0.0799, 0.0799)",2.698696,28.102147,22.608693,0,0,41,16,628,562,229.631999,2259.962127,9.841669,7.597626,0.017670,-11,0.066378,47.058328,0.0,45278.0,8918.054516,4392.976511,3835.846084


#### **`DEFINE` - The get_organelle_morph() function**

> ***IMPORTANT**: The solidity measurement included in `get_morphology_metrics()` function may cause an error. It has been suppressed to reduce the lenght of the output. If `"Warning(s) suppressed while quantifying lyso. See 'method_morphology.ipynb' notebook for more details."` is included in the output, see [method_morphology](method_morphology.ipynb) notebook for more details.

In [16]:
def _get_org_morphology(source_file_path: str,
                         list_obj_names: List[str],
                         list_obj_segs: List[np.ndarray],
                         list_intensity_img: List[np.ndarray],
                         list_region_names: List[str],
                         list_region_segs: List[np.ndarray],
                         mask_name: str,
                         scale: Union[tuple,None] = None):
    """
    Measure the amount, size, and shape of multiple organelles from a single cell

    Parameters:
    ----------
    source_file: str
        file path; this is used for recorder keeping of the file name in the output data tables
    list_obj_names: List[str]
        a list of object names (strings) that will be measured; this should match the order in list_obj_segs
    list_obj_segs: List[np.ndarray]
        a list of 3D (ZYX) segmentation np.ndarrays that will be measured per cell; the order should match the list_obj_names 
    list_intensity_img: List[np.ndarray]
        a list of 3D (ZYX) grayscale np.ndarrays that will be used to measure fluoresence intensity in each region and object
    list_region_names: List[str]
        a list of region names (strings); these should include the mask (entire region being measured - usually the cell) 
        and other sub-mask regions from which we can meausure the objects in (ex - nucleus, neurites, soma, etc.). It should 
        also include the centering object used when created the XY distribution bins.
        The order should match the list_region_segs
    list_region_segs: List[np.ndarray]
        a list of 3D (ZYX) binary np.ndarrays of the region masks; the order should match the list_region_names.
    mask: str
        a str of which region name (contained in the list_region_names list) should be used as the main mask (e.g., cell mask)
    scale: Union[tuple,None] = None
        a tuple that contains the real world dimensions for each dimension in the image (Z, Y, X)

    Returns:
    ----------
    Dataframe of measurements of organelle morphology

    """
    print(f"Quantifying organelle morphology from {source_file_path}.")

    # select the mask from the region list
    mask = list_region_segs[list_region_names.index(mask_name)]
    
    # empty list to collect a morphology data for each organelle
    org_tabs = []

    # loop through the list of organelles and run the get_morphology_metrics function
    for j, target in enumerate(list_obj_names):
        # select intensity image
        org_img = list_intensity_img[j]  
        
        # select segmentation and if ER, ensure it is only one object
        if target == 'ER':
            org_obj = (list_obj_segs[j] > 0).astype(np.uint16)  
        else:
            org_obj = list_obj_segs[j]
        
        # run get_morphology_metrics function to output a table of measurements
        org_metrics = get_morphology_metrics(segmentation_img=org_obj, 
                                            seg_name=target,
                                            intensity_img=org_img, 
                                            mask=mask,
                                            mask_name=mask_name,
                                            scale=scale)

        # add table to list above
        org_tabs.append(org_metrics)

    # combine the lists for each organelle into one table
    final_org_tab = pd.concat(org_tabs, ignore_index=True)

    # add a new column to list the name of the image these data are derived from 
    final_org_tab.insert(loc=0,column='image_name',value=source_file_path.stem)
    

    return final_org_tab

In [17]:
org_morph_tab = _get_org_morphology(source_file_path = file_path,
                                     list_obj_names = org_file_names,
                                     list_obj_segs = organelles,
                                     list_intensity_img = intensities, 
                                     list_region_names = regions_file_names,
                                     list_region_segs = regions,
                                     mask_name=mask_name,
                                     scale=scale)
org_morph_tab

Quantifying organelle morphology from C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi.


Unnamed: 0,image_name,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,a24hrs-Ctrl_14_Unmixing,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,a24hrs-Ctrl_14_Unmixing,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,a24hrs-Ctrl_14_Unmixing,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,a24hrs-Ctrl_14_Unmixing,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,a24hrs-Ctrl_14_Unmixing,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
142,a24hrs-Ctrl_14_Unmixing,perox,42,"(0.3891, 0.0799, 0.0799)",5.058540,36.221794,35.702628,13,452,446,14,456,449,0.024824,0.799834,32.220540,0.361929,0.833333,1,inf,0.391290,3727.0,9813.0,7179.700000,2008.598370,3835.846084
143,a24hrs-Ctrl_14_Unmixing,perox,43,"(0.3891, 0.0799, 0.0799)",5.058540,37.891112,34.041298,13,474,425,14,476,428,0.012412,0.539441,43.461666,0.287263,0.833333,1,inf,0.276684,5396.0,14727.0,10237.000000,4098.921663,3835.846084
144,a24hrs-Ctrl_14_Unmixing,perox,44,"(0.3891, 0.0799, 0.0799)",5.447659,24.001431,21.685153,14,300,271,15,302,273,0.009929,0.444234,44.738801,0.266671,1.000000,1,inf,0.178598,7855.0,14096.0,12243.000000,2576.774631,3835.846084
145,a24hrs-Ctrl_14_Unmixing,ER,1,"(0.3891, 0.0799, 0.0799)",2.698696,28.102147,22.608693,0,0,41,16,628,562,229.631999,2259.962127,9.841669,7.597626,0.017670,-11,0.066378,47.058328,0.0,45278.0,8918.054516,4392.976511,3835.846084


##### &#x1F453; **FYI:** This function has been added to `infer_subc.utils.stats` and can be imported with the following:
> ```python
> from infer_subc.utils.stats import get_org_morphology
> ```

-----
### 🧪 **Batch process *multiple cells* from a <ins>SINGLE EXPERIMENT</ins>**

#### **`STEP 1` - List images and segmentations to be collected for each**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** These steps collect a list of the images included in your "raw" (intensity image) data folder. Then, the masks suffixes and organelle suffixes are combined into one list.

In [18]:
# reading list of files from the raw path
img_file_list = list_image_files(raw_data_path, raw_img_type)

# list of organelle segmentation and masks files to collect from each image
segs_to_collect = org_file_names + regions_file_names

#### **`STEP 2` - Loop through the list of images and perform the morphology quantification on all organelles**

##### &#x1F6D1; &#x270D; **User Input Required:**

Determine if the quantification should be carried out with or without the scale:
- `scale`: True indicates that the function will use the scale metadata to produce "real world" metrics (e.g., microns, etc.). False will produce quantification results in pixel/voxel units.

In [19]:
scale = True

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** The block of code below loops through the list of files and runs the `get_org_morphology()` function on each one. The loop utilizes the following sequence of steps:
1) Find the paths for all of the organelle and mask segmentation files.
2) Collect the intensity channels and organelle segmentation files in the same order. Store them as lists.
3) Collect all of the region segmentation files in another list.
4) Determine the scale from the metadata if it is being used.
5) Run the get_org_morphology() function and add the resulting data table to the org_tab list.
6) Repeat the loop above for each image in the raw image list and add them sequentially to the org_tab list.

In [22]:
# containers to collect data tabels
org_tabs = []

# loop through list of cell analyzing each and appending the data to the empty list
for img_f in img_file_list:
    filez = find_segmentation_tiff_files(img_f, segs_to_collect, seg_data_path, suffix_separator)

    # read in raw file and metadata
    img_data, meta_dict = read_czi_image(filez["raw"])

    # create intensities from raw file as list based on the channel order provided
    intensities = [img_data[ch] for ch in org_channels_ordered]

    # store organelle images as list
    organelles = [read_tiff_image(filez[org]) for org in org_file_names]

    # load regions as a list based on order in list (should match order in "masks" file)
    regions = [read_tiff_image(filez[r]) for r in regions_file_names] 

    # define the scale
    if scale is True:
        scale_tup = meta_dict['scale']
    else:
        scale_tup = None

    org_metrics = _get_org_morphology(source_file_path=img_f,
                                        list_obj_names=org_file_names,
                                        list_obj_segs=organelles,
                                        list_intensity_img=intensities, 
                                        list_region_names=regions_file_names,
                                        list_region_segs=regions, 
                                        mask_name=mask_name,
                                        scale=scale_tup)

    org_tabs.append(org_metrics)

print(f"You've collected quantification data from: {len(org_tabs)} images.")

Quantifying organelle morphology from C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi.
Quantifying organelle morphology from C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a48hrs-Ctrl + oleic acid_01_Unmixing.czi.
You've collected quantification data from: 2 images.


#### **`STEP 3` - Combine all of the tables together and create/store the csv file**

##### &#x1F6D1; &#x270D; **User Input Required:**

Select what file name you'd like to use for the output data table:
- `out_file_name`: the prefix you wish to include in the output file name. An underscore will automatically be added to the end of this string before the word "organelles" to indicat this is the results of the organelle morphology analysis.

In [23]:
#### USER INPUT REQUIRED ###
out_file_name = "20241204_test"

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** This code block combines the data tables from each image and combines them into one large data table. Then, the file is named using the "*{out_file_name}*_organelles.csv".

In [25]:
# combine all of the image tables together into one sheet
final_org = pd.concat(org_tabs, ignore_index=True)

# write the new file path include the file name
org_csv_path = quant_data_path / f"{out_file_name}_org_morph.csv"

# save the csv file
final_org.to_csv(org_csv_path)
print(f"The following quantification results have been save to {org_csv_path}.")
final_org

The following quantification results have been save to C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\quant_two\20241204_test_org_morph.csv.


Unnamed: 0,image_name,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,a24hrs-Ctrl_14_Unmixing,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,a24hrs-Ctrl_14_Unmixing,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,a24hrs-Ctrl_14_Unmixing,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,a24hrs-Ctrl_14_Unmixing,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,a24hrs-Ctrl_14_Unmixing,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
324,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,15,"(0.3891, 0.0799, 0.0799)",2.334711,45.420346,29.521450,6,567,367,7,572,373,0.044683,1.263401,28.274919,0.440265,0.600000,1,inf,0.655341,3132.0,8665.0,6092.611111,1622.816795,2302.843664
325,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,22,"(0.3891, 0.0799, 0.0799)",3.502066,37.290077,19.463723,9,465,242,10,470,247,0.039718,1.092028,27.494555,0.423314,0.640000,1,inf,0.420816,4455.0,9287.0,6694.187500,1300.366449,2302.843664
326,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,30,"(0.3891, 0.0799, 0.0799)",4.669422,11.821004,33.952397,12,145,422,13,152,430,0.114189,2.045553,17.913714,0.601925,0.821429,1,inf,0.698000,4160.0,9825.0,7358.869565,1327.667874,2302.843664
327,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,31,"(0.3891, 0.0799, 0.0799)",4.669422,15.008609,38.890233,12,186,485,13,191,490,0.054612,1.263692,23.139358,0.470721,0.880000,1,inf,0.491397,3791.0,9553.0,6445.000000,1654.984427,2302.843664


#### **`DEFINE` - The batch_process_org_morph() function**

In [26]:
def _batch_process_org_morph(out_file_name: str,
                            seg_path: Union[Path,str],
                            out_path: Union[Path, str], 
                            raw_path: Union[Path,str], 
                            raw_file_type: str,
                            organelle_names: List[str],
                            organelle_channels: List[int],
                            region_names: List[str],
                            mask_name: str,
                            scale:bool=True,
                            seg_suffix:Union[str, None]=None) -> int :
    """  
    batch process segmentation quantification (morphology, distribution, contacts); this function is currently optimized to process images from one file folder per image type (e.g., raw, segmentation)
    the output csv files are saved to the indicated out_path folder

    Parameters:
    ----------
    out_file_name: str
        The prefix to use when naming the output datatable. Do not add a separator; "_" will be added between your prefix and the base name given in the function below.
    seg_path: Union[Path,str]
        Path or str to the folder that contains the segmentation tiff files
    out_path: Union[Path, str]
        Path or str to the folder that the output datatables will be saved to
    raw_path: Union[Path,str]
        Path or str to the folder that contains the raw image files
    raw_file_type: str
        The file type of the raw data; ex - ".tiff", ".czi"
    organelle_names: List[str]
        a list of all organelle names that will be analyzed; the names should be the same as the suffix used to name each of the tiff segmentation files
        Note: the intensity measurements collect per region (from get_region_morphology_3D function) will only be from channels associated to these organelles 
    organelle_channels: List[int]
        a list of channel indices associated to respective organelle staining in the raw image; the indices should listed in same order in which the respective segmentation name is listed in organelle_names
    region_names: List[str]
        a list of regions, or masks, to measure; the order should correlate to the order of the channels in the "masks" output segmentation file
    mask: str
        the name of the region to use as the mask when measuring the organelles; this should be one of the names listed in regions list; usually this will be the "cell" mask
    scale:bool=True
        a tuple that contains the real world dimensions for each dimension in the image (Z, Y, X)
    seg_suffix:Union[str, None]=None
        any additional text that is included in the segmentation tiff files between the file stem and the segmentation suffix, not including the initial "-"

    Returns:
    ----------
    count: int
        the number of images processed
        
    """
    start = time.time()
    count = 0

    # create path objects if inputs are strings
    if isinstance(raw_path, str): raw_path = Path(raw_path)
    if isinstance(seg_path, str): seg_path = Path(seg_path)
    if isinstance(out_path, str): out_path = Path(out_path)
    
    # create directory is it doesn't exist
    if not Path.exists(out_path):
        Path.mkdir(out_path)
        print(f"Output file path not found. Making {out_path}.")
    

    # reading list of files from the raw path
    img_file_list = list_image_files(raw_path, raw_file_type)
    len_file_list = len(img_file_list)

    # list of organelle segmentation and masks files to collect from each image
    segs_to_collect = organelle_names + region_names

    # containers to collect data tabels
    org_tabs = []

    # loop through list of cell analyzing each and appending the data to the empty list
    for img_f in img_file_list:
        img_start = time.time()
        count = count + 1
        filez = find_segmentation_tiff_files(img_f, segs_to_collect, seg_path, seg_suffix)

        # read in raw file and metadata
        img_data, meta_dict = read_czi_image(filez["raw"])

        # create intensities from raw file as list based on the channel order provided
        intensities = [img_data[ch] for ch in organelle_channels]

        # store organelle images as list
        organelles = [read_tiff_image(filez[org]) for org in organelle_names]

        # load regions as a list based on order in list (should match order in "masks" file)
        regions = [read_tiff_image(filez[r]) for r in region_names] 

        # define the scale
        if scale is True:
            scale_tup = meta_dict['scale']
        else:
            scale_tup = None

        org_metrics = _get_org_morphology(source_file_path=img_f,
                                            list_obj_names=organelle_names,
                                            list_obj_segs=organelles,
                                            list_intensity_img=intensities, 
                                            list_region_names=region_names,
                                            list_region_segs=regions, 
                                            mask_name=mask_name,
                                            scale=scale_tup)

        org_tabs.append(org_metrics)
        end2 = time.time()
        print(f"Completed quantification of {meta_dict['file_name']} in {(end2-img_start)/60} mins.")
        print(f"{count}/{len_file_list} images have been processed.")
        print(f"Time elapsed: {(end2-start)/60} mins")

    final_org = pd.concat(org_tabs, ignore_index=True)

    org_csv_path = out_path / f"{out_file_name}_org_morph.csv"
    final_org.to_csv(org_csv_path)

    end = time.time()
    print(f"Quantification for {count} files is COMPLETE! Files saved to '{out_path}'.")
    print(f"It took {(end - start)/60} minutes to quantify these files.")
    return final_org

> ***IMPORTANT**: The solidity measurement included in `get_morphology_metrics()` function may cause an error. It has been suppressed to reduce the lenght of the output. If `"Warning(s) suppressed while quantifying lyso. See 'method_morphology.ipynb' notebook for more details."` is included in the output, see [method_morphology](method_morphology.ipynb) notebook for more details.

In [28]:
batch_org_morph_table = _batch_process_org_morph(out_file_name = "20241204_test",
                                                seg_path = seg_data_path,
                                                out_path = quant_data_path, 
                                                raw_path = raw_data_path, 
                                                raw_file_type = raw_img_type,
                                                organelle_names = org_file_names,
                                                organelle_channels = org_channels_ordered,
                                                region_names = regions_file_names,
                                                mask_name = mask_name,
                                                scale = True,
                                                seg_suffix = suffix_separator)

batch_org_morph_table

Quantifying organelle morphology from C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi.
Completed quantification of C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a24hrs-Ctrl_14_Unmixing.czi in 0.5593964695930481 mins.
1/2 images have been processed.
Time elapsed: 0.5594122568766277 mins
Quantifying organelle morphology from C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a48hrs-Ctrl + oleic acid_01_Unmixing.czi.
Completed quantification of C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\raw_two\a48hrs-Ctrl + oleic acid_01_Unmixing.czi in 0.2897226889928182 mins.
2/2 images have been processed.
Time elapsed: 0.8491349458694458 mins
Quantification for 2 files is COMPLETE! Files saved to 'C:\Users\Shannon\Documents\Python_Scripts\Infer-subc\quant_two'.
It took 0.8493348439534505 minutes to quantify these files.


Unnamed: 0,image_name,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,a24hrs-Ctrl_14_Unmixing,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,a24hrs-Ctrl_14_Unmixing,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,a24hrs-Ctrl_14_Unmixing,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,a24hrs-Ctrl_14_Unmixing,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,a24hrs-Ctrl_14_Unmixing,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
324,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,15,"(0.3891, 0.0799, 0.0799)",2.334711,45.420346,29.521450,6,567,367,7,572,373,0.044683,1.263401,28.274919,0.440265,0.600000,1,inf,0.655341,3132.0,8665.0,6092.611111,1622.816795,2302.843664
325,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,22,"(0.3891, 0.0799, 0.0799)",3.502066,37.290077,19.463723,9,465,242,10,470,247,0.039718,1.092028,27.494555,0.423314,0.640000,1,inf,0.420816,4455.0,9287.0,6694.187500,1300.366449,2302.843664
326,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,30,"(0.3891, 0.0799, 0.0799)",4.669422,11.821004,33.952397,12,145,422,13,152,430,0.114189,2.045553,17.913714,0.601925,0.821429,1,inf,0.698000,4160.0,9825.0,7358.869565,1327.667874,2302.843664
327,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,31,"(0.3891, 0.0799, 0.0799)",4.669422,15.008609,38.890233,12,186,485,13,191,490,0.054612,1.263692,23.139358,0.470721,0.880000,1,inf,0.491397,3791.0,9553.0,6445.000000,1654.984427,2302.843664


##### &#x1F453; **FYI:** This function has been added to `infer_subc.utils.stats` and can be imported with the following:
> ```python
> from infer_subc.utils.stats import batch_process_org_morph
> ```

-----
### 🧮 **Summarize metrics *per cell* across <INS>ONE OR MORE EXPERIMENTS</ins>**

#### **`STEP 1` - Get the orgnaelle morphology .csv files**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** These steps collect all of the organelle morphology .csv quantification files from a list locations. Multiple paths can be listed in `csv_path_list` to combine multiple experimental replicates of information.

In [30]:
# create path list from the inputs given above; if desired, more than one input location can be included here when more than one experimental replicate is included
csv_path_list = [quant_data_path]

ds_count = 0
fl_count = 0
###################
# Read in the csv files and combine them into one of each type
###################
# create empty list to hold the morphology tables from different experiments
org_tabs = []
org = "_org_morph"

# loop through all of the locations listed above and find the _org_morph files; append them to the list above
for loc in csv_path_list:
    ds_count = ds_count + 1
    files_store = sorted(loc.glob("*.csv"))
    for file in files_store:
        fl_count = fl_count + 1
        stem = file.stem

        if org in stem:
            test_orgs = pd.read_csv(file, index_col=0)
            test_orgs.insert(0, "dataset", stem[:-11])
            org_tabs.append(test_orgs)

# combine the org_morph lists found above into one table
org_df = pd.concat(org_tabs,axis=0, join='outer')

# print table
org_df

Unnamed: 0,dataset,image_name,object,label,scale,centroid-0,centroid-1,centroid-2,bbox-0,bbox-1,bbox-2,bbox-3,bbox-4,bbox-5,volume,surface_area,SA_to_volume_ratio,equivalent_diameter,extent,euler_number,solidity,axis_major_length,min_intensity,max_intensity,mean_intensity,standard_deviation_intensity,cell_volume
0,20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,1,"(0.3891, 0.0799, 0.0799)",0.000000,21.855788,15.930764,0,271,197,1,278,202,0.054612,0.698442,12.789117,0.470721,0.628571,1,inf,0.636832,0.0,6740.0,2750.772727,1982.010364,3835.846084
1,20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,2,"(0.3891, 0.0799, 0.0799)",0.000000,22.349087,19.423787,0,278,241,1,283,246,0.039718,0.561866,14.146382,0.423314,0.640000,1,inf,0.592344,174.0,4247.0,2239.250000,1050.838980,3835.846084
2,20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,3,"(0.3891, 0.0799, 0.0799)",0.000000,22.670667,16.512820,0,279,203,1,289,210,0.076954,1.025384,13.324701,0.527728,0.442857,1,inf,0.792587,315.0,4582.0,2201.580645,1087.198583,3835.846084
3,20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,4,"(0.3891, 0.0799, 0.0799)",0.190035,28.081387,22.904589,0,348,283,2,357,290,0.106742,1.638994,15.354710,0.588544,0.341270,1,0.704918,0.924011,0.0,6208.0,2170.000000,1508.550745,3835.846084
4,20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,5,"(0.3891, 0.0799, 0.0799)",0.440619,28.385563,24.257490,0,352,299,3,360,310,0.337603,3.146287,9.319491,0.863911,0.515152,1,0.839506,1.333907,0.0,8111.0,2645.058824,1895.408771,3835.846084
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
324,20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,15,"(0.3891, 0.0799, 0.0799)",2.334711,45.420346,29.521450,6,567,367,7,572,373,0.044683,1.263401,28.274919,0.440265,0.600000,1,inf,0.655341,3132.0,8665.0,6092.611111,1622.816795,2302.843664
325,20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,22,"(0.3891, 0.0799, 0.0799)",3.502066,37.290077,19.463723,9,465,242,10,470,247,0.039718,1.092028,27.494555,0.423314,0.640000,1,inf,0.420816,4455.0,9287.0,6694.187500,1300.366449,2302.843664
326,20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,30,"(0.3891, 0.0799, 0.0799)",4.669422,11.821004,33.952397,12,145,422,13,152,430,0.114189,2.045553,17.913714,0.601925,0.821429,1,inf,0.698000,4160.0,9825.0,7358.869565,1327.667874,2302.843664
327,20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,31,"(0.3891, 0.0799, 0.0799)",4.669422,15.008609,38.890233,12,186,485,13,191,490,0.054612,1.263692,23.139358,0.470721,0.880000,1,inf,0.491397,3791.0,9553.0,6445.000000,1654.984427,2302.843664


#### **`STEP 2` - Summarize the mean, median, and standard deviation of each feature per cell**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** The block of code below summarizes the metrics per cell. The following are summarized:
- The number and total volume of each organelle per cell
- The total surface area of each organelle per cell
- The mean, median and standard deviation of the "SA_to_volume_ratio", "equivalent_diameter", "extent", "euler_number", "solidity", and "axis_major_length" metrics for each organelle per cell
- The total cell volume

In [45]:
###################
# summary stat group
###################
group_by = ['dataset', 'image_name', 'object']
sharedcolumns = ["SA_to_volume_ratio", "equivalent_diameter", "extent", "euler_number", "solidity", "axis_major_length"]
ag_func_standard = ['mean', 'median', 'std']

###################
# summarize shared measurements between org_df and contacts_df
###################
tab1 = org_df[group_by + ['volume']].groupby(group_by).agg(['count', 'sum'] + ag_func_standard)
tab2 = org_df[group_by + ['surface_area']].groupby(group_by).agg(['sum'] + ag_func_standard)
tab3 = org_df[group_by + sharedcolumns].groupby(group_by).agg(ag_func_standard)
org_summary = pd.merge(tab1, tab2, 'outer', on=group_by)
org_summary = pd.merge(org_summary, tab3, 'outer', on=group_by)
org_summary[('volume', 'cell')] = org_df[group_by + ['cell_volume']].groupby(group_by).first()

org_summary

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,volume,volume,volume,volume,volume,surface_area,surface_area,surface_area,surface_area,SA_to_volume_ratio,SA_to_volume_ratio,SA_to_volume_ratio,equivalent_diameter,equivalent_diameter,equivalent_diameter,extent,extent,extent,euler_number,euler_number,euler_number,solidity,solidity,solidity,axis_major_length,axis_major_length,axis_major_length,volume
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count,sum,mean,median,std,sum,mean,median,std,mean,median,std,mean,median,std,mean,median,std,mean,median,std,mean,median,std,mean,median,std,cell
dataset,image_name,object,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2
20241204_tes,a24hrs-Ctrl_14_Unmixing,ER,1,229.631999,229.631999,229.631999,,2259.962127,2259.962127,2259.962127,,9.841669,9.841669,,7.597626,7.597626,,0.01767,0.01767,,-11.0,-11.0,,0.066378,0.066378,,47.058328,47.058328,,3835.846084
20241204_tes,a24hrs-Ctrl_14_Unmixing,LD,1,4.565086,4.565086,4.565086,,19.802354,19.802354,19.802354,,4.337784,4.337784,,2.05818,2.05818,,0.420343,0.420343,,1.0,1.0,,0.814438,0.814438,,2.575421,2.575421,,3835.846084
20241204_tes,a24hrs-Ctrl_14_Unmixing,golgi,10,50.93583,5.093583,1.140651,8.324421,293.331117,29.333112,10.57512,40.479969,8.502844,8.702842,2.495405,1.730551,1.296173,0.895976,0.355885,0.303313,0.209917,0.9,1.0,0.316228,inf,0.74534,,3.767344,2.875089,2.506693,3835.846084
20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,101,72.170057,0.714555,0.119154,3.438198,667.458093,6.608496,2.381141,26.169206,17.953252,18.080466,6.300097,0.76425,0.610525,0.469278,0.521579,0.541667,0.198854,0.693069,1.0,2.791926,inf,0.901961,,1.276805,0.972288,1.161458,3835.846084
20241204_tes,a24hrs-Ctrl_14_Unmixing,mito,7,256.007221,36.57246,5.287456,74.072076,1605.858021,229.408289,33.380718,465.115328,7.089808,6.282237,2.523024,3.071157,2.161471,1.984718,0.127048,0.105259,0.060199,-14.0,1.0,39.285281,0.348948,0.271352,0.179877,9.969366,10.437631,5.416649,3835.846084
20241204_tes,a24hrs-Ctrl_14_Unmixing,perox,27,0.83656,0.030984,0.029788,0.020382,24.288398,0.89957,0.876097,0.368926,33.584855,32.22054,8.127136,0.373113,0.384606,0.080771,0.737516,0.75,0.173332,1.0,1.0,0.0,inf,inf,,0.470311,0.357197,0.278454,3835.846084
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,ER,1,343.371889,343.371889,343.371889,,2921.039992,2921.039992,2921.039992,,8.506928,8.506928,,8.688045,8.688045,,0.060494,0.060494,,-172.0,-172.0,,0.134462,0.134462,,43.279091,43.279091,,2302.843664
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,17,20.37036,1.198256,0.148942,3.252644,162.071696,9.533629,2.471522,21.385807,18.507938,17.913714,7.229472,0.86255,0.657668,0.664604,0.572319,0.68,0.267411,0.882353,1.0,0.485071,inf,inf,,1.93945,0.8566,3.373528,2302.843664
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,golgi,1,25.238296,25.238296,25.238296,,135.444425,135.444425,135.444425,,5.366623,5.366623,,3.639322,3.639322,,0.144418,0.144418,,-1.0,-1.0,,0.285863,0.285863,,9.412951,9.412951,,2302.843664
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,lyso,67,29.507979,0.440418,0.114189,1.052381,372.63267,5.561682,2.178454,9.770319,21.362361,21.654694,7.559035,0.74287,0.601925,0.381499,0.463294,0.488889,0.185996,0.865672,1.0,0.48914,inf,inf,,1.372835,1.000435,1.033211,2302.843664


#### **`STEP 3` - Calculate additional metrics**

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** The block of code below includes additional metrics that are normalized to the cell size:
- `volume fraction`: the portion of the cell volume taken up by each organelle; calculated as (total organelle volume / total cell volume)

In [47]:
###################
# add normalization
###################
# organelle area fraction
org_summary[('volume', 'fraction')] = org_summary[('volume', 'sum')]/org_summary[('volume', 'cell')]
org_summary = org_summary.sort_index(axis=1, level=0, ascending=False)

org_summary

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,volume,volume,volume,volume,volume,volume,volume,surface_area,surface_area,surface_area,surface_area,solidity,solidity,solidity,extent,extent,extent,euler_number,euler_number,euler_number,equivalent_diameter,equivalent_diameter,equivalent_diameter,axis_major_length,axis_major_length,axis_major_length,SA_to_volume_ratio,SA_to_volume_ratio,SA_to_volume_ratio
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,sum,std,median,mean,fraction,count,cell,sum,std,median,mean,std,median,mean,std,median,mean,std,median,mean,std,median,mean,std,median,mean,std,median,mean
dataset,image_name,object,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2
20241204_tes,a24hrs-Ctrl_14_Unmixing,ER,229.631999,,229.631999,229.631999,0.059865,1,3835.846084,2259.962127,,2259.962127,2259.962127,,0.066378,0.066378,,0.01767,0.01767,,-11.0,-11.0,,7.597626,7.597626,,47.058328,47.058328,,9.841669,9.841669
20241204_tes,a24hrs-Ctrl_14_Unmixing,LD,4.565086,,4.565086,4.565086,0.00119,1,3835.846084,19.802354,,19.802354,19.802354,,0.814438,0.814438,,0.420343,0.420343,,1.0,1.0,,2.05818,2.05818,,2.575421,2.575421,,4.337784,4.337784
20241204_tes,a24hrs-Ctrl_14_Unmixing,golgi,50.93583,8.324421,1.140651,5.093583,0.013279,10,3835.846084,293.331117,40.479969,10.57512,29.333112,,0.74534,inf,0.209917,0.303313,0.355885,0.316228,1.0,0.9,0.895976,1.296173,1.730551,2.506693,2.875089,3.767344,2.495405,8.702842,8.502844
20241204_tes,a24hrs-Ctrl_14_Unmixing,lyso,72.170057,3.438198,0.119154,0.714555,0.018815,101,3835.846084,667.458093,26.169206,2.381141,6.608496,,0.901961,inf,0.198854,0.541667,0.521579,2.791926,1.0,0.693069,0.469278,0.610525,0.76425,1.161458,0.972288,1.276805,6.300097,18.080466,17.953252
20241204_tes,a24hrs-Ctrl_14_Unmixing,mito,256.007221,74.072076,5.287456,36.57246,0.066741,7,3835.846084,1605.858021,465.115328,33.380718,229.408289,0.179877,0.271352,0.348948,0.060199,0.105259,0.127048,39.285281,1.0,-14.0,1.984718,2.161471,3.071157,5.416649,10.437631,9.969366,2.523024,6.282237,7.089808
20241204_tes,a24hrs-Ctrl_14_Unmixing,perox,0.83656,0.020382,0.029788,0.030984,0.000218,27,3835.846084,24.288398,0.368926,0.876097,0.89957,,inf,inf,0.173332,0.75,0.737516,0.0,1.0,1.0,0.080771,0.384606,0.373113,0.278454,0.357197,0.470311,8.127136,32.22054,33.584855
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,ER,343.371889,,343.371889,343.371889,0.149108,1,2302.843664,2921.039992,,2921.039992,2921.039992,,0.134462,0.134462,,0.060494,0.060494,,-172.0,-172.0,,8.688045,8.688045,,43.279091,43.279091,,8.506928,8.506928
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,LD,20.37036,3.252644,0.148942,1.198256,0.008846,17,2302.843664,162.071696,21.385807,2.471522,9.533629,,inf,inf,0.267411,0.68,0.572319,0.485071,1.0,0.882353,0.664604,0.657668,0.86255,3.373528,0.8566,1.93945,7.229472,17.913714,18.507938
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,golgi,25.238296,,25.238296,25.238296,0.01096,1,2302.843664,135.444425,,135.444425,135.444425,,0.285863,0.285863,,0.144418,0.144418,,-1.0,-1.0,,3.639322,3.639322,,9.412951,9.412951,,5.366623,5.366623
20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,lyso,29.507979,1.052381,0.114189,0.440418,0.012814,67,2302.843664,372.63267,9.770319,2.178454,5.561682,,inf,inf,0.185996,0.488889,0.463294,0.48914,1.0,0.865672,0.381499,0.601925,0.74287,1.033211,1.000435,1.372835,7.559035,21.654694,21.362361


#### **`STEP 4` - Unstack the organelle names, fill NA values with 0, and save file**

##### &#x1F6D1; &#x270D; **User Input Required:**

Select what file name you'd like to use for the output data table:
- `out_file_name`: the prefix you wish to include in the output file name. An underscore will automatically be added to the end of this string before the word "organelles" to indicat this is the results of the organelle morphology analysis. Note the separator (e.g., " _ " or "-") should be excluded from this object. The script below addes a " _ " between the out name and the suffix.

In [51]:
#### USER INPUT REQUIRED ###
summary_out_file_name = "20241204_test"

##### &#x1F3C3; **Run code; no user input required**

&#x1F453; **FYI:** The block of code below the following action occur:
1. The "organelle" column was unstacked resulting in a dataframe where each column if a different summary statistic and row represents the values for each cell. 
2. For specific values, such as organelle count and total/mean/median volume, NA values were filled with 0.
3. The standard deviation, mean, and median values for the ER (forced to be only one object) were removed.
4. The file was saved.

In [54]:
###################
# flatten datasheets and combine
###################
# org flattening
org_final = org_summary.unstack(-1)
new_col_order = ['dataset', 'image_name', 'object', 'volume', 'surface_area', 'SA_to_volume_ratio', 
                'equivalent_diameter', 'extent', 'euler_number', 'solidity', 'axis_major_length']
new_cols = org_final.columns.reindex(new_col_order, level=0)
org_final = org_final.reindex(columns=new_cols[0])
org_final.columns = ["_".join((col_name[-1], col_name[1], col_name[0])) for col_name in org_final.columns.to_flat_index()]


#renaming, filling "NaN" with 0 when needed, and removing ER_std columns
for col in org_final.columns:
    if col.endswith(("_count_volume","_sum_volume", "_mean_volume", "_median_volume")):
        org_final[col] = org_final[col].fillna(0)
    if col.endswith("_count_volume"):
        org_final.rename(columns={col:col.split("_")[0]+"_count"}, inplace=True)
    if col.startswith(("ER_std_", "ER_mean_", "ER_median_")):
        org_final.drop(columns=[col], inplace=True)
org_final = org_final.reset_index()


###################
# export summary sheets
###################
org_summary.to_csv(str(quant_data_path) + f"/{summary_out_file_name}_per_org_summarystats.csv")

print(f"Organelle morphology summary for {fl_count} files from {ds_count} dataset(s) is complete.")
org_final

Organelle morphology summary for 2 files from 1 dataset(s) is complete.


Unnamed: 0,dataset,image_name,ER_sum_volume,LD_sum_volume,golgi_sum_volume,lyso_sum_volume,mito_sum_volume,perox_sum_volume,LD_std_volume,golgi_std_volume,lyso_std_volume,mito_std_volume,perox_std_volume,LD_median_volume,golgi_median_volume,lyso_median_volume,mito_median_volume,perox_median_volume,LD_mean_volume,golgi_mean_volume,lyso_mean_volume,mito_mean_volume,perox_mean_volume,ER_fraction_volume,LD_fraction_volume,golgi_fraction_volume,lyso_fraction_volume,mito_fraction_volume,perox_fraction_volume,ER_count,LD_count,golgi_count,lyso_count,mito_count,perox_count,ER_cell_volume,LD_cell_volume,golgi_cell_volume,lyso_cell_volume,mito_cell_volume,perox_cell_volume,ER_sum_surface_area,LD_sum_surface_area,golgi_sum_surface_area,lyso_sum_surface_area,mito_sum_surface_area,perox_sum_surface_area,LD_std_surface_area,golgi_std_surface_area,lyso_std_surface_area,mito_std_surface_area,perox_std_surface_area,LD_median_surface_area,golgi_median_surface_area,lyso_median_surface_area,mito_median_surface_area,perox_median_surface_area,LD_mean_surface_area,golgi_mean_surface_area,lyso_mean_surface_area,mito_mean_surface_area,perox_mean_surface_area,LD_std_SA_to_volume_ratio,golgi_std_SA_to_volume_ratio,lyso_std_SA_to_volume_ratio,mito_std_SA_to_volume_ratio,perox_std_SA_to_volume_ratio,LD_median_SA_to_volume_ratio,golgi_median_SA_to_volume_ratio,lyso_median_SA_to_volume_ratio,mito_median_SA_to_volume_ratio,perox_median_SA_to_volume_ratio,LD_mean_SA_to_volume_ratio,golgi_mean_SA_to_volume_ratio,lyso_mean_SA_to_volume_ratio,mito_mean_SA_to_volume_ratio,perox_mean_SA_to_volume_ratio,LD_std_equivalent_diameter,golgi_std_equivalent_diameter,lyso_std_equivalent_diameter,mito_std_equivalent_diameter,perox_std_equivalent_diameter,LD_median_equivalent_diameter,golgi_median_equivalent_diameter,lyso_median_equivalent_diameter,mito_median_equivalent_diameter,perox_median_equivalent_diameter,LD_mean_equivalent_diameter,golgi_mean_equivalent_diameter,lyso_mean_equivalent_diameter,mito_mean_equivalent_diameter,perox_mean_equivalent_diameter,LD_std_extent,golgi_std_extent,lyso_std_extent,mito_std_extent,perox_std_extent,LD_median_extent,golgi_median_extent,lyso_median_extent,mito_median_extent,perox_median_extent,LD_mean_extent,golgi_mean_extent,lyso_mean_extent,mito_mean_extent,perox_mean_extent,LD_std_euler_number,golgi_std_euler_number,lyso_std_euler_number,mito_std_euler_number,perox_std_euler_number,LD_median_euler_number,golgi_median_euler_number,lyso_median_euler_number,mito_median_euler_number,perox_median_euler_number,LD_mean_euler_number,golgi_mean_euler_number,lyso_mean_euler_number,mito_mean_euler_number,perox_mean_euler_number,LD_std_solidity,golgi_std_solidity,lyso_std_solidity,mito_std_solidity,perox_std_solidity,LD_median_solidity,golgi_median_solidity,lyso_median_solidity,mito_median_solidity,perox_median_solidity,LD_mean_solidity,golgi_mean_solidity,lyso_mean_solidity,mito_mean_solidity,perox_mean_solidity,LD_std_axis_major_length,golgi_std_axis_major_length,lyso_std_axis_major_length,mito_std_axis_major_length,perox_std_axis_major_length,LD_median_axis_major_length,golgi_median_axis_major_length,lyso_median_axis_major_length,mito_median_axis_major_length,perox_median_axis_major_length,LD_mean_axis_major_length,golgi_mean_axis_major_length,lyso_mean_axis_major_length,mito_mean_axis_major_length,perox_mean_axis_major_length
0,20241204_tes,a24hrs-Ctrl_14_Unmixing,229.631999,4.565086,50.93583,72.170057,256.007221,0.83656,,8.324421,3.438198,74.072076,0.020382,4.565086,1.140651,0.119154,5.287456,0.029788,4.565086,5.093583,0.714555,36.57246,0.030984,0.059865,0.00119,0.013279,0.018815,0.066741,0.000218,1,1,10,101,7,27,3835.846084,3835.846084,3835.846084,3835.846084,3835.846084,3835.846084,2259.962127,19.802354,293.331117,667.458093,1605.858021,24.288398,,40.479969,26.169206,465.115328,0.368926,19.802354,10.57512,2.381141,33.380718,0.876097,19.802354,29.333112,6.608496,229.408289,0.89957,,2.495405,6.300097,2.523024,8.127136,4.337784,8.702842,18.080466,6.282237,32.22054,4.337784,8.502844,17.953252,7.089808,33.584855,,0.895976,0.469278,1.984718,0.080771,2.05818,1.296173,0.610525,2.161471,0.384606,2.05818,1.730551,0.76425,3.071157,0.373113,,0.209917,0.198854,0.060199,0.173332,0.420343,0.303313,0.541667,0.105259,0.75,0.420343,0.355885,0.521579,0.127048,0.737516,,0.316228,2.791926,39.285281,0.0,1.0,1.0,1.0,1.0,1.0,1.0,0.9,0.693069,-14.0,1.0,,,,0.179877,,0.814438,0.74534,0.901961,0.271352,inf,0.814438,inf,inf,0.348948,inf,,2.506693,1.161458,5.416649,0.278454,2.575421,2.875089,0.972288,10.437631,0.357197,2.575421,3.767344,1.276805,9.969366,0.470311
1,20241204_tes,a48hrs-Ctrl + oleic acid_01_Unmixing,343.371889,20.37036,25.238296,29.507979,259.907031,8.740439,3.252644,,1.052381,143.620215,0.108903,0.148942,25.238296,0.114189,5.006948,0.057095,1.198256,25.238296,0.440418,86.635677,0.093983,0.149108,0.008846,0.01096,0.012814,0.112864,0.003795,1,17,1,67,3,93,2302.843664,2302.843664,2302.843664,2302.843664,2302.843664,2302.843664,2921.039992,162.071696,135.444425,372.63267,1870.605852,163.678774,21.385807,,9.770319,998.996768,1.295802,2.471522,135.444425,2.178454,55.101645,1.371658,9.533629,135.444425,5.561682,623.535284,1.759987,7.229472,,7.559035,4.393478,6.810681,17.913714,5.366623,21.654694,11.005036,23.139358,18.507938,5.366623,21.362361,11.285312,23.879402,0.664604,,0.381499,3.440393,0.152014,0.657668,3.639322,0.601925,2.122551,0.477748,0.86255,3.639322,0.74287,3.877585,0.518395,0.267411,,0.185996,0.030385,0.198614,0.68,0.144418,0.488889,0.064962,0.65,0.572319,0.144418,0.463294,0.074557,0.635834,0.485071,,0.48914,60.753601,0.0,1.0,-1.0,1.0,-2.0,1.0,0.882353,-1.0,0.865672,-33.0,1.0,,,,0.049627,,inf,0.285863,inf,0.167378,1.0,inf,0.285863,inf,0.169366,inf,3.373528,,1.033211,20.887634,0.480941,0.8566,9.412951,1.000435,9.735887,0.730335,1.93945,9.412951,1.372835,20.545981,0.787345


#### **`DEFINE` - The batch_org_morph_summary_stats() function**

In [55]:
def _batch_org_morph_summary_stats(csv_path_list: List[str],
                                    out_path: str,
                                    out_preffix: str):
    """" 
    csv_path_list: List[str],
        A list of path strings where .csv files to analyze are located.
    out_path: str,
        A path string where the summary data file will be output to
    out_preffix: str
        The prefix used to name the output file. An "_" will be included between this prefix and the file suffix.
    """
    ds_count = 0
    fl_count = 0
    ###################
    # Read in the csv files and combine them into one of each type
    ###################
    # create empty list to hold the morphology tables from different experiments
    org_tabs = []
    org = "_org_morph"

    # loop through all of the locations listed above and find the _org_morph files; append them to the list above
    for loc in csv_path_list:
        ds_count = ds_count + 1
        files_store = sorted(loc.glob("*.csv"))
        for file in files_store:
            fl_count = fl_count + 1
            stem = file.stem

            if org in stem:
                test_orgs = pd.read_csv(file, index_col=0)
                test_orgs.insert(0, "dataset", stem[:-11])
                org_tabs.append(test_orgs)

    # combine the org_morph lists found above into one table
    org_df = pd.concat(org_tabs,axis=0, join='outer')


    ###################
    # summary stat group
    ###################
    group_by = ['dataset', 'image_name', 'object']
    sharedcolumns = ["SA_to_volume_ratio", "equivalent_diameter", "extent", "euler_number", "solidity", "axis_major_length"]
    ag_func_standard = ['mean', 'median', 'std']

    ###################
    # summarize shared measurements between org_df and contacts_df
    ###################
    tab1 = org_df[group_by + ['volume']].groupby(group_by).agg(['count', 'sum'] + ag_func_standard)
    tab2 = org_df[group_by + ['surface_area']].groupby(group_by).agg(['sum'] + ag_func_standard)
    tab3 = org_df[group_by + sharedcolumns].groupby(group_by).agg(ag_func_standard)
    org_summary = pd.merge(tab1, tab2, 'outer', on=group_by)
    org_summary = pd.merge(org_summary, tab3, 'outer', on=group_by)
    org_summary[('volume', 'cell')] = org_df[group_by + ['cell_volume']].groupby(group_by).first()


    ###################
    # add normalization
    ###################
    # organelle area fraction
    org_summary[('volume', 'fraction')] = org_summary[('volume', 'sum')]/org_summary[('volume', 'cell')]
    org_summary = org_summary.sort_index(axis=1, level=0, ascending=False)


    ###################
    # flatten datasheets and combine
    ###################
    # org flattening
    org_final = org_summary.unstack(-1)
    new_col_order = ['dataset', 'image_name', 'object', 'volume', 'surface_area', 'SA_to_volume_ratio', 
                    'equivalent_diameter', 'extent', 'euler_number', 'solidity', 'axis_major_length']
    new_cols = org_final.columns.reindex(new_col_order, level=0)
    org_final = org_final.reindex(columns=new_cols[0])
    org_final.columns = ["_".join((col_name[-1], col_name[1], col_name[0])) for col_name in org_final.columns.to_flat_index()]

    #renaming, filling "NaN" with 0 when needed, and removing ER_std columns
    for col in org_final.columns:
        if col.endswith(("_count_volume","_sum_volume", "_mean_volume", "_median_volume")):
            org_final[col] = org_final[col].fillna(0)
        if col.endswith("_count_volume"):
            org_final.rename(columns={col:col.split("_")[0]+"_count"}, inplace=True)
        if col.startswith(("ER_std_", "ER_mean_", "ER_median_")):
            org_final.drop(columns=[col], inplace=True)
    org_final = org_final.reset_index()


    ###################
    # export summary sheets
    ###################
    org_summary.to_csv(str(out_path) + f"/{out_preffix}_per_org_summarystats.csv")


    print(f"Organelle morphology summary for {fl_count} files from {ds_count} dataset(s) is complete.")
    return org_summary

In [56]:
test_org_summary = _batch_org_morph_summary_stats(csv_path_list = csv_path_list,
                                                  out_path = quant_data_path,
                                                  out_preffix = summary_out_file_name)

print(f"The table above matches the table created with the function here: {org_summary.equals(test_org_summary)}")

Organelle morphology summary for 3 files from 1 dataset(s) is complete.
The table above matches the table created with the function here: True


##### &#x1F453; **FYI:** This function has been added to `infer_subc.utils.stats` and can be imported with the following:
> ```python
> from infer_subc.utils.stats import batch_org_morph_summary_stats
> ```

-----
-----

## **EXECUTE QUANTIFICATION**

### **`STEP 1` - 🧪 Batch process *multiple cells* from a <ins>SINGLE EXPERIMENT</ins>**

#### &#x1F6D1; &#x270D; **User Input Required:**

Please specify the following information about your data: 
- `out_file_name`: The prefix to use when naming the output datatable. Do not add a separator; "_" will be added between your prefix and the base name given in the function below.
- `seg_path`: Path or str to the folder that contains the segmentation tiff files
- `out_path`: Path or str to the folder that the output datatables will be saved to
- `raw_path`: Path or str to the folder that contains the raw image files
- `raw_file_type`: The file type of the raw data; ex - ".tiff", ".czi"
- `organelle_names`: A list of all organelle names that will be analyzed; the names should be the same as the suffix used to name each of the tiff segmentation files. Note: the intensity measurements collect per region (from get_region_morphology_3D function) will only be from channels associated to these organelles 
- `organelle_channels`: A list of channel indices associated to respective organelle staining in the raw image; the indices should listed in same order in which the respective segmentation name is listed in organelle_names
- `region_names`: A list of regions, or masks, to measure; the order should correlate to the order of the channels in the "masks" output segmentation file
- `mask`: The name of the region to use as the mask when measuring the organelles; this should be one of the names listed in regions list; usually this will be the "cell" mask
- `scale`: A tuple that contains the real world dimensions for each dimension in the image (Z, Y, X)
- `seg_suffix`: Any additional text that is included in the segmentation tiff files between the file stem and the segmentation suffix, not including the initial "-"

The defaults below utilize the user input from the `IMPORTS` section.

In [57]:
out_file_name = "20241204_test"
seg_path = seg_data_path
out_path = quant_data_path
raw_path = raw_data_path
raw_file_type = raw_img_type
organelle_names = org_file_names
organelle_channels = org_channels_ordered
region_names = regions_file_names
mask_name = mask_name
scale = True
seg_suffix = suffix_separator



#### &#x1F3C3; **Run code; no user input required**
&#x1F453; **FYI:** This code block uses the inputs provided above to run the batch processing. The table that is saved to you files is also printed below for easy access.

In [None]:
batch_org_morph_table = batch_process_org_morph(out_file_name,
                                                seg_path,
                                                out_path, 
                                                raw_path, 
                                                raw_file_type,
                                                organelle_names,
                                                organelle_channels,
                                                region_names,
                                                mask_name,
                                                scale,
                                                seg_suffix)

batch_org_morph_table

### **`STEP 2` - 🧮 Summarize metrics *per cell* across <INS>ONE OR MORE EXPERIMENTS</ins>**

#### &#x1F6D1; &#x270D; **User Input Required:**

Please specify the following information about your data: 
- `csv_path_list`: A list of path strings where .csv files to analyze are located.
- `out_path`: A path string where the summary data file will be output to
- `out_preffix`: The prefix used to name the output file. An "_" will be included between this prefix and the file suffix.

In [None]:
csv_path_list = csv_path_list
out_path = quant_data_path
out_preffix = summary_out_file_name

#### &#x1F3C3; **Run code; no user input required**
&#x1F453; **FYI:** This code block uses the inputs provided above to run the batch processing. The table that is saved to you files is also printed below for easy access.

In [None]:
test_org_summary = batch_org_morph_summary_stats(csv_path_list = csv_path_list,
                                                  out_path = quant_data_path,
                                                  out_preffix = summary_out_file_name)

test_org_summary

-----
-----
# 🎉 **CONGRATULATIONS!!**
### **You've successfully quantified organelle morphology using the modular `2.1._organelle_morphology` notebook.**

Continue on to other quantification notebooks as needed:
- [2.2 Organelle interactions]() (amounts, size, shape)
- [2.3 Subcellular distribution]() in XY and Z separately (of organelles and interaction sites)
- [2.4 Cell morphology]() (size, shape)