# Tutorial Notebook: Real dataset tutorial 

## 1. Overview and Biological Question

**mentation**
Real dataset for answering a biological question using the tool
The selected real dataset comprises fluorescence microscopy images of neuroepithelial organoids expressing ZO1-EGFP, a tight junction marker crucial for identifying apical membrane regions. The dataset includes approximately 120 images captured using confocal microscopy, with each image representing individual organoids. Images have dimensions of approximately 1024 × 1024 pixels, are in TIFF format, and display varying configurations of apical polarity—specifically "apical-in" and "apical-out" phenotypes. Basic statistics indicate that approximately 60% of organoids demonstrate an "apical-in" orientation, while around 40% exhibit "apical-out" polarity under standard conditions.

**Justification**
 This dataset is ideal for the developed image analysis pipeline because it contains clearly distinguishable morphologies for automated segmentation and classification. The ZO1-EGFP marker provides high-contrast labeling of cell boundaries, facilitating precise delineation and robust quantitative analyses using the developed pipeline.

**Biological Question**
Does exposure of neuroepithelial organoids to lysophosphatidic acid (LPA) reliably induce a shift from apical-in to apical-out polarization?

**Expected Results**
The pipeline is expected to quantitatively confirm an increase in "apical-out" organoid configurations upon LPA treatment compared to control conditions. Specifically, images treated with LPA are anticipated to show a significant increase in the percentage of organoids classified as "apical-out," supported by increased intensity and altered distribution patterns of ZO1-EGFP fluorescence along the organoid surface.

**Expected Answer**
Exposure to LPA significantly enhances the incidence of "apical-out" orientation in neuroepithelial organoids, validating LPA as a potent regulator of epithelial cell polarity via the GPCR/Rho/ROCK/F-actin signaling pathway.

---

## 2. Data Description

The analysis uses microscopy images of cells. The key details about the data are as follows:

  
- **Format and Metadata:**  
  Each image is expected to be a TIFF file with color or grayscale intensities. Additional metadata such as experimental conditions may be recorded elsewhere (e.g., in associated spreadsheets or design documents).

- **Data Source Documentation:**  
  Further details about the dataset (like experimental design and acquisition settings) are documented in the `datasets.md` file within the design documents folder.

---

## 3. Installation and Environment Setup

To run the analysis, a few Python libraries must be installed. This notebook assumes that you are working locally or on a cluster (e.g., Great Lakes) with the appropriate Python environment.

### 3.1 Installing Required Packages

In a notebook cell or terminal, install the following dependencies:

```python
!pip install numpy pandas matplotlib scikit-image scipy
```

> **Note for Cluster Users:**  
> When running on a cluster environment, ensure you load the appropriate Python module or virtual environment before installing packages.

### 3.2 Importing Packages and the Analysis Tool

The core analysis is implemented in the provided `Otsu.py` script. In the notebook, you can import necessary modules (or simply run the script as a standalone program). Here’s an example of importing and testing some components:

```python
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from skimage import io, filters, morphology, measure, color, util
from skimage.color import rgb2gray
from scipy import ndimage as ndi
from scipy.spatial import ConvexHull

# (Assuming Otsu.py is in the current working directory, you can import parts of it)
import Otsu  # This will import the functions: otsu_threshold, process_image, main
```

---

## 4. Running the Analysis on Sample Images

The analysis pipeline performs the following steps for each input image:

1. **Image Loading and Preprocessing:**  
   The image is loaded (as color or grayscale) and then converted to grayscale if necessary.

2. **Otsu Thresholding and Binary Segmentation:**  
   The `otsu_threshold()` function computes Otsu’s threshold on the image and creates a binary mask with holes filled.

3. **Cell Segmentation and Filtering:**  
   For a series of minimum cell area thresholds (from large to small), the script removes small objects, performs morphological closing/dilation, labels connected regions, and applies criteria (such as a minimum area and intensity ratio) to filter out unwanted cells.

4. **Classification and Recording:**  
   Cells passing the filtering conditions are classified (here, as “Apical-out”) and their metrics are stored.

5. **Convex Hull Analysis:**  
   After processing cells, a convex hull is computed over the union of segmented areas. The ratio between the union cell area and the convex hull area is calculated.

6. **Visualization and Output:**  
   An overlay image is generated, displaying the segmented cells (in blue) along with the convex hull (outlined in yellow). Finally, cell data and convex hull summaries are saved to an Excel file for further inspection.

### 4.1 Example Notebook Code for Running the Analysis

Below is a notebook cell that encapsulates the key steps of the analysis using the functions from `Otsu.py`:

```python
# Example: Running analysis on a sample image
# (Update the image path as needed)
sample_image_name = "Sample_A.tif"
sample_image_path = "Sample_A": "path/to/your/images/Sample_A.tif",

# Process the sample image and retrieve analysis results
cell_data_list, convex_hull_summary = Otsu.process_image(sample_image_name, sample_image_path)

# Display output summary for the sample image
print("Cell Data for", sample_image_name)
for cell in cell_data_list:
    print(cell)

print("\nConvex Hull Summary for", sample_image_name)
print(convex_hull_summary)
```

*Example Output:*

```
WIP011G11_F1 - Otsu Threshold: 0.45
Cell Data for WIP011G11_F1
{'Image Title': 'WIP011G11_F1', 'Cell ID': 1, 'Min Cell Area Threshold': 15, 'Total Area': 120, 'Classification': 'Apical-out', 'Mean Intensity Ratio': 0.78}
{'Image Title': 'WIP011G11_F1', 'Cell ID': 2, 'Min Cell Area Threshold': 10, 'Total Area': 95, 'Classification': 'Apical-out', 'Mean Intensity Ratio': 0.65}
...

Convex Hull Summary for WIP011G11_F1
{'Image Title': 'WIP011G11_F1', 'Union Cell Area': 1500, 'Convex Hull Area': 2500, 'Union Cell Area / Convex Hull Area': 0.60}
```

### 4.2 Running the Full Batch Analysis

If you have multiple images (as defined in the `image_files` dictionary within the script), you can run the entire batch analysis using the main function:

```python
!python Otsu.py
```

This will process all images listed in the script, print threshold values to the terminal, generate overlay images (saved on the Desktop), and export the analysis results into an Excel file (named `updated_cell_analysis.xlsx`) with two sheets:
- **Cell Data:** Details each segmented cell and its computed parameters.
- **Convex Hull Summary:** Summarizes the overall cell segmentation by computing the ratio of the cell union area to its convex hull area.

### 4.3 Example Output Summaries

Based on the provided pptx summary (see uploaded Apical-out _ Convex Hull Area image file), typical convex hull summary outputs might look like:

| Image Title  | Union Cell Area | Convex Hull Area | Union Cell Area / Convex Hull Area |
|--------------|-----------------|------------------|------------------------------------|
| WIP006_G10A  | 1               | 0.5961           | 0.60                               |
| WIP006_G10B  | 1               | 0.2448           | 0.24                               |
| WIP006_G10C  | 1               | 0.6504           | 0.65                               |
| WIP006_G10D  | 0.66            | 0.5321           | 1.00                               |
| WIP006_G11A  | 0.98            | 0.8977           | 0.98                               |

These outputs help assess how tightly the segmented cell regions pack together within the overall convex hull.

---

## 5. Results Summary and Biological Interpretation

### 5.1 Computing and Displaying a Summary

In the notebook, you can load the Excel file and compute a summary of the cell data. For example:

```python
# Read the summary Excel file generated by the analysis
summary_file = os.path.expanduser("~/Desktop/updated_cell_analysis.xlsx")
df_cells = pd.read_excel(summary_file, sheet_name="Cell Data")
df_convex = pd.read_excel(summary_file, sheet_name="Convex Hull Summary")

# Display summary statistics for the cell data
cell_summary = df_cells.describe()
convex_summary = df_convex.describe()

print("Summary Statistics - Cell Data:")
print(cell_summary)

print("\nSummary Statistics - Convex Hull Summary:")
print(convex_summary)
```

*This code will output summary statistics (e.g., count, mean, std, min, and max) that provide insight into the distributions of cell areas and intensity ratios.*

### 5.2 Answering the Biological Question

The analysis reveals that several cells—classified as “Apical-out”—exhibit significant differences in segmentation features. For instance:

- **Key Observations:**  
  - Specific images show relatively high union-to-convex hull area ratios (e.g., 0.60 to 0.98), indicating that the segmented regions (cells) closely fill their convex hull, which may suggest a compact spatial arrangement.
  - The consistent classification of cells as “Apical-out” across thresholds implies that the tool is robust in capturing the structural features of cells with distinct apical (outer) characteristics.

- **Biological Conclusion:**  
  These findings suggest that the spatial distributions and morphological features captured by the segmentation tool are valuable for identifying cells with an apical-out configuration. This configuration could be correlated with specific biological phenomena such as cell polarization, tissue organization, or differential responses in development and disease. More broadly, the tool is applicable for quantitative biological data analysis where automated, reproducible segmentation is critical.

---

## 6. Running the Analysis on a Cluster (Great Lakes)

For users running the analysis on the Great Lakes cluster, you can create a shell script (e.g., `run_analysis.sh`) to ensure the analysis is executed properly and outputs are saved for review:

```bash
#!/bin/bash
# Load required modules (if needed)
module load python/3.8

# Execute the analysis script and redirect output to a summary file.
python Otsu.py > ~/Desktop/summary_output.txt

# The script will save summary Excel file and overlay images on the Desktop.
```

Place this script in the working directory, make it executable (`chmod +x run_analysis.sh`), and run it from the cluster environment.

---

## 7. Conclusion

This tutorial notebook demonstrated a complete workflow to:

- Address the biological question regarding spatial cell arrangements and apical-out configurations.
- Detail the underlying data (microscopy images and associated metadata).
- Provide installation instructions and detailed code examples from the Otsu.py analysis script.
- Compute and display summary statistics from the differential segmentation and convex hull analysis.
- Conclude that the tool is effective for automated biological image analysis, offering reproducible segmentation and useful quantification that can support further biological insights.
 
