<table>
  <tr>
    <td><p style="font-size:45px; color: #55BBD2">Analysis of light microscopy images in Python</p></td>
    <td><img src="../ressources/lmb_logo.svg" alt="LMB Logo" width="500" height="600" align="right"></td>
  </tr>
</table>
<table>
  <tr>
    <td><p style="font-size:15px; color: #55BBD2">Version: September 2025</p></td>
  </tr>
</table>

# Part 3 Cargo protein tethering

<b>Problem:</b> Syntaxin-16 cargo-containing vesicles will tether to Glogi in WT cells. In this example, we are interested into quantifying the tethering of syntaxin-16 cargo-containing vesicles by the mitochondria when the protein TBC1D23 is relocated to the mitochondria. We want to quantify the colocalization between the mitochondria and the cargo at single-cell level, to measure the effect of mutant of TBC1D23 on the effectivness of the cargo tethering. Note that not all cells are transfected.

<b>Dataset 1:</b> Zeiss1344.lsm

![title](../ressources/data2.png)

<b>Credit:</b> Alison Gillingham from Sean Munro's group at the MRC-LMB. Reference: Jérôme Cattin-Ortolá et al., [Cargo selective vesicle tethering: The structural basis for binding of specific cargo proteins by the Golgi tether component TBC1D23](https://www.science.org/doi/10.1126/sciadv.adl0608). Sci. adv.10,eadl0608(2024).

<b>Workflow:</b>

<img src="../ressources/workflow/workflow2.png" alt="drawing" width="800"/>

<b>Objectives:</b>
- Get to know CellPose and use it to perform cellular segementation (models.CellposeModel, model.eval)
- Use Pearson Corrlation Coefficience (PCC) to evaluate colocalization (scipy.stats.pearsonr)
- Simple example of statistical anaysis (scipy.stats.mannwhitneyu, sns.violinplot, sns.stripplot, statannotations.Annotator, plt.savefig)

## Load data for colocalization analysis

In [None]:
from pathlib import Path
from bioio import BioImage

data_folder = Path("../data")
image_path = data_folder / "Zeiss1344.lsm"
img = BioImage(image_path).data
print(img.shape)

## Check for GPU Availability
Checks for a CUDA-compatible GPU, which significantly accelerates deep learning model training and inference.

In [None]:
import torch

print(torch.__version__)
print(torch.version.cuda)
print(torch.cuda.is_available())

use_gpu = torch.cuda.is_available()

## Segment the cell using CellPose (Deep Learning approach)

[Cellpose](https://cellpose.readthedocs.io/en/latest/) is a generalist cellular segmentation algorithm, trained on diverse datasets that works on wide range of data types.

Key features: 
- Instance segmentation
- Easy-to-use GUI
- A “zoo” of pre-trained models
- Well-documented Python API
- Simple fine-tuning of models via Human-in-the-Loop retraining


<img src="../ressources/cellpose_w.png" alt="drawing" width="800"/>


In [None]:
import matplotlib.pyplot as plt
from cellpose import models

# Get the pretrained model
model = models.CellposeModel(gpu=use_gpu, model_type="cyto2")

# To segment the cells in the image, we use channels 2 (mitochondria) and 3 (nuclei) for guidance.
mask, flows, diams = model.eval(
    img, channels=[2, 3], diameter=400, cellprob_threshold=1.0
)
"""
Parameters:
-----------
img : np.ndarray
    Input image to segment.

channels : list of int
    Specifies which channels to use from the image:
    - channels[0]: main input channel (e.g. cytoplasmic)
    - channels[1]: optional auxiliary channel (e.g. nuclear)
    Example: [2, 3] uses channel 2 and 3 as input.

diameter : float or int
    Estimated object diameter (in pixels). Used to rescale the image internally to match the training conditions of the model
    - A good estimate improves segmentation accuracy.

cellprob_threshold : float
    Threshold applied to the internal "cell probability" map output by the model.
    - Higher values result in stricter segmentation.
    - Typical range: 0.0 to 1.0

Returns:
--------
mask : np.ndarray
    Segmentation mask with labeled regions.

flows : list of np.ndarray
    List of flow-related outputs used in Cellpose for segmentation and visualization.

    - flows[0]: HSV-encoded XY flow image (H, W, 3), for visualization only.
    - flows[1]: Raw XY flow vectors (2, H, W), directing pixel movement toward cell centers.
    - flows[2]: Cell probability map (H, W); used to threshold foreground pixels.
    - flows[3]: Final pixel coordinates after Euler integration (2, H, W); used to define cell masks.


diams : float
    Estimated object diameter based on image content (may differ from input diameter).
"""


# Display the results
fig, ax = plt.subplots(1, 7, figsize=(15, 5))
for k in range(4):
    ax[k].imshow(img[:, k, :, :, :].squeeze())
    ax[k].set_axis_off()
    ax[k].set_title("Ch0" + str(k))
ax[4].imshow(mask)
ax[4].set_axis_off()
ax[4].set_title("Labels")
ax[5].imshow(flows[0])
ax[5].set_axis_off()
ax[5].set_title("Flows")
ax[6].imshow(flows[2])
ax[6].set_axis_off()
ax[6].set_title("Cell Probability")
fig.tight_layout()

## Mask overlay in Napari

In [None]:
import napari

viewer = napari.Viewer()
viewer.add_image(
    img,
    channel_axis=1,
    name=["Golgi", "Cargo", "Mitochondria", "Nuclei"],
    colormap=["green", "magenta", "cyan", "blue"],  # Assign colors to each channel
    blending="additive",  # Better for multichannel visualization
)

# Add segmentation mask as labels layer
viewer.add_labels(
    mask,
    name="Segmentation Mask",
    opacity=0.7,  # Make mask semi-transparent
    blending="translucent",  # Good blending mode for overlays
)
# viewer.show()

<div class="alert alert-success">

#### Exercise       

1. The cell segmentation was obtained after setting the parameters: `diameter` to 200 pixels and `cellprob_threshold` to 0.5. <br>
2. Observe the segmented regions when you change the diameter or the cellprob_threshold.    
</div>

## Measure the colocalization coefficient 

The Pearson's correlation coefficient (PCC) is widely used to quantify the colocalization of one object $A$ into another object $B$. It can be calculated using the following formula [1]:

$$\text{PCC} = \frac{\sum_i\left( A_i - A_{aver} \right) \cdot \left( B_i - B_{aver} \right)}{\sqrt{\left[ \sum_i \left( A_i - A_{aver} \right)^2 \cdot \sum_i \left( B_i - B_{aver} \right)^2 \right]}}$$

$A$: cargo protein with $A_i$ being the intensity at pixel $i$ and $A_{aver}$ the average value intensity \
$B$: organelles (mitochondria or Golgi) with $B_i$ being the intensity at pixel $i$ and $B_{aver}$ the average value intensity

PCC values range from –1 to 1, where 1 indicates perfect colocalization and –1 indicates complete exclusion. <br>

Reference [1]: Manders, E.M., Verbeek, F.J. and Aten, J.A., 1993. Measurement of co‐localization of objects in dual‐colour confocal images. Journal of microscopy, 169(3), pp.375-382.

We will calculate the colocalization between the cargo protein and either mitochondria or Golgi using scipy.stats.pearsonr, to quantify cargo tethering with the following notations:
- PCC1: PCC between the cargo protein and Golgi channels
- PCC2: PCC between the cargo protein and mitochondria channels

At the single-cell level, a threshold for defining "transfected" cells is based on the intensity in the mitochondria channel. PCC values are compared between cells with high (transfected) and low (non-transfected) levels of TBC1D23 in the mitochondria channel.

In [None]:
import scipy.stats
import pandas as pd
from skimage import measure


# Extract region properties from a labelled mask
props = measure.regionprops(mask)

colcoeff = []  # List to store PCC results and associated measurements for each labelled region
intensity_threshold =  1e6  # Threshold for "transfected"

# Extract intensity values for each channel at the coordinates of each labelled region
for p in props:
    ch00 = img[:, 0, :, :].squeeze()[p.coords[:, 0], p.coords[:, 1]]  # Channel 0: Golgi marker
    ch01 = img[:, 1, :, :].squeeze()[p.coords[:, 0], p.coords[:, 1]]  # Channel 1: Cargo protein
    ch02 = img[:, 2, :, :].squeeze()[p.coords[:, 0], p.coords[:, 1]]  # Channel 2: Mitochondria

    # Determine transfection condition based on the intensity in the mitochondria channel
    sum_mitochondria = ch02.sum()
    if sum_mitochondria >= intensity_threshold:
        c = "Transfected"
    else:
        c = "Not transfected"

    # Compute Pearson correlation coefficient between:
    # - Golgi and Cargo protein
    [r1, pv] = scipy.stats.pearsonr(ch00, ch01)
    # - Cargo protein and Mitochondria
    [r2, pv] = scipy.stats.pearsonr(ch01, ch02)
    # - pv is the two-tailed (default hypothesis) p-value testing non-correlation

    # Append results as a dictionary to the list
    colcoeff.append(
        {
            "label": p.label,  # Object label
            "PCC1": r1,  # Pearson correlation: Golgi vs Cargo
            "PCC2": r2,  # Pearson correlation: Cargo vs Mitochondria
            "Area": p.area,  # Area of the object
            "Integrated intensity": sum_mitochondria,  # Total mitochondria intensity
            "Condition": c,  # Transfection status
        }
    )

# Convert the list of dictionaries to a pandas DataFrame for easier downstream analysis
colcoeff = pd.DataFrame.from_records(colcoeff)
# Display the DataFrame
colcoeff

## Statistical analysis
We would like to analyze the tethering of the cargo to the mitochondria due to the protein TBC1D23 relocalization by comparing the PCC between the cargo and mitochondria signal (PCC2) in two groups of cells: transfected (with high TBC1D23 content) and non-transfected (with low TBC1D23 content). 

### P-value
The p-value is obtained from a statistical test (Mann–Whitney U test) using scipy.stats.mannwhitneyu, comparing PCC2 values between transfected and non-transfected cells. This test reflects the probability of observing such a difference by chance if there were no actual relationship between mitochondrial content (TBC1D23) and cargo–mitochondria colocalization. A smaller p-value indicates stronger evidence that the two groups differ in terms of PCC2.

In [None]:
# Filter the DataFrame to separate objects based on their transfection condition
grp1 = colcoeff[colcoeff["Condition"] == "Transfected"]  # Subset of transfected objects
grp2 = colcoeff[colcoeff["Condition"] == "Not transfected"]  # Subset of non-transfected objects

# Perform a Mann-Whitney U test (non-parametric test) to compare the PCC2 values
# between transfected and non-transfected groups
pvalue = scipy.stats.mannwhitneyu(grp1["PCC2"], grp2["PCC2"])
print(pvalue)

## Display of the statistical result
- Plot the PCC values using sns.violinplot and sns.stripplot; a violin plot shows the distribution of a numeric variable across categories and a strip plot is a scatter plot where individual data points are plotted along a categorical axis (like "Condition"), with some jitter (random horizontal spreading) to prevent overlap.
- Annotate the statistical result using statannotations.Annotator
- Save the result into a pdf file using plt.savefig

In [None]:
import seaborn as sns
from statannotations.Annotator import Annotator

# Define the two experimental conditions in desired plotting order
conditions_list = ["Not transfected", "Transfected"]

# Create a dictionary of plotting parameters for reuse
plotting_parameters = {
    "data": colcoeff,  # DataFrame containing PCC values
    "x": "Condition",  # X-axis will show the condition (transfection status)
    "y": "PCC2",  # Y-axis will show the PCC between mitochondria and Cargo
    "order": conditions_list,  # Force consistent order of conditions
}

# Plot the PCC values
ax = sns.violinplot(**plotting_parameters, color="0.9") # Unpacks the dictionary into individual keyword arguments
sns.stripplot(**plotting_parameters, jitter=True, size=2) 

# Set up the Annotator to add statistical result on the plot
annotator = Annotator(ax, tuple([conditions_list]), **plotting_parameters)
annotator.set_pvalues([pvalue.pvalue])
annotator.configure(loc="outside") # Configure annotation display
annotator.annotate() # Draw the annotation on the plot

ax.set_title(
    "Relocation of the \nCargo to the Mitochondria",
    y=1.0, # Vertical position (1.0 = top of axes)
    pad=-10, # Move title closer to plot area (negative = down)
    c="red", # Title text color
    horizontalalignment="center", # Center the title horizontally
)

# Save the figure to PDF with tight layout to avoid cutting off elements
plt.savefig("PCC_Cargo_to_Mitochondria.pdf", format="pdf", bbox_inches="tight")


<div class="alert alert-success">

#### Exercise       

Display the result of the statistical analysis which corresponds to the tethering of the cargo to the Golgi. <br>
Make sure the title of the figure and the name under which the figure will be saved are updated.
   
</div>