<h2>3D stack - Single image - Marker+ based on APOC Object Classifier</h2>

The following notebook is able to process a 3D stack (.czi or .nd2 files) and allows you to:

1. Inspect your images in Napari.
2. Define regions of interest (ROIs) using labels in Napari. Store said ROIs as .tiff files if needed.
3. Predict nuclei labels and store them as .tiff files for further processing.
4. Extract numbers of cells positive for a marker based on pre-trained object classifiers (scikit random forest).
5. Display positive cells in Napari.
6. Extract and save number of positive cells in a .csv file (SP_marker_+_label_obj_class).

In [None]:
from pathlib import Path
import tifffile
import napari
import os
from tqdm import tqdm
import numpy as np
import pandas as pd
import pyclesperanto_prototype as cle
import apoc
from skimage import measure
from utils_stardist import get_gpu_details, list_images, read_image, extract_nuclei_stack, get_stardist_model, maximum_intensity_projection, save_rois, simulate_cytoplasm_chunked_3d, simulate_cell_chunked_3d, simulate_cytoplasm, simulate_cell, segment_nuclei, remove_labels_touching_roi_edge

get_gpu_details()
cle.select_device("RTX")

<h3>Define the directory where your images are stored (.nd2 or .czi files)</h3>

In [None]:
# Copy the path where your images are stored, you can use absolute or relative paths to point at other disk locations
directory_path = Path("../raw_data/test_data")

# Define the channels you want to analyze using the following structure:
# markers = [(channel_name, channel_nr, cellular_location),(..., ..., ...)]
# cellular locations can be "nucleus", "cytoplasm" or "cell" (cell being the sum volume of nucleus and cytoplasm)
# Remember in Python one starts counting from 0, so your first channel will be 0
# i.e. markers = [("ki67", 0, "nucleus"), ("neun", 1, "cell"), ("calbindin", 2, "cytoplasm")]
markers = [("ki67", 0, "nucleus"), ("neun", 1, "cell"), ("calbindin", 2, "cytoplasm")]

# Iterate through the .czi and .nd2 files in the directory
images = list_images(directory_path)

images

<h3>Open each image in the directory</h3>
You can do so by changing the number within the brackets below <code>image = images[0]</code>. By changing the <code>slicing factor</code> you lose resolution but speed up processing times (check the results).

If you have not generated nuclei predictions before, input <code>nuclei_channel</code>, <code>n_tiles</code>, <code>segmentation_type</code> and <code>model_name</code> values.

In [None]:
# Explore each image to analyze (0 defines the first image in the directory)
image = images[3]

# Image size reduction (downsampling) to improve processing times (slicing, not lossless compression)
# Now, in addition to xy, you can downsample across your z-stack
slicing_factor_xy = None # Use 2 or 4 for downsampling in xy (None for lossless)
slicing_factor_z = None # Use 2 to select 1 out of every 2 z-slices

# Define the nuclei and markers of interest channel order ('Remember in Python one starts counting from zero')
nuclei_channel = 3

# The n_tiles parameter defines the number of tiles the input volume/image will be divided into along each dimension (z, y, x) during prediction. 
# This is useful for processing large images that may not fit into memory at once.
# While tiling can handle memory limitations, chopping the image into smaller chunks increases
# the processing time for stitching the predictions back together. 
# Use n_tiles=(1, 1, 1) if the input volume fits in memory without tiling to minimize processing overhead.
n_tiles=(1,4,4)

# Segmentation type ("2D" or "3D"). 
# 2D takes a z-stack as input, performs MIP (Maximum Intensity Projection) and predicts nuclei from the resulting projection (faster, useful for single layers of cells)
# 3D is more computationally expensive. Predicts 3D nuclear volumes, useful for multilayered structures
segmentation_type = "3D"

# Nuclear segmentation model type ("Stardist")
# Choose your Stardist fine-tuned model (model_name) from stardist_models folder
# If no custom model is present, type "test" and a standard pre-trained model will be loaded
model_name = "MEC0.1" # Type "test" if you don't have a custom model trained

# Read image, apply slicing if needed and return filename and img as a np array
img, filename = read_image(image, slicing_factor_xy, slicing_factor_z)

# Slice the nuclei stack
nuclei_img = extract_nuclei_stack(img, nuclei_channel)

# Generate maximum intensity projection 
img_mip = maximum_intensity_projection(img)

# Show image in Napari
viewer = napari.Viewer(ndisplay=2)
viewer.add_image(img_mip)

<h3>Label your regions of interest in Napari and explore the signal of your marker of interest</h3>

Make sure to set <code>n edit dim = 3</code> so the label propagates across all channels. Name your regions of interest as i.e. <code>DG</code>, <code>CA1</code>, <code>CA3</code> or <code>HIPPO</code>. If you do not draw any ROI the entire image will be analyzed.

Fnally, select the <code>img_mip</code> layer and play with the contrast limit to later set a min_max range of intensities within which cells will be considered positive for said marker.

<video controls>
  <source src="../assets/napari_labels.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>

<h3>Save user-defined label ROIs as .tiff files</h3>

In [None]:
save_rois(viewer, directory_path, filename)

<h3>Mask the input image with the user defined ROIs, apply object classifiers and extract data</h3>

In [None]:
# Add the 3D-stack into Napari
if segmentation_type == "3D":
    # Remove the 'img_mip' layer if it exists
    if 'img_mip' in viewer.layers:
        viewer.layers.remove('img_mip')
    # Add the 'img' stack
    viewer.add_image(img)

# Construct ROI and nuclei predictions paths from directory_path above
roi_path = directory_path / "ROIs"
nuclei_preds_path = directory_path / "nuclei_preds" / segmentation_type / model_name

# Extract the experiment name from the data directory path
experiment_id = directory_path.name

# Construct the object classifier path
obj_class_path = Path("./APOC_ObjectClassifiers") / experiment_id

# List of subfolder names
try:
    roi_names = [folder.name for folder in roi_path.iterdir() if folder.is_dir()]

except FileNotFoundError:
    roi_names = ["full_image"]
        
print(f"The following regions of interest will be analyzed: {roi_names}")

# Create an empty list to store all stats extracted from each image
stats = []

for roi_name in tqdm(roi_names):
    print(f"\nAnalyzing ROI: {roi_name}")

    # Read the user defined ROIs, in case of full image analysis generate a label covering the entire image
    try:
        # Read previously defined ROIs
        user_roi = tifffile.imread(roi_path / roi_name / f"{filename}.tiff")

    except FileNotFoundError:
        # Extract the xy dimensions of the input image 
        img_shape = img_mip.shape
        img_xy_dims = img_shape[-2:]

        # Create a label covering the entire image
        user_roi = np.ones(img_xy_dims).astype(np.uint8)

    # Read previously predicted nuclei labels, if not present generate nuclei predictions and save them
    try:
        # Read the nuclei predictions per ROI
        labels = tifffile.imread(nuclei_preds_path / roi_name / f"{filename}.tiff")
        print(f"Pre-computed nuclei labels found for {filename}")
        # Remove labels touching ROI edge (in place for nuclei predictions generated before "remove_labels_touchin_roi_edge" was implemented)
        print("Removing nuclei labels touching ROI edge")
        labels = remove_labels_touching_roi_edge(labels, user_roi)

    except FileNotFoundError:
        print(f"Generating nuclei labels for {filename}")

        # If 3D-segmentation input nuclei_img is a 3D-stack
        if segmentation_type == "3D":
            # Slice the nuclei stack
            nuclei_img = extract_nuclei_stack(img, nuclei_channel)

        # If 2D-segmentation input nuclei_img is a max intensity projection of said 3D-stack
        elif segmentation_type == "2D":
            # Slice the nuclei stack
            nuclei_img = extract_nuclei_stack(img, nuclei_channel)
            nuclei_img = np.max(nuclei_img, axis=0)

        # We will create a mask where roi is greater than or equal to 1
        mask = (user_roi >= 1).astype(np.uint8)

        # 3D segmentation logic, extend 2D mask across the entire stack volume
        if segmentation_type == "3D":
            # Extract the number of z-slices to extend the mask
            slice_nr = img.shape[1]
            # Extend the mask across the entire volume
            mask = np.tile(mask, (slice_nr, 1, 1))
            # Apply the mask to nuclei_img, setting all other pixels to 0
            masked_nuclei_img = np.where(mask, nuclei_img, 0)
        elif segmentation_type == "2D":
            # Apply the mask to nuclei_img, setting all other pixels to 0
            masked_nuclei_img = np.where(mask, nuclei_img, 0)

        # Model loading 
        model = get_stardist_model(segmentation_type, name=model_name, basedir='stardist_models')

        # Segment nuclei and return labels
        labels = segment_nuclei(masked_nuclei_img, segmentation_type, model, n_tiles)

        # Remove labels touching ROI edge
        print("Removing nuclei labels touching ROI edge")
        labels = remove_labels_touching_roi_edge(labels, user_roi)

        # Save nuclei labels as .tiff files to reuse them later
        try:
            os.makedirs(nuclei_preds_path / roi_name, exist_ok=True)
        except Exception as e:
            print(f"Error creating directory {nuclei_preds_path / roi_name}: {e}")

        # Construct path to store
        path_to_store = nuclei_preds_path / roi_name / f"{filename}.tiff"
        print(f"Saving nuclei labels to {path_to_store}")
        try:
            tifffile.imwrite(path_to_store, labels)
        except Exception as e:
            print(f"Error saving file {path_to_store}: {e}")

    # Add the ROIs as labels into Napari
    viewer.add_labels(user_roi, name=f"{roi_name}_ROI", opacity=0.4)

    # Loop through each marker
    for marker in markers:

        # Extract marker_name
        marker_name = marker[0] 

        # Retrieve the first and second values (channel and location) of the corresponding tuple in markers
        for item in markers:
            if item[0] == marker_name:
                marker_channel = item[1]
                location = item[2]
                break  # Stop searching once the marker is found

        if location == "cytoplasm":
            if segmentation_type == "3D":
                print(f"Generating {segmentation_type} cytoplasm labels for: {marker_name}")
                # Simulate a cytoplasm by dilating the nuclei and subtracting the nuclei mask afterwards
                labels = simulate_cytoplasm_chunked_3d(labels, dilation_radius=2, erosion_radius=0, chunk_size=(labels.shape[0], 1024, 1024))

            elif segmentation_type == "2D":
                print(f"Generating {segmentation_type} cytoplasm labels for: {marker_name}")
                # Simulate a cytoplasm by dilating the nuclei and subtracting the nuclei mask afterwards
                labels = simulate_cytoplasm(labels, dilation_radius=2, erosion_radius=0)

        elif location == "cell":
            if segmentation_type == "3D":
                print(f"Generating {segmentation_type} cell labels for: {marker_name}")
                # Simulate a cell volume by dilating the nuclei 
                labels = simulate_cell_chunked_3d(labels, dilation_radius=2, erosion_radius=0, chunk_size=(labels.shape[0], 1024, 1024))

            elif segmentation_type == "2D":
                print(f"Generating {segmentation_type} cell labels for: {marker_name}")
                # Simulate a cytoplasm by dilating the nuclei and subtracting the nuclei mask afterwards
                labels = simulate_cell(labels, dilation_radius=2, erosion_radius=0)

        viewer.add_labels(labels, opacity=0.3, name=f'{location}_in_{roi_name}')

        # Classify labels based on their corresponding object classifier
        cl_filename = f"./{obj_class_path}/ObjClass_{segmentation_type}_ch{marker_channel}.cl"

        # Load the classifier from disc to use the latest version
        classifier = apoc.ObjectClassifier(cl_filename)

        # If 3D-segmentation input marker_img is a 3D-stack
        if segmentation_type == "3D":
            # Slice the img stack
            marker_img = img[marker_channel]

        # If 2D-segmentation input marker_img is a max intensity projection of said 3D-stack
        elif segmentation_type == "2D":
            # Slice the img stack
            marker_img = img_mip[marker_channel]

        # Determine object classification
        print(f"Classifying labels according to {marker_name} intensities...")
        result = classifier.predict(labels, marker_img)

        # Show the result
        viewer.add_labels(result, name=f"{marker_name}_{location}_classes_in_{roi_name}")

        # Extract unique class values from result and loop through them
        unique_classes = np.unique(result)
        unique_classes = unique_classes[unique_classes != 0]  # Exclude background label

        for label in unique_classes:

            # Retrieve class and transform into a string
            subpopulation = str(label)

            # Create a boolean array (mask) where values match the label (class) in result
            class_mask = cle.pull(result) == label
            class_mask = class_mask.astype(bool) # Convert into boolean to allow for indexing later on

            # Find nuclei labels that colocalize with said class (mask) using Numpy indexing
            positive_labels = np.unique(labels[class_mask])
            positive_labels = positive_labels[positive_labels != 0]  # Remove background label

            # Display positive labels for each class
            positive_labels_mask = np.isin(labels, positive_labels) # Find which positive_labels are contained in labels and create a boolean mask
            filtered_labels = np.where(positive_labels_mask, labels, 0) # Use the mask to set values in 'labels' that are not in 'positive_labels' to 0
            viewer.add_labels(filtered_labels, name=f'{marker_name}_{subpopulation}_labels_in_{roi_name}')

            # Extract your information of interest
            total_cells = len(np.unique(labels)) - 1
            marker_pos_cells = len(np.unique(positive_labels))

            # Calculate "%_marker+_cells" and avoid division by zero errors
            try:
                perc_marker_pos_cells = (marker_pos_cells * 100) / total_cells
            except ZeroDivisionError:
                perc_marker_pos_cells = 0

            # Create a dictionary containing all extracted info per masked image
            stats_dict = {
                        "filename": filename,
                        "ROI": roi_name,
                        "population": f'{marker_name}_{subpopulation}',
                        "marker": marker_name,
                        "marker_location":location,
                        "total_cells": total_cells,
                        "marker+_cells": marker_pos_cells,
                        "%_marker+_cells": perc_marker_pos_cells,
                        "nuclei_ch": nuclei_channel,
                        "marker_ch": marker_channel,
                        "slicing_factor_xy": slicing_factor_xy,
                        "slicing_factor_z": slicing_factor_z
                        }

            # Append the current data point to the stats_list
            stats.append(stats_dict)

<h3>Data saving</h3>


In [None]:
# Define output folder for results
results_folder = Path("results") / experiment_id / segmentation_type / model_name

# Create the necessary folder structure if it does not exist
try:
    os.makedirs(str(results_folder))
    print(f"Output folder created: {results_folder}")
except FileExistsError:
    print(f"Output folder already exists: {results_folder}")

# Transform into a dataframe to store it as .csv later
df = pd.DataFrame(stats)

# Define the .csv path
csv_path = results_folder / f"SP_marker_+_label_obj_class.csv"

# Append to the .csv with new data points each round
df.to_csv(csv_path, mode="a", index=False, header=not os.path.isfile(csv_path))

# Show the updated .csv 
csv_df = pd.read_csv(csv_path)

csv_df