## Tutorial Steps:

1. **Download Example ISS Dataset:** Obtain the provided ISS dataset to work with.

2. **Optional: Deconvolution and Maximum Intensity Projection:** You have the option to apply deconvolution and create maximum intensity projections from the raw image data.

3. **Stitching Image Data:** Combine the image data using stitching techniques.

4. **Decode Image Data:** Decode the stitched image data.

5. **Quality Control and Visualization:** Evaluate the results through quality control measures and visualize them.


### Step 1. Download ISS Data
To begin, download the ISS toy dataset by clicking on the following link: [ISS Toy Dataset](https://drive.google.com/drive/folders/1AmNFyTtnl3i1QOuFRs_4u2StVnj7FkZK?usp=drive_link).


Once the dataset is downloaded, take a moment to examine the file names and familiarize yourself with their naming conventions. The files adhere to the following naming pattern: `stage{stage}_round{round}_z{z}_channel{channel}.tif`, where the placeholders correspond to the numerical identifiers for the stage position, staining round, z level, and channel.


Next, we'll proceed to load the dataset into an `ISSDataContainer` class. This class is designed to facilitate dataset management without the need to load the entire contents into memory simultaneously.

In [1]:
from imaging_utils import ISSDataContainer

# Create the container
issdata = ISSDataContainer()

# Add images
# join('downloads', 'stage{stage}_rounds{round}_z{z}_channel{channel}.tif')
pattern = 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R{round}_C{channel}_Z{z}_L{stage}.tif'
issdata.add_images_from_filepattern(pattern)

Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C0_Z17_L0.tif. Stage: 0, Round: 0, Channel: 0
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C1_Z17_L0.tif. Stage: 0, Round: 0, Channel: 1
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C2_Z17_L0.tif. Stage: 0, Round: 0, Channel: 2
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C3_Z17_L0.tif. Stage: 0, Round: 0, Channel: 3
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C4_Z17_L0.tif. Stage: 0, Round: 0, Channel: 4
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C0_Z17_L1.tif. Stage: 1, Round: 0, Channel: 0
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C1_Z17_L1.tif. Stage: 1, Round: 0, Channel: 1
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C2_Z17_L1.tif. Stage: 1, Round: 0, Channel: 2
Added C:\Users\Axel\Documents\ISTDECO\downloads\liver_3d\R0_C3_Z17_L1.tif. Stage: 1, Round: 0, Channel: 3
Added C:\Users\Axel\Documents\ISTDECO\download

For verification, you can print out the size of the dataset.


In [None]:
num_stages, num_rounds, num_channels = issdata.get_dataset_shape()
print(f'There are {num_stages} number of stages')
print(f'There are {num_rounds} number of rounds')
print(f'There are {num_channels} number of channels')

There are 2 number of stages
There are 5 number of rounds
There are 5 number of channels


We can also verify that there are equal number of images for each stage, round and channel

In [None]:
issdata.is_dataset_complete()

True

(Optional) Let's take a look at the data using Napari.

In [3]:
import napari

# Select small piece of the data
small_data = issdata.select(stage=0, round=0)

# Load images into memory
small_data.load()

# Run Napari
viewer = napari.Viewer()
viewer.add_image(small_data.data.squeeze())
napari.run()

# Free memory
small_data.unload()

{'image_files': ['C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z0_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z1_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z2_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z3_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z4_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z5_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z6_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z7_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z8_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z9_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z10_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z11_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d

### Step 2. 2D Projection

In this step, we will perform a 2D projection of our data through maximum intensity projection. This involves selecting the maximum pixel value across different z-planes. To enhance the clarity of the 2D images, we can apply deconvolution. It's worth noting that deconvolution can be applied either before or after the 2D projection. However, it's important to highlight that deconvolution can be computationally intensive, often requiring a CUDA-supported GPU for efficient processing, especially when dealing with substantial stacks of 3D multiplexed images. For the purpose of this tutorial, we will omit the deconvolution step, but the necessary functions can be found in the `deconvolution.py` file.


In [2]:
# The iterate dataset allows us to iterate the dataset over stages, rounds and channels.
import numpy as np
from imaging_utils import imwrite
from os.path import join

for index, small_dataset in issdata.iterate_dataset(iter_stages=True, iter_rounds=True, iter_channels=True):
    # Load the small dataset
    small_dataset.load()
    # Get the image data
    data = small_dataset.data
    # MIP the data
    data = np.squeeze(data).max(axis=0)
    # Save the data
    print(data.shape)
    imwrite(join('MIP','stage{stage}_round{round}_channel{channel}.tif'.format(**index)), data)
    # Finally, we unload the images (otherwise we might run oom)
    small_dataset.unload()

# Or equivalently ...
# from ISSDataset import mip
# mip(join('mip','stage{stage}_round{round}_channel{channel}.tif'), issdata)


{'image_files': ['C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z0_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z1_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z2_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z3_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z4_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z5_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z6_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z7_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z8_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z9_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z10_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d\\R0_C0_Z11_L0.tif', 'C:\\Users\\Axel\\Documents\\ISTDECO\\downloads\\liver_3d

### Step 3. Stitching

In this step, we will proceed to stitch the data using ASHLAR. This task can be accomplished by utilizing the `stitch_ashlar.py` function.


In [4]:
from imaging_utils import stitch_ashlar

# First we load the miped data
iss_data_miped = ISSDataContainer()
iss_data_miped.add_images_from_filepattern(join('MIP','stage{stage}_round{round}_channel{channel}.tif'))


Added MIP\stage0_round0_channel0.tif. Stage: 0, Round: 0, Channel: 0
Added MIP\stage0_round0_channel1.tif. Stage: 0, Round: 0, Channel: 1
Added MIP\stage0_round0_channel2.tif. Stage: 0, Round: 0, Channel: 2
Added MIP\stage0_round0_channel3.tif. Stage: 0, Round: 0, Channel: 3
Added MIP\stage0_round0_channel4.tif. Stage: 0, Round: 0, Channel: 4
Added MIP\stage1_round0_channel0.tif. Stage: 1, Round: 0, Channel: 0
Added MIP\stage1_round0_channel1.tif. Stage: 1, Round: 0, Channel: 1
Added MIP\stage1_round0_channel2.tif. Stage: 1, Round: 0, Channel: 2
Added MIP\stage1_round0_channel3.tif. Stage: 1, Round: 0, Channel: 3
Added MIP\stage1_round0_channel4.tif. Stage: 1, Round: 0, Channel: 4
Added MIP\stage0_round1_channel0.tif. Stage: 0, Round: 1, Channel: 0
Added MIP\stage0_round1_channel1.tif. Stage: 0, Round: 1, Channel: 1
Added MIP\stage0_round1_channel2.tif. Stage: 0, Round: 1, Channel: 2
Added MIP\stage0_round1_channel3.tif. Stage: 0, Round: 1, Channel: 3
Added MIP\stage0_round1_channel4.t

To successfully register and stitch the image data, it's crucial to have access to the initial position of each stage in pixel coordinates. This information can typically be extracted from the microscope software.

In [None]:
stage_locations = {
    0: (0, 0), 
    1: (0, 1843), 
    2: (0, 3686), 
    3: (0, 5529), 
    4: (1843, 0), 
    5: (1843, 1843), 
    6: (1843, 3686), 
    7: (1843, 5529), 
    8: (3686, 0), 
    9: (3686, 1843), 
    10: (3686, 3686), 
    11: (3686, 5529), 
    12: (5529, 0), 
    13: (5529, 1843), 
    14: (5529, 3686), 
    15: (5529, 5529)
}

# Stitch using ASHLAR
stitch_ashlar(join('stitched','round{round}_channel{channel}.tif', iss_data_miped, stage_locations))

### Step 4. Decoding

In this step, we will proceed to decode the previously stitched image data.

In [None]:
from imaging_utils import ISSDataContainer
from decoding import istdeco_decode, Codebook, estimate_fdr
import pickle
import pandas as pd

# Load the stitched data
issdata = ISSDataContainer().add_images_from_filepattern(join('stitched','round{round}_channel{channel}.tif', iss_data_miped, stage_locations))

# Load the data into memory
issdata.load()

# Load combinatorial labels
## This is available in the Google Drive with the example data
metadata = pickle.load('metadata.json')

# Create the codebook
codebook = Codebook(num_rounds=5, num_channels=4)
for gene, attributes in metadata['codebook'].items():
    codebook.add_code(gene, round_indices=attributes=['round_index'], channel_indices=attributes=['channel_indices'])


# Run the decoding
results = []
for tile_coords, tile in issdata.iterate_tiles(tile_height=512, tile_width=512):
    # Tile coords [y_start, y_end, x_start, x_end]
    origin = tile_coords[0], tile_coords[2]

    # Decode the data using matrix factorization

    # Depending on your data, you might want to adjust the parameter min_integrated_intensity
    # or quality ...
    # Usually a quality threshold between 0.5 and 0.85 works fine. 
    decoded_table = istdeco_decode(tile, codebook, psf_sigma=(2.0, 2.0), origin=origin)

    results.append(decoded_table)

results = pd.concat(results, axis=1)

Some of the genes are marked as `Negatives` in the codebook. These genes correspond to non-biological labels that we do not expect to find in the data. Treating these negative genes as false-positives allow us to estimate a false discovery rate. This value is useful for quality control. 

In [None]:
negative_labels = {label for label in results['gene_name'].unique() if 'Negative' in label}
positive_labels = {label for label in results['gene_name'].unique() if 'Negative' not in label}

fdr = estimate_fdr(results['gene_name'], negative_labels, positive_labels)
print(fdr)

In [None]:
# We can also compute the quality for a different quality threshold 
fdr = estimate_fdr(results['gene_name'].query('quality > 0.80'), negative_labels, positive_labels)
print(fdr)

An FDR < 1% is pretty OK