## Test decoding of simulated MERFISH data generated with different axial spacings.
**Contributors**: Maxwell Schweiger, Steve Presse, Douglas Shepherd^  
^douglas.shepherd@asu.edu

The goal of this notebook is to show the performance of [`merfish3d-analysis`](https://github.com/QI2lab/merfish3d-analysis) on simulated MERFISH data. The output metric is the [F1-score](https://en.wikipedia.org/wiki/F-score) that determines how well `merfish3d-analysis` recovers the ground truth location and identity of the RNA molecules used to generate the simulation. We will use a single FOV with uniformly distributed RNA molecules. **Note:** `merfish3d-analysis` requires a GPU runtime and will not run without one.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/QI2lab/merfish3d-analysis/blob/main/examples/notebooks/Simulation_example.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/QI2lab/merfish3d-analysis/blob/main/examples/notebooks/Simulation_example.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## Install `merfish3d-analysis`
This is a modified version of the library installation that allow it to run using Google Colab servers. It is missing visualization tools and the ability to automatically stitched tiled data.  
  
**Note:** This installation can take 5-10 minutes, because all of the RAPIDS.AI, pytorch, onnx, cupy, and CUDA libraries have to be validated for the correct versioning.

In [None]:
%%capture
!git clone https://github.com/qi2lab/merfish3d-analysis/
%cd merfish3d-analysis
%pip install -e .
!setup-colab

## Download simulation data
Roughly 1200 individual RNA molecules that are randomly distributed in space within a 41.6 𝜇m by 41.6 𝜇m by 15.0 𝜇m volume (x,y,z). The individual RNA molecules are imaged using a 16-bit Hamming Weight 4 Distance 4 codebook and a simulated microscope with realistic parameters and noise. The imaging simulation is performed in 8 rounds each with 3 channels, containg 2 MERFISH bits per round and a fidicual marker. Three simulations of the same RNA molecules are performed for three different axial steps sizes (0.315 𝜇m, 1.0 𝜇m, 1.5 𝜇m) to explore the impact of sufficent axial sampling when imaging.

In [None]:
%%capture
import zipfile
import os

# Download data from Zenodo
%cd /content/
!wget "https://zenodo.org/records/17274305/files/merfish3d_analysis-simulation.zip?download=1" -O synthetic_data.zip

# Destination path for the unzipped content
unzip_destination = '/content/synthetic_data'

# Create the destination directory if it doesn't exist
os.makedirs(unzip_destination, exist_ok=True)

# Unzip the file
try:
    with zipfile.ZipFile("/content/synthetic_data.zip", 'r') as zip_ref:
        zip_ref.extractall(unzip_destination)
    print(f"File unzipped successfully to {unzip_destination}")
except zipfile.BadZipFile:
    print("Error: The downloaded file is not a valid zip file.")
except FileNotFoundError:
    print("Error: The file /content/synthetic_data.zip was not found.")
except Exception as e:
    print(f"An error occurred during unzipping: {e}")

## Test merfish3d-analysis on randomly distributed RNA with 𝚫z=0.315 𝞵m.
Because an axial spacing of 𝚫z=0.315 𝞵m is Shannon-Nyquist sampled for the objective (NA=1.35), here we decode in 3D.
  
The steps are:  
1. Convert simulation data format to our (qi2lab) experimental format.
2. Convert qi2lab format to `merfish3d-analysis` datastore.
3. 3D deconvolution and 3D prediction of "spot-like" features in every bit.
4. Self-optimize decoding parameters.
5. 3D decoding to find RNA molecules and filter to limit blank codewords as necessary.
6. Calculate F1-score using ground truth RNA molecule locations.

In [None]:
!sim-convert "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/0.315"
!sim-datastore "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/0.315/sim_acquisition"
!sim-preprocess "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/0.315/sim_acquisition"
!sim-decode "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/0.315/sim_acquisition"
!sim-f1score "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/0.315"

## Test merfish3d-analysis on randomly distributed RNA with 𝚫z=1.0 𝞵m.

Because an axial spacing of 𝚫z=1.0 𝞵m is larger than Shannon-Nyquist sampling for the objective (NA=1.35), here we decode in plane-by-plane and then collapse spots in adajacent z planes.
  
The steps are:  
1. Convert simulation data format to our (qi2lab) experimental format.
2. Convert qi2lab format to `merfish3d-analysis` datastore.
3. 2D deconvolution and 2D prediction of "spot-like" features plane-by-plane in every bit.
4. Self-optimize decoding parameters.
5. 2D decoding to find RNA molecules plane-by-plane, then collapse indentical molecules in adajacent z-planes to one decoded moelcule, and filter to limit blank codewords as necessary.
6. Calculate F1-score using ground truth RNA molecule locations.


In [None]:
!sim-convert "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.0"
!sim-datastore "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.0/sim_acquisition"
!sim-preprocess "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.0/sim_acquisition"
!sim-decode "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.0/sim_acquisition"
!sim-f1score "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.0"

## Test merfish3d-analysis on randomly distributed RNA with 𝚫z=1.5 𝞵m.
Because an axial spacing of 𝚫z=1.5 𝞵m is larger than Shannon-Nyquist sampling for the objective (NA=1.35), here we decode in plane-by-plane and then collapse spots in adajacent z planes.
  
The steps are:  
1. Convert simulation data format to our (qi2lab) experimental format.
2. Convert qi2lab format to `merfish3d-analysis` datastore.
3. 2D deconvolution and 2D prediction of "spot-like" features plane-by-plane in every bit.
4. Self-optimize decoding parameters.
5. 2D decoding to find RNA molecules plane-by-plane, then collapse indentical molecules in adajacent z-planes to one decoded moelcule, and filter to limit blank codewords as necessary.
6. Calculate F1-score using ground truth RNA molecule locations.

In [None]:
!sim-convert "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.5"
!sim-datastore "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.5/sim_acquisition"
!sim-preprocess "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.5/sim_acquisition"
!sim-decode "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.5/sim_acquisition"
!sim-f1score "/content/synthetic_data/merfish3d_analysis-simulation/example_16bit_flat/1.5"