## Reproduce Published results with Starfish

This notebook walks through a workflow that reproduces a MERFISH result for one field of view using the starfish package.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
%matplotlib inline

import pprint
import os

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from showit import image as show_image

import starfish.display
from starfish import data, FieldOfView, IntensityTable
from starfish.types import Features, Axes

In [None]:
# load the data from cloudfront
use_test_data = os.getenv("USE_TEST_DATA") is not None
experiment = data.MERFISH(use_test_data=use_test_data)

Individual imaging rounds and channels can also be visualized

In [None]:
intensities = list()
for primary_image in experiment.fov().iterate_image_type(FieldOfView.PRIMARY_IMAGES):

## Compare to results from paper

The below plot aggregates gene copy number across single cells in the field of view and compares the results to the published intensities in the MERFISH paper.

To make this match perfectly, run deconvolution 15 times instead of 14. As presented below, STARFISH displays a lower detection rate.

In [None]:
bench = pd.read_csv('https://d2nhj9g34unfro.cloudfront.net/MERFISH/benchmark_results.csv',
                    dtype = {'barcode':object})

benchmark_counts = bench.groupby('gene')['gene'].count()
genes, counts = np.unique(spot_intensities[Features.AXIS][Features.TARGET], return_counts=True)
result_counts = pd.Series(counts, index=genes)

tmp = pd.concat([result_counts, benchmark_counts], join='inner', axis=1).values

r = np.corrcoef(tmp[:, 1], tmp[:, 0])[0, 1]
x = np.linspace(50, 2000)
f, ax = plt.subplots(figsize=(6, 6))
ax.scatter(tmp[:, 1], tmp[:, 0], 50, zorder=2)
ax.plot(x, x, '-k', zorder=1)

plt.xlabel('Gene copy number Benchmark')
plt.ylabel('Gene copy number Starfish')
plt.xscale('log')
plt.yscale('log')
plt.title(f'r = {r}');

## Visualize results

This image applies a pseudo-color to each gene channel to visualize the position and size of all called spots in a subset of the test image