# Setup
First, we import all of the modules required for analysis.

Note that we also use the `%gui qt5` magic. This sets the GUI backed to qt5, which is required for the image/results viewer we will use to visualize our results (`starfish.display()`).

In [1]:
%gui qt5
import numpy as np
from skimage import io
from starfish import Experiment, display, Codebook, ExpressionMatrix, BinaryMaskCollection, LabelImage
from starfish.image import Filter
from starfish.spots import FindSpots, DecodeSpots, AssignTargets
from starfish.types import Axes, Coordinates, Features, FunctionSource, TraceBuildingStrategies

# Load the data
First, we load the experiment from experiment files written in the the spacetx format.  The experiment file contains the locations to the files for loading them as well as relevant metadata (e.g., codebook, dataset shape).

`./find_spots/make_find_spots_exp.py` was used to generate the experiment files.For details on constructing the experiment files, see the `make_experiment_file.ipynb` notebook. 

In [2]:
# Load the data
exp = Experiment.from_json("./find_spots/experiment.json")

We then get the first field of view. We select the images tagged `primary`, which are the images of the spots. Similary, we can access other types of images (e.g., nuclei or GFP) as defined by the experiment files. The image is returned as an `ImageStack` object which contains the array data as well as metadata (e.g., coordinates).

In [None]:
im = exp.fov().get_image('primary')

# Viewing data
We can view the image data using a [`napari`](https://www.github.com/napari/napari) viewer using the `display()` command. This will open a new window that allows us to explore the image. You can pan/zoom as you would in Google Maps. You can adjust the display parameters in the pallete on the left side of the window. You can scroll through different dimensions using the slider at the bottom of the window.

In [3]:
# View the image stack
display(im)

100%|██████████| 25/25 [00:07<00:00,  3.26it/s]


<napari.viewer.Viewer at 0x13ebdb630>

# Filtering the data
Next, to improve the spot contrast, we filter the image. The general pattern when using starfish filters is to first instantiate the filter object with the filter properties as parameters. The `Filter` object can then be used to filter `ImageStack` objects. The same filter object can be used to filter multiple `ImageStack` objects. For example, if we want to create a gaussian high pass filter with sigma=3, we do the following

```
ghp = Filter.GaussianHighPass(sigma=3)
```
We can then filter the image using the `run()` method, which takes the `ImageStack` object to be filtered (first positional argument), a flag for verbose output (e.g., progress bars) and a flag for in place computation (i.e., if set to `True`, the filterd image replaces the original image.). In this example, we perform the apply gaussian high pass filter we instantiated above on the `im` `ImageStack` with the progress bars on (`verbose=True`) and we return the result as a new `ImageStack` object (`in_place=False`)
```
high_passed = ghp.run(im, verbose=True, in_place=False)
```

If we had a second `ImageStack` called `im2`, we could then filter `im2` with the same `ghp` object we previously used.
```
high_passed2 = ghp.run(im2, verbose=True, in_place=False)
```

In [4]:
# Filter the image
ghp = Filter.GaussianHighPass(sigma=3)
high_passed = ghp.run(im, verbose=True, in_place=False)


glp = Filter.GaussianLowPass(sigma=1)
low_passed = glp.run(high_passed, in_place=False, verbose=True)

100%|██████████| 25/25 [00:00<00:00, 463.68it/s]
100%|██████████| 25/25 [00:00<00:00, 300.20it/s]


In [5]:
# Max intensity project the spots spots image
mproj = Filter.Reduce((Axes.ZPLANE,), func='max', module=FunctionSource.np)
mip = mproj.run(low_passed)

In [6]:
display(mip)

100%|██████████| 1/1 [00:00<00:00, 70.95it/s]


<napari.viewer.Viewer at 0x10478e7b8>

# Detecting spots
To detect spots, we will use the starfish `FindSpots.Blob` detector. This detector wraps the `skimage` `blob_log()` detector. For details on the parameters, see the docs [here](https://scikit-image.org/docs/dev/api/skimage.feature.html#skimage.feature.blob_log). The detector components in `starfish` use the same pattern as the `Filter` components: (1) instantiate the component object with the detector parameters (2) use the object to detect spots on an `ImageStack`.

In [7]:
# Find the spots
p = FindSpots.BlobDetector(
    min_sigma=1,
    max_sigma=10,
    num_sigma=10,
    threshold=0.001,
    measurement_type='mean',
)

spots = p.run(mip)


# Assign genes to spots
Next, we assign a gene (target) to each detected spot. The `Codebook` object contains the mapping of (`round`, `channel`) combinations to target.

In [8]:
# Assign the spots
codebook = exp.codebook
decoder = DecodeSpots.PerRoundMaxChannel(
        codebook=codebook,
        trace_building_strategy=TraceBuildingStrategies.SEQUENTIAL
)

decoded_intensities = decoder.run(spots=spots)

In [9]:
display(mip, decoded_intensities)

<napari.viewer.Viewer at 0x1040f4b70>

# Cell segmentation
In order to assign detected targets to individual cells, we must first, segment the cells to determine their borders. `starfish` contains some basic cell segmentation tools, but we often use an external pipeline for segmentation (e.g., [CellProfiler](https://cellprofiler.org/) or [ilastik](https://www.ilastik.org/)). 

### Segmentation in external pipelines
To segment cells in an external pipeline, we first write the relevant images to disk for processing. In this case, the cells expressed GFP throughout the cell body, so we will use that to segment the cells using CellProfiler. For details on the [CellProfiler](https://cellprofiler.org/) segmentation pipeline, see `./find_spots/segment_gfp.cpproj`. The pipeline outputs a label image in which each segmented cell body is assigned a unique integer label. If you wish to inspect the image, see `./find_spots/gfp_segmentation.tiff`.


In [10]:
from InSituToolkit.analysis import save_stack

# Save the GFP stack
gfp = exp.fov().get_image('stain')
gfp_mip = mproj.run(gfp)
save_stack(gfp_mip, './find_spots/gfp.tif')

100%|██████████| 25/25 [00:08<00:00,  2.90it/s]
100%|██████████| 1/1 [00:00<00:00, 80.05it/s]
  .format(dtypeobj_in, dtypeobj_out))


### Loading segmentaton results from external pipelines
Note: this part of the `starfish` API is a work in progress and these steps will be consolidated in the near future.

Once we generate a label image, we can load it into starfish and then use it to assign spots to cells. To load the image we:

1. Load the label image generated by CellProfiler as a `np.ndarray` using `skimage.io` 
2. All images in `starfish` must have physical and pixel coordinates. Therefore, we get the coordinates from the source GFP image (i.e., the one that was saved to disk) to be attached to the resulting `LabelImage` object.
3. We instantiate a `LabelImage` object containing the label image generated by CellProfiler and the physical/pixel coordinates of the image.
4. Finally, we create the `BinaryMaskCollection` object which can then be used to assign spots to cells.

In [11]:
# Load the label image generated in CellProfiler
label_image = io.imread('./find_spots/gfp_segmentation.tiff')

# Get the physical ticks from the original GFP image
yc = gfp.xarray.yc.values
xc = gfp.xarray.xc.values
physical_ticks = {Coordinates.Y: yc, Coordinates.X:xc}

# Get the pixel values from the original GFP image
y = gfp.xarray.y.values
x = gfp.xarray.x.values
pixel_coords = {Axes.Y: y, Axes.X:x}

# Create the label image
label_im = LabelImage.from_label_array_and_ticks(
    label_image,
    pixel_coordinates=pixel_coords,
    physical_coordinates=physical_ticks,
    log=gfp_mip.log
)
# Create the mask collection
masks = BinaryMaskCollection.from_label_image(label_im)

# Creating cell x gene tables
We assign spots to cells using `AssignTargets`. `AssignTargets` uses spot coordinates in `decoded_intensities` (`IntensityTable`) and the cell masks in `masks` (`BinaryMaskCollection`) to determine membership of each detected spot/target. The output is an `ExpressionMatrix` object which is a matrix where the each row is a cell, each column is a gene/target, and each element value is the number of counts of a given target in a given cell.

Generally, the `ExpressionMatrix` is the interface between `starfish` analysis and other statistical analyses. The coordinates are for the centroid of the cell. The `cell_id` is the label index in the segmentation label image.

In [12]:
al = AssignTargets.Label()
labeled = al.run(masks, decoded_intensities)
cg = labeled.to_expression_matrix()
cg

<xarray.ExpressionMatrix 'expression_matrix' (cells: 6, genes: 1)>
array([[ 2],
       [30],
       [ 1],
       [ 6],
       [ 8],
       [ 3]])
Coordinates:
    x        (cells) float64 596.0 523.5 306.0 363.5 812.0 994.0
    y        (cells) float64 603.5 710.0 740.0 757.0 742.5 550.0
    z        (cells) float64 0.0 0.0 0.0 0.0 0.0 0.0
    xc       (cells) float64 1.007e+05 1.007e+05 ... 1.007e+05 1.007e+05
    yc       (cells) float64 4.324e+04 4.324e+04 ... 4.324e+04 4.324e+04
    zc       (cells) float64 1.771e+03 1.771e+03 ... 1.771e+03 1.771e+03
    area     (cells) float64 nan nan nan nan nan nan
  * genes    (genes) object 'nan'
    cell_id  (cells) object '0' '1' '2' '3' '4' 'nan'
Dimensions without coordinates: cells

# Viewing the results
We can view the segmentation masks and the spots all overlaid on the image as shown below.

In [13]:
viewer = display(stack=mip, spots=decoded_intensities)
viewer.add_labels(label_image);