## RUN: smFISH Spot Detection

This notebook implements the spot detection for the pea3 smFISH data.

### Notes

Spots are detected and counted using `skimage.blob.blob_log` from scikit-image. However, since my production version of scikit-image (`0.13.0`) does not support 3D blob detection, I copied and adapted the newer version (`0.15.dev0`) from github (see `katachi\utilities\skimage_blob.py`).

Note that this approach is quite sensitive to image quality and parameters, so it will need to be adjusted for different smFISH experiments!

### Prep

In [None]:
### Imports

# Generic
from __future__ import division
import os, sys, pickle
import numpy as np
import matplotlib.pyplot as plt

# smFISH
import katachi.utilities.loading as ld
import katachi.utilities.skimage_blob as blob

In [None]:
### Load data

# Prep loader
dirpath = r'data\experimentB\image_data'
loader = ld.DataLoader(dirpath, recurse=True)

# Load smFISH image data
smf_stacks, prim_IDs, _ = loader.load_dataset(r"pea3smFISH.tif")

# Load segmentations
seg_stacks, _, _ = loader.load_dataset(r"lynEGFP_seg.tif", IDs=prim_IDs)

# Get number of cells
n_cells = [len(np.unique(seg_stacks[prim_ID]))-1 for prim_ID in prim_IDs]

### Some Testing

### Running Spot Detection

In [None]:
### Run spot detection

# Relevant parameters [see help(blob_log) for details]
min_sigma      = 1     # Minimum standard deviation for Gaussian kernel
max_sigma      = 4     # Maximum standard deviation for Gaussian kernel
num_sigma      = 10    # Number of standard deviation steps to run
spot_threshold = 0.22  # Determines which spots to keep (lower is more inclusive, higher is more stringent)
overlap        = 0.50  # Smaller blob eliminated when area of two blobs overlaps by this much (separation resolution)

# For each prim...
print "Running spot detection..."
for prim_ID in prim_IDs:
    
    # Report
    print "  Working on prim", prim_ID
    
    # Run spot detection
    blobs = blob.blob_log(smf_stacks[prim_ID], 
                          min_sigma=min_sigma, max_sigma=max_sigma, num_sigma=num_sigma, 
                          threshold=spot_threshold, overlap=overlap, log_scale=False)
    
    # Get corresponding cell label for each spot
    blob_coords = blobs[:, :-1].astype(np.int)
    hit_labels  = seg_stacks[prim_ID][blob_coords[:,0], 
                                      blob_coords[:,1], 
                                      blob_coords[:,2]]
    
    # Get counts
    cell_counts = np.zeros(np.unique(seg_stacks[prim_ID]).size, dtype=np.int)
    hit_labels_id, hit_counts  = np.unique(hit_labels, return_counts=True)
    cell_counts[hit_labels_id] = hit_counts
    cell_counts = cell_counts[1:]
    
    # Save spots file
    base_path  = [p for p in loader.data[prim_ID] 
                  if p.endswith(r"pea3smFISH.tif")][0]
    spots_path = base_path[:-4] + '_RNAspots.npy'
    np.save(spots_path, blobs)
    
    # Save counts file
    counts_path = base_path[:-4] + '_RNAcounts.npy'
    np.save(counts_path, cell_counts)
    
    # Report
    print "    Detected", blobs.shape[0], "spots!"
    
# Report
print "Processing complete!"