Fluorescence datasets ideally reflect a relationship between the pixels in an image and the location and local density of your fluorescent molecule in a sample. However, properties of the detectors, optics, or even the samples can confound direct interpretation of this data. Here we will present some operations that can mitigate these effects to achieve robust hypothesis testing. 

Hypothesis: Treatment with drug Y will cause a decrease in the total amount of protein Y. You have saved the control dataset as "no_drug.tif" and the drugged cell dataset as "drug.tif". 

First some boilerplate code to make it easier to access useful libraries, and to make it easier to visualize data in the notebook.

In [163]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [164]:
sns.set_style('dark', rc={'image.cmap':'inferno'})

Import an image file and associated metadata as we learnt yesterday.

In [181]:
from skimage.io import imread

#THE real data to be imported in the class is commented out for now because files are currently too large for github

data_drug = imread("../data/confocal_drug_panel/drugA.tif")
# data_nodrug = imread("../data/confocal_drug_panel/DMSO.tif")
data_nodrug = imread("../data/confocal_drug_panel/temp_DMSO.tif")

In [182]:
import json
with open('../data/confocal_drug_panel/DMSO_metadata.json', mode='r') as f_nodrug:
    meta_nodrug = json.load(f_nodrug)

drug_slice = {}
nodrug_slice = {}
for idx, channel in enumerate(meta_nodrug['channels']):
    drug_slice[channel] = data_drug[4,:,:,idx]
    nodrug_slice[channel] = data_nodrug #[4,:,:,idx]    #add in the indexing when read in full dataset
    print(channel)

Display the images to make sure everything worked as expected...

In [153]:
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
ax[0].imshow(nodrug_slice["actin"])
ax[1].imshow(nodrug_slice['nucleus'])
ax[2].imshow(nodrug_slice["your_fav_protein"])

fig, ax = plt.subplots(1, 3, figsize=(16, 4))
ax[0].imshow(drug_slice["actin"])
ax[1].imshow(drug_slice['nucleus'])
ax[2].imshow(drug_slice["your_fav_protein"])

## Image pre-processing: a motivating example

So here we have images of fixed cells in three channels -- cell bodies labeled with an actin stain, nuclei labeled with DAPI, and a third protein "your_fav_protein" that responds to drug treatment. Just by looking at the images it seems like the protein is shifting from the nuclei to the cell body once the drug is applied (always visualize your intermediates!), but it is unclear if the drug treatment changes the total amount of protein per cell as well.

To address these questions, you will need to do the following --

1. Make a mask of the actin channel to identify pixels within the cell bodies
2. Make a mask of the nuclear channel to identify pixels within the nucleus
3. Determine the signal coming from *your favorite protein* within these regions of interest. 

Today, we will define the ROIs in the image. We'll find that the quality of masks can be improved by preprocessing the images by quantitatively determining thresholds and filtering to remove noise. We will then cover how to clean up the mask and turn it into an accurate ROI using morphological image processing. 

Tomorrow (Day 4), we will cover how to design your image processing pipeline to deal with some trickier problems, such as quantifying fluorescence in the ROIs determined here. 

**Preprocessing misteps are a good way to get a paper retracted. We argue that it's easier to make these misteps when doing things manually, but it's not *impossible* to do it computationally. In fact if you don't check intermediate steps of your data in either case, it's no good. Always visualize your intermediates!**

### Making masks to localize cell bodies

In [169]:
nodrug_slice['actin'].dtype
data = nodrug_slice['actin']

**Find an appropriate threshold that defines the cell bodies accurately across the image using the sliding bar.**
First, view the image more closely

In [155]:
#parameters to adjust
minX1 = 700 #crop edges for a cell in the center of field of view
minY1 = 900
minX2 = 1200 #crop edges for cell at the edge of the field of view
minY2 = 1800
crop_size = 200 #pix
image_view_thresh = 0.1

#run
maxX1 = minX1 + crop_size
maxY1 = minY1 + crop_size
maxX2 = minX2 + crop_size
maxY2 = minY2 + crop_size

top = data.max() * image_view_thresh

fig, ax = plt.subplots(1, 3, figsize=(16, 4))
ax[1].imshow(data[minY1 : maxY1 , minX1 : maxX1], vmin=0, vmax=top)
ax[0].imshow(data, vmin=0, vmax=top)
ax[2].imshow(data[minY2 : maxY2, minX2: maxX2], vmin=0, vmax=top)

Determine using the sliding bar which threshold gives the best mask across the image.

In [156]:
from ipywidgets import interactive
@interactive
def show_masks(thresh=(0, data.max() * 0.1, 20)):
    fig, ax = plt.subplots(1, 3, figsize=(16, 4))
    mask = np.zeros(nodrug_slice["actin"].shape)
    mask[nodrug_slice["actin"] >=thresh] = 1
    mask_zoom_center = mask[minY1 : maxY1 , minX1 : maxX1]
    mask_zoom_edge = mask[minY2 : maxY2 , minX2 : maxX2]
    ax[0].imshow(mask, vmin=0, vmax=1)
    ax[1].imshow(mask_zoom_center, vmin=0, vmax=1)
    ax[2].imshow(mask_zoom_edge, vmin=0, vmax=1)
show_masks

### Automated detection of foreground using Otsu's method

Nobuyuki Otsu proposed a method (now very widely used) to detect thresholds. Simply put, the idea is to assume that background pixels (unwanted), and foreground pixels (your signal) will follow a bimodal distribution, i.e. that all the background pixels will be a well defined group on a histogram, which will be different from another well defined group that will be brighter, and is the signal that you want.

In [103]:
from skimage import filters

thresh = filters.threshold_otsu(data)
print("the objective masking threshold for this dataset is:", thresh)

In [157]:
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
mask = np.zeros(nodrug_slice["actin"].shape)
mask[nodrug_slice["actin"] >=thresh] = 1
mask_zoom_center = mask[minY1 : maxY1 , minX1 : maxX1]
mask_zoom_edge = mask[minY2 : maxY2 , minX2 : maxX2]
ax[0].imshow(mask, vmin=0, vmax=1)
ax[1].imshow(mask_zoom_center, vmin=0, vmax=1)
ax[2].imshow(mask_zoom_edge, vmin=0, vmax=1)

Note that the global threshold produces masks with different qualtities at the edges and the center of the image because of the uneven illumination throughout the sample. Observe the histogram of pixel intensities to see why this might be the case.

In [158]:
sns.distplot(data.flatten(), hist_kws={'log': True}, kde=False)
plt.axvline(thresh, ls='--', lw=2, c='r')

How might the unneveness of the illumination compromise the algorithm? 

## Rank filters: local image manipulations

### Steal Noah's intro module on rank filters to explain how to flatten the field (or, alternatively subtract local background)

In [162]:
#flatten the field to show that it improves Otsu's method--show the new histogram
    #Methods: rolling ball, subtract Gaussian blur, large min filter

In [None]:
#show new masks-- they are improved but still have holes and rogue noisy pixels
#removing shot noise: median filtering (shown below)
#morphological processing (From Morphological Image Processing module)

### Removing Shot Noise from your Image -- Median Fitering

To get better at object detection, we can leverage various properties of the pixels. With time, you will be able to leverage pretty much any property you can articulate, but for now, let's use the idea that the pixels that are hanging out in the wrong places are surrounded by other pixels that are properly classified. Let's make them listen to their neighbours. There are many ways to do this. One useful method to know is called Median filtering. It goes pixel by pixel, and replaces each pixel with the median of its surroundings. Let's load our image slice...

In [None]:
from scipy.ndimage.filters import median_filter
plt.imshow(original_slice)

Tech tip: These images were taken with a confocal microscope, which uses a PMT (photomultiplier tube) with high sensitivity. However, because this detector operates in a low-photon regime, shot noise (Poisson distributed) can add substantial deviation of pixel values from the local fluorescence intensities they represent. Shot noise is commonly removed with the median filter, although other rank filters exist.

In [None]:
from ipywidgets import interactive

@interactive
def apply_filter(size=(1, 21)):
    fig, ax = plt.subplots(1, 3, figsize=(10, 5))
    
    # Here we implement the median filtering
    filtered = median_filter(original_slice, size=size)
                             
    ax[0].imshow(original_slice)
    ax[1].imshow(filtered)
    dif_img = filtered.astype('int') - original_slice.astype('int')
    
    extreme = 10000
    im = ax[2].imshow(dif_img, vmin=-extreme, vmax=extreme, cmap='coolwarm')
    
    print("total difference in image =" + str(np.mean(dif_img)) + " arbitrary units")
    print("percent change =" + str(np.mean(dif_img)/100) + "%") 
apply_filter



Note that the size of the filter determines the value of the median value of the pixels in the output. That means, the larger the filter size, the more neighbours the filter will look at, before deciding what the new pixel value should be. A good rule of thumb when determining an appropriate filter size is that it should be the smallest filter that sufficiently flattens the visible noise in the background. Many of these operations do not have well-accepted statistical tests for determing the appropriate parameters, so care needs to be taken to record and reproduce processing steps with the same parameters. 

Let's choose a filter size of 3x3.

In [None]:
filtered_slice = median_filter(original_slice, size=3)
filtered_image = median_filter(whole_image, size=3)

fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(filtered_image)
ax[1].imshow(filtered_slice)

Now let's see how the filtering affects our mask, and compare to the mask we made earlier.

In [None]:
masked_filtered_slice = threshold_image(filtered_slice, filters.threshold_otsu(filtered_slice))
masked_filtered_image = threshold_image(filtered_image, filters.threshold_otsu(filtered_image))

fig, ax = plt.subplots(1, 3, figsize=(10, 5))
ax[0].imshow(masked_filtered_image)
ax[1].imshow(masked_slice)
ax[2].imshow(masked_filtered_slice)

Great, but still not perfect! What do you think would happen if we tried to apply the mask before the filter? (this could be an exercise)

In [None]:
filtered_masked_slice = median_filter(masked_slice, size=3)
filtered_masked_image = median_filter(masked_whole_image, size=3)

fig, ax = plt.subplots(1, 3, figsize=(10, 5))
ax[0].imshow(masked_filtered_slice)
ax[1].imshow(filtered_masked_slice)
ax[2].imshow(masked_filtered_slice - filtered_masked_slice, cmap='coolwarm')

Why does doing this not make sense? (this leads to discussion of morphological operations...)