**Run this cell if the dataset/variables are not present from running Preprocessing part1 - thresholding**

In [225]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from skimage.io import imread
sns.set_style('dark', rc={'image.cmap':'inferno'})
import matplotlib.axes as ax

#THE real data to be imported in the class is commented out for now because files are currently too large for github

# data_nodrug = imread("../data/confocal_drug_panel/DMSO.tif")
data_nodrug = imread("../data/confocal_drug_panel/DMSO.tif")

import json
with open('../data/confocal_drug_panel/DMSO_metadata.json', mode='r') as f_nodrug:
    meta_nodrug = json.load(f_nodrug)

nodrug_slice = {}
for idx, channel in enumerate(meta_nodrug['channels']):
    nodrug_slice[channel] = data_nodrug[4,:,:,idx]    #add in the indexing when read in full dataset

data = nodrug_slice['actin']

Visualize the raw data and masks generate from global Otsu's threshold in Preprocessing part 1 - global thresholding

In [229]:
#parameters to adjust
minX1 = 300 #crop edges for a cell in the center of field of view
minY1 = 400
minX2 = 450 #crop edges for cell at the edge of the field of view
minY2 = 1
crop_size = 200 #pix
image_view_thresh = 0.1

#run
maxX1 = minX1 + crop_size
maxY1 = minY1 + crop_size
maxX2 = minX2 + crop_size
maxY2 = minY2 + crop_size

top = data.max() * image_view_thresh

fig, ax = plt.subplots(1, 3, figsize=(16, 4))
ax[1].imshow(data[minY1 : maxY1 , minX1 : maxX1], vmin=0, vmax=top, interpolation = 'nearest')
ax[0].imshow(data, vmin=0, vmax=top, interpolation = 'nearest')
ax[2].imshow(data[minY2 : maxY2, minX2: maxX2], vmin=0, vmax=top, interpolation = 'nearest')

from skimage import filters
thresh = filters.threshold_otsu(data)
print("the Otsu masking threshold for this dataset is:", thresh)
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
mask = np.zeros(nodrug_slice["actin"].shape)
mask[nodrug_slice["actin"] >=thresh] = 1
mask_zoom_center = mask[minY1 : maxY1 , minX1 : maxX1]
mask_zoom_edge = mask[minY2 : maxY2 , minX2 : maxX2]
ax[0].imshow(mask, vmin=0, vmax=1)
ax[1].imshow(mask_zoom_center, vmin=0, vmax=1)
ax[2].imshow(mask_zoom_edge, vmin=0, vmax=1)

crop_edge = data[minY2 : maxY2, minX2: maxX2]
crop_mid = data[minY1 : maxY1 , minX1 : maxX1]

### Introduction to rank filters: denoising with the median filter

Note the problems with the dataset that produce low-quality ROIs:
1) Noise
2) Uneven illumination

Rank filters are a subset of common image processing tools that modify images pixel-by-pixel by including information about the surrounding pixels. They do the heavy lifting in many algorithms for denoising (here the median filter), flattening illumination or background, and morphological manipulations. 

The skimage piece on rank filters (http://scikit-image.org/docs/dev/auto_examples/applications/plot_rank_filters.html) explains different types of filters and has sample code.

In [230]:
from skimage.filters.rank import minimum as min_filter
from skimage.morphology import disk
import matplotlib.patches as patches

Because these filters work pixel-by-pixel, let's first zoom in on a small portion of our original dataset. Note that we specify 'nearest' interpolation so that imshow does not blur our image.

In [239]:
zoom = crop_edge[1:21 , 90:110]
plt.imshow(zoom, interpolation='nearest')

Let's simulate an abberantly dark pixel at a known location (to protect your lesson from variation in input datasets) and an abberantly bright pixel.

In [240]:
dark_pix_x = 10
dark_pix_y = 5
bright_pix_x = 10
bright_pix_y = 15

zoom_noised = zoom.copy()
zoom_noised[bright_pix_y, bright_pix_x] = zoom.max()*2
zoom_noised[dark_pix_y, dark_pix_x] = 0
plt.imshow(zoom_noised, interpolation='nearest', vmin = 0, vmax = zoom.max())

These abberant pixel values do not accurately reflect the local concentration of your fluorescent protein, but instead a faulty detector on your camera (for ccds or scmos) or noise. Rank filters use the information from the pixels in the neighborhood of this pixel to reassign a value in the "filtered" image.

Let's take a look at the dark pixel in the image and the pixels immediately surrounding it. 

In [241]:
plt.imshow(zoom_noised, interpolation='nearest')
nhood = 3 #works best with an odd number

#plt.gca().add_patch(plt.Rectangle((bright_pix_x-nhood/2, bright_pix_y-nhood/2), nhood, nhood, fill=None, color='y', lw=2))

plt.gca().add_patch(plt.Rectangle((dark_pix_x-nhood/2, dark_pix_y-nhood/2), nhood, nhood, fill=None, color='y', lw=2))

In [242]:
print("There is a lot of information about what that pixel value could be based on its neighbors. ",
      "Their values are:")
NHOOD = zoom_noised[int(dark_pix_y - (nhood-1)/2 ) : int(dark_pix_y + (nhood-1)/2) +1 , int(dark_pix_x - (nhood-1)/2) : int(dark_pix_x + (nhood-1)/2 )+1]
print( NHOOD)

print("ranked, the values are" , sorted(NHOOD.flatten()) )
print("with a median value of ", np.median(NHOOD))

print("If this were one step in a median filter, the median value in the neighborhood would become", np.median(NHOOD), "in the new, filtered image")

Of course, performing these manipulations on choice pixels is not a reproducible approach, and fortunately there are good built-in 2D median filters that will process each pixel in the image by considering its neighbors. A median filter applied to each pixel in the image above results in noise reduction. 

Apply a median filter with the parameters in the example above using the skimage rank filters and skimage morphology libraries:

In [243]:
from skimage.filters.rank import median as median_filter
from skimage.morphology import square

plt.imshow(median_filter(zoom_noised, square(nhood) ) , interpolation='nearest')

This filter was applied to the entire image (pixel by pixel) and resulted in loss of the abberant bright and dark pixels. What else do you notice about the image?

### Structuring elements: determining the neighborhood for rank filters

In the example above, the neighborhood considered for determining the output for each pixel was the set of pixels immediately adjacent to our pixel of interest in a square (the pixels boxed in the example above). The shape and size of the neighborhood used in the algorithm is called the structuring element.

You have total freedom to choose any structuring element you want, but generally simple symmetric shapes are used because their effects are intuitive. In fact, we can let `scikit-image` generate reasonable structuring elements for us! This is a good idea to maximize repeatability.

See: http://scikit-image.org/docs/dev/api/skimage.morphology.html for some options.

"Disk," which approximates a circle around the filtered pixel, is a common choice for structuring element. The skimage.morphology library structuring elements can be viewed directly.

#### Influence of structuring element size on image output: median filter example

View the size and shape of the "disk" structuring element included in the skimage morpholoogy package. 

In [244]:
from skimage.morphology import disk

with sns.axes_style('white'):
    N = 5
    fig, axes = plt.subplots(1, N, figsize=(16, 3))
    for n, ax in enumerate(axes):
        np1 = n + 1
        ax.imshow(np.pad(disk(np1), N-n, 'constant'), interpolation='nearest')
        c = plt.Circle((np1 + N - n, np1 + N - n), radius=np1, fill=False, lw=4, color='b')
        ax.add_artist(c)
        ax.set_xlim(0, 2 * N + 2)
        ax.set_ylim(0, 2 * N + 2)

Visualize the effect of changing the size of the structuring element used to median filter the image.

In [245]:
extreme = 1000

from skimage.filters.rank import median as median_filter
from ipywidgets import interactive

im_filter = zoom_noised

@interactive
def apply_filter(s=(1, 10)):
    # Here we implement the median filtering
    fig, ax = plt.subplots(1, 3, figsize=(10, 5))
    ax[0].imshow(im_filter, interpolation = 'nearest')
    filtered = median_filter(im_filter, disk(s))
    dif_img = filtered.astype('int') - im_filter.astype('int')
    print("total difference in image =" + str(np.mean(dif_img)) + " arbitrary units")
    print("percent change =" + str(np.mean(dif_img)/100) + "%")                     
    ax[1].imshow(filtered, interpolation = 'nearest')
    ax[2].imshow(dif_img, vmin=-extreme, vmax=extreme, cmap='coolwarm',interpolation = 'nearest')
apply_filter

Note that the size of the filter determines the value of the pixels in the output as well as how much the output is blurred by the filter. That means, the larger the filter size, the more neighbours the filter will include before deciding what the new pixel value should be. A good rule of thumb when determining an appropriate filter size is that it should be the smallest filter that sufficiently flattens the visible noise in the background. Many of these operations do not have well-accepted statistical tests for determing the appropriate parameters, so care needs to be taken to record and reproduce processing steps with the same parameters. 

Let's choose a filter size of 3x3.

In [246]:
f_size = 2
filtered_im = median_filter(data, disk(f_size))
filtered_crop_edge = median_filter(crop_edge, disk(f_size))
filtered_crop_mid = median_filter(crop_mid, disk(f_size))

fig, ax = plt.subplots(1, 3, figsize=(10, 5))
ax[0].imshow(filtered_im, interpolation = 'nearest')
ax[1].imshow(filtered_crop_edge, interpolation = 'nearest')
ax[2].imshow(filtered_crop_mid, interpolation = 'nearest')

The median filter removed much of the noise!

Now let's see how the filtering affects our mask, and compare to the mask we made earlier.

**Exercise** Look up the documentation on the filters
Fix the bugs in the hand-written filters
Change the type of filter (median>mean, ect.)
Matching: Match the output image to the operation that produced it (from a single input image)

In [249]:
thresh = filters.threshold_otsu(filtered_im)

masked_filtered_im = np.zeros(filtered_im.shape)
masked_filtered_crop_edge = np.zeros(filtered_crop_edge.shape)
masked_filtered_crop_mid = np.zeros(filtered_crop_mid.shape)

masked_filtered_im[filtered_im > thresh] = 1
masked_filtered_crop_edge[filtered_crop_edge > thresh] = 1
masked_filtered_crop_mid[filtered_crop_mid > thresh] = 1

fig, ax = plt.subplots(1, 3, figsize=(10, 5))
ax[0].imshow(mask, vmin=0, vmax=1)
ax[1].imshow(mask_zoom_center, vmin=0, vmax=1)
ax[2].imshow(mask_zoom_edge, vmin=0, vmax=1)

fig, ax = plt.subplots(1, 3, figsize=(10, 5))
ax[0].imshow(masked_filtered_im)
ax[2].imshow(masked_filtered_crop_edge)
ax[1].imshow(masked_filtered_crop_mid)

In this case, denoising creates smoother masks but the resultant global threshold from Otsu's still does not handle cells at the edges of the image as well as cells in the center of the image. To understand why, see the histogram. 

In [250]:
sns.distplot(filtered_im.flatten(), hist_kws={'log': False}, kde=False)
plt.axvline(thresh, ls='--', lw=2, c='r')
plt.gca().set_ylim([0, 200000])