# Last Lesson: Image Quantification

Two of the most common questions people ask given a fluorescence dataset are (1) the location and (2) the local fluorescence intensity of their favorite fluorescent molecule in a sample. Today, we will first review on how to use previously introduced preprocessing steps that allowed us to detect region of interests (ROIs) in a fluorescent image by creating a binary mask with an intensity threshold. Then we will learn some new preprocessing steps including:

* Morphological operations to optimize a mask
* Quantify the change in localization and amount of your favorite protein

## Download files:

We will work with data from an experiment to determine whether treatment with a drug causes a shift in the localization of a protein.

**Hypothesis: Treatment with drug A will cause a decrease in the total amount of protein Y.** 

Make sure you have downloaded the following files in your working directory:

* DMSO.tif
* DMSO_metadata.json
* drugA.tif
* drugA_metadata.json

Our end goal is to quantify the change in nuclear localization and amount of your favorite protein (**yfp**) with drug treatment. We would like to be able to answer two questions: 

1) Does the *total* amount of yfp per cell change with drug treatment and 

2) How does the localization of yfp change between the nucleus and the cytoplasm? 

Addressing these questions requires care when choosing the preprocessing algorithms to apply and their ordering, as well as batch processing across datasets.

## Pipeline design: ordering steps for fluorescence quantification

    (1) Read image data and metadata files
    (2) In this case for a confocal z-stack, extract a single slice of image to preprocess
    (3) View your sample image and decide what preprocessing algorithms to use
    (4) Filter out noise in the background of the sample image
    (5) Get an intensity threshold using the filtered image to create a binary mask
    (6) Optimize the mask using morphological operations
    (7) Identify and quantify the fluorescence intensity of ROIs using the optimized mask

## Load libraries

In [1]:
# Load all of the useful libraries
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from skimage.io import imread
import json

# make sure our plots are shown inline
%matplotlib inline

# set the plot display format to your favorite settings
sns.set_style('dark', rc={'image.cmap':'inferno'})

## Step 1: Read images and metadata.json files

In [2]:
# set image path, make sure to add a back slash at the end!
imagepath = "../data/confocal_drug_panel/"

# set image file names
imagename_drug = "drugA.tif"
imagename_nodrug = "DMSO.tif"

# set metadata file name (we only need to load one file for our purpose here)
metaname = "DMSO_metadata.json"

In [3]:
# import images
data_drug = imread(imagepath + imagename_drug)
data_nodrug = imread(imagepath + imagename_nodrug)

In [4]:
# check out the image dimension/shape
# what do you think each dimension means?
data_drug.shape

In [5]:
# the answer to above question
# Want to know how the function "format" works? Check out this link: https://pyformat.info/
print("The first dimension {0:d} represents number of slices in this confocal z-stack".format(data_drug.shape[0]))
print("The second dimension {0:d} represents number of rows (i.e., y) in each slice of image".format(data_drug.shape[1]))
print("The third dimension {0:d} represents number of columns (i.e., x) in each slice of image".format(data_drug.shape[2]))
print("The fourth dimension {0:d} represents number of fluorescent channels".format(data_drug.shape[3]))

# Challenge: write a code in the cell provided below to display slice 3, channel 1 in data_drug

In [6]:
# display slice 3 and channel 1 in data_drug
plt.imshow(data_drug[2, :, :, 0])

### How does a metadata file help us?

In [7]:
# import metadata
# The "with" function will make sure the metadata file will be closed by the end of the function
# this will prevent any alteration to the original file; mode 'r' stands for read only
with open(imagepath + metaname, mode='r') as f_nodrug:
    meta_nodrug = json.load(f_nodrug)

In [8]:
# Recall metadata.json file is a dictionary with keys and corresponding values
# Let's print out each of the key and its correspoonding value
# Note: there is a built-in method items() that returns a list of dict's (key, value) pairs
for key, value in meta_nodrug.items():
    print("This is the key: ", key)
    print("This is the value: ", value)

## Step 2: Extract a sinlge slice of image from this confocal z-stack

In [9]:
# Initialize two empty dicts to store the slice of z-stack we want to analyze (curly brackets!)
# Because there are three channels, our final dictionary will have keys = channel name,
# and value = corresponding image in the channel
drug_slice = {}
nodrug_slice = {}

# Initialize slice number we want to extract
slicenum = 3

# Use your favorite for loop to fill in the dicts!
# Remember that meta_nodrug['channels'] is a list, the "enumerate" function allows you to index the list
for idx, channel in enumerate(meta_nodrug["channels"]):
    
    # recall the dimension/shape of the z-stack is (slice, row, column, channel)
    drug_slice[channel] = data_drug[slicenum,:,:,idx]
    nodrug_slice[channel] = data_nodrug[slicenum,:,:,idx]
    
    # print out the channel to make sure we fill in the dict with correct keys
    print(channel)

## Step 3: View your images (always!)

In [10]:
# Settings to increase the image contrast to be used in the "imshow" function
# minimum intensity to display:
low =  0
# maximum intensity to display:
# 40% of the maximum intensity in the actin channel (this percentage is arbitrary)
# recall that lowering the percentage of max intensity to display makes the contrast brighter
top = nodrug_slice["actin"].max()*0.4

# initialize a new figure, then imshow no drug data in a subplot
# ax is a handle to each subplot that allows us to control each subplot
fig, ax = plt.subplots(1, 3, figsize = (20, 5))
ax[0].imshow(nodrug_slice["actin"], vmin = low, vmax = top)
ax[1].imshow(nodrug_slice['nucleus'], vmin = low, vmax = top)
ax[2].imshow(nodrug_slice["your_fav_protein"], vmin = low, vmax = top)
# add title to the whole figure, note that is "suPtitle" not "subtitle"!
fig.suptitle('Before Drug', fontsize = 30)

# initialize another new figure, then imshow drug data in a subplot
fig, ax = plt.subplots(1, 3, figsize = (20, 5))
ax[0].imshow(drug_slice["actin"], vmin = low, vmax = top)
ax[1].imshow(drug_slice['nucleus'], vmin = low, vmax = top)
ax[2].imshow(drug_slice["your_fav_protein"], vmin = low, vmax = top)
# add title to the whole figure
fig.suptitle('After Drug', fontsize = 30)

# Challenge: How does drug treatment affect yfp intensity and nuclear localization? What measurements can answer these questions? What imperfections can get in the way?

Given using the SAME contrast to view the images, we can intuitively believe that (1) the nuclear intensity of yfp decreases and (2) yfp moves out from the nucleus to the cytoplasm after drug treatment. To justify our conclusions we need to quantify the yfp intensity both in the nucleus and in the cytoplasm. However, we need to perform image preprocessing to remove the dead pixels and high background fluorescence due to un-even illumination.

## Step 4: Remove background noise in the image by applying filters

In [11]:
# let's zoom in on one of the images
# set arbitrary figure size in inches
plt.figure(figsize = (10, 10))
# imshow image
plt.imshow(nodrug_slice["actin"][200:400, 300:500], vmin = low, vmax = top)

There are at least two problems with our image after zooming-in on them:

    (1) Non-uniform illumination resulting in high background fluorescence
    (2) Black spots that most likely are caused by dead pixels in camera

To resolve these two problems, we will apply our previously learned preprocessing algorthim - filter - to remove these annoying noises. With the filtered image, we can easily identify an intensity threshold and create a binary (i.e., black/white) mask for further image quantification.

### What is a filter and how it works?

To filter out an image, we need two things (1) an image to be filtered and (2) a structuring element (square, disk, or any shape of structure) with a certain size. For example, say we have an image with a size of 100 x 100, and a square matrix with a size of 3 x 3 as a structuring element. To "filter" the image, we move the square across the whole image and apply a mathematical operation to the neighbors within the square overlapping with the image, and then replace the intensity value at the center of the image overlapping with the square by the calculated result. The kind of mathematical operation depends on the type of filter function you call. There are two types of filter functions, linear and non-linear. The common ones we use are non-linear including (1) **median filter**, good for removing "salt and pepper" noises such as dead pixels from a camera, and (2) **minimum filter** (typically being used in rolling ball background subtraction), good for removing non-uniform illumination. The size of the structuring element determines the value of the replaced pixel as well as how much the original image will be blurred by the filter. The bigger the structuring element, the more neighboring pixels will be included to calculate the replaced value. So a good rule of thumb when determining an appropriate filter size is that is should be the smallest filter that sufficiently flattens the visible noise in the background. Many of these operations do not have well-accepted statistical tests for determing the appropriate parameters, so care needs to be taken to record and reproduce processing steps with the same parameters.

In [12]:
# load more useful libraries
from scipy.ndimage.filters import median_filter
from scipy.ndimage.filters import minimum_filter
from skimage import filters

In [13]:
# we will work on both drug and no drug images, and in both the actin and nucleus channels
channels_of_interest = ['actin', 'nucleus']

# WARNING: Do NOT do anything to the yfp channel as you don't want to manipulate the real intensity data
# filtering is ONLY for creating a mask to identify ROIs!

# filter parameters
# dead pixels are small so a square with a size of 2 x 2 is good enough
median_filter_size = 2
# the filter size for a minimum filter should be set to at least the size of the largest object 
# that is not part of the background
min_filter_size = 101

# initialize empty dictionary to store processed images
clean_drug = {}
clean_nodrug = {}

# loop through both channels to filter image in each channel
for channel in channels_of_interest:
    
    # copy image so that raw is kept raw!
    original_drug = drug_slice[channel].copy()
    original_nodrug = nodrug_slice[channel].copy()
    
    # apply median filter to remove dead pixels
    filtered_drug = median_filter(original_drug, size = median_filter_size)
    filtered_nodrug = median_filter(original_nodrug, size = median_filter_size)
    
    # apply minimum filter to remove high background fluorescence (i.e., rolling ball background subtraction)
    # drug
    background_drug = minimum_filter(filtered_drug, size = min_filter_size)
    bgs_drug = filtered_drug - background_drug
    # no drug
    background_nodrug = minimum_filter(filtered_nodrug, size = min_filter_size)
    bgs_nodrug = filtered_nodrug - background_nodrug
    
    # store in dictionary
    clean_drug[channel] = bgs_drug
    clean_nodrug[channel] = bgs_nodrug

In [14]:
# get a feel on what minimum filter is doing by visualization
fig, ax = plt.subplots(1, 3, figsize = (12, 6))
# view original image
ax[0].imshow(original_nodrug, vmin = low, vmax = top)
ax[0].set_title('Original', fontsize = 20)
# view image after applying a minimum filter, i.e., the background
ax[1].imshow(background_nodrug, vmin = low, vmax = top)
ax[1].set_title('Minimum filtered', fontsize = 20)
# view image after subtracting the background
ax[2].imshow(bgs_nodrug, vmin = low, vmax = top)
ax[2].set_title('Background subtracted', fontsize = 20)

In [15]:
# let's now view our original and processed images side by side
# set channel to view
# ch = "nucleus"
ch = "actin"
fig, ax = plt.subplots(1, 2, figsize = (12, 6))
ax[0].imshow(nodrug_slice[ch], vmin = low, vmax = top)
ax[0].set_title('Original', fontsize = 20)
ax[1].imshow(clean_nodrug[ch], vmin = low, vmax = top)
ax[1].set_title('Filtered', fontsize = 20)
fig.suptitle('Before Drug', fontsize = 30)

fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0].imshow(drug_slice[ch], vmin = low, vmax = top)
ax[0].set_title('Original', fontsize = 20)
ax[1].imshow(clean_drug[ch], vmin = low, vmax = top)
ax[1].set_title('Filtered', fontsize = 20)
fig.suptitle('After Drug', fontsize = 30)

In [16]:
# let's zoom in again to see whether we removed the dead pixels
fig, ax = plt.subplots(1, 2, figsize = (14, 7))
ax[0].imshow(nodrug_slice[ch][200:400, 300:500], vmin = low, vmax = top)
ax[0].set_title('Original', fontsize = 30)
ax[1].imshow(clean_nodrug[ch][200:400, 300:500], vmin = low, vmax = top)
ax[1].set_title('Filtered', fontsize = 30)

## Step 5: Create a binary mask using intensity threshold

With these filtered image, we can now figure out an intensity threshold to create a binary mask.

### Method 1: Manually figure out an intensity threshold

In [17]:
# let's use one of the filtered images as an example
sample = clean_nodrug['actin'].copy()
# collapse all pixels in this 2D image into a 1D array
sample_array = sample.flatten()
# plot a histogram
plt.figure(figsize = (12, 6))
# kde is an optional argument for displaying a fitted distribution curve, by default it's True (i.e., on)
# we want to turn it off (i.e., set to False) here
sns.distplot(sample_array, kde = False)
# change x limit to zoom in on the plot
plt.gca().set_xlim([0, 21000])
# label axes
plt.xlabel('Intenisty per pixel', fontsize = 20)
plt.ylabel('Counts', fontsize = 20)

In [18]:
# manually determined intensity threshold
manual_thresh = 2500

# create a binary mask (black = 0; white = 1)
# here we are using a logical operator:
# anything bigger than the compared value will be true or set to 1, anything else will be false or set to 0
sample_manual_mask = sample > manual_thresh

# visualize the original image and mask side by side
fig, ax = plt.subplots(1, 2, figsize = (12, 6))
ax[0].imshow(sample, vmin = low, vmax = top)
# note that the mask is a binary image with only 0's and 1's, so no need to add vmin or vmax
ax[1].imshow(sample_manual_mask)

### Method 2: Automatically figure out an intensity threshold using built-in functions

As you can imagine, manually determine an intensity threshold for hundreds of images in a time-lapse or z-stack will be extremely inefficient! Fortunately, python has some built-in functions that will allow us to threshold much faster! Here, we will demonstrate one of the methods called Otsu's method. To checkout other thresholding methods, see the link: http://scikit-image.org/docs/dev/api/skimage.filters.html

In [19]:
# Use Otsu's method to determine the intensity threshold
otsu_thresh = filters.threshold_otsu(sample)
print("The Otsu's intensity threshold is: ", otsu_thresh)

In [20]:
# Does Otsu's threshold make sense? We can check it on the histogram
# plot histogram
plt.figure(figsize = (12, 6))
sns.distplot(sample_array, kde = False)
# overlay the Otsu's threshold
plt.axvline(otsu_thresh, ls = '--', lw = 2, c = 'r', label = 'Otsu threshold')
# overlay the manual threshold
plt.axvline(manual_thresh, ls = '--', lw = 2, c = 'g', label = 'Manual threshold')
# change x limit to zoom in on the plot
plt.gca().set_xlim([0, 21000])
# label axes
plt.xlabel('Intenisty per pixel', fontsize = 20)
plt.ylabel('Counts', fontsize = 20)
# plot legend
plt.legend(fontsize = 30)

In [21]:
# create a binary mask using Otsu's threshold
sample_otsu_mask = sample > otsu_thresh

# visualize the original image, manual mask, and otsu mask side by side
fig, ax = plt.subplots(1, 3, figsize = (18, 6))
ax[0].imshow(sample, vmin = low, vmax = top)
ax[0].set_title('original', fontsize = 30)
ax[1].imshow(sample_manual_mask)
ax[1].set_title('manual', fontsize = 30)
ax[2].imshow(sample_otsu_mask)
ax[2].set_title('otsu', fontsize = 30)

In [22]:
# In fact, the yen's thresholding method is much better after some try-outs
# Use Yen's method to determine the intensity threshold
yen_thresh = filters.threshold_yen(sample)
print("The Yen's intensity threshold is: ", yen_thresh)

# Check Yen's thresholding value on the histogram
# plot histogram
plt.figure(figsize = (12, 6))
sns.distplot(sample_array, kde = False)
# overlay the manual threshold
plt.axvline(manual_thresh, ls = '--', lw = 2, c = 'g', label = 'Manual threshold')
# overlay the Otsu's threshold
plt.axvline(otsu_thresh, ls = '--', lw = 2, c = 'r', label = 'Otsu threshold')
# overlay the Yen's threshold
plt.axvline(yen_thresh, ls = '--', lw = 2, c = 'b', label = 'Yen threshold')
# change x limit to zoom in on the plot
plt.gca().set_xlim([0, 21000])
# label axes
plt.xlabel('Intenisty per pixel', fontsize = 20)
plt.ylabel('Counts', fontsize = 20)
# plot legend
plt.legend(fontsize = 30)

In [23]:
# create a binary mask using Yen's threshold
sample_yen_mask = sample > yen_thresh

# visualize the original image and mask side by side
fig, ax = plt.subplots(1, 3, figsize = (18, 6))
ax[0].imshow(sample, vmin = low, vmax = top)
ax[0].set_title('original', fontsize = 30)
ax[1].imshow(sample_otsu_mask)
ax[1].set_title('otsu', fontsize = 30)
ax[2].imshow(sample_yen_mask)
ax[2].set_title('yen', fontsize = 30)

### Apply built-in intensity thresholding methods to the whole image set

In [24]:
# initialize an empty dictionary to store masks for different channels
drug_masks = {}
nodrug_masks = {}

# loop through each channel
for channel in channels_of_interest:
    
    # copy filtered image
    filtered_drug = clean_drug[channel].copy()
    filtered_nodrug = clean_nodrug[channel].copy()
    
    # autothreshold using Yen's method
    yen_thresh_drug = filters.threshold_yen(filtered_drug)
    yen_thresh_nodrug = filters.threshold_yen(filtered_nodrug)
    
    # create a binary mask
    masked_drug = filtered_drug > yen_thresh_drug
    masked_nodrug = filtered_nodrug > yen_thresh_nodrug

    # fill in dict
    drug_masks[channel] = masked_drug
    nodrug_masks[channel] = masked_nodrug

In [25]:
# let's now view our processed images and masks side by side
# set channel to view
# ch = "nucleus"
ch = "actin"
fig, ax = plt.subplots(1, 2, figsize = (12, 6))
ax[0].imshow(clean_nodrug[ch], vmin = low, vmax = top)
ax[1].imshow(nodrug_masks[ch])
fig.suptitle('Before Drug', fontsize = 30)

fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0].imshow(clean_drug[ch], vmin = low, vmax = top)
ax[1].imshow(drug_masks[ch])
fig.suptitle('After Drug', fontsize = 30)

## Step 6: Optimize the mask using morphological operations

## Reminder:
**Hypothesis: Treatment with drug A will cause a decrease in the total amount of protein Y.** 

Our end goal is to quantify yfp before and after treatment in nucleus vs. cytoplasm seperatly. 
Questions: 

1) Does the *total* amount of yfp per cell change with drug treatment and 

2) How does the localization of yfp change between the nucleus and the cytoplasm? 

**Morphological Operations and Why:**

Background - https://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/ImageProcessing-html/topic4.htm

Documentation: http://scikit-image.org/docs/dev/api/skimage.morphology.html

Use to clean up image, minimize noise in image and create nice fitted masks.

**What are we doing:** 
Let's try to isloate the cells with a cleaned-up mask

In [26]:
# import more useful libraries
import skimage.morphology as sm

In [27]:
# initialize empty dicts to store refined masks
refined_drug_masks = {}
refined_nodrug_masks = {}

# loop through each channel
for channel in channels_of_interest:
    
    # copy original mask
    drug_mask = drug_masks[channel].copy()
    nodrug_mask = nodrug_masks[channel].copy()
    
    # close and then open holes
    # drug
    drug_morph1 = sm.binary_closing(drug_mask,sm.disk(3))
    drug_morph2 = sm.binary_opening(drug_morph1,sm.disk(3))
    drug_morph3 = sm.remove_small_objects(drug_morph2, 300)
    # no drug
    nodrug_morph1 = sm.binary_closing(nodrug_mask,sm.disk(3))
    nodrug_morph2 = sm.binary_opening(nodrug_morph1,sm.disk(3))
    nodrug_morph3 = sm.remove_small_objects(nodrug_morph2, 300)
    
    # fill in dicts
    refined_drug_masks[channel] = drug_morph3
    refined_nodrug_masks[channel] = nodrug_morph3

In [28]:
# let's now view our processed images, original masks, and improved masks side by side
# set channel to view
# ch = "nucleus"
ch = "actin"
fig, ax = plt.subplots(1, 3, figsize = (18, 6))
ax[0].imshow(clean_nodrug[ch], vmin = low, vmax = top)
ax[0].set_title('original image', fontsize = 20)
ax[1].imshow(nodrug_masks[ch])
ax[1].set_title('original mask', fontsize = 20)
ax[2].imshow(refined_nodrug_masks[ch])
ax[2].set_title('improved mask', fontsize = 20)
fig.suptitle('Before Drug', fontsize = 30)

fig, ax = plt.subplots(1, 3, figsize=(18, 6))
ax[0].imshow(clean_drug[ch], vmin = low, vmax = top)
ax[0].set_title('original image', fontsize = 20)
ax[1].imshow(drug_masks[ch])
ax[1].set_title('original mask', fontsize = 20)
ax[2].imshow(refined_drug_masks[ch])
ax[2].set_title('improved mask', fontsize = 20)
fig.suptitle('After Drug', fontsize = 30)

## Step 7: Quantify intensity of yfp both in nucleus and cytoplasm

### What are we doing:
1. Isolate nucleus 
2. Average signal intensity in cytoplasm vs. nucleus.

In [29]:
# Add another key in the mask dict for a "cell body" mask that represents the cytoplasm
# ^ is essentially a minus sign, used in binary operations
refined_drug_masks['cell_body'] = refined_drug_masks['actin'] ^ refined_drug_masks['nucleus']
refined_nodrug_masks['cell_body'] = refined_nodrug_masks['actin'] ^ refined_nodrug_masks['nucleus']

In [30]:
# let's visualize all of the channels in the masks
fig, ax = plt.subplots(1, 3, figsize = (18, 6))
ax[0].imshow(refined_nodrug_masks['nucleus'])
ax[0].set_title('nucleus', fontsize = 20)
ax[1].imshow(refined_nodrug_masks['actin'])
ax[1].set_title('actin', fontsize = 20)
ax[2].imshow(refined_nodrug_masks['cell_body'])
ax[2].set_title('cell body', fontsize = 20)
fig.suptitle('Before Drug', fontsize = 30)

fig, ax = plt.subplots(1, 3, figsize = (18, 6))
ax[0].imshow(refined_drug_masks['nucleus'])
ax[0].set_title('nucleus', fontsize = 20)
ax[1].imshow(refined_drug_masks['actin'])
ax[1].set_title('actin', fontsize = 20)
ax[2].imshow(refined_drug_masks['cell_body'])
ax[2].set_title('cell body', fontsize = 20)
fig.suptitle('After Drug', fontsize = 30)

In [31]:
# let's calculate a mean nuclear and cytoplasmic intensities of yfp. For this, we'll apply our masks to the image of interest.
yfp_drug = drug_slice['your_fav_protein']
yfp_nodrug = nodrug_slice['your_fav_protein']

# extract nucleus and cell body masks
nucleus_drug = refined_drug_masks['nucleus'].copy()
cell_body_drug = refined_drug_masks['cell_body'].copy()
cell_whole_drug = refined_drug_masks['actin'].copy()

nucleus_nodrug = refined_nodrug_masks['nucleus'].copy()
cell_body_nodrug = refined_nodrug_masks['cell_body'].copy()
cell_whole_nodrug = refined_nodrug_masks['actin'].copy()

# get nuclear intensities
# drug
nuclear_intensities_drug = yfp_drug.copy()
nuclear_intensities_drug[~nucleus_drug] = 0
# no drug
nuclear_intensities_nodrug = yfp_nodrug.copy()
nuclear_intensities_nodrug[~nucleus_nodrug] = 0

# get cytoplasmic intensities
# drug
cytoplasmic_intensities_drug = yfp_drug.copy()
cytoplasmic_intensities_drug[~cell_body_drug] = 0
# no drug
cytoplasmic_intensities_nodrug = yfp_nodrug.copy()
cytoplasmic_intensities_nodrug[~cell_body_nodrug] = 0

# get whole cell intensities
# drug
cell_intensities_drug = yfp_drug.copy()
cell_intensities_drug[~cell_whole_drug] = 0
# no drug
cell_intensities_nodrug = yfp_nodrug.copy()
cell_intensities_nodrug[~cell_whole_nodrug] = 0

In [32]:
# visualize the intensities
fig, ax = plt.subplots(1, 3, figsize=(15, 5))
ax[0].imshow(nuclear_intensities_nodrug)
ax[0].set_title('nucleus', fontsize = 20)
ax[1].imshow(cytoplasmic_intensities_nodrug)
ax[1].set_title('cytoplasm', fontsize = 20)
ax[2].imshow(cell_intensities_nodrug)
ax[2].set_title('cell', fontsize = 20)
plt.suptitle('Before Drug', fontsize = 30)

fig, ax = plt.subplots(1, 3, figsize=(15, 5))
ax[0].imshow(nuclear_intensities_drug)
ax[0].set_title('nucleus', fontsize = 20)
ax[1].imshow(cytoplasmic_intensities_drug)
ax[1].set_title('cytoplasm', fontsize = 20)
ax[2].imshow(cell_intensities_drug)
ax[2].set_title('cell', fontsize = 20)
plt.suptitle('After Drug', fontsize = 30)

In [33]:
# We can now easily calculate the mean nuclear and cytoplasmic intensities before and after drug treatment.

# collapse 2D image to 1D array
nuclear_array_nodrug = nuclear_intensities_nodrug[nuclear_intensities_nodrug > 0].flatten()
nuclear_array_drug = nuclear_intensities_drug[nuclear_intensities_drug > 0].flatten()
cytoplasmic_array_nodrug = cytoplasmic_intensities_nodrug[cytoplasmic_intensities_nodrug > 0].flatten()
cytoplasmic_array_drug = cytoplasmic_intensities_drug[cytoplasmic_intensities_drug > 0].flatten()
cell_array_nodrug = cell_intensities_nodrug[cell_intensities_nodrug > 0].flatten()
cell_array_drug = cell_intensities_drug[cell_intensities_drug > 0].flatten()

# plot histogram before drug
plt.figure(figsize = (12, 6))
sns.distplot(nuclear_array_nodrug, kde=True, label='nuclear')
sns.distplot(cytoplasmic_array_nodrug, kde=True, label='cytoplasmic')
# sns.distplot(cell_array_nodrug, kde=True, label='cell')

plt.legend(fontsize = 20)
plt.title('Before Drug', fontsize = 30)

# plot histogram after drug
plt.figure(figsize = (12, 6))
sns.distplot(nuclear_array_drug, kde=True, label='nuclear')
sns.distplot(cytoplasmic_array_drug, kde=True, label='cytoplasmic')
# sns.distplot(cell_array_drug, kde=True, label='cell')

plt.legend(fontsize = 20)
plt.title('After Drug', fontsize = 30)

print("The average nuclear intensity before drug treatment is: {0:.2f} AU".format(np.mean(nuclear_array_nodrug)))
print("The average cytoplasmic intensity before drug treatment is: {0:.2f} AU.".format(np.mean(cytoplasmic_array_nodrug)))
print("The average cell intensity before drug treatment is: {0:.2f} AU.".format(np.mean(cell_array_nodrug)))

print("The average nuclear intensity after drug treatment is: {0:.2f} AU".format(np.mean(nuclear_array_drug)))
print("The average cytoplasmic intensity after drug treatment is: {0:.2f} AU.".format(np.mean(cytoplasmic_array_drug)))
print("The average cell intensity after drug treatment is: {0:.2f} AU.".format(np.mean(cell_array_drug)))

# Challenge: Is there a change in the intensities after drug treatment? Was our hypothesis about the drug true or false? Recall our hypothesis: Treatment with drug A will cause a decrease in the total amount of protein Y.

It's interesting to see that there is a bimodal distribution of nuclear intensities before drug treatment. Caution: the mean cytoplasmic intensity printed out from the previous cell is thus not the Gaussian mean of the two peaks!

Both the mean nuclear intensity of the protein and the cytoplasmic intensity decreased after drug treatment, suggesting that our hypothesis is true.