# Bioimage Handling in Python for Beginners
*Author: Vladislav Kim*
* [Introduction](#intro)
* [Bioimage formats and loading](#load)
* [Viewing Images in Jupyter](#view)
* [Handling multichannel images with $z$-stack](#compleximages)
* [Applications to high-content screening](#hcs)
* [Writing images](#writeimg)


<a id="intro"></a> 
## Introduction
Before applying segmentation or a machine learning model on the imaging data set there may be a number of preprocessing steps that can transform the image and adjust numerous parameters such as brightness, contrast, noise. We may also want to combine or split color channels or apply filters that enhance or suppress certain image features.


In this notebook we show how to load a microscopy image in Python using `python-bioformats`. We load images as `numpy.array` objects. We also show how the images represented as `numpy.array` can be preprocessed using various transformations avaialbe in `scikit-image` library before downstream analysis, such as segmentation, is run.


The first step before running this notebook would be to set up a conda environment with all the dependencies (see README for instructions). Once the environment is set up, activate it and start jupyter server in the environment.


In [None]:
# load third-party Python modules
import javabridge
import bioformats as bf
import skimage
import numpy as np
import matplotlib.pyplot as plt

import sys
sys.path.append('..')

javabridge.start_vm(class_path=bf.JARS)

<a id="load"></a> 
## Bioimage formats and data loading
Open Microscopy Environment (OME), a consortium of research institutes and universities, supports 150 bioimage formats. Their Java library `bioformats` is available in Python. Among the supported formats we can find common formats such as TIFF, JPG and PNG, but also proprietary formats such as Zeiss CZI, Leica LCF, Canon DNG, etc


We provide an example of an image stack (download here [insert link] and unzip in the same directory as Jupyter notebook). The first step is to load microscopy images. The images that we will be working with are in TIFF format.

To read in a basic TIFF file, initialize a bioformats reader object and provide the file name:

In the local module `transform.basic` we provide a function `read_tiff` which is a wrapper that reads in the image and outputs a `numpy.array` object

In [None]:
from transform.basic import read_tiff
img_ho = read_tiff(fname='data/CLL-coculture/r01c02f01-Hoechst.tiff')

Two-dimensional images can be represented as 2D numerical arrays (`np.array`) or matrices:

In [None]:
print(type(img_ho))
print(img_ho.shape)

<a id="view"></a> 
## Viewing Images in Jupyter
We can plot the image arrays using `mapltolib` as grey-scale images:

In [None]:
plt.figure(figsize=(7,7))
plt.imshow(img_ho)
plt.axis('off')

If a microscopy image has several channels, these can be plotted individually as grey-scale images side by side. Load another chanel of the same image:

In [None]:
img_ly = read_tiff(fname='data/CLL-coculture/r01c02f01-Ly.tiff')

Side by side view:

In [None]:
from visualize.plot_static import plot_channels
plot_channels([img_ho, img_ly], titles=['Nuclei', 'Lysosomes'], nrow=1, ncol=2)

Or combined as an RGB-overlaid image:

In [None]:
from visualize.plot_static import combine_channels
# here we use gamma correction for 'img_ho'
img_overlay = combine_channels([img_ho**0.5, img_ly],
                               colors=['blue', 'white'],
                               blend = [1.5, 0.7])

In [None]:
plt.figure(figsize=(10,10))
plt.imshow(img_overlay)
plt.axis('off')

<a id="compleximages"></a> 
## Handling compound multichannel images (high-content screening)
In addition to color information, microscopy images may have optical sections along the $z$-axis. Handling 3D multichannel data is trivial in Python, as these can be represented as (3D+color)-`np.array`. We can load one such image using `load_imgstack` function

In [None]:
from transform.basic import load_imgstack
imgstack = load_imgstack(fname="data/BiTE/Tag2-r04c02f1.tiff")

The convention is that the first dimension is reserved for optical sections ($z$-stack), the next two dimensions describe image coordinates ($xy$-plane) and the last dimension is for channel information. 

First we can use maximum intensity projection (MIP) to aggregate images along the $z$-direction and make them two-dimensional (+ color)

In [None]:
mip = np.amax(imgstack, axis=0)
print(mip.shape)

As mentioned before channels are in the last array axis (dimension), we can split the color channels and plot them side by side:

In [None]:
# split individual color channels and place them in a list
mip_split = [mip[:,:,i] for i in range(mip.shape[2])]

In [None]:
plot_channels(mip_split,
              nrow=1, ncol=4,
              titles=['CD20+', 'Calcein',
                      'Nuclei', 'CD8+'])

In [None]:
mip_color = combine_channels(mip_split, 
                             colors=['red', 'green',
                                     'blue','orange'],
                             # these are optional (see documentation)
                             blend = [0.8, 0.8, 2, 0.8],
                             gamma = [0.3, 0.3, 0.4, 0.3])

In [None]:
plt.figure(figsize=(10,10))
plt.imshow(mip_color)
plt.axis('off')

*You can skip 'Applications to High-Content Screening' upon first reading*
<a id="hcs"></a> 
## Applications to High-Content Screening
Some microscopes output a series of images instead of a single image stack. We can use the function `transform.basic.load_image_series` to load all the color channels and $z$-stack in a single `numpy.array`.


Here we will load a series of images from a high-content screen. Wells of a 384-well plate are numbered (r01 = row 1, c16 = column 16) and we would like to load a single well that has 
+ 3 color channels
+ 7 optical sections ($z$-stack)
+ 3 fields of view (sampled positions in the $xy$-plane at which the well was imaged)

Suppose we want to load well 'r01c02' and only the first field of view ('f01'):

In [None]:
# list files
import os
files = os.listdir('data/AML_screen')
print(files[:5])

In [None]:
# well r01c02, position 1 (f01)
import re
wellfiles = [re.search('r01c02f01.+', f).group() for f in files
                if re.search('r01c02f01', f)]
# sort them lexicographically
wellfiles.sort()

print(wellfiles[:5])

In [None]:
from transform.basic import load_image_series
imgseries = load_image_series(path='data/AML_screen', imgfiles=wellfiles)

In [None]:
imgseries = imgseries.reshape((10, 3, 2160,2160))

In [None]:
mipseries = np.amax(imgseries, axis=0)

In [None]:
# plot 3 channels side by side
plot_channels([mipseries[i] for i in range(3)], nrow=1, ncol=3)

In [None]:
rgbseries = combine_channels([mipseries[i] for i in range(3)],
                            colors=['blue', 'red', 'green'],
                            blend=[1.5,1.5,2],
                            gamma=[0.6, 0.6,0.6])

In [None]:
plt.figure(figsize=(10,10))
plt.imshow(rgbseries)
plt.axis('off')

<a id="writeimg"></a> 
## Writing images

In [None]:
javabridge.kill_vm()