## Analyse bacteria colony sizes

**You can solve this exercise in pairs and/or discuss with your collegues as much as you like**

The directory ```data/bacteria/``` contains "microscope images" from experiments done with three different bacteria, b1, b2 and b3, each treated with three different drugs, d1, d2 and d3 plus control group for each bacteria.
The image are named according to bacteria and drug.
We will process and analyse the images to see how the drugs have affected the sizes of the colonies the bacterias create. We will look at the area of the colonies.

Then we will create a data table with all the collected data.

Look at the images and see if you can figure out what preprocessing need to be done before labeling the colonies.

Basic algorithmic layout:

- import needed libraries (start with the ones you know and add more as you go along)

- use a for loop to loop over the needed files in the directory and to the following for each file:
    you probably need the index as well

- open the tiff image and fetch the user defined metadata and read the image data.

- preprocess
- apply filters

- remove items touching border

- connected component analysis
labeling

- print how many colonies are found
- plot the image

- Store the loaded images,the labeld images and the metadata in arrays for later use!

In [None]:

import matplotlib.pyplot as plt
from skimage import color,measure
from skimage.filters import gaussian, median, threshold_otsu
from skimage.segmentation import clear_border
from skimage.morphology import area_opening, disk
import glob
import tifffile


In [None]:

#image directory
img_dir = "../data/bacteria/"  
labeled = []
images = []
metadata = []


In [None]:
#write the rest of your code here


In [None]:

for i, img_name in enumerate(glob.glob(img_dir + "*.tiff")):
    
    # 1. Load the image
    
    with tifffile.TiffFile(img_name) as tif:
        page = tif.pages[0]
        desc = page.tags["ImageDescription"].value
        img = tif.asarray()
    
    metadata.append(desc)
    images.append(img)   # replace with your image
    image = images[i]
    gray = color.rgb2gray(image)

    #use median filter to remove salt & pepper noise
    median_bg = median(gray, disk(4))

    print(metadata[i])

    # use gaussian filter to estimate uneven background illumination
    gaussian_bg = gaussian(gray,sigma=200,preserve_range=True)

    # subtract background fr
    gray = median_bg - gaussian_bg
    
    # 2. Threshold the image to separate colonies from background
    thresh = threshold_otsu(gray)
    binary = gray > thresh   # invert if colonies appear dark

    # remove segmentations that touch the border of the iamge
    binary = clear_border(binary)

    # remove small objects
    cleaned = area_opening(binary, area_threshold=70)

    # 4. Label connected regions (colonies). You might want to set the background parameter. Look at the documentation.
    labeled_img = measure.label(cleaned, background=0)
    labeled.append(labeled_img)
    labeled[i] = labeled_img
    regions = measure.regionprops(labeled_img)

    # 5. Print colony sizes.
    print("Number of colonies detected:", len(regions))

    # 6. Visualize results
    fig, ax = plt.subplots(1, 2, figsize=(14, 7))
    ax[0].imshow(cleaned, cmap="gray")
    ax[0].set_title("Grayscale Image")
    ax[1].imshow(labeled_img, cmap="nipy_spectral")
    ax[1].set_title("Detected Colonies")
    plt.show()


Now, lets collect the data and store it in a data fram for statistical analysis later

Create tables of regions of all the loaded images and labled images by using '''skimage.measure.regionprops_table(...)'''

Create panda frames from each table by adding the bacteria, the drug and the area for each colony.
Bacteria and Drug are stored int he metadata.

Use json package to read out bacteria and drug for each image/metadata

Concatenate all frames to a single frame and save it as csv


Now we will gather the data we are interested in from the labeled images and store it in a table.
We will also add information to the the tables to identify the bacteria and drug used.

Algorithm outline:

- import the needed libraries as before

- crete a dataframe to store all other frames

- loop over all the labeled image (the array you created)

- crete the properties table using ```skimage.measure.regionprops_table( labeld image, original image, properties=['area','feret_diameter_max'])```

- create the dataframe from the properties table

- read the bacteria and drug metadata from the array before

- add the columns 'bacteria' and 'drug' to the dataframe and set the respective data in it

- add the frame to the toal frame

- after the for loop, save the total frame and print some information about the total frame

In [None]:

import pandas as pd
import json

frame_total = pd.DataFrame()
for i in range(len(labeled)):
    
    
    props = measure.regionprops_table(labeled[i], images[i],
                           properties=['area','feret_diameter_max'])

    frame = pd.DataFrame(props)
    
    info = json.loads(metadata[i])
    bacteria = info["bacteria"]
    drug = info["drug"]
    
    frame["bacteria"] = bacteria
    frame["drug"] = drug
    
    frame_total = pd.concat([frame_total, frame]) 

frame_total.to_csv('../data/bacteria_results_total.csv')
frame_total.describe()
#frame_total.head(40)
