# NESM Python Workshop Part 3


## Overview

Goals:
We can't cover every use case in detail so the main goals here are:
1. Demonstrate that you can do microscopy analysis completely from python
2. Give a quick tour of the some of the many open source libs (set you up for future success)
3. Go over some domain specific examples
    - reinforce the numpy we learned in part 1
    - Show power of the open source ecosystem
    - Building blocks are all here and numpy is the commmon language of these tools

- Microscopy Image IO

- Analysis Pipeline

    Structure: two motivating examples for which we illustrate the entire pipeline of analysis
    Napari
    Many examples (see email suggestions)
    Skimage
    Scipy ndimage
    Frame-wise operations with np.vectorize
    Resource list


## IO

Not always a fully solved problem due to to proprietary formats. 

Libraries exist where people have put in the work to read all the possible types of images. 


Rock solid base: https://github.com/cgohlke/tifffile
https://github.com/tlambert03/ome-types
The future (and maybe also the now?): https://allencellmodeling.github.io/aicsimageio/


Per https://ngff.openmicroscopy.org/latest/ [zarr](https://zarr.readthedocs.io/en/stable/) will be the basis of the next generation file format.

## [Napari](https://napari.org/)
    
    
The future of microscopy image visualization. 

- Tools like Matplotlib will always have a place in the workflow, but Napari is a best in class image viewer that also 

GPU accelerated
Core developers work in bioimaging
Open source with a strong community of contributors
vs hyperslicer - 


https://www.youtube.com/watch?v=VXdFOcBCto4


Let's you work anywhere on the spectrum from pure gui to pure script no interactivity.

In [None]:
# Show a basic napari demo

## Image analysis tools

Python has a rich ecosystem of libraries
- [scipy.ndimage](https://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html)
- [skimage](https://scikit-image.org/docs/dev/api/skimage.html)
- [sklearn](https://scikit-learn.org/stable/)
- [xarray](http://xarray.pydata.org/en/stable/)
- [pandas](https://pandas.pydata.org/)


Domain Specific tools:

- [hyperspy](https://hyperspy.org/hyperspy-doc/current/user_guide/intro.html)
- [microutil](https://github.com/Hekstra-Lab/microutil)
- 
**Machine Learning:**
The two world class libraries are both primarily python:
https://pytorch.org/
https://www.tensorflow.org/  (we'll see an example later)



In [None]:
CHO = tifffile.imread("data/Fluo-N3DH-CHO/01/*")

In [None]:
from mpl_interactions import hyperslicer
import xarray as xr
plt.figure()
hyperslicer(CHO)

In [None]:
plt.figure()
hyperslicer(CHO)

In [None]:
%matplotlib widget
import tifffile
import numpy as np
import matplotlib.pyplot as plt


In [None]:
particles_raw = tifffile.imread('data/Particle.tif')


In [None]:
plt.figure()
plt.imshow(particles_raw)

### Extracting Scale Bar

I zoomed in on the image and figured out which pixel are the scale bar then used them

In [None]:
# subset the array - I already looked and know that these values are good

arr = particles_raw[960:, 850:900]
plt.figure()
plt.imshow(arr)
plt.figure()
bar = arr[10,:]
plt.plot(bar,'o-')


In [None]:
idxs = np.arange(len(bar)) * (bar > 10)
scale_bar_length_pixels = idxs.max() - idxs.min()
scale_bar_length_micron = 10
pixels_per_micron = scale_bar_length_pixels / scale_bar_length_micron
microns_per_pixel = 1/pixels_per_micron

In [None]:
plt.figure()
plt.imshow(particles_raw[:960])
dat = particles_raw[:960]


## Example - SEM image

Goals:
Compute a histogram of bead areas and decsriptive statistics for bead sizes.

To accomplish this we will string together existing tools with a bit of custom numpy.


In [None]:
import matplotlib.pyplot as plt
from skimage import data
from skimage.filters import threshold_otsu

## Thresholding


### Interactively


Sometimes it's nice to make a human judgement. This is easy to do using existing tools in the python ecosystem.  Here we use code taken nearly verbatim from an example on https://mpl-interactions.readthedocs.io/en/stable/examples/range-sliders.html#Using-a-RangeSlider-for-Scalar-arguments---Thresholding-an-Image


To make it easy to use we've also wrapped it up into a function that we can call easily on an array.

In [None]:
import mpl_interactions.ipyplot as iplt
from nesm_utils import interactive_threshold
controls, axes = interactive_threshold(dat, bins=np.arange(0,255))


In [None]:
controls.params

## Breakout Exercise

Interactively choosing thresholds does not scale 

1. Make a plot comparing multiple thresholding methods (https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_thresholding.html)
2. Segment the blobs - Use google!
3. Compute the area of all the blobs
     - `np.unique`
     - `ndi.sum_labels`
     - `skimage.measure.regionprops`
4. Remove any small objects using indexing and `np.unique(..., return_counts=True)`

In [None]:
from skimage.filters import try_all_threshold

fig, ax = try_all_threshold(dat, figsize=(10, 8), verbose=False)
plt.show()


In [None]:
threshold = threshold_otsu(dat)
thresholded = dat > threshold
plt.figure()
plt.imshow(thresholded)
plt.figure()
distance = ndi.distance_transform_edt(thresholded)
plt.imshow(distance)
coords = peak_local_max(distance,min_distance=10)
mask = np.zeros(distance.shape, dtype=bool)
mask[tuple(coords.T)] = True
markers, _ = ndi.label(mask)
labels = watershed(-distance, markers, mask=thresholded)
plt.figure()
plt.imshow(labels)

In [None]:
def remove_small_objects()
ids, counts = np.unique(labels, return_counts=True)
min_count = 10
for label in ids[counts < min_count]:
    labels[labels==label] = 0
return relabel_sequential(labels)[0]

In [None]:
from skimage.morphology import remove_small_objects

In [None]:
labels = relabel_sequential(remove_small_objects(labels, 10))[0]

In [None]:
counts[1:] * microns_per_pixel**2

In [None]:
ids, counts = np.unique(labled, return_counts=True)
fig, ax = plt.subplots()
ax.hist(counts[1:]/pixels_per_micron, bins=np.arange(0,80,2))
ax.set_xlim([0,80])

In [None]:
from skimage import measure

In [None]:
import pandas as pd

df = pd.DataFrame(
    measure.regionprops_table(
        labels,
        properties=[
            "eccentricity",
            "filled_area",
            "equivalent_diameter",
            "orientation",
            "solidity",
            "perimeter",
            'area',
        ],
    )
)
df

In [None]:
df['circularity'] = (4 * np.pi * df['area']) /df['perimeter'] **2

In [None]:
df

In [None]:
df.hist('circularity', bins=100)

In [None]:
plt.figure()
plt.scatter('circularity', 'eccentricity', data=df)
plt.ylabel('eccentricity')
plt.xlabel('circularity')

### Cell Tracking

In [None]:
import glob
files = sorted(glob.glob('data/Fluo-N3DH-CHO/01/*'))
cho_data = tifffile.imread(files)
CHO = xr.DataArray(
    cho_data,
    dims = ('T','Z','Y','X'),
    coords = {
        "T": 9.5 * np.arange(cho_data.shape[0]),
        "Z": 1.0 * np.arange(cho_data.shape[1]),
        "Y": 0.202 * np.arange(cho_data.shape[2]),
        "X": 0.202 * np.arange(cho_data.shape[3]),
    })
sq = CHO.sel(Z=3)

In [None]:
xr.Dataset({"images":sq})


In [None]:
import scipy.ndimage as ndi
thresholded = arr > 40
plt.figure()
plt.imshow(ndi.binary_fill_holes(thresholded))

In [None]:
from skimage.measure import label
labels = label(ndi.binary_fill_holes(thresholded))

In [None]:
try_all_threshold(CHO[0,0].values)

In [None]:
from skimage.feature import peak_local_max
from skimage.segmentation import watershed

image = ndi.binary_fill_holes(thresholded)
distance = ndi.distance_transform_edt(image)
coords = peak_local_max(distance, min_distance=50, footprint=np.ones((3, 3)), labels=image)
mask = np.zeros(distance.shape, dtype=bool)
mask[tuple(coords.T)] = True
markers, _ = ndi.label(mask)
labels = watershed(-distance, markers, mask=image)


In [None]:
plt.figure()
plt.imshow(labels)

In [None]:
from skimage.segmentation import relabel_sequential

In [None]:
cutoff = 10
ids, counts = np.unique(labels, return_counts=True)
for label in ids[counts < cutoff]:
    labels[labels==label] = 0


In [None]:
ids, counts = np.unique(labels, return_counts=True)

In [None]:
labels = relabel_sequential(remove_small_objects(labels,10))[0]

In [None]:
plt.figure()
plt.imshow(labels)


Fixing Watershed using Human in the loop with Napari

Tracking cells through time

### Breakout exercise


Using indexing make a plot showing the evolution of each cells area over time.

Hint: If cells are born later then it can be tricky to plot them. So start with the easy part and just plot the cells that we know exist in the first frame



### Extra Breakout Acitivity


## Closing Thoughts on this section


1. GUI vs Scripting
2. If you make scripts make them available to others!
    - Put up on github with a name
3. If you're at a company - consider open sourcing at least part of your software (see tensorflow)

## How to get help!

1. Always always always google a phrase that basically says what you want.

Here are some of the things that I googled when making this notebook:

> how to analyze EDS python

> Cell tracking python

> remove small object skimage


In general people are friendly and want to know how you are using their software and what doesn't work for you:

https://forum.image.sc/
https://discourse.matplotlib.org/
https://stackoverflow.com/
https://discourse.jupyter.org/
https://gitter.im/hyperspy/hyperspy
Opening issues!
