Scale Analysis Example
====
Using scales from Rabia Sevil

We'll read the LIF files, try to segment them out (either just by thresholding or by using a pretrained model) and then run EFA to summarise their shape variation.

Read in the scales
----

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
plot = True

In [None]:
import pathlib

parent_dir = pathlib.Path("~/zebrafish_rdsf/Rabia/SOST scales").expanduser()
assert parent_dir.exists()

scale_dirs = tuple(d for d in parent_dir.glob("*") if not d.stem in {".DS_Store", "TIFs"})

In [None]:
from scale_morphology.scales import read

scale_dir = scale_dirs[1]
scale_paths = scale_dir.glob("*.lif")
path = next(scale_paths)
print(path)

names, images = zip(*read.read_lif(path))

In [None]:
import math
import textwrap
import matplotlib.pyplot as plt


def factor_int(n):
    val = math.ceil(math.sqrt(n))
    val2 = int(n / val)
    while val2 * val != float(n):
        val -= 1
        val2 = int(n / val)
    return val, val2


def plot_imgs(images, **plot_kw):
    global titles

    n_figs = factor_int(len(images))

    fig, axes = plt.subplots(*n_figs, figsize=[3 * x for x in n_figs])
    for axis, img, title in zip(axes.flat, images, titles):
        axis.imshow(img, **plot_kw)
        axis.set_title(title)
        axis.set_axis_off()
    fig.tight_layout()


titles = ["\n".join(textwrap.wrap(name, width=10)) for name in names]
if plot:
    plot_imgs(images)

Segment them
----
Now that we have read the scales into memory, we want to threshold them out.
There aren't that many, so we could probably just do this by hand, but I don't have a mouse right now so I'm going to try to do it using computers.

In [None]:
"""
Otsu threshold

"""

from tqdm.notebook import tqdm
from skimage.color import rgb2gray
from skimage.filters import threshold_otsu

greyscale = [rgb2gray(img) for img in tqdm(images)]
thresholds = [threshold_otsu(img) for img in tqdm(greyscale)]

In [None]:
import numpy as np

if plot:
    n_figs = factor_int(len(greyscale))
    fig, axes = plt.subplots(
        *n_figs, figsize=[5 * n_figs[0], 3 * n_figs[1]], sharex=True, sharey=True
    )
    for axis, img, title, threshold in zip(axes.flat, greyscale, tqdm(titles), thresholds):
        axis.hist(img.ravel(), bins=np.linspace(0, 1, 100))
        axis.axvline(threshold, color="r")
        axis.set_title(title)
        axis.set_ylim(0, 400000)
    
    plt.show(fig)

In [None]:
thresholded = [i < t for i, t in zip(greyscale, thresholds)]

In [None]:
if plot:
    plot_imgs(thresholded, cmap="binary")

In [None]:
"""
Remove objects touching the border
"""

from skimage.segmentation import clear_border

cleared_border = [clear_border(t) for t in tqdm(thresholded)]

if plot:
    plot_imgs(cleared_border, cmap="binary")

In [None]:
from scipy import ndimage


def largest_connected_component(binary_array):
    """
    Return the largest connected component of a binary array, as a binary array,
    using a 26-connectivity.

    :param binary_array: Binary array.
    :returns: Largest connected component.

    """
    labelled, _ = ndimage.label(binary_array, np.ones((3, 3)))

    # Find the size of each component
    sizes = np.bincount(labelled.ravel())
    sizes[0] = 0

    retval = labelled == np.argmax(sizes)
    print(f"{np.sum(binary_array) - np.sum(retval)}", end=" ")
    return retval


single_obj = [largest_connected_component(img) for img in tqdm(cleared_border)]

In [None]:
if plot:
    plot_imgs(single_obj, cmap="binary")

Elliptical Fourier Analysis
----
We'll summarise their shapes using Elliptical Fourier Analysis (EFA)
<a name="cite_ref-1"></a><sup>[1]</sup>
<a name="cite_ref-2"></a><sup>[2]</sup>,
which basically decomposes the boundary into sums of ellipses.
The coefficients (strength and direction of each size of ellipse) tell us about the shape of the object.
There's a demonstration of how this works [here](https://reinvantveer.github.io/2019/07/12/elliptical_fourier_analysis.html).

Our edge is constructed as:

\begin{aligned}
x(t) &= a_0 + \sum_{n=1}^{N} \big[a_n \cos(n t) + b_n \sin(n t)\big],\\
y(t) &= c_0 + \sum_{n=1}^{N} \big[c_n \cos(n t) + d_n \sin(n t)\big],
\qquad t \in [0, 2\pi].
\end{aligned}

with:

\begin{aligned}
a_0 = \frac{1}{2\pi}\int_{0}^{2\pi} x(t)\,dt,\qquad
c_0 = \frac{1}{2\pi}\int_{0}^{2\pi} y(t)\,dt.
\end{aligned}

\begin{aligned}
a_n &= \frac{1}{\pi}\int_{0}^{2\pi} x(t)\cos(n t)\,dt, &
b_n &= \frac{1}{\pi}\int_{0}^{2\pi} x(t)\sin(n t)\,dt,\\
c_n &= \frac{1}{\pi}\int_{0}^{2\pi} y(t)\cos(n t)\,dt, &
d_n &= \frac{1}{\pi}\int_{0}^{2\pi} y(t)\sin(n t)\,dt.
\end{aligned}

possibly up to some factors of $2\pi$

[^1](#cite_ref-1):  F. P. Kuhl and C. R. Giardina, ‘Elliptic Fourier features of a closed contour’, Computer Graphics and Image Processing, vol. 18, no. 3, pp. 236–258, Mar. 1982, doi: 10.1016/0146-664x(82)90034-x. 

[^2](#cite_ref-2): N. MacLeod, 'PalaeoMath 101 part 25: the centre cannot hold II: Elliptic fourier
analysis.' Palaeontol. Assoc. Newslett. 79, 29–43, 2012 http://go.palass.org/65a.