# Analysis Capabilities

## 1. Cell Area

The cell area estimation is performed using a procedure that incorporates a Gaussian Mixture Model. Two Gaussian curves are fit to the pixel intensities of an image. One curve fits the "background" pixels and one curve fits the "foreground" pixels. Cell area is computed by thresholding based on

$$ \mu_{\text{foreground}} + \gamma \times \sigma_{\text{foreground}} , $$

where $\gamma$ is some multiplier of the foreground standard deviation.

1. $\gamma = 0$: Pixels with intensities greater than $ \mu_{\text{foreground}} $ will pass the threshold
2. $\gamma > 0$: More strict than (1). Smaller $ \gamma \Longrightarrow $ fewer pixels pass the threshold
3. $\gamma < 0$: Less strict than (1). Larger $ \gamma \Longrightarrow $ more pixels pass the threshold

**Notes**: The cell area script assumes that the brightest regions of your images are the cells. If this isn't the case, consider including a preprocessing step to make them this way.

## 2. Z Projection

The Z projection of input Z stacks can be computed using several methods:

* Minimum pixel intensity
* Maximum pixel intensity
* Median pixel intensity
* Average pixel intensity
* Focus stacking (pixel-wise Laplacian)

A Z stack is a 3D collection of grayscale images. The Z projection is the 2D image that results when images are collapsed along the $z$-axis.

### Minimum pixel intensity

Set pixel $ (x, y) $ to $ (x, y, z_{\text{min}}) $, the minimum pixel intensity along the $z$-axis at location $ (x, y) $.

### Maximum pixel intensity

Set pixel $ (x, y) $ to $ (x, y, z_{\text{max}}) $, the maximum pixel intensity along the $z$-axis at location $ (x, y) $.

### Median pixel intensity

Set pixel $ (x, y) $ to $ (x, y, z_{\text{med}}) $, the median pixel intensity along the $z$-axis at location $ (x, y) $.

### Average pixel intensity

Set pixel $ (x, y) $ to $ (x, y, z_{\text{avg}}) $, the average pixel intensity along the $z$-axis at location $ (x, y) $.

### Focus stacking

Set pixel $ (x, y) $ to $ (x, y, z_{\text{max foc}}) $, the value of the most "in-focus" pixel along the $z$-axis at location $ (x, y) $.

## 3. Invasion Depth

The invasion depth (within a given Z stack) is performed using a binary classifier deep neural network (based on the ResNet50 architecture).

Given a Z stack of $ k $ slices (Z positions), the underlying classifier determines if each image has a sufficient amount of in-focus cell area to be considered to demonstrate "invasion".

In symbols, a Z stack (here shown in *descending* order)

$$ \mathbf{Z} = (z_k, z_{k-1}, ..., z_0),$$

is fed into the invasion depth analysis system, which outputs two results:

1. A collection of "probabilities", $ \hat{\mathbf{p}} $, showing the model's confidence that invasion has been identified,
$$ \hat{\mathbf{p}} = (p_k, p_{k-1}, ..., p_0),$$
2. A collection of classifications, $ \hat{y}_{p} $ thresholded at a given value (typically $p_i > 0.5$,
$$ \hat{\mathbf{y}} = (\hat{y}_{k}, \hat{y}_{k-1}, ..., \hat{y}_0) $$

## 4. Microvessel formation

Microvessel formation is analyzed using a two-step process: semantic binary segmentation and topological data analysis.

### Semantic Binary Segmentation
A trained U-Net Xception-style model segments microvessels from the image background. The model outputs a probabilistic segmentation, indicating the likelihood of each pixel being part of a vessel.

Training of the model involves:

1. Data Preparation: Dataset of fifty high-resolution, manually annotated images, split into training, validation, and test sets.
2. Data Augmentation: Application of transformations such as rotation, cropping, flipping, brightness and contrast alteration, noise addition, gaussian blur, and elastic deformations.
3. Model Training: Conducted with a grid search to determine optimal hyperparameters. The model is trained for fifty epochs.

Inference on a whole image involves two steps:
1. Segment individual, overlapping tiles on the image, with a similar crop size used during training.
2. Uses code from [Smoothly-Blend-Image-Patches](https://github.com/Vooban/Smoothly-Blend-Image-Patches) to blend the overlapping tiles together, producing a realistic, smooth segmentation of the whole image.

### Topological Data Analysis
After segmentation, the microvessel network is analyzed using the Disperse algorithm to extract a graph representation, referred to as the Morse skeleton. This step involves:

1. Preprocessing: Images are preprocessed to remove background noise and isolate the microvessel network.
2. Graph Extraction: The Disperse algorithm extracts a graph representation from the segmented images.
3. Network Simplification: Includes removing branches shorter than a threshold and smoothing branch trajectories. At this point, we can extract a color-coded visualization of the network.
4. Persistence Homology: Utilizes persistent homology to characterize the branching structure of the microvessels, resulting in a persistence barcode. The basic metrics we extract are the average branch length and number of branches.

**Notes**:
- Like the cell area script, the microvessel analysis script assumes that the brightest regions of your images are the cells. If this isn't the case, consider including a preprocessing step to make them this way.
- The script requires a trained model. See notebooks/microvessels_segmentation_training. Recommended: Try using the pretrained model first. If the pretrained model does not work well, use an interactive segmentation tool to annotate 50 or so of your images and train a new model.