# Image compression for functional multiphoton imaging

Multiphoton images are *noisy*: the signal is dominated by the quantum noise arising from the limited number of photons detected by the photomultiplier tubes. The noise contributes to the high entropy of the images, making them difficult to compress *losslessly*. Our goal is to accurately encode the underlying biological signals, not the recorded noise.

It is possible to re-quantize the images through a grayscale transformation optimized for preserving the underlying signal while reducing the total number of grayscale levels. This results in a significant reduction of signal entropy. 

## Grayscale quantization

Let $x$ represent the underlying biological signal, which is a function of time and space. 
Let $\varepsilon$ represent the noise introduced by the data acqusition process. 
We will assume that the noise is unbiased $\mathop{\mathbb{E}}[\varepsilon]=0$ with variance $\sigma^2 = \mathop{\mathbb{E}}[\varepsilon^2]$.

Let $d$ be the offset introduced by the grayscale quantization with step $\delta$.

Thus the recorded signal is $$\dot x = x + \varepsilon + d$$

Let $\beta = \delta/\sigma$ be the quantization step-to-noise ratio. 


For $\beta < 1.0$, the effect of grayscale quantization can be approximated as an independent random variable distributed uniformly on the interval $\left[-\delta/2, +\delta/2\right]$.

Then the mean and the variance of the noise and quantization are $$\mathop{\mathbb E}[\varepsilon + d] = 0$$ and $$\mathop{\mathbb E}\left[(\varepsilon + d)^2\right] = \mathop{\mathbb E}\left[\varepsilon^2\right] + \mathop{\mathbb E}\left[d^2\right] = \sigma^2 + \frac 1 6 \delta^2 =  \left(1 + \frac 1 6 \beta^2\right) \sigma^2$$

This means that that quantization modifies the standard deviation of the noise by the factor of $\sqrt{1 + \frac 1 6 \beta^2}$. This factor express the releative drop in signal-to-noise ratio (SNR) introduced by quantization. The degrading effect of quantization on the variance of the estimate does not become signficant until $\beta$ exceeds 0.5:  

 | $\beta$ |  SNR change | 
 | --------| ------------ |
 | 0.01 | -0.00083% |
 | 0.1  | -0.083% |
 |  0.2  |  -0.33% |
 |  0.5  |  -2.1% | 
 |  0.7  |  -4.0% |
 | 1.0  |    -8.0% |
 
 In general, we will assert that quantizations finer than $\beta=0.5$ have no substantial improvement on signal estimation.

In [2]:
print(f'β\t SNR change')
for β in 0.01, 0.1, 0.2, 0.5, 0.7, 1.0:
    print(f'{β} \t {(1 - (1 + β*β/6) ** .5)*100:0.2}%')

β	 SNR change
0.01 	 -0.00083%
0.1 	 -0.083%
0.2 	 -0.33%
0.5 	 -2.1%
0.7 	 -4.0%
1.0 	 -8.0%


## Variance equalization

Since multiphoton imaging is dominated by quantum shot noise, the variance of the noise is not constant. Following the Poisson distribution, the variance $\sigma^2$ of the noise  scales linearly with the intensity of the fluorescent signal $x$. 

This can be address by a variance-equalizing transformation for Poisson noise such as the *Anscombe transform*:

$$\dot x \mapsto 2\sqrt{\frac 1 q  \dot x + \frac 3 8}$$ where $q$ is the intensity induced by a single photon. 

This will result in an image with the noise variance of 1.0 everywhere.

In practice, A/D converters may add an offset and the single-photon magnitude $q$ is unknown. Therefore, the variance normalization is derived by robust paremetric fitting of the noise variance to the pixel intensity. A useful output of this fit is the NEQ (noise-equivalent quantum) size $q$.

This transformation is stable with respect to the imaging settings and can be estimated accurately from a larger stretch of data. 

Together, these two processes: variance equalization followed by grayscale quantization with $\beta=0.5$ produce a grayscale transformation that significantly reduces the image bitdepth without losing biologically relevant signals. 

## Frequency-based lossless compression

Images produced as the difference of the current frame from a running average, will produce roughly normally-distributed pixel values with the entropy of $H = \frac 1 2 \log_2\frac{2 \pi e}{\beta^2}$ bits per pixel for region areas with little activity. For $\beta = 0.5$, this will be 3 bits per pixel. 

In [8]:
import math
β, e, π = 0.5, math.e, math.pi 
print(f"{0.5 * math.log(2*π*e/β**2, 2):0.2} bits/per pixel")

3.0 bits/per pixel


This rate predicts the size of the images after efficient lossless compression in regions of bright fluorescence with no motion or activity. In regions with calcium activity or motion, the bit rate will be higher. In regions with no fluorescence (below the single-photon threshold), the bit rate will be lower. Therefore, it's reasonable to estimate a bit rate of about 2.5/bits per pixel.  Considering that most current systems store the data unpacked in 16-bit values, this will lead to 6.4:1 compression ratio in an effectively lossless scheme. Some formats pack 12-bit or 10-bit data, so the compression rate may be 4:1.

This compression is based purely on the frequencies of single-pixel intensity values in the temporal difference images. Additional gains may be derived from spatial dependencies. 

We must define data compression utilities that are configurable and replaceable to allow for iterative improvements.