# Tutorial about analyzing blink statistics

SMLM depends critically on the fluorescence intermittency or, in other words, the blinking of fluorescence dyes. To characterize blinking properties you can compute on- and off-periods from clustered localizations assuming that they originate from the same fluorophore.

In [None]:
from pathlib import Path

%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

import locan as lc

In [None]:
lc.show_versions(system=False, dependencies=False, verbose=False)

## Synthetic data

We use synthetic data that represents localizations from a single fluorophore being normal distributed in space and emitting at a constant intensity. We assume that the on- and off-times in units of frames are distributed like a geometric distribution with a mean on_period `mean_on` and a mean off_period `mean_off`. Typically a geometric distribution is parameterized by a variable `p` with `p = 1 / mean`.

In [None]:
rng = np.random.default_rng(seed=1)

In [None]:
n_samples = 10_000
mean_on = 5
mean_off = 20

on_periods = stats.geom.rvs(p=1/mean_on, size=n_samples, random_state=rng)
off_periods = stats.geom.rvs(p=1/mean_off, size=n_samples, random_state=rng)

On- and off-times are converted in a series of frame numbers at which a localization was detected.

In [None]:
def periods_to_frames(on_periods, off_periods):
    """
    Convert on- and off-periods into a series of increasing frame values.
    """
    on_frames = np.arange(np.sum(on_periods))
    cumsums = np.r_[0, np.cumsum(off_periods)[:-1]]
    add_on = np.repeat(cumsums, on_periods)
    frames = on_frames + add_on
    return frames[:len(on_periods)]

frames = periods_to_frames(on_periods, off_periods)

In [None]:
offspring = [rng.normal(loc=0, scale=10, size=(n_samples, 2))]
locdata = lc.simulate_cluster(centers=[(50, 50)], region=[(0, 100), (0, 100)], offspring=offspring, clip=False, shuffle=False, seed=rng)
locdata.dataframe['intensity'] = 1
locdata.dataframe['frame'] = frames

locdata = lc.LocData.from_dataframe(dataframe=locdata.data)

print('Data head:')
print(locdata.data.head(), '\n')
print('Summary:')
locdata.print_summary()
print('Properties:')
print(locdata.properties)

In [None]:
lc.render_2d(locdata, bin_size=5);

## Blinking statistics

To determine on- and off-times for the observed blink events use the analysis class `BlinkStatistics`.

In [None]:
bs = lc.BlinkStatistics(memory=0, remove_heading_off_periods=False).compute(locdata)
bs

In [None]:
bs.results.keys()

When plotting the histogram an exponential distribution is fitted by default.

In [None]:
bs.hist(data_identifier='on_periods');

In [None]:
bs.hist(data_identifier='off_periods');

In [None]:
bs.distribution_statistics

The fit results provide `loc` and `scale` parameter (see `scipy.stats` documentation). For `loc = 0`, `scale` describes the mean of the distribution..

In [None]:
bs.distribution_statistics['on_periods'].parameter_dict()

In [None]:
bs.distribution_statistics['off_periods'].parameter_dict()

Due to the default setting for the scaling parameter `loc` the mean on_period is `on_periods_scale + on_periods_loc` in agreement with our input value.

## Geometric distribution

We can compare this with a geometric distribution that is estimated from the observed mean on_period `on_periods_mean`.

In [None]:
on_periods_mean = bs.results['on_periods'].mean()
on_periods_mean.round(2)

In [None]:
off_periods_mean = bs.results['off_periods'].mean()
off_periods_mean.round(2)

In [None]:
# test result
x = np.arange(stats.geom.ppf(0.01, 1/on_periods_mean), stats.geom.ppf(0.9999, 1/on_periods_mean))
y = stats.geom.pmf(x, 1/on_periods_mean)
fig, ax = plt.subplots()
bs.hist(data_identifier='on_periods', fit=False, label='data')
bs.distribution_statistics['on_periods'].plot(label='exponential')
ax.plot(x, y, '-go', label='geometric')
ax.set_yscale('log')
ax.legend(loc='best')
plt.show()