# [PyBroMo](http://tritemio.github.io/PyBroMo/) - 2.3 Generate smFRET data - Mixture

<small>
*This notebook is part of [PyBroMo](http://tritemio.github.io/PyBroMo/) a 
python-based single-molecule Brownian motion diffusion simulator 
for confocal [smFRET](http://en.wikipedia.org/wiki/Single-molecule_FRET)
experiments. You can find the full list of notebooks [here](http://nbviewer.ipython.org/github/tritemio/PyBroMo/tree/master/notebooks/).*
</small>

## *Overview*

*In this notebook we show how to generated smFRET data files from raw timestamps*.

##Loading the software

Import all the relevant libraries:

In [None]:
%matplotlib inline
from pathlib import Path
import numpy as np
import tables
import matplotlib.pyplot as plt
import seaborn as sns
import pybromo as pbm
print('Numpy version:', np.__version__)
print('PyTables version:', tables.__version__)
print('PyBroMo version:', pbm.__version__)

# Create smFRET data-files

## Create a file for a single FRET efficiency

In this section we show how to save a single smFRET data file. In the next section we will perform the same steps in a loop to generate a sequence of smFRET data files.

### Step 1: Create a the timestamp array

The start by loading the timestamps for donor and acceptor channel. 
The FRET efficiency is determined by the **max emission rate ratio (*k*)**. We also need to choose the background rate.

As a memo, let's write some formulas related to the FRET efficiency:

$$ k = \frac{F_a}{F_d} \quad,\qquad E = \frac{k}{k+1}  \qquad\Rightarrow\qquad k = \frac{E}{1-E}$$

In [None]:
S = pbm.ParticlesSimulation.from_datafile('0168', mode='w')

In [None]:
#S = pbm.ParticlesSimulation.from_datafile('0168')

## Simulate timestamps

In [None]:
def em_rates_DA_from_E(em_rate_tot, E_values):
    E_values = np.asarray(E_values)
    em_rates_a = E_values * em_rate_tot
    em_rates_d = em_rate_tot - em_rates_a
    return em_rates_d, em_rates_a

def em_rates_from_E(em_rate_tot, E_values):
    em_rates_d, em_rates_a = em_rates_DA_from_E(em_rate_tot, E_values)
    return np.unique(np.hstack([em_rates_d, em_rates_a]))

def em_rates_DA_from_E_mix(em_rate_tot1, em_rate_tot2, E_values1, E_values2):
    em_rates_d1, em_rates_a1 = em_rates_DA_from_E(em_rate_tot1, E_values1)
    em_rates_d2, em_rates_a2 = em_rates_DA_from_E(em_rate_tot2, E_values2)
    return em_rates_d1, em_rates_a1, em_rates_d2, em_rates_a2

In [None]:
em_rates_DA_from_E_mix(200e3, 150e3, 0.55, 0.8)

In [None]:
em_rate_tot1 = 250e3
em_rate_tot2 = 500e3
E1 = 0.55
E2 = 0.2

em_rate_d1, em_rate_a1, em_rate_d2, em_rate_a2 = \
    em_rates_DA_from_E_mix(em_rate_tot1, em_rate_tot2, E1, E2)

In [None]:
em_rate_d1, em_rate_a1, em_rate_d2, em_rate_a2

In [None]:
bg_rate_d, bg_rate_a = 900, 700
bg_rates = [bg_rate_a, bg_rate_d]

In [None]:
def populations_slices(particles):
    D_counts1, D_counts2 = particles.diffusion_coeff_counts
    return slice(0, D_counts1[1]), slice(D_counts1[1], D_counts1[1] + D_counts2[1])

In [None]:
populations_slices(S.particles)

In [None]:
rs = np.random.RandomState(123)

s = """
Timestamp simulation: Mixture
-----------------------------

Population1:
    Peak emission rate: %d kcps
    FRET efficiency:    %.2f
    
Population2:
    Peak emission rate: %d kcps
    FRET efficiency:    %.2f    
""" % (em_rate_tot1*1e-3, E1, em_rate_tot2*1e-3, E2)
print(s)

S.simulate_timestamps_mix(
    max_rates=(em_rate_d1, em_rate_d2),
    populations = populations_slices(S.particles),
    bg_rate=bg_rate_d, 
    rs=rs, 
    overwrite=True)

S.simulate_timestamps_mix(
    max_rates=(em_rate_a1, em_rate_a2),
    populations = populations_slices(S.particles),
    bg_rate=bg_rate_a, 
    rs=rs, 
    overwrite=True)

In [None]:
S.timestamp_names

## Compose timestamps for FRET

Now, we need to create a single array with donor + acceptor timestamps:

In [None]:
ts_d, ts_par_d = S.get_timestamps(hash_='87f1')
ts_a, ts_par_a = S.get_timestamps(hash_='9261')

In [None]:
ts, a_ch, part = pbm.timestamps.merge_da(ts_d, ts_par_d, ts_a, ts_par_a)
ts.shape, a_ch.shape, part

Perform some safety checks and plot:

In [None]:
assert a_ch.sum() == ts_a.shape[0]
assert (-a_ch).sum() == ts_d.shape[0]
assert a_ch.size == ts_a.shape[0] + ts_d.shape[0]

In [None]:
plt.plot(ts)

In [None]:
bins = np.arange(0, 1, 1e-3)
plt.hist(ts*ts_d.attrs['clk_p'], bins=bins, histtype='step');

In [None]:
bins = np.arange(0, 1, 1e-3)
counts_d, _ = np.histogram(ts[~a_ch]*ts_d.attrs['clk_p'], bins=bins)
counts_a, _ = np.histogram(ts[a_ch]*ts_d.attrs['clk_p'], bins=bins)
plt.plot(bins[:-1], counts_d)
plt.plot(bins[:-1], -counts_a)

### Step 2: saving to Photon-HDF5 format

To save the data in [Photon-HDF5 format](http://photon-hdf5.org) we use 
the library [**phconvert**](http://photon-hdf5.github.io/phconvert/):

In [None]:
import phconvert as phc
print('Phconvert version: ', phc.__version__)

We neeed a file name with FRET simulation info:

In [None]:
fret_string = '_E1_%03d_D1Em%dk_A1Em%03dk_E2_%03d_D2Em%dk_A2Em%03dk_BgD%d_BgA%d' %\
        (E1*100, em_rate_d1*1e-3, em_rate_a1*1e-3,
         E2*100, em_rate_d2*1e-3, em_rate_a2*1e-3, 
         bg_rate_d, bg_rate_a)
fret_string

In [None]:
filename_smfret = S.store.filepath.stem.replace('pybromo', 'smFRET') + fret_string + '.hdf5'
filename_smfret

In [None]:
fret_sim_fname = Path(filename_smfret)
fret_sim_fname

In [None]:
(D1, num_pop1), (D2, num_pop2) = S.particles.diffusion_coeff_counts
(D1, num_pop1), (D2, num_pop2)

In [None]:
def make_photon_hdf5(ts, a_ch, clk_p, E1, E2):
    # globals: S.ts_store.filename, S.t_max
    photon_data = dict(
        timestamps = ts,
        timestamps_specs = dict(timestamps_unit=clk_p),
        detectors = a_ch,
        measurement_specs = dict(
            measurement_type = 'smFRET',
            detectors_specs = dict(spectral_ch1 = np.atleast_1d(False),
                                   spectral_ch2 = np.atleast_1d(True))))

    setup = dict(
        num_pixels = 2,
        num_spots = 1,
        num_spectral_ch = 2,
        num_polarization_ch = 1,
        num_split_ch = 1,
        modulated_excitation = False,
        lifetime = False)

    provenance = dict(filename=S.ts_store.filename, 
                      software='PyBroMo', software_version=pbm.__version__)

    identity = dict(
        author='Antonino Ingargiola',
        author_affiliation='UCLA')

    (D1, num_pop1), (D2, num_pop2) = S.particles.diffusion_coeff_counts
    description = ('Simulated freely-diffusing smFRET experiment 2-population mixture: '
                   'D1 = %.1e m^2/s, D2 = %.1e m^2/s, E1 = %.2f%%, E2 = %.2f%%, '
                   '#Pop1 = %d, #Pop2 = %d') % (
                   D1, D2, E1, E2, num_pop1, num_pop2)
    acquisition_duration = S.t_max
    data = dict(
        acquisition_duration = round(acquisition_duration),
        description = description,
        photon_data = photon_data,
        setup=setup,
        provenance=provenance,
        identity=identity)
    return data

In [None]:
data = make_photon_hdf5(ts, a_ch, ts_d.attrs['clk_p'], E1, E2)

In [None]:
phc.hdf5.save_photon_hdf5(data, h5_fname=str(fret_sim_fname), overwrite=True)

In [None]:
h5file = tables.open_file(str(fret_sim_fname))

In [None]:
phc.hdf5.print_children(h5file.root.photon_data)

In [None]:
h5file.close()

# Burst analysis

As a final check we analyze the created files with 
[FRETBursts](https://github.com/tritemio/FRETBursts/) 
smFRET burst analysis program.

In [None]:
import fretbursts as fb

In [None]:
filepath = list(Path('./').glob('smFRET_016*E1*'))[0]

In [None]:
str(filepath)

In [None]:
d = fb.loader.photon_hdf5(str(filepath))

In [None]:
d

In [None]:
d.A_em

In [None]:
fb.dplot(d, fb.timetrace);

In [None]:
d.calc_bg(fun=fb.bg.exp_fit, tail_min_us='auto', F_bg=1.7)

In [None]:
d.bg_dd, d.bg_ad

In [None]:
d.burst_search(F=7)

In [None]:
d.num_bursts

In [None]:
ds = d.select_bursts(fb.select_bursts.size, th1=200)

In [None]:
ds.num_bursts

In [None]:
fb.dplot(ds, fb.hist_fret)
plt.axvline(0.2);

In [None]:
fb.dplot(ds, fb.timetrace, bursts=True);
plt.ylim(-100, 250);
plt.xlim(0., 1);

In [None]:
fb.bext.burst_data(ds)