# usALEX - Corrections - Direct excitation physical parameter

> *This notebook estimates direct excitation coefficient $d_{dirT}$ from μs-ALEX data.*

# Definitions Memo

$$n_d = I_{D_{ex}} \, \sigma_{D_{ex}}^D \,
\phi_D \, \eta_{D_{det}}^{D_{em}} \, (1-E)$$

$$n_a = I_{D_{ex}} \, \sigma_{D_{ex}}^D \,
\phi_A \, \eta_{A_{det}}^{A_{em}} \, E$$

$$ n_{aa} = I_{A_{ex}} \, \sigma_{A_{ex}}^A \,
\phi_A \, \eta_{A_{det}}^{A_{em}}$$

$$n_a^* = n_a + Lk + Dir$$

where

$$Lk = I_{D_{ex}} \, \sigma_{D_{ex}}^D \,
\phi_D \, \eta_{A_{det}}^{D_{em}} \, (1-E)$$

$$Dir = I_{D_{ex}} \, \sigma_{D_{ex}}^A \,
\phi_A \, \eta_{A_{det}}^{A_{em}}$$

$$\gamma = \frac{\phi_A\,\eta_{D_{det}}^{A_{em}}}{\phi_D\,\eta_{D_{det}}^{D_{em}}}$$

# Aim of this notebook

## What is already computed?

We previously fitted the **leakage** and **gamma** coefficient from the RAW PR values 
for the 5 dsDNA measurements. We also fitted the
direct excitation coefficient expressed (`dir_ex_aa`) as a function of the 
A-signal during A-excitation (`naa`). In symbols, `dir_ex_aa` is defined as:

$$ d_{dirAA} = \frac{n_{a}}{n_{aa}}$$

for a A-only population.

## What we want to compute?

Alternatively, we can express the direct excitation contribution ($Dir$)
as a function of the total corrected burst size:

$$ Dir = d_{dirT}\, (n_a + \gamma n_d)$$

With this definition, expressing $d_{dirT}$ as a function 
of the physical parameters we obtain:

$$d_{dirT} = \frac{\sigma_{D_{ex}}^A}{\sigma_{D_{ex}}^D} $$

where $\sigma_{Dex}^A$ and $\sigma_{Dex}^D$ are the absorption cross-sections 
of the Acceptor and Donor dye at wavelength of Donor laser.

Finally, remembering the definition of $\beta$:

$$ \beta = \frac{I_{A_{ex}}\sigma_{A_{ex}}^A}{I_{D_{ex}}\sigma_{D_{ex}}^D}$$

we can express $d_{dirT}$ as the product of $\beta$ and $d_{dirAA}$:

$$ d_{dirT} = \beta \, d_{dirAA}$$

Note that $d_{dirT}$ is a property of the Donor-Acceptor dyes pair and of the
Donor excitation wavelength. As such, differently from $d_{dirAA}$, the 
$d_{dirT}$ coefficient is valid for the same sample in any setup using the same
donor excitation wavelength, such as the single-spot μs-ALEX and the multi-spot
system. Additionally, $d_{dirT}$ allows to correct for direct acceptor
excitation using only donor-excitation quantities. Therefore the same
correction formula can be used both in two-laser (e.g. single-spot μs-ALEX) 
and single-laser systems (e.g. 8-spot system).

## How we compute it?

We use two different procedures both yielding
an estimation of $d_{dirT}$. Except for the numerical
accuracy the two procedures are equivalent.

### Procedure 1: Using beta

From the previous relation between $d_{dirT} = \beta \,d_{dirAA}$ is possible to
directly estimate $d_{dirT}$ with the values of $\beta$ and $d_{dirAA}$
we already fitted in previous notebooks.

### Procedure 2: Correction formula

It is possible to go from the raw $E_R$ (only background correction, 
no leakage, direct excitation nor gamma) to the fully-corrected $E$
using the formula:

$$ E = f(E_R,\, \gamma,\, L_k,\, d_{dirT}) = 
\frac{E_{R} \left(L_{k} + d_{dirT} \gamma + 1\right) - L_{k} - d_{dirT} \gamma}
{E_{R} \left(L_{k} - \gamma + 1\right) - L_{k} + \gamma}$$

* See [Derivation of FRET and S correction formulas](http://nbviewer.jupyter.org/github/tritemio/notebooks/blob/master/Derivation%20of%20FRET%20and%20S%20correction%20formulas.ipynb) for derivation.

We can compute the corrected $E$ for the 5 dsDNA samples by fitting
the fully-corrected histograms (histograms with γ, leakage and 
direct excitation corrections). We can also fit the 5 $E_R$ values 
for the same samples from the proximity ratio histograms 
(only background correction).

Therefore, using the previous formula we can fit $d_{dirT}$ (`dir_ex_t`)
by minimizing the error between the 5 $E$ values fitted from
corrected histograms and the 5 $E$ values obtained correcting
the 5 $E_R$ values from the fit of the proximity ratio histograms.

# Loading data

We load the needed libraries and FRETBursts which includes the FRET
correction formulas ($E = f(E_R,\, \gamma,\, L_k,\, d_{dirT})$).

In [None]:
from __future__ import division
import os
import numpy as np
import pandas as pd
import lmfit
from fretbursts import *
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
%config InlineBackend.figure_format='retina'  # for hi-dpi displays
sns.set_style('whitegrid')

## Load Raw PR

In [None]:
#bsearch_ph_sel = 'AND-gate'
bsearch_ph_sel = 'Dex'

data_file = 'results/usALEX-5samples-PR-raw-%s.csv' % bsearch_ph_sel

These are the **RAW proximity ratios** for the 5 samples (only background correction, no leakage nor direct excitation):

In [None]:
data_raw = pd.read_csv(data_file).set_index('sample')
data_raw[['E_gauss_w', 'E_kde_w']]

## Load Corrected E

And these are the **FRET efficiencies** fitted from corrected histograms for the same 5 samples: 

In [None]:
data_file = 'results/usALEX-5samples-E-corrected-all-ph.csv'
data_corr = pd.read_csv(data_file).set_index('sample')
data_corr[['E_gauss_w', 'E_kde_w']]

## Load SNA Data

In [None]:
data_file_sna = 'results/usALEX-5samples-E-SNA.csv'

In [None]:
sna = pd.read_csv(data_file_sna, index_col=0)
sna

## Load μs-ALEX corrections

In [None]:
leakage_coeff_fname = 'results/usALEX - leakage coefficient DexDem.csv'
leakage = np.loadtxt(leakage_coeff_fname)

print('Leakage coefficient:', leakage)

In [None]:
dir_ex_coeff_fname = 'results/usALEX - direct excitation coefficient dir_ex_aa.csv'
dir_ex_aa = np.loadtxt(dir_ex_coeff_fname)

print('Dir. excitation AA:', dir_ex_aa)

In [None]:
dir_ex_t_datasheet_fname = 'results/Dyes - ATT0647N-ATTO550 abs X-section ratio at 532nm.csv'
dir_ex_t_datasheet = np.loadtxt(dir_ex_t_datasheet_fname)

print('Direct excitation (dir_ex_t) from datasheet:', dir_ex_t_datasheet)

In [None]:
gamma_coeff_fname = 'results/usALEX - gamma factor - all-ph.csv'
gamma = np.loadtxt(gamma_coeff_fname)

print('Gamma factor:', gamma)

In [None]:
beta_coeff_fname = 'results/usALEX - beta factor - all-ph.csv'
beta = np.loadtxt(beta_coeff_fname)

print('Beta factor:', beta)

# Procedure 1

Compute $d_{dirT}$ using $\beta$ and $d_{dirAA}$:

In [None]:
dir_ex_t_beta = dir_ex_aa * beta
'%.5f' % dir_ex_t_beta

In [None]:
with open('results/usALEX - direct excitation coefficient dir_ex_t beta.csv', 'w') as f:
    f.write('%.5f' % dir_ex_t_beta)

With this coefficient, computing the corrected $E$ for the 5 dsDNA samples
we obtain:

In [None]:
PR_corr_kde = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w, 
                                                leakage=leakage, 
                                                dir_ex_t=dir_ex_t_beta,
                                                gamma=gamma)*100
PR_corr_kde

In [None]:
PR_corr_gauss = fretmath.correct_E_gamma_leak_dir(data_raw.E_gauss_w, 
                                                  leakage=leakage, 
                                                  dir_ex_t=dir_ex_t_beta,
                                                  gamma=gamma)*100
PR_corr_gauss

# Procedure 2

## Datasheet-based direct excitation

The coefficient $d_{dirT}$ can be estimated from data-sheet values of 
$\sigma_{D_{ex}}^A$ and $\sigma_{D_{ex}}^D$.

Using the [datasheet values](dyes/Absorption ATTO550-ATTO647N.ipynb) 
provided by ATTOTec (in PBS buffer) we obtain a $d_{dirT}$ estimation
close to 10%:

In [None]:
dir_ex_t_datasheet

With this the corrected $E$ for the 5 dsDNA samples are:

In [None]:
E_datasheet = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w, 
                                                leakage=leakage, 
                                                dir_ex_t=dir_ex_t_datasheet,
                                                gamma=gamma)*100
E_datasheet

Comparing these values with the ones obtained fitting the
corrected E histograms we observe a significant discrepancy:

In [None]:
out = data_corr[['E_kde_w']].copy()*100
out.columns = ['E_alex']
out['E_datasheet'] = E_datasheet
out

In [None]:
out.plot(alpha=0.4, lw=3, style=dict(E_alex='-ob', E_datasheet='-sr'));

> **NOTE:** The corrected FRET efficiencies using the datasheet and 
> μs-ALEX-based direct excitation do not match well.

## Fitting direct excitation $d_{dirT}$

In [None]:
def residuals_absolute(params, E_raw, E_ref):
    dir_ex_t = params['dir_ex_t'].value
    return E_ref - fretmath.correct_E_gamma_leak_dir(E_raw, 
                                                     leakage=leakage, 
                                                     gamma=gamma, 
                                                     dir_ex_t=dir_ex_t)

In [None]:
def residuals_relative(params, E_raw, E_ref):
    dir_ex_t = params['dir_ex_t'].value
    return (E_ref - fretmath.correct_E_gamma_leak_dir(E_raw, 
                                                      leakage=leakage, 
                                                      gamma=gamma, 
                                                      dir_ex_t=dir_ex_t))/E_ref

In [None]:
params = lmfit.Parameters()
params.add('dir_ex_t', value=0.05) 

In [None]:
m = lmfit.minimize(residuals_absolute, params, args=(data_raw.E_kde_w, data_corr.E_kde_w))
lmfit.report_fit(m.params, show_correl=False)

In [None]:
m = lmfit.minimize(residuals_relative, params, args=(data_raw.E_kde_w, data_corr.E_kde_w))
lmfit.report_fit(m.params, show_correl=False)

> **NOTE:** The fitted `dir_ex_t` is 4.5% as opposed to 10.6% as expected from the [absorption spectra of ATTO550 and ATTO647](dyes/Absorption ATTO550-ATTO647N.ipynb) at 532nm.

In [None]:
'%.5f' % m.params['dir_ex_t'].value

In [None]:
with open('results/usALEX - direct excitation coefficient dir_ex_t fit.csv', 'w') as f:
    f.write('%.5f' % m.params['dir_ex_t'].value)

# Corrected E

In [None]:
data_raw

In [None]:
PR_corr_kde_dfit = fretmath.correct_E_gamma_leak_dir(data_raw.E_kde_w, 
                                                     leakage=leakage, 
                                                     dir_ex_t=m.params['dir_ex_t'].value,
                                                     gamma=gamma)*100
PR_corr_kde_dfit.name = 'PR_corr_kde_dfit'
PR_corr_kde_dfit

In [None]:
E = pd.concat([data_corr[['E_kde_w', 'E_gauss_w']]*100, PR_corr_kde, PR_corr_gauss, sna*100], axis=1)
E.columns = ['E KDE', 'E Gauss', 'PR KDE', 'PR Gauss', 'SNA Epr mean', 'SNA Epr max']
E

In [None]:
E.plot.bar(table=np.round(E, 2).T)
plt.ylabel('FRET (%)')
plt.gca().xaxis.set_visible(False)
#plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.);

In [None]:
E[['PR KDE', 'PR Gauss', 'E KDE']].plot(kind='bar')
E[['PR KDE', 'PR Gauss', 'E KDE']].plot(lw=3);
print('Max error E_alex vs E_corr_pr: %.2f' % (E['E KDE'] - E['PR KDE']).abs().max())
print('Max error E_alex vs E_beta:    %.2f' % (E['E KDE'] - E['PR Gauss']).abs().max())
print('Max error E_beta vs E_corr_pr: %.2f' % (E['PR Gauss'] - E['PR KDE']).abs().max())

In [None]:
x = [int(idx[:-1]) for idx in out.index]
plt.plot(x, 'E KDE', data=E)
plt.plot(x, 'PR KDE', data=E)
plt.plot(x, 'PR Gauss', data=E)
plt.xlabel('Distance in base-pair')
plt.ylabel('FRET');

In [None]:
E['E KDE'] - E['PR KDE']

> **NOTE:** Fitting $d_{dirT}$ to match $E$ from corrected hsitograms with $E$ from PR correction formula produces a max difference of 1% for the 12d sample. The match is well below the fitting accuracy (> 2%).

# Save

In [None]:
E.to_csv('results/usALEX-5samples-E-all-methods.csv', float_format='%.3f')

In [None]:
E.round(3)