# Sample Jupyter Notebook for Using `eat.factor()`

Load all the useful python modules

The `eat` module is actively developed.  It is installed in the development mode by `cd eat && pip install -e .` (note the `-e` flag).  Together with the the `autoreload` jupyter extension, we have a smooth development workflow that does not require continuous module reinstallation.

In [33]:
%load_ext autoreload
%autoreload 2

import pandas as pd
import numpy  as np
import eat

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Randomly generate a dictionary of zero-mean site-based rates/delays

In [40]:
sites = "abcde"

r  = 2 * np.random.rand(len(sites)) - 1
sb = {s:r[i] for i, s in enumerate(sites)}
for s in sites:
    print(s, sb[s])
print("mean =", np.mean(r))

a 0.282787052382
b -0.921133999129
c -0.730399741837
d -0.462579422502
e 0.340920589588
mean = -0.2980811043


Generate baseline-based rates/delays using `sb`

In [41]:
bb = np.array([(ref, rem, sb[ref] - sb[rem]) for ref in sites for rem in sites if rem > ref],
              dtype=[('ref', 'U3'), ('rem', 'U2'), ('val', 'f16')])
print(bb)

[('a', 'b',  1.2039211) ('a', 'c',  1.0131868) ('a', 'd',  0.74536647)
 ('a', 'e', -0.058133537) ('b', 'c', -0.19073426) ('b', 'd', -0.45855458)
 ('b', 'e', -1.2620546) ('c', 'd', -0.26782032) ('c', 'e', -1.0713203)
 ('d', 'e', -0.80350001)]


Use `eat.factor()` to factor out site-based delays/rates from baseline-based delays/rates

In [49]:
sol = eat.factor(bb, regularizer='mean', weight=10.0)
for s in sites:
    print(s, sol[s])

a 0.580868156682
b -0.623052894829
c -0.432318637537
d -0.164498318203
e 0.639001693887


The solution is in general different from the original rates/delays by a constant.

In [50]:
for s in sites:
    print(s, sol[s]-sb[s])

a 0.2980811043
b 0.2980811043
c 0.2980811043
d 0.2980811043
e 0.2980811043
