# Handling different types of data

In this example we will show which types of experimental data can be handled by BME, and how they are handled.
From a practical point of view, the following datatypes are currently supported : NOE, chemical shifts (CS), scalar couplings (JCOUPLINGS), SAXS and RDC. This means that the keyword DATA in the experimental datafile has to be one of the following: "NOE","JCOUPLINGS","CS","SAXS","RDC".


## Chemical shifts, 3J couplings and other *plain averaged* data 

Data such as chemical shifts are calculated as simple averages, i.e. 

$<F_{calc}> = \sum w_j F_{calc}(x_j)$

In this case, BME will try to find the weights such that $<F_{calc}> \approx F_{exp}$

## RDC: rescaling the dataset

For RDC the question is a little bit more complex, since 

$<F_{calc}> \approx \alpha F_{exp}$,  where $\alpha$ is a scaling parameter calculated by performing a linear regression (with intercept=0). The linear regression is weighted using the inverc
When using RDC it is #fundamental# to specify this when loading the data file


In [16]:
import sys,os
import numpy as np
bme_dir = os.getcwd().split("notebook")[0]
sys.path.append(bme_dir)
import BME as BME

# define input file names
exp_file = "%s/data/RDC_TL.exp.dat" % bme_dir
calc_file = "%s/data/RDC_TL.calc.dat" % bme_dir

rew = BME.Reweight("example_03_scale")
# load the experimental and calculated datasets note the "scale" 
rew.load(exp_file,calc_file,fit="scale")
results = rew.fit(theta=100)

print("CHI2  original: %6.2f" % results[0])
print("CHI2 optimized: %6.2f" % results[1])

CHI2  original:  15.60
CHI2 optimized:   8.05


## SAXS: rescaled and shifted dataset

For SAXS data we need to scaled and shift the dataset. This means that 
$<F_{calc}> \approx \alpha F_{exp} + \beta$,  where $\alpha$ is a scaling parameter and $\beta$ is an offset.
These parameters are calculated by performing a linear regression. 

In [17]:
exp_file = "%s/data/saxs310k_bme.txt" % bme_dir
calc_file = "%s/data/calc_saxs.txt" % bme_dir


# initialize. A name must be specified 
rew = BME.Reweight("example_03_scale_offset")

# load the experimental and calculated datasets
rew.load(exp_file,calc_file,fit="scale+offset")

results = rew.fit(theta=100)

print("CHI2  original: %6.2f" % results[0])
print("CHI2 optimized: %6.2f" % results[1])

CHI2  original:   5.67
CHI2 optimized:   2.45


## NOE: non linearly averaged data. 

The back-calculation of NOE involves averaging $r^{-p}$ distances, where $r$ is the distance between a proton pair and the exponent $p$ depends on the timescale of the internal motion.   Internally, BME will read all distances $r_j$ from the calculated datafile, and minimize the difference between $\sum_j w_j r_j^{-p}$ and $r_{EXP}^{-p}$. By default, BME will automatically perform this transformation (with $p=6$) for you whenever NOE data are loaded.  This behavior can be changed passing the argument "averaging" to the load function:


In [18]:
exp_noe_file = "../data/NOE_exp.dat"
calc_noe_file = "../data/NOE_calc.dat"

rew = BME.Reweight("example_03_noe")


rew.load(exp_noe_file,calc_noe_file)

results = rew.fit(theta=100)
stats_noe = rew.predict(exp_noe_file,calc_noe_file,"example_03_noe")
print("CHI2  original: %6.2f" % results[0])
print("CHI2 optimized: %6.2f" % results[1])

CHI2  original:   1.15
CHI2 optimized:   0.77


Allowed values for the argument "averaging" are "power_6","power_4","power_3" or "linear"

In [23]:
rew = BME.Reweight("example_03_noe_4")

# this automatically uses p=6
rew.load(exp_noe_file,calc_noe_file,averaging="power_4")

results = rew.fit(theta=100)
stats_noe = rew.predict(exp_noe_file,calc_noe_file,"example_03_noe4")
print("CHI2  original: %6.2f" % results[0])
print("CHI2 optimized: %6.2f" % results[1])



CHI2  original:   2.98
CHI2 optimized:   1.69


## Inequality restraints: upper and lower bounds. 

Sometimes experimental data comes in the form of upper or lower bound (e.g. NOE upper bounds or unobserved NOE). Such information can be specified in BME by adding the keyword BOUND=UPPER or BOUND=LOWER in the header of the experimental data file.
For example, when DATA=LOWER, BME will try to move all the calculated averages above the value specified in the experimental data file. 

