This totrial shows how to use functions in Preparation class to prepare input files for BICePs sampling. You are welcome to take it as a template and make modification to prepare your own scripts. 
The example used here is the data set in this [work](https://onlinelibrary.wiley.com/doi/full/10.1002/jcc.23738).

--Yunhui Ge 02/2019

In [1]:
# import source code
import sys, os, glob
sys.path.append('biceps')
from Preparation import *
from toolbox import *

In this tutorial, we have two experimental observables: (1) [J couplings](https://en.wikipedia.org/wiki/J-coupling) for small molecules and (2) [NMR nuclear Overhauser effect (NOE)](https://en.wikipedia.org/wiki/Nuclear_Overhauser_effect) (pairwise distances).
First we need to perform conformational clustering on our MD simulations data. In this case, 100 metastable states are clustered. Now we need to prepare prior knowledge we learned from computational simulations. Normally, we used relative free energy for each metastable state. In the [original work](https://onlinelibrary.wiley.com/doi/full/10.1002/jcc.23738), Zhou et al did Quantum Mechanical (QM) calculations to refine each state and B3LYP energy was used as priors in BICePs calculation. Instructions of QM calculations are beyond the scope of this tutorial. Alternatively, we can estimate the energy using U = -ln(P) where P is the normalized population for each state. We also have built-in functions in `toolbox.py` to conduct this conversion. You can find tutorials using functions in `toolbox.py` [here](https://github.com/vvoelz/biceps/blob/master/BICePs_2.0/tutorials/Tools/toolbox.ipynb).   


Next, we need to compute pairwise distances and J coupling constants for each clustered state. 
To compute pairwise distances, we recommend to use [MDTraj](http://mdtraj.org) which is free to download. 
To compare simulated conformational ensembles to experimetnal NOE measurements, we normally computed <r^-6>^-1/6. For convenience in this tutorial, we assume the cluster center of each state is representitive enough and simply compute pairwise distances for the cluster center conformation. In practice, we still recommend users to compute ensemble-averaged distance.

In [2]:
import mdtraj as md
import numpy as np

# atom indices of pairwise distances
ind=np.loadtxt('atom_indice_noe.txt')
print "indices", ind

# make a new folder of computed distances for later 
os.system('mkdir NOE')

# compute pairwise distances using MDTraj
for i in range(100):    # 100 clustered states
    print 'state', i
    t = md.load('cineromycinB_pdbs/%d.fixed.pdb'%i)
    d=md.compute_distances(t,ind)*10.     # convert nm to Å 
    np.savetxt('NOE/%d.txt'%i,d)
print "Done!"

indices [[ 23.  25.]
 [ 23.  27.]
 [ 25.  27.]
 [ 24.  26.]
 [ 25.  26.]
 [ 26.  27.]
 [ 27.  32.]
 [ 27.  30.]
 [ 27.  31.]
 [ 28.  38.]
 [ 28.  39.]
 [ 28.  40.]
 [ 29.  38.]
 [ 29.  39.]
 [ 29.  40.]
 [ 25.  41.]
 [ 25.  42.]
 [ 25.  43.]
 [ 26.  44.]
 [ 26.  45.]
 [ 26.  46.]
 [ 33.  38.]
 [ 33.  39.]
 [ 33.  40.]
 [ 35.  38.]
 [ 35.  39.]
 [ 35.  40.]
 [ 36.  38.]
 [ 36.  39.]
 [ 36.  40.]
 [ 37.  38.]
 [ 37.  39.]
 [ 37.  40.]]
state 0
state 1
state 2
state 3
state 4
state 5
state 6
state 7
state 8
state 9
state 10
state 11
state 12
state 13
state 14
state 15
state 16
state 17
state 18
state 19
state 20
state 21
state 22
state 23
state 24
state 25
state 26
state 27
state 28
state 29
state 30
state 31
state 32
state 33
state 34
state 35
state 36
state 37
state 38
state 39
state 40
state 41
state 42
state 43
state 44
state 45
state 46
state 47
state 48
state 49
state 50
state 51
state 52
state 53
state 54
state 55
state 56
state 57
state 58
state 59
state 60
state 61
state 62
state

Next, we need to convert computed distance to BICePs readable format. 

In [5]:
#########################################
# Lets' create input files for BICePs
############ Preparation ################
# Specify necessary argument values

# REQUIRED: raw data of pre-comuted chemical shifts
path = 'NOE/*txt'

# REQUIRED: number of states
states = 100

# REQUIRED: atom indices of pairwise distances
indices = 'atom_indice_noe.txt'

# REQUIRED: experimental data
exp_data = 'noe_distance.txt'

# REQUIRED: topology file (as it only supports topology information, so it doesn't matter which state is used)
top = 'cineromycinB_pdbs/0.fixed.pdb'

# OPTIONAL: output directory of generated files
out_dir = 'noe_J'

p = Preparation('noe',states=states,indices=indices,exp_data=exp_data,top=top,data_dir=path)   # 'noe' scheme is selected
p.write(out_dir=out_dir)
# This will convert pairwise distances files for each state to a BICePs readable format and saved the new files in "noe_J" folder.

Wrote noe_J/0.noe
Wrote noe_J/1.noe
Wrote noe_J/2.noe
Wrote noe_J/3.noe
Wrote noe_J/4.noe
Wrote noe_J/5.noe
Wrote noe_J/6.noe
Wrote noe_J/7.noe
Wrote noe_J/8.noe
Wrote noe_J/9.noe
Wrote noe_J/10.noe
Wrote noe_J/11.noe
Wrote noe_J/12.noe
Wrote noe_J/13.noe
Wrote noe_J/14.noe
Wrote noe_J/15.noe
Wrote noe_J/16.noe
Wrote noe_J/17.noe
Wrote noe_J/18.noe
Wrote noe_J/19.noe
Wrote noe_J/20.noe
Wrote noe_J/21.noe
Wrote noe_J/22.noe
Wrote noe_J/23.noe
Wrote noe_J/24.noe
Wrote noe_J/25.noe
Wrote noe_J/26.noe
Wrote noe_J/27.noe
Wrote noe_J/28.noe
Wrote noe_J/29.noe
Wrote noe_J/30.noe
Wrote noe_J/31.noe
Wrote noe_J/32.noe
Wrote noe_J/33.noe
Wrote noe_J/34.noe
Wrote noe_J/35.noe
Wrote noe_J/36.noe
Wrote noe_J/37.noe
Wrote noe_J/38.noe
Wrote noe_J/39.noe
Wrote noe_J/40.noe
Wrote noe_J/41.noe
Wrote noe_J/42.noe
Wrote noe_J/43.noe
Wrote noe_J/44.noe
Wrote noe_J/45.noe
Wrote noe_J/46.noe
Wrote noe_J/47.noe
Wrote noe_J/48.noe
Wrote noe_J/49.noe
Wrote noe_J/50.noe
Wrote noe_J/51.noe
Wrote noe_J/52.noe
Wro

Now, let's take a look at what's inside the newly generated files.

In [3]:
fin = open('noe_J/0.noe','r')
text = fin.read()
fin.close()
print text

#restraint_index atom_index1 res1 atom_name1 atom_index2 res2 atom_name2 exp_noe(A) model_noe(A)
1            23       UNK1     H3           25       UNK1     H5             2.5000      2.8649
2            23       UNK1     H3           27       UNK1     H7             2.5000      4.4001
3            25       UNK1     H5           27       UNK1     H7             2.5000      4.0458
4            24       UNK1     H4           26       UNK1     H6             2.5000      3.4810
5            25       UNK1     H5           26       UNK1     H6             2.5000      2.5667
6            26       UNK1     H6           27       UNK1     H7             2.5000      3.5607
7            27       UNK1     H7           32       UNK1     H12            2.5000      2.7369
8            27       UNK1     H7           30       UNK1     H10            2.5000      3.7437
9            27       UNK1     H7           31       UNK1     H11            2.5000      2.5552
10           28       UNK1     H8      

Now let's move on to J couplings. Model predictions of coupling constants from dihedral angles θ were obtained from Karplus relations chosen depending on the relevant stereochemistry. 

In [4]:
# atom indices of J coupling constants
ind=np.load('ind.npy')
print 'index', ind

# Karplus relations for each dihedral angles 
karplus_key=np.load('Karplus.npy')
print 'Karplus relations', karplus_key

# compute J coupling constants using our built-in funtion (compute_nonaa_Jcoupling) in toolbox.py
for i in range(100):    # 100 clustered states
    print i
    J = compute_nonaa_Jcoupling('cineromycinB_pdbs/%d.fixed.pdb'%i, index=ind, karplus_key=karplus_key)
    np.savetxt('J_coupling/%d.txt'%i,J)


index [[25  9 10 26]
 [30 14 15 32]
 [31 14 15 32]
 [32 15 20 38]
 [32 15 20 39]
 [32 15 20 40]
 [32 15 16 33]
 [33 16 21 35]
 [33 16 21 36]
 [33 16 21 37]]
Karplus relations ['Allylic' 'Karplus_HH' 'Karplus_HH' 'Karplus_HH' 'Karplus_HH' 'Karplus_HH'
 'Karplus_antiperiplanar_O' 'Karplus_antiperiplanar_O'
 'Karplus_antiperiplanar_O' 'Karplus_antiperiplanar_O']
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99


Again, we need to convert computed J coupling constants to BICePs supported format.

In [5]:
#########################################
# Lets' create input files for BICePs
############ Preparation ################
# Specify necessary argument values

# REQUIRED: raw data of pre-comuted chemical shifts
path = 'J_coupling/*txt'

# REQUIRED: number of states
states = 100

# REQUIRED: atom indices of pairwise distances
indices = 'atom_indice_J.txt'

# REQUIRED: experimental data
exp_data = 'exp_Jcoupling.txt'

# REQUIRED: topology file (as it only supports topology information, so it doesn't matter which state is used)
top = 'cineromycinB_pdbs/0.fixed.pdb'

# OPTIONAL: output directory of generated files
out_dir = 'noe_J'

p = Preparation('J',states=states,indices=indices,exp_data=exp_data,top=top,data_dir=path)   # 'J' scheme is selected
p.write(out_dir=out_dir)
# This will convert J coupling constants files for each state to a BICePs readable format and saved the new files in "noe_J" folder.

Wrote noe_J/0.J
Wrote noe_J/1.J
Wrote noe_J/2.J
Wrote noe_J/3.J
Wrote noe_J/4.J
Wrote noe_J/5.J
Wrote noe_J/6.J
Wrote noe_J/7.J
Wrote noe_J/8.J
Wrote noe_J/9.J
Wrote noe_J/10.J
Wrote noe_J/11.J
Wrote noe_J/12.J
Wrote noe_J/13.J
Wrote noe_J/14.J
Wrote noe_J/15.J
Wrote noe_J/16.J
Wrote noe_J/17.J
Wrote noe_J/18.J
Wrote noe_J/19.J
Wrote noe_J/20.J
Wrote noe_J/21.J
Wrote noe_J/22.J
Wrote noe_J/23.J
Wrote noe_J/24.J
Wrote noe_J/25.J
Wrote noe_J/26.J
Wrote noe_J/27.J
Wrote noe_J/28.J
Wrote noe_J/29.J
Wrote noe_J/30.J
Wrote noe_J/31.J
Wrote noe_J/32.J
Wrote noe_J/33.J
Wrote noe_J/34.J
Wrote noe_J/35.J
Wrote noe_J/36.J
Wrote noe_J/37.J
Wrote noe_J/38.J
Wrote noe_J/39.J
Wrote noe_J/40.J
Wrote noe_J/41.J
Wrote noe_J/42.J
Wrote noe_J/43.J
Wrote noe_J/44.J
Wrote noe_J/45.J
Wrote noe_J/46.J
Wrote noe_J/47.J
Wrote noe_J/48.J
Wrote noe_J/49.J
Wrote noe_J/50.J
Wrote noe_J/51.J
Wrote noe_J/52.J
Wrote noe_J/53.J
Wrote noe_J/54.J
Wrote noe_J/55.J
Wrote noe_J/56.J
Wrote noe_J/57.J
Wrote noe_J/58.J
Wrote n

Now, let's take a look at what's inside the newly generated files.

In [6]:
fin = open('noe_J/0.J','r')
text = fin.read()
fin.close()
print text

#restraint_index atom_index1 res1 atom_name1 atom_index2 res2 atom_name2 atom_index3 res3 atom_name3 atom_index4 res4 atom_name4 exp_J_coupling(Hz) model_J_coupling(Hz)
1            25       UNK1     H5           9        UNK1     C6           10       UNK1     C7           26       UNK1     H6             5.0000       3.2791
2            30       UNK1     H10          14       UNK1     C11          15       UNK1     C12          32       UNK1     H12            1.5000      13.3369
3            31       UNK1     H11          14       UNK1     C11          15       UNK1     C12          32       UNK1     H12            1.5000       0.6008
4            32       UNK1     H12          15       UNK1     C12          20       UNK1     C16          38       UNK1     H18            6.0000      13.8777
4            32       UNK1     H12          15       UNK1     C12          20       UNK1     C16          39       UNK1     H19            6.0000       3.3394
4            32       UNK1     H12  

### Conlusion###
In this tutorial, we breifly showed how to use `Preparation` class to prepare input files for BICePs using precomputed experimental observables. So far, BICePs support observables: NOE, J couplings, Chemical Shifts, Protection Factors. 

In the example above, we showed how to deal with NOE and J couplings (non-natural amino acids). 

For J couplings for natural amino acids, please check this tutorial. 

Checmial shifts can be computed using different algorithm. We recommend to use [Shiftx2](http://www.shiftx2.ca) which is also availabe in MDTraj libaray.  

Protection factors is a special observables which asks for extra work. This tutorial (under construction) shows how to include protection factors in BICePs sampling.

After the input files are ready, now we are moving on to [how to set up BICePs sampling](https://github.com/vvoelz/biceps/blob/master/BICePs_2.0/tutorials/BICePs_example/BICePs_example.ipynb).