# Tutorial for BaNDyT: Bayesian Network Analysis of Dynamic Trajectories

In this tutorial, we will use BaNDyT package to explore probabilistic relationship between residues of interferon induced transmembrane protein 3 (IFITM3) and identify functionally important residues.

**Step 1** Using MD package of choice (for example, *Gromacs*), we calculated interaction energy of each residue with its surroundings. All measurements were combined in the csv file *ifitm3.csv*, where each column corresponds to the residue and each row is the frame of the trajectory (excerpt from the file is shown in Table 1)

| R10       | R11       | R12       |
|-----------|-----------|-----------|
| -456.262  | -462.787  | -432.281  |
| -441.219  | -467.503  | -443.350  |
| -478.927  | -496.202  | -457.204  |
| -489.087  | -498.819  | -445.036  |
| -494.550  | -521.253  | -433.522  |

**Table 1**. Example of BaNDyT input dataset based on interaction energy of IFITM3.
Upon obtaining the MD simulations data we are interested in, we will start its processing using BaNDyT package. To use it locally, define the location of BaNDyT package following way:

In [None]:
import sys

# put your path here
bandyt = '/path/to/BaNDyT/directory'

**Step 2** involves loading the input data. The loader function will detect whether the dataset is discretized or not. In the case of continuous data, it will use maximum entropy algorithm to discretize it into 8 bins as default.

In [None]:
import bandyt

dt=bandyt.read_input_file('ifitm3.csv')

**Step 3** is initializing the search of starting Baesian Network. The function *bandyt.search* uses MU scoring function as a default. If you have the C version of BaNDyT package compiled, you can use change the soring function to *bandyt.cmu* to make calculation faster.

In [None]:
# change to srch = bnomics.search(dt,ofunc=bandyt.cmu) if C package is compiled
srch = bnomics.search(dt)

**Step 4** is performing the recursive search of optimal Bayesian Network topology. We recommend doing at least 50 restarts to ensure network convergence.

In [None]:
srch.restarts(nrestarts=50)

**Step 5** is saving the final topology as a dot and pickle file for future analysis and vizualization.

In [None]:
srch.dot(path='ifitm3')
bandyt.convert_bn_to_igraph(srch,fout="ifitm3.pickle",format="pickle")

**Step 6** is generating the output csv file with weighted degrees of each residue and Graphml object that can be used to vizualize Bayesian network graph in the network vizualization software (for example, *Cytoscape*). Example of network vizualization is shown on Figure 1.

In [None]:
bandyt.getGraphProp('ifitm3.pickle', 'ifitm3')

<div style="text-align: center;">
    <img src="figures/G_ifitm3.png" alt="IFITM3 graph" width="500">
</div>

**Figure 1.** Bayesian Network Graph of probabilistic relationship between IFITM3 residues. Figure was generated using Cytoscape 3.10.2. Node size and color is proportionate to weighted degree, arrow thickness and color is proportionate to edge weight. Graph is shown in edge-weighted spring embedded layout (based on a force-directed paradigm as implemented by Kamada and Kawai (1988)).
