# Welcome to the Markov-Model hands-on tutorial

If you are reading this, you are on the right path already! This is a "Python notebook":
a series of commands mixed with comments that you may execute, modify, re-execute
at will.

Cells are executed with **Shift+Enter**. If you need to start over, use **Kernel/Restart** in the top menu.


## Getting started

First we import the modules we are going to need for the tutorial. Please execute the following cell and ignore the output.

In [1]:
from htmd.ui import *
config(viewer="ngl")


Copyright by Acellera Ltd. By executing you are accepting the License. In order to register, run htmd_register on your terminal

ffevaluate module is in beta version


In [2]:
m=Molecule("3ffn")

2019-01-14 16:36:04,626 - htmd.molecule.readers - INFO - Attempting PDB query for 3ffn


In [3]:
m.view()

A Jupyter Widget

# First example dataset: prothrombin

Look into `/mnt/scratch/shared/markov/binding/`. You will find three independent replicase of the same system,
namely prothrombin simulated with an inhibitor.

In [4]:
n=Molecule("/mnt/scratch/shared/markov/binding/1/filtered/filtered.pdb")
n.view()





A Jupyter Widget

In [5]:
n.sequence()

{'': 'EADCGLRPLFEKKSLEDKTERELLESYIIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKEGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQF'}

In [6]:
n.get("resname",sel="not protein")

array(['MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL',
       'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'MOL',
       'MOL', 'MOL', 'MOL', 'MOL', 'MOL', 'Cl-', 'Cl-', 'Cl-', 'Cl-'], dtype=object)

In [7]:
ligand=n.copy()
ligand.filter("resname MOL")
ligand.view()

2019-01-14 16:36:07,182 - htmd.molecule.molecule - INFO - Removed 4484 atoms. 23 atoms remaining in the molecule.


A Jupyter Widget

The inhibitor seems to be *Piperidin-1-ylmethanediamine*. See PubChem: https://pubchem.ncbi.nlm.nih.gov/compound/67834394

# Note the data set layout

```
/mnt/scratch/shared/markov/binding/
    ├── 1
    │   └── filtered
    │       ├── e10s1_e7s5f133
    │       │   └── e10s1_e7s5f133-SDOERR_thrombinLig6x1-0-1-RND6286_9.filtered.xtc
    │       ├── e10s2_e7s5f159
    │       │   └── e10s2_e7s5f159-SDOERR_thrombinLig6x1-0-1-RND8907_9.filtered.xtc
    │       ├── e10s3_e7s5f133
    │       │   └── e10s3_e7s5f133-SDOERR_thrombinLig6x1-0-1-RND5451_9.filtered.xtc
...    

            ├── e9s6_e2s5f112
            │   └── e9s6_e2s5f112-SDOERR_thrombinLig6x3-0-1-RND3971_9.filtered.xtc
            ├── filtered.pdb
            └── filtered.psf

860 directories, 858 files
```

Each `.xtc` file is one trajectory of 20 ns (200 frames). Frames are spaced 100 ps from each other. Each trajectory is a small piece of the whole simulation set.  File names indicate the identity of the trajectory (e.g. `e10s1`) and where it started from (e.g. `e7s5f133` means that the first frame of that configuration was frame 133 of `e7s5`).

Water molecules were removed. To visualize properly, first `wrap()` then `align()` the trajectory (to account for PBC and diffusion).


In [8]:
m=Molecule("/mnt/scratch/shared/markov/binding/1/filtered/filtered.psf")
m.read("/mnt/scratch/shared/markov/binding/1/filtered/e10s1_e7s5f133/e10s1_e7s5f133-SDOERR_thrombinLig6x1-0-1-RND6286_9.filtered.xtc")
m

<htmd.molecule.molecule.Molecule object at 0x7f7855555668>
Molecule with 4507 atoms and 200 frames
Atom field - altloc shape: (4507,)
Atom field - atomtype shape: (4507,)
Atom field - beta shape: (4507,)
Atom field - chain shape: (4507,)
Atom field - charge shape: (4507,)
Atom field - coords shape: (4507, 3, 200)
Atom field - element shape: (4507,)
Atom field - insertion shape: (4507,)
Atom field - masses shape: (4507,)
Atom field - name shape: (4507,)
Atom field - occupancy shape: (4507,)
Atom field - record shape: (4507,)
Atom field - resid shape: (4507,)
Atom field - resname shape: (4507,)
Atom field - segid shape: (4507,)
Atom field - serial shape: (4507,)
angles shape: (0, 3)
bonds shape: (4562, 2)
bondtype shape: (4562,)
box shape: (3, 200)
boxangles shape: (3, 200)
crystalinfo: None
dihedrals shape: (0, 4)
fileloc shape: (200, 2)
impropers shape: (0, 4)
reps: 
ssbonds shape: (0,)
step shape: (200,)
time shape: (200,)
topoloc: /mnt/scratch/shared/markov/binding/1/filtered/filtere

In [9]:
m.wrap()
m.align("protein and name CA")
m.view()

A Jupyter Widget

In [10]:
m.get("serial",sel="protein and name CA")

array([   5,   20,   30,   42,   52,   59,   78,  110,  116,  135,  155,
        170,  192,  214,  225,  244,  259,  271,  293,  307,  322,  346,
        361,  380,  399,  414,  425,  446,  468,  487,  503,  518,  525,
        536,  548,  558,  573,  592,  599,  616,  635,  641,  665,  682,
        698,  715,  734,  754,  778,  800,  819,  825,  842,  857,  876,
        895,  905,  912,  922,  933,  952,  971,  982,  994, 1018, 1042,
       1058, 1077, 1091, 1101, 1111, 1128, 1138, 1157, 1176, 1205, 1219,
       1225, 1249, 1261, 1283, 1297, 1317, 1331, 1346, 1360, 1372, 1391,
       1410, 1426, 1450, 1469, 1476, 1498, 1515, 1526, 1550, 1564, 1588,
       1609, 1624, 1648, 1662, 1681, 1696, 1718, 1737, 1748, 1765, 1784,
       1799, 1821, 1840, 1861, 1880, 1905, 1911, 1935, 1956, 1970, 1994,
       2018, 2033, 2047, 2066, 2078, 2102, 2114, 2133, 2143, 2162, 2179,
       2201, 2220, 2242, 2272, 2278, 2294, 2304, 2324, 2335, 2347, 2368,
       2387, 2412, 2418, 2434, 2444, 2471, 2477, 24

In [11]:
m.get("serial",sel="resname MOL and noh")

array([4481, 4484, 4487, 4490, 4493, 4496, 4497, 4498, 4501])

## Visualize selected frames

Copy of the above, for convenience.

In [12]:
m=Molecule("/mnt/scratch/shared/markov/binding/1/filtered/filtered.psf")
m.read("/mnt/scratch/shared/markov/binding/1/filtered/e12s1_e9s4f131/e12s1_e9s4f131-SDOERR_thrombinLig6x1-0-1-RND2190_9.filtered.xtc")
m.wrap()
m.align("protein and name CA")
m.view()

A Jupyter Widget

In [13]:
m=Molecule("/mnt/scratch/shared/markov/binding/1/filtered/filtered.psf")
m.read("/mnt/scratch/shared/markov/binding/1/filtered/e12s5_e10s6f112/e12s5_e10s6f112-SDOERR_thrombinLig6x1-0-1-RND3432_9.filtered.xtc")
m.wrap()
m.align("protein and name CA")
m.view()

A Jupyter Widget