# MDM-TASK-web command line tools

## Quick introduction

The tools mine information about protein dynamics. Most tools require a topology file and a molecular dynamics trajectory (but not always). Documentation for the command line tools is obtained likewise:

`../calc_correlation.py -h`. 

This notebook gives a short overview of how the command line tools can be used. The <strong>MDM-TASK-web</strong> is available <a href="https://mdmtaskweb.rubi.ru.ac.za/">here</a>

In [None]:
import nglview as nv
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import Image
%matplotlib inline

## Dynamic Cross Correlation

DCC determines the pairwise correlation of residue motion across an MD trajectory, using C-alpha atoms by default. It is also possible to use one or more atom types (as found in the topology file), via a comma-separated list of atoms (without spaces) to use for DCC calculation. E.g. 'CA,P'. The backbone phosphorus 'P' atom can be a good choice for representing nucleotide residues. In the resulting plot, residue indices are numbered starting from zero.

In [None]:
!../calc_correlation.py --step 50 --topology pr1.pdb pr1.xtc
Image("correlation.png")

## Residue Interaction Network analysis

Network centrality metrics from a residue interaction network (RIN) are calculated from a single conformation (provided in any MDTraj-supported format). Each graph consists of nodes (protein C-beta, or GLY C-alpha atoms) connected by edges (binary: 0, 1) that are assigned using a user-defined cut-off distance for each protein residue pair. The default cut-off distance of 6.7 Angstroms comes from previous work identifying the first of a series of coordination shells using the radial distribution function in a collection of proteins

In [None]:
!../calc_network.py --topology pr2.pdb --calc-BC pr2.pdb
v = nv.show_file("./pr2_mean_BC.cif")
v.clear_representations()
v.add_representation("spacefill", colorScheme="bfactor")
v.center()
v

## Dynamic Residue Network analysis

Dynamic Residue Network analysis is calculated from the aggregation of metrics computed from network graphs, which are themselves computed over each frame from a conformational sampling experiment, typically MD. Each graph consists of nodes (protein C-beta, or GLY C-alpha atoms) connected by edges that are assigned using a user-defined cut-off distance for each protein residue pair. The default cut-off distance of 6.7 Angstroms comes from previous work identifying the first of a series of coordination shells using the radial distribution function in a collection of proteins. This value can be changed if desired, for experimentation.
For faster uploads we recommend reducing the input trajectory.

In [None]:
!../calc_network.py --topology pr1.pdb --step 50 --calc-BC pr1.xtc
v = nv.show_file("./pr1_mean_BC.cif")
v.clear_representations()
v.add_representation("spacefill", colorScheme="bfactor")
v.center()
v

### Custom plotting of DRN results 

In [None]:
print(pd.read_csv("./pr2_mean.csv").head())

In [None]:
b = pd.read_csv("./pr2_mean.csv")
b.BC.plot()

# Weighted Residue Contact Map

Generate a weighted network graph using contact frequencies obtained at a defined cut-off radius (typically 6.7) around a protein residue of interest. It can be very useful in getting a weighted contact graph at residue locus of high centrality determined from the DRN metrics. The edge weights represent the contact frequencies determined over the course of the trajectory. It can be coupled to the contact heat map tool for large scale comparisons.

In [None]:
!../contact_map.py --topology pr1.pdb --step 10 --residue ASH25 --ocsv ASH25_pr1.csv pr1.xtc
Image("ASH25_chainA_contact_map.png")

In [None]:
%%bash
../contact_map.py --topology pr2.pdb --step 10 --residue ASH25 --ocsv ASH25_pr2.csv pr2.xtc
../contact_map.py --topology pr3.pdb --step 10 --residue ASH25 --ocsv ASH25_pr3.csv pr3.xtc
../contact_map.py --topology pr4.pdb --step 10 --residue ASH25 --ocsv ASH25_pr4.csv pr4.xtc
../contact_map.py --topology pr5.pdb --step 10 --residue ASH25 --ocsv ASH25_pr5.csv pr5.xtc
../contact_map.py --topology pr6.pdb --step 10 --residue ASH25 --ocsv ASH25_pr6.csv pr6.xtc
../contact_map.py --topology pr7.pdb --step 10 --residue ASH25 --ocsv ASH25_pr7.csv pr7.xtc

## Weighted Residue Contact Heatmap

The contact heatmap aggregates the CSV files obtained from multiple weighted contact network graphs. This enables high-throughput comparison of the local neighborhoods of a given residue position across comparable samples. For example, a given position can be compared in the presence and absence of a drug. The position of interest may not necessarily be the site of a missense mutation, but can be a useful follow-up method to, for example examine the results of degree centrality mapping in greater detail across multiple mutants or various experimental conditions.

In [None]:
!../contact_heatmap.py --annotate --xtickfontsize 8 ASH25_pr*.csv
Image("contact_heatmap.png")

## Perturbation response scanning

PRS sequentially applies a set of uniformly distributed forces (Perturbations) to each residue from an initial (reference) conformation before assesing the correlation against a target (final) conformation. The final conformation needs tot be identical to the input topology, but only needs have be homologous residues. For proper representativeness, it is important that the trajectory used for the calculation of the covariance matrix be properly equilibrated.

### Getting the target conformation

Let's download an opened conformation HIV protease structure as our <strong>target conformation</strong> from the PDB

In [None]:
!wget https://files.rcsb.org/download/1TW7.pdb

In [None]:
nv.show_file('1TW7.pdb')

### Our initial conformation/ topology

In [None]:
nv.show_file('pr1.pdb')

In [None]:
!../prs.py --topology pr1.pdb --perturbations 30 --step 1 --final 1TW7.pdb pr1.xtc 

### Custom plotting of PRS correlation data

In [None]:
pd.read_csv("result.csv").plot()