# Contactome and Protonation assessment

The aim of this program is to assess the protonation established by the protein prepare function from htmd. The first step is the creation of the contactome, and object that contains all the important interactions (depenting on a cutoff) of the protein. Then, using as a starting point the contactome the protonation of the amino acids is revised.

In [1]:
from htmd import *
from contactome import *  # contactome.py has to be located in the working directory
pdb = Molecule('3PTB') 
my_pdb, prepData = proteinPrepare(pdb, returnDetails=True) #proteinPrepare step
prova,check = contactome(my_pdb,prepData,returnDetails=True) # contactome creation and contactome checking

2017-06-20 18:17:06,883 - htmd.molecule.readers - INFO - Attempting PDB query for 3PBL



Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. https://dx.doi.org/10.1021/acs.jctc.6b00049

HTMD Documentation at: https://www.htmd.org/docs/latest/

New devel HTMD version (1.7.35 python[==3.5,==3.6]) is available. You are currently on (1.7.32). Use 'conda update -c acellera htmd' to update to the new version. You might need to update your python version as well if there is no release for your current version.



2017-06-20 18:17:11,122 - propka - INFO - No pdbfile provided
2017-06-20 18:18:03,322 - htmd.builder.preparationdata - INFO - The following residues are in a non-standard state: ASP    75  A (ASH), CYS   103  A (CYX), HIS   137  A (HID), HIS   140  A (HID), CYS   181  A (CYX), HIS  1031  A (HIP), HIS   349  A (HID), HIS   354  A (HID), CYS   355  A (CYX), CYS   358  A (CYX), HIS   359  A (HID), ASP    75  B (ASH), CYS   103  B (CYX), CYS   181  B (CYX), GLU  1011  B (GLH), HIS  1031  B (HIP), HIS   349  B (HID), HIS   354  B (HID), CYS   355  B (CYX), CYS   358  B (CYX), HIS   359  B (HID)

**Contactome**: 2 chains detected --> A B. Creating the contactome...
Checking the contactome...

CHARGE CLASHES:  (cutoff=3.5)
Charged residue ('A', 1061, 'ASP', 'OD2') is in direct H-bond orientation with a residue of the same charge: ('A', 1064, 'GLU', 'OE1')
Charged residue ('A', 1061, 'ASP', 'OD2') is in direct H-bond orientation with a residue of the same charge: ('A', 1064, 'GLU', 'OE2')
Char

9.06 min


DUBIOUS HISTIDINE:  (cutoff=3.5)
Chain: B | Resid: 354 | Protonation: HID || Predicted --> HIP 
Chain: B | Resid: 349 | Protonation: HID || Predicted --> HIE 
Chain: A | Resid: 354 | Protonation: HID || Predicted --> HIE 
Chain: A | Resid: 349 | Protonation: HID || Predicted --> HIE HIP 
Chain: B | Resid: 359 | Protonation: HID || Predicted --> HIE 
Chain: A | Resid: 140 | Protonation: HID || Predicted --> HIE 


## 1) Retrieving information

Here are the main ways to retrieve specific information from the contactome.
The first example will be how to get all the contacts from an specific amino acid type. 

In [2]:
prova.iloc[prova.index.get_level_values('Resname') == 'SER']

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,chain,resid,resname,name,bond
Chain,Resid,Resname,Name,n,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
A,35,SER,OG,0,,,,,
A,35,SER,OGd,0,,,,,
A,70,SER,OG,1,A,120,ASN,ND2,2.84
A,70,SER,OG,2,A,71,LEU,H,2.93
A,70,SER,OGd,1,A,120,ASN,OD1,2.72
A,70,SER,OGd,2,A,67,LEU,O,3.5
A,99,SER,OG,1,A,101,ILE,H,2.22
A,99,SER,OG,2,A,102,CYS,H,2.8
A,99,SER,OG,3,A,100,ARG,H,2.96
A,99,SER,OGd,0,,,,,


You can also retrieve information on the contacts from more than one amino acid type:

In [None]:
prova.loc(axis=0)[:,:,('GLU','ASP'),:]

By using the name of the different levels you can do a more specific search. For example, get all the nformation for an specific residue: 

In [None]:
prova.iloc[prova.index.get_level_values('Resid') == 228]

Anoter possible action is to retrieve in a numpy array all the bond distances of a specific residue:

In [None]:
np.asarray(prova.loc(axis=0)[:,228]['bond']) #All the bonds

## 2) Filtering for at least one hit

There are some residues i the contactome that do not have any interaction with anything else. To only get the residues that have one ore more contacts you can run the following command:

In [None]:
prova.iloc[prova.index.get_level_values('n') != 0]

## 3) Contactome methods

### 3.1 Narrow

Return a new Contactome object with the first hit (n=1) and all the following hits with a difference between bond-distance equal o lower than a given cutoff. By default the cutoff is set to 0.5.

In [None]:
prova.narrow()

In [None]:
prova.narrow(0.2)

In [None]:
prova.narrow(0)

### 3.2 Viewer

Opens a VMD visualization with the contactome information for a given query.

In [None]:
prova.viewer(my_pdb,'A',245)

You can set the argument displayChain= False to only show the query amino acid with the interactions without the chain:

In [None]:
 prova.viewer(my_pdb,'A',245,displayChains=False)

### 3.3 get_chn_int

Return a pandas.DataFrame object with the entries of the contactome with interactions with other chains:

In [None]:
prova.get_chn_int()

### 3.4 check

Call the functions 'charge_test' and 'his_test' and return a pandas.Dataframe object with a summary of the results for both functions.

In [None]:
prova.check(my_pdb)

## 4) Destroying multi-index DataFrame

Destroys the multi-index format allowing to use the normal pandas.dataFrame methods. All the contactome methods are disabled in this format.

In [None]:
prova.reset_index()