In [193]:
import pytraj as pt
import numpy as np
import nglview as nv
import pandas as pd

We are interested in titrating all LYS, HIS, and CYS residues as well as the ASP residue in the R-D lock.

We know that the original PDB IDs of the ARG and ASP in the R-D lock are 375 and 354 respectively

First, lets view the structure with NGLVIEW and mark the R-D saltbridge residues.

The image is embedded here in case you are unable to use nglview as below

<img src="RD_lock_residues.png" height="360" width="480" alt="Screenshot of RD lock">

In [165]:
struc=pt.load('structures/R206H.pdb',
              top='structures/step2_solvator.psf')
view=nv.show_pytraj(struc)
view.clear_representations()
view.add_representation('cartoon')
view.add_representation('ball+stick',selection='.CA')
view.add_representation('ball+stick',selection='[ARG] and 375')
view.add_representation('ball+stick',selection='[ASP] and 354')
view

NGLWidget()

Rather than hunting all the HIS, LYS, and CYS residues down visually and jotting them down,
we can get pytraj and pandas to the gruntwork for us.

We will work off of the charmm-gui psf to build a residue info table using pandas and the
resinfo command.

In [194]:
pdbStruc=pt.load('structures/R206H.pdb',top='structures/step2_solvator.psf')
print pdbStruc
resinfoDat=pt.resinfo(pdbStruc.top,'resinfo @CA',pdbStruc[0],task='resinfo').split('\n')
resinfoList=[tuple(infoLine.split()) for infoLine in resinfoDat[1:] if len(infoLine.split())>0]
resinfoHeader=resinfoDat[0].replace('#','').split()
resinfoFrame=pd.DataFrame(resinfoList,columns=resinfoHeader)
resinfoFrame.head()

pytraj.Trajectory, 1 frames: 
Size: 0.000000 (GB)
<Topology: 143628 atoms, 46195 residues, 45762 mols, non-PBC>
           


Unnamed: 0,Res,Name,First,Last,Natom,Orig,Mol
0,1,THR,1,16,16,172,1
1,2,THR,17,30,14,173,1
2,3,ASN,31,44,14,174,1
3,4,VAL,45,60,16,175,1
4,5,GLY,61,67,7,176,1


Now that we have the residue information for all protein residues in data frame form,
we can easily query the appropriate residues.

For cpinutil.py we will need to get the corresponding entries of the 'Res' column,
which corresponds to the ordering that cpptraj / pytraj will put them in... which
is also the order that tleap will put them in as well.

In [197]:
titratable_residues_table=resinfoFrame[
    (resinfoFrame.Name.isin(['LYS','CYS','HSD','HIP','HIS'])) |
    (pd.Series(resinfoFrame.Orig,dtype=int)==354)]
print 'found %g titrateable residue entries'%(titratable_residues_table.shape[0])
print '--- top 5 rows of titratable residue table ---'
print titratable_residues_table.head()
titratable_residues_table.to_csv('titratable_residues_table.csv',index=False)
np.savetxt('titrate_res_list.txt',np.array(titratable_residues_table.Res,dtype=int))
' '.join(np.array(titratable_residues_table.Res,dtype=str))

found 48 titrateable residue entries
--- top 5 rows of titratable residue table ---
   Res Name First Last Natom Orig Mol
14  15  HSD   196  212    17  186   1
16  17  CYS   224  234    11  188   1
34  35  HSD   472  488    17  206   1
41  42  CYS   592  602    11  213   1
44  45  LYS   626  647    22  216   1


'15 17 35 42 45 64 72 88 103 113 115 134 147 149 158 163 167 169 174 175 179 180 183 190 208 224 229 257 275 278 301 304 306 321 322 326 345 350 353 362 363 372 375 380 401 415 422 433'