In [None]:
(c) Copyright Rosetta Commons Member Institutions.
(c) This file is part of the Rosetta software suite and is made available under license.
(c) The Rosetta software is developed by the contributing members of the Rosetta Commons.
(c) For more information, see http://www.rosettacommons.org. Questions about this can be
(c) addressed to University of Washington CoMotion, email: license@uw.edu.

##This Runs on the older Pyrosetta bindings
---
---

## Running Symmetry without Perl !!!!!! Gasp!!!! Sacrilege!!!!!

In order for Rosetta to model a symmetric protein and make use of the fancy time saving machinery it has, one must use the perl script written by Frank.... Some call it black magic, others call it a black box.

A while back, thanks to Javier's efforts, he encoded some stuff so that you could *auto* detect Cn symmetries as a mover! This is awesome... but sometimes, for some proteins it just doesnt' work. Much of this code is to his credit though. The C++ dies on some proteins though, it has reasonable, hard coded cutoffs that just don't work for all proteins.

This is an implementation a la Pyrosetta that will allow a researcher to simply upload a protein (which may have Cn) symmetry and after some math, will be able to call symmetric movers, score with symmetric scorefunctions and do sweet symmetry stuff.

Note: This currently will not work for Dn symmetries (Until Stephanie figures that out ;) )

Imagine you have a protein (say KIVD- who works on those anyways?) that has C2 symmetry and at the active site, there are catalytic residues from both chains. Suppose we wanted to design on a related sequence that doesn't have a crystal structures, then we would need to build a homology model. Unfortunately, homology modeling already takes a tremendous amount of compute resources that when you double the size of the protein 541\*2, it simply takes forever. 

Fortunately, the homology modeling code (Yifan & Frank) takes a symmetry definition file.... like the ones that come from Franks perl script (make_symm_def_file.pl). However, when you create 1 symmetry definition file, the symmetry is static!  Why is this a problem for us? Well, say I am creating a homology model of a relative of 2vbg, and there are 10 available templates. Each template might have a different way of interacting with at the interface and if I ran each template through Frank's perl script, I would get 10 different answers (slightly different).

So, because in the future I think homology modeling of symmetric interfaces should be able to A) use mutiple templates and B) be able to sample these different orientations and symmetries at the same time, I wrote this notebook. 

In the near future, imagine modeling symmetric interfaces and sample different interface and symmetry perturbations at interfaces... It might help us make better models


## Setup the Functions and Maths needed for symmetry
This essentially is a copy of some of the code in src/numeric/xyzfunctions.hh

There are no bindings for these because they are multi templates, ideally there would be bindings for all core types (and std types)

The following set of functions take an xyzVector and calculates the projections onto different x, y and z axis

Note: In order for Rosetta's symmetry machinery to work, it assumes the center of mass of the Pose is at the origin (0,0,0). See the symmetry documentation for more details, or compare what happens here with what happens when you run the perl script (load the input files and the files that are created to be ran in rosetta)

In [28]:
# This is needed for some builds
import rosetta
import pyrosetta


This next section copies other code, again mostly for the reason that there are no python bindings that interface with the c++ code correctly.

These create rotation matrices when you input a degree of rotation in degrees

In [35]:
import math
test = numeric.xyzMatrix_double_t()

def x_rotation_matrix( theta ):
    sin_theta = math.sin(theta)
    cos_theta = math.cos(theta)
    mat = numeric.xyzMatrix_double_t()
    print "made mat"
    print mat
    
    newmat = mat.rows( 1, 0, 0,
            0, cos_theta, -sin_theta,
             0, sin_theta, cos_theta
            )
    print "changed mat"
    print newmat
    return newmat
    
def x_rotation_matrix_degrees( theta ):
    return x_rotation_matrix( math.radians(theta))

def z_rotation_matrix( theta ):
    sin_theta = math.sin(theta)
    cos_theta = math.cos(theta)
    mat = numeric.xyzMatrix_double_t()
    mat = numeric.xyzMatrix_double_t()
    print "made mat"
    print mat
    
    newmat = mat.rows( cos_theta, -sin_theta, 0,
            sin_theta, cos_theta, 0,
             0, 0, 1
            )
    print "changed mat"
    print newmat
    return newmat
    
def z_rotation_matrix_degrees( theta ):
    return z_rotation_matrix( math.radians(theta))

def y_rotation_matrix( theta ):
    sin_theta = math.sin(theta)
    cos_theta = math.cos(theta)
    mat = numeric.xyzMatrix_double_t()
    print "made mat"
    print mat
    
    newmat = mat.rows( cos_theta, 0 , sin_theta,
            0, 1, 0,
             -sin_theta, 0, cos_theta
            )
    print "changed mat"
    print newmat
    return newmat
    
def y_rotation_matrix_degrees( theta ):
    return y_rotation_matrix( math.radians(theta))


## This is an example
f = x_rotation_matrix_degrees( 180 )
print type(f)

#numeric.xyz

made mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17928>
changed mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17a40>
<class 'rosetta.numeric.xyzMatrix_double_t'>


---
---


# Let's begin with rosetta

I highly recommend that before you start, you also open Pymol & plot the xyz axes

Once you can see the protein (see next few cells) be sure to clean up (cartoon and color by chain)

In [None]:
from rosetta import *

In [9]:
pyrosetta.init('-ignore_unrecognized_res T -ignore_waters T -preserve_header T -prevent_repacking T')

Found rosetta database at: /home/stephanie/anaconda2/envs/DebugPyrosetta/lib/python2.7/site-packages/pyrosetta-4.0-py2.7.egg/database; using it....
PyRosetta-4 2016 [Rosetta 2016 unknown:9ea8e5e15e7c35838a32b8089ca8351ff540888c 2016-12-16 10:52:45 -0500] retrieved from: git@github.com:RosettaCommons/main.git
(C) Copyright Rosetta Commons Member Institutions.
Created in JHU by Sergey Lyskov and PyRosetta Team.



In [30]:
def tmalign( pose, ref_pose ):
    
    print 'Running tmalign on poses'
    tm = rosetta.protocols.hybridization.TMalign()
    tm.apply(pose, ref_pose)
    longest = max(pose.total_residue()+1, ref_pose.total_residue()+1)
    
    print 'TMScore = %s ' %tm.TMscore(longest)
    atommap =  rosetta.core.id.AtomID_Map_core_id_AtomID_t()
    rosetta.core.pose.initialize_atomid_map_AtomID( atommap, pose )
    tm.alignment2AtomMap( pose, ref_pose, atommap )
    aln_cutoff = rosetta.utility.vector1_double()
    for i in [2.0,1.5,1.0,0.5]:
        aln_cutoff.append(i)
    min_coverage = .2
    rosetta.protocols.hybridization.partial_align(pose1,pose2, atommap, True, aln_cutoff, min_coverage)

In [10]:
# Read in a pose and setup the pymol observer, you have to score it to get it into pymol

p = pyrosetta.pose_from_file('2VBF_cleaned.pdb')
pm = pyrosetta.PyMolMover()
pm.keep_history(True)
pm.apply(p)
# pyobs = PyMOL_Observer() 
# pyobs.add_observer(p) 
sfxn = pyrosetta.get_fa_scorefxn()
sfxn(p) 


2343.720084716446

Notice that the protein is not at the origin

In [11]:
## Take a look at your protein and the wierd chain endings
print p
print p.conformation().chain_endings()

# Fix the weird chain endings (so that later, we can create a nice foldtree)
p.update_pose_chains_from_pdb_chains()
print p.chain_sequence(1)
print p.conformation().chain_endings()

PDB file name: 2VBF_cleaned.pdb
Total residues:1082
Sequence: MYTVGDYLLDRLHELGIEEIFGVPGDYNLQFLDQIISREDMKWIGNANELNASYMADGYARTKKAAAFLTTFGVGELSAINGLAGSYAENLPVVEIVGSPTSKVQNDGKFVHHTLADGDFKHFMKMHEPVTAARTLLTAENATYEIDRVLSQLLKERKPVYINLPVDVAAAKAEKPALSLENTTEQVILSKIEESLKNAQKPVVIAGHEVISFGLEKTVTQFVSETKLPITTLNFGKSAVDESLPSFLGIYNGKLSEISLKNFVESADFILMLGVKLTDSSTGAFTHHLDENKMISLNIDEGIIFNKVVEDFDFRAVVSSLSELKGIEYEGQYIDKQYEEFIPSSAPLSQDRLWQAVESLTQSNETIVAEQGTSFFGASTIFLKSNSRFIGQPLWGSIGYTFPAALGSQIADKESRHLLFIGDGSLQLTVQELGLSIREKLNPICFIINNDGYTVEREIHGPTQSYNDIPMWNYSKLPETFGATEDRVVSKIVRTENEFVSVMKEAQADVNRMYWIELVLEKEDAPKLLKKMGKLFAEQNKMYTVGDYLLDRLHELGIEEIFGVPGDYNLQFLDQIISREDMKWIGNANELNASYMADGYARTKKAAAFLTTFGVGELSAINGLAGSYAENLPVVEIVGSPTSKVQNDGKFVHHTLADGDFKHFMKMHEPVTAARTLLTAENATYEIDRVLSQLLKERKPVYINLPVDVAAAKAEKPALSLENTTEQVILSKIEESLKNAQKPVVIAGHEVISFGLEKTVTQFVSETKLPITTLNFGKSAVDESLPSFLGIYNGKLSEISLKNFVESADFILMLGVKLTDSSTGAFTHHLDENKMISLNIDEGIIFNKVVEDFDFRAVVSSLSELKGIEYEGQYIDKQYEEFIPSSAPLSQDRLWQAVESLTQSNETIVAEQGTSFFGASTIFLKSNSRFIGQPLWGS

In [13]:
#### Now starts the pyrosetta implementation of DetectSymmetry:apply
## We might be able to skip this if we can pull the data from the mmcif pdb file or if
## we know a priori what kind of symmetry exists in the family of proteins we want to model
n_jumps = p.num_jump()
symmetric_type = n_jumps +1
seq1 = p.chain_sequence(1)
#pose made from chain A seq
print len(seq1)
new_pose = rosetta.core.pose.Pose(p, 1, len(seq1))
print new_pose

541
PDB file name: 
Total residues:541
Sequence: MYTVGDYLLDRLHELGIEEIFGVPGDYNLQFLDQIISREDMKWIGNANELNASYMADGYARTKKAAAFLTTFGVGELSAINGLAGSYAENLPVVEIVGSPTSKVQNDGKFVHHTLADGDFKHFMKMHEPVTAARTLLTAENATYEIDRVLSQLLKERKPVYINLPVDVAAAKAEKPALSLENTTEQVILSKIEESLKNAQKPVVIAGHEVISFGLEKTVTQFVSETKLPITTLNFGKSAVDESLPSFLGIYNGKLSEISLKNFVESADFILMLGVKLTDSSTGAFTHHLDENKMISLNIDEGIIFNKVVEDFDFRAVVSSLSELKGIEYEGQYIDKQYEEFIPSSAPLSQDRLWQAVESLTQSNETIVAEQGTSFFGASTIFLKSNSRFIGQPLWGSIGYTFPAALGSQIADKESRHLLFIGDGSLQLTVQELGLSIREKLNPICFIINNDGYTVEREIHGPTQSYNDIPMWNYSKLPETFGATEDRVVSKIVRTENEFVSVMKEAQADVNRMYWIELVLEKEDAPKLLKKMGKLFAEQNK
Fold tree:
FOLD_TREE  EDGE 1 541 -1 


In [15]:
## This goes through each of the subunits (different chains) and calculates the rms of the chain A
## to chain X, this is to make sure all of the 'symmetric' chains are actually symmetric

i = 1
while i != symmetric_type:
    #i +=1
    if len(seq1) != len(p.chain_sequence(2)):
        print "Subunits have different length sequences"
    
    test_pose = rosetta.core.pose.Pose(p, i*len(seq1)+1,(i+1)*len(seq1))
    rms = rosetta.core.scoring.CA_rmsd(new_pose,test_pose)
    print rms
    i +=1

0.374408006668
nan
nan


The next few cells first move the input protein to the origin (COM). But then the next few apply rotations that align the COM of each chain with the XY, XZ and YZ planes

In [25]:
### This first move moves the proteins COM to the origin (0,0,0)
#id_rot_mat = numeric.xyzMatrix_Real.identity()
#numeric.xyz
id_rot_mat = numeric.xyzMatrix_double_t.identity()
cm_pose = rosetta.core.pose.center_of_mass(p,1,p.total_residue())
print "starting cm"
print cm_pose
# print type(cm_pose)
# print type(id_rot_mat)
## So this is just applying the negative of the com to the pose (so a origin placement)
p.apply_transform_Rx_plus_v(id_rot_mat, cm_pose.negate())
cm_pose.negate()
print id_rot_mat
#sfxn(p)
print "newcom"
newcom = rosetta.core.pose.center_of_mass(p,1,p.total_residue())
print newcom

starting cm
  3.652859489524545E-16   1.031009330077040E-15  -6.566938408134013E-18
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17110>
newcom
  3.513312048351697E-16   1.027725860872973E-15  -6.566938408134013E-18


In [33]:
from rosetta.protocols.rigid import *
from rosetta.core.pose.symmetry import *
from rosetta.numeric import *
import rosetta.numeric as numeric

## turns out i need some private inline functions
#	inline core::Real angle_with_x_axis_proj_y( xyzVector const & v) const { return numeric::dihedral_degrees(xyzVector(v[0],1,v[2]), xyzVector(0,1,0), xyzVector(0,0,0), xyzVector(1,0,0)); }
def angle_with_x_axis_proj_y( some_xyzVector ):
    return numeric.dihedral_degrees_double_t( numeric.xyzVector_double_t ( some_xyzVector[0],1,some_xyzVector[2]), numeric.xyzVector_double_t(0,1,0), numeric.xyzVector_double_t( 0,0,0 ), numeric.xyzVector_double_t( 1,0,0 ) )

def angle_with_y_axis_proj_x( some_xyzVector ):
    return numeric.dihedral_degrees_double_t( numeric.xyzVector_double_t( 1,some_xyzVector[1],some_xyzVector[2]), numeric.xyzVector_double_t(1,0,0), numeric.xyzVector_double_t( 0,0,0 ), numeric.xyzVector_double_t( 0,1,0 ) )

def angle_with_y_axis_proj_z( some_xyzVector ):
    return numeric.dihedral_degrees_double_t( numeric.xyzVector_double_t( some_xyzVector[0],some_xyzVector[1],1), numeric.xyzVector_double_t(0,0,1), numeric.xyzVector_double_t( 0,0,0 ), numeric.xyzVector_double_t( 0,1,0 ) )

def angle_with_z_axis_proj_y( some_xyzVector ):
    return numeric.dihedral_degrees_double_t( numeric.xyzVector_double_t( some_xyzVector[0],1,some_xyzVector[2]), numeric.xyzVector_double_t(0,1,0), numeric.xyzVector_double_t( 0,0,0 ), numeric.xyzVector_double_t( 0,0,1 ) )



## align the com of chain A with Y axis
#first, rotate around x to align com of chain A with xy plane
cm_chain_A = rosetta.core.pose.center_of_mass(p,1,len(seq1))
angle_y1 = angle_with_y_axis_proj_x( cm_chain_A )
x_rot = x_rotation_matrix_degrees( -1*angle_y1 )
p.apply_transform_Rx_plus_v( x_rot, numeric.xyzVector_double_t(0,0,0))
new_cm_chain_A = rosetta.core.pose.center_of_mass(p,1,p.total_residue())
print new_cm_chain_A
print "plane tolerance: "
print new_cm_chain_A[2]
if (new_cm_chain_A[2] > -1*0.05) and (new_cm_chain_A[2] < .05):
    print "pass plane check x"
#sfxn(p)


made mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17880>
changed mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17420>
  3.513312048351697E-16   8.029723938545863E-15  -1.083544837342112E-15
plane tolerance: 
-1.08354483734e-15
pass plane check x


In [37]:
 ## now rotat along Z axis   
angle_y2 = angle_with_y_axis_proj_z(  new_cm_chain_A )
z_rot = z_rotation_matrix_degrees( -1*angle_y2)
p.apply_transform_Rx_plus_v( z_rot, numeric.xyzVector_double_t(0,0,0))
new2_cm_chain_A = rosetta.core.pose.center_of_mass(p,1,p.total_residue())
print new2_cm_chain_A[2]
if (new2_cm_chain_A[2] > -1*0.05) and (new2_cm_chain_A[2] < .05):
    print "pass plane check z"


made mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035d17ce0>
changed mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f203640d650>
-1.08354483734e-15
pass plane check z


Note now that the protein is centered over the origin, and has undergone some rotations

This next cell is just a sanity check that the com of the chain B is also near 0

In [38]:
## now check the other chain's com
i = 1
cm_chain = rosetta.core.pose.center_of_mass(p, i*len(seq1)+1,(i+1)*len(seq1))
print "Chain B"
print cm_chain[2]

Chain B
-4.85953442202e-16


In [56]:
## Now perform C2 check plus one last rotation
# print p.residue(1).xyz("CA")
# print p.residue(len(seq1)+1).xyz("CA")

vm_old = ( p.residue(1).xyz("CA") + p.residue(len(seq1)+1).xyz("CA")) #/2.0
vm = rosetta.numeric.xyzVector_double_t(vm_old.x/2,vm_old.y/2,vm_old.z/2)
#print vm
# vm.x/2
# vm.y/2
# vm.z/2
#print type(vm)
# print vm 
angle_z_axis = angle_with_z_axis_proj_y( vm )
rot1 = y_rotation_matrix_degrees( -1*angle_z_axis )
print angle_z_axis 
print rot1 
print '-'*30
p.apply_transform_Rx_plus_v( rot1, numeric.xyzVector_double_t(0,0,0))

print p.residue(1).xyz("CA")
print p.residue(len(seq1)+1).xyz("CA")

vm_new_old = ( p.residue(1).xyz("CA") + p.residue(len(seq1)+1).xyz("CA"))
vm_new = rosetta.numeric.xyzVector_double_t(vm_new_old.x/2,vm_new_old.y/2,vm_new_old.z/2) #/2
print vm_new

print "This should be less than .001"
print vm_new[0]
print "This too "
print vm_new[1]

print '-'*30
print vm_new
#pm.apply(p)

made mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035ddff10>
changed mat
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035ddfb20>
0.0
<rosetta.numeric.xyzMatrix_double_t object at 0x7f2035ddfb20>
------------------------------
     -36.34637252436451       4.082909810076414       6.598750987474515
      36.34637252436451       3.706966120699662       7.015905183063775
      0.000000000000000       3.894937965388038       6.807328085269145
This should be less than .001
0.0
This too 
3.89493796539
------------------------------
      0.000000000000000       3.894937965388038       6.807328085269145


In [59]:
# Now, since the protein is in the right place, we just make a copy of the chain A
alignpose = rosetta.core.pose.Pose(p,1,len(seq1))
pm.apply(alignpose)

In [None]:
# here, we read in the generic sym files in the Rosetta database
# We can use ChainA and the symdef information to create a symmetric pose!
from rosetta.basic import *
from rosetta.core.conformation.symmetry import *

c2_symm = 'C2_Z.sym'
# the sym file is usually located at the below path
#c2_symm = '~/Rosetta/main/database/symmetry/cyclic/C2_Z.sym'
symdef = core.conformation.symmetry.SymmData()
symdef.read_symmetry_data_from_file( c2_symm )

#p = 
core.pose.symmetry.make_symmetric_pose( alignpose, symdef )
p = alignpose

##
Wells, that's it!

You should have seen a bunch of symmetry stuff flash on the screen
We now have a pose inside of rosetta that is symmetric!!! And not an ounce of Perl was used!

So we can do all the normal stuff like symmetric packing
symmetric scoring, etc

In [None]:
import rosetta.core.conformation.symmetry
import rosetta.core.pose.symmetry
from rosetta.core.scoring.symmetry import *
import rosetta.protocols.simple_moves.symmetry


#print p
symopts = rosetta.core.conformation.symmetry.SymmDataOptions()
#rosetta.core.conformation.symmetry.SymmetryInfo()
#So rosetta sees this as a symmetric pose
rosetta.core.conformation.symmetry.is_symmetric(p.conformation())


sym_sfxn = rosetta.core.scoring.symmetry.SymmetricScoreFunction()
sym_packer = protocols.simple_moves.symmetry.SymPackRotamersMover( sym_sfxn )
sym_min = protocols.simple_moves.symmetry.SymMinMover()
mm = MoveMap()
core.pose.symmetry.make_symmetric_movemap( p, mm)
#sym_packer.apply(p)

# But now how do I score it?
# and do i need to turn other options on?
#symopts.show()

In [None]:
print rosetta.core.conformation.symmetry.is_symmetric(p.conformation())

this pyrosetta lesson was brought to you by Steve Bertolani