### PDBArrows

This script creates a bild file to draw arrows in chimera between coordinates in space. The workflow includes a sequence alignment to make sure the right residues are paired between the two sequences.

1. Align the two structures and save them on the same coordinate system. Decide the start and end positions for drawing arrows
2. Run Part 1 below to generate your input for sequence alignment.
3. Copy the output into your favourite alignment software (e.g. clustalw), making sure the output of the alignment is in the same order as the input.
4. Change the variables in Part 2 to point to the alignment and specify the bild output name.


### Dependencies

In [1]:
from biopandas.pdb import PandasPdb
import pandas as pd
from pandas import DataFrame
from Bio import PDB
from Bio import AlignIO
import math

### Part 1
Change the variables below and run the cell to generate the input for sequence alignment

In [3]:
#########################################################
#Change these variables

"""
The input pdbs
"""
pdb1 = 'Human-dynein-onMT.pdb'
pdb2 = '3VKG-relativeAAA3L.pdb'

"""
A chimera-compatible color name. See here:
https://www.cgl.ucsf.edu/chimerax/docs/user/commands/colornames.html
"""
arrowcolor = "orchid"

"""
Residues to draw 
"""
start = 2573
end = 3108

#########################################################

p1 = PandasPdb().read_pdb(pdb1).df['ATOM']
p2 = PandasPdb().read_pdb(pdb2).df['ATOM']

p1 = p1[p1['atom_name'] == 'CA']
p2 = p2[p2['atom_name'] == 'CA']

p1 = p1[['atom_name','residue_name', 'residue_number', 'x_coord', 'y_coord', 'z_coord']].reset_index()
p2 = p2[['atom_name','residue_name', 'residue_number', 'x_coord', 'y_coord', 'z_coord']].reset_index()

###############

d = {'CYS': 'C', 'ASP': 'D', 'SER': 'S', 'GLN': 'Q', 'LYS': 'K',
     'ILE': 'I', 'PRO': 'P', 'THR': 'T', 'PHE': 'F', 'ASN': 'N', 
     'GLY': 'G', 'HIS': 'H', 'LEU': 'L', 'ARG': 'R', 'TRP': 'W', 
     'ALA': 'A', 'VAL':'V', 'GLU': 'E', 'TYR': 'Y', 'MET': 'M', 'NA':'NA'}

p1_res = p1.residue_name.tolist()
p2_res = p2.residue_name.tolist()

p1_res_fix = ""
p2_res_fix = ""

for i,j in zip(p1_res, p2_res):
    p1_res_fix += d[i]
    p2_res_fix += d[j]
    
print("Align these two. KEEP THE ORDER AS INPUT\n")
    
print(">" + pdb1.split("/")[-1].split(".")[0])
print(p1_res_fix)
print(">" + pdb2.split("/")[-1].split(".")[0])
print(p2_res_fix)

Align these two. KEEP THE ORDER AS INPUT

>211111-HsHC-rebuilt
QDLKGVWSELSKVWEQIDQMKEQPWVSVQPRKLRQNLDALLNQLKSFPARLRQYASYEFVQRLLKGYMKINMLVIELKSEALKDRHWKQLMKRLHVNWVVSELTLGQIWDVDLQKNEAIVKDVLLVAQGEMALEEFLKQIREVWNTYELDLVNYQNKCRLIRGWDDLFNKVKEHINSVSAMKLSPYYKVFEEDALSWEDKLNRIMALFDVWIDVQRRWVYLEGIFTGSADIKHLLPVETQRFQSISTEFLALMKKVSKSPLVMDVLNIQGVQRSLERLADLLGKIQKALGEYLERERSSFPRFYFVGDEDLLEIIGNSKNVAKLQKHFKKMFAGVSSIILNEDNSVVLGISSREGEEVMFKTPVSITEHPKINEWLTLVEKEMRVTLAKLLAESVTEVEIFGKATSIDPNTYITWIDKYQAQLVVLSAQIAWSENVETALSSMGGGGDAAPLHSVLSNVEVTLNVLADSVLMEQPPLRRRKLEHLITELVHQRDVTRSLIKSKIDNAKSFEWLSQMRFYFDPKQTDVLQQLSIQMANAKFNYGFEYLGVQDKLVQTPLTDRCYLTMTQALEARLGGSPFGPAGTGKTESVKALGHQLGRFVLVFNCDETFDFQAMGRIFVGLCQVGAWGCFDEFNRLEERMLSAVSQQVQCIQEALREHSNPNYDKTSAPITCELLNKQVKVSPDMAIFITMNPGYAGRSNLPDNLKKLFRSLAMTKPDRQLIAQVMLYSQGFRTAEVLANKIVPFFKLCDEQLSSQSHYDFGLRALKSVLVSAGNVKRERIQKIKREKEERGEAVDEGEIAENLPEQEILIQSVCETMVPKLVAEDIPLLFSLLSDVFPGVQYHRGEMTALREELKKVCQEMYLTYGDGEEVGGMWVEKVLQLYQITQINHGLMMVGPSGSGKSMAWRVLLKALERLEGVEGVAHIIDPKAISKDH

### Part 2

Change the variables below to load the alignment file and to generate the bild file.

In [8]:
#########################################################
#Change these variables

"""
The output from aligning the sequences above
"""

alignmentfile = 'AAA3-to3VKG.clustal_num'

"""
The name for the output file generated by this script
"""

bildoutputname = 'AAA3-to3VKG.bild'

"""
The maximum distance between two points to draw
"""

maxdistance = 12

#########################################################

alignment = AlignIO.read(alignmentfile, 'clustal')

p1ali = str(alignment[0].seq)
p2ali = str(alignment[1].seq)

p1coordsx = p1.x_coord.tolist()
p1coordsy = p1.y_coord.tolist()
p1coordsz = p1.z_coord.tolist()
p1_resnum = p1.residue_number.tolist()

p2coordsx = p2.x_coord.tolist()
p2coordsy = p2.y_coord.tolist()
p2coordsz = p2.z_coord.tolist()

p1coords_ali = []
p2coords_ali = []

i = 0
indices=[]
for p in p1ali:
    if p != "-":
        p1coords_ali.append([p1coordsx[i], p1coordsy[i], p1coordsz[i]])
        i+=1
    else:
        p1coords_ali.append([0])
        
i = 0
for p in p2ali:
    if p != "-":
        p2coords_ali.append([p2coordsx[i], p2coordsy[i], p2coordsz[i]])
        i+=1
    else:
        p2coords_ali.append([0])
       
coordinates = []
num = p1_resnum[0]
for i,j in zip(p1coords_ali, p2coords_ali):
    if i != [0]:
        if num >= start and num <= end and j != [0]:
            coordinates.append([i, j])      
        num+=1
        
#######################

arrows = []

r1 = 0.2
r2 = 0.8
rho = 0.75

arrows.append(".color " + arrowcolor)

for c in coordinates:
    
    [x1,y1,z1],[x2,y2,z2] = c
    
    if math.sqrt((x2 - x1)**2 + (y2 - y1)**2 + (z2 - z1)**2) < maxdistance:

        arrows.append(".arrow " + str(x1) + " " + str(y1) + " " + str(z1) + " " + str(x2) + " " + str(y2) + " " + str(z2) + " " + str(r1) + " " + str(r2) + " " + str(rho))

exportfile = open(bildoutputname, 'w')
for item in arrows:
    exportfile.write("%s\n" % item)
exportfile.close()

print("Wrote " + bildoutputname)

Wrote AAA3-to3VKG.bild
