# Glycosylate a protein

In this tutorial we will glycosylate a protein scaffold with a glycan. We will use part of the HIV-1 enevellope protein 4TVP which is heavily glycosylated. We will attach a Mannose9 glycan to all Asn residues in the scaffold.

In [1]:
import glycosylator as gl

First we load the protein scaffold and the glycan from a PDB file. We can of course make the glycan live, but we happen to have it as a PDB already.

In [None]:
protein = gl.scaffold("files/4tvp.prot.pdb")
protein.reindex()
protein.infer_bonds()

glycan = gl.glycan("files/man9.pdb")

Next we can glycosylate the protein scaffold using the `glycosylate` function (or the scaffold's `attach` method). To do that we first need to set the root atom of the glycan to be `C1` of the first resdiue. This is because glycosylation will always attach the glycan's root atom to the scaffold. 

In [8]:
glycan.root_atom = glycan.get_atom("C1", residue=1)

Now with that done, we can start glycosylating. If we want to attach to all `Asn` residues, we could first get a list of all `Asn` residues and then glycosylate each of them in a for-loop. However, we can also simply use a `sequon` to find the appropriate asparagine residues automatically. Glycosylator pre-implements the `N-linked` sequon that we can use for this. There is also the `O-linked` sequon that works for `Ser` and `Thr` residues. 

In [9]:
protein_glycosylated = gl.glycosylate(protein, glycan, sequon="N-linked")

Let's see how many glycans we attached

In [10]:
protein_glycosylated.glycans

defaultdict(dict,
            {Atom(ND2, 892): Glycan(files/man9.pdb),
             Atom(ND2, 1620): Glycan(files/man9.pdb),
             Atom(ND2, 1677): Glycan(files/man9.pdb),
             Atom(ND2, 1851): Glycan(files/man9.pdb),
             Atom(ND2, 1906): Glycan(files/man9.pdb),
             Atom(ND2, 2533): Glycan(files/man9.pdb),
             Atom(ND2, 3106): Glycan(files/man9.pdb),
             Atom(ND2, 3507): Glycan(files/man9.pdb),
             Atom(ND2, 3714): Glycan(files/man9.pdb),
             Atom(ND2, 4019): Glycan(files/man9.pdb),
             Atom(ND2, 4109): Glycan(files/man9.pdb),
             Atom(ND2, 4564): Glycan(files/man9.pdb),
             Atom(ND2, 4674): Glycan(files/man9.pdb),
             Atom(ND2, 4945): Glycan(files/man9.pdb),
             Atom(ND2, 5078): Glycan(files/man9.pdb),
             Atom(ND2, 5392): Glycan(files/man9.pdb),
             Atom(ND2, 5476): Glycan(files/man9.pdb),
             Atom(ND2, 5582): Glycan(files/man9.pdb),
           

Now that we have a basic glycosylated protein we can save it to a new PDB file

In [11]:
protein_glycosylated.to_pdb("protein_glycosylated.pdb")

### Optimizing the glycosylated protein

If we inspect the PDB file we find that some of the glycans look rather sad. That is because the geometry of the protein as well as the presence of other glycans is not really considered when attaching glycans. However, we can address these issues by optimizing the structure now. 

Glycosylator uses Biobuild's optimization framework to improve the glycan conformations. Be sure to check out the tutorial on optimization there to get more details. In short, to optimize structures we need to perform four steps: 

(1) make a graph representation of the molecule to optimize 

(2) choose bonds to rotate around in order to optimize the conformation

(3) select an optimization environment to evaluate the quality of our conformations

(4) solve the environment to find a good conformation


Here we will outline how we can do this for our glycosylated protein:

In [16]:
# before we optimize the glycoprotein we "hollow out" the protein scaffold, that means removing all the parts that are "inside" the protein.
# this is convenient because it reduces the number of atoms that need to be considered during optimization.
protein_glycosylated.hollow_out()

# glycosylator comes with a pre-made function to perform steps 1 and 2 automatically. 
# It produces a ResidueGraph for the scaffold and a list of edges that belong to the glycan residues, for optimization.
graph, edges = gl.optimizers.make_scaffold_graph(protein_glycosylated)

Now that we have a graph and edges, we need to setup a an environment. Glycosylator has multiple environments that we can use. We will use the default `DistanceRotatron` which uses geomeric reasoning to evaluate conformations. There are a few hyperparameters we might need to tinker with until we get a good result...

In [17]:
env = gl.optimizers.DistanceRotatron(graph, edges, pushback=4, crop_nodes_further_than=1.2, radius=18)

Now we can use the environment to optimize our glycoprotein. We can use the `optimize` function to solve the environment using an optimization algorithm. We will use a particle-swarm optimization. We could pass more arguments here to further guide the behavior of the swarm optimization...

In [18]:
optimized = gl.optimizers.optimize(protein_glycosylated, env, "swarm")
optimized.to_pdb("protein_glycosylated.optimized.pdb")

That's for this tutorial. You can now glycosylate your own proteins. Of course, the part about optimization is only illustrative and you may need to tinker around to fit it to your system. Good luck using Glycosylator in your project!