# Analyzing lectins with bound glycans

GlyContact can extract glycan structures from protein-glycan co-crystals. To show you how, we'll do this for the example of `3ZW1`, the complex of the bacterial lectin BambL and Lewis X. But we're getting ahead of ourselves. Let's imagine we have no idea what glycan is in this file. How do we get started?

In [20]:
%load_ext autoreload
%autoreload 2

from glycontact.process import get_glycan_sequences_from_pdb

pdb_file ="./3ZW1.pdb"

get_glycan_sequences_from_pdb(pdb_file)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


['Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal', 'Fuc(a1-3)GlcNAc']

Got it! So this crystal structure has two glycan sequences that have been built. Note that, often, the electron density of glycans is not fully resolved, so "fragments", such as `Fuc(a1-3)GlcNAc` here, usually are simply the resolved portion of the larger sequence `Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal`. Now that we know what we're looking for, we can extract the structure of the glycan with the `get_annotation` function and then analyze the torsion angles within this glycan with the `get_glycosidic_torsions` function

In [21]:
from glycontact.process import get_annotation, get_glycosidic_torsions
glycan = "Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal"

df, ints = get_annotation(glycan, pdb_file)
get_glycosidic_torsions(df, ints)

Unnamed: 0,linkage,phi,psi,omega,anomeric_form,position
0,2_NAG-1_GAL,-86.4,100.06,,b,3
1,3_FUC-2_NAG,-78.26,140.97,,a,3
2,4_GAL-2_NAG,-84.06,-127.42,,b,4


Since lectin-glycan interactions are also about the lectin, we have a bit of functionality to learn more about the binding pocket as well, namely the `get_binding_pocket` for instance. This function allows you to extract all amino acid residues within a minimum distance (default: 4.0 Å) around a specified monosaccharide from the glycan you're interested in. In our case, since we know that the `Fuc` is the relevant bit for BambL, we home in on that to get all residues of interest (if you instead wanted the **entire** binding pocket, with all glycan-adjacent residues, you could simply remove the `binding_monosaccharide` argument from the function call)

By default, this function returns all **atoms** that are closer than the cut-off value. If you're only interested in the **residues**, try running it with `all_atoms = False` for a more concise output

You can also export the binding pocket (+ bound glycan) into a new PDB file by setting the optional `filepath` argument in the function to the file location you would like to save the `.pdb` file to

In [47]:
from glycontact.process import get_binding_pocket
get_binding_pocket("Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal", "./3ZW1.pdb", binding_monosaccharide = "FUC")

Unnamed: 0,chain,resSeq,resName,atom_name,target_atom,distance_min
0,A,26,GLU,OE1,FUC3_O3,2.679193
1,A,26,GLU,OE2,FUC3_O4,2.683798
2,A,15,ARG,NH2,FUC3_O5,2.825438
3,A,79,TRP,NE1,FUC3_O3,2.869875
4,A,15,ARG,NE,FUC3_O4,2.905937
5,A,38,ALA,N,FUC3_O2,2.998354
6,A,26,GLU,CD,FUC3_O4,3.41624
7,A,15,ARG,CD,FUC3_O4,3.472454
8,A,37,GLY,CA,FUC3_O2,3.481
9,A,15,ARG,CZ,FUC3_O5,3.508778


# Analyzing glycosylated proteins

In [27]:
pdb_file = "./7T6X.pdb"
get_glycan_sequences_from_pdb(pdb_file)

['Man(b1-4)GlcNAc(b1-4)GlcNAc',
 'GlcNAc(b1-4)GlcNAc',
 'Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc',
 'GlcNAc',
 'Man(a1-6)Man(a1-6)Man(b1-4)GlcNAc(b1-4)GlcNAc']

In [22]:
df, ints = get_annotation("Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc", pdb_file)
get_glycosidic_torsions(df, ints)

Unnamed: 0,linkage,phi,psi,omega,anomeric_form,position
0,2_NAG-1_NAG,-80.85,-120.6,,b,4
1,3_BMA-2_NAG,-82.62,-121.97,,b,4
2,4_MAN-3_BMA,71.33,138.26,,a,3
3,5_MAN-3_BMA,92.63,-157.5,-51.99,a,6


In [29]:
from glycontact.process import compute_merge_SASA_flexibility
compute_merge_SASA_flexibility("Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc", my_path = pdb_file)



Unnamed: 0,Monosaccharide_id,Monosaccharide,SASA,Standard Deviation,Coefficient of Variation,flexibility,torsion_flexibility
0,1,GlcNAc(b1-1),294.705612,,,1.71432,
1,2,GlcNAc(b1-4),231.883071,,,1.978299,
2,3,Man(b1-4),109.338062,,,2.570273,
3,4,Man(a1-3),234.063838,,,2.633248,
4,5,Man(a1-6),240.003828,,,2.796275,
