<a href="https://colab.research.google.com/github/andrewfavor95/GuidedHallucination/blob/main/demos/oligomer_aspect_ratio_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AfDesign - hallucination custom loss example: cyclic oligomers

In [1]:
#@title install
%%bash
if [ ! -d params ]; then
  # get code
  pip -q install git+https://github.com/sokrypton/ColabDesign.git
  # for debugging
  ln -s /usr/local/lib/python3.7/dist-packages/colabdesign colabdesign
  # then add the GuidedHallucination repo for loss functions
  git clone https://github.com/andrewfavor95/GuidedHallucination.git
  # download params
  mkdir params
  curl -fsSL https://storage.googleapis.com/alphafold/alphafold_params_2021-07-14.tar | tar x -C params
  for W in openfold_model_ptm_1 openfold_model_ptm_2 openfold_model_no_templ_ptm_1
  do wget -qnc https://files.ipd.uw.edu/krypton/openfold/${W}.npz -P params; done
fi

In [2]:
#@title import libraries
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import os
import colabdesign
from colabdesign import mk_afdesign_model, clear_mem
from colabdesign.af.alphafold.common import residue_constants
from IPython.display import HTML
from google.colab import files
import numpy as np
import pdb as pydebug
import string

import jax
import jax.numpy as jnp


# Adding an additional loss function

The hallucination protocol seems to favor specific protein shapes, especially when using loss terms such as plddt are included.  One example of this is a tendency to generate coiled-coils or helical bundles when hallucinating cyclic oligomers.  Maybe we can work around this by adding a new custom loss function, which controls the aspect ratio of the protein.

In this demo, we'll explore how different shape-penalties based on aspect ratios can generate cyclic oligomers with different morphologies.

In [78]:

def aspect_ratio_loss(inputs,outputs,opt,mode='cyllinder'):

    pred = outputs["structure_module"]["final_atom_positions"][:,residue_constants.atom_order["CA"]]
    
    
    s_vals = jnp.linalg.svd(pred - pred.mean(axis=0),full_matrices=True,compute_uv=False)  # singular values of the coordinates


    if mode=='cyllinder':
        s_vals = s_vals/s_vals.mean()
        aspect_ratio_loss_val = jnp.square( s_vals[0]-s_vals[1] )

    elif mode=='rod':
        s_vals = s_vals/jnp.linalg.norm(s_vals)
        aspect_ratio_loss_val = -jnp.log( s_vals[0] )

    return {"aspect_ratio_loss": aspect_ratio_loss_val }





def cyllinder_loss(inputs,outputs,opt):

    return aspect_ratio_loss(inputs,outputs,opt,mode='cyllinder')


def rod_loss(inputs,outputs,opt):
    aspect_ratio_loss_val = aspect_ratio_loss(inputs,outputs,opt,mode='rod')

    return aspect_ratio_loss(inputs,outputs,opt,mode='rod')






# Trimer demos:

Let's start by hallucinating a trimer with 60 amino acids in each chain

In [54]:
copies = 3
chain_len = 60 # maybe change variable names later, or even add this as a cell to enter info into

#Example 0: 
We can start by seeing what the output looks like when using the standard loss terms, like plddt, pae, and ptm

In [51]:
clear_mem()
af_model = mk_afdesign_model(protocol="hallucination",
                             debug=False)

af_model.prep_inputs(length=chain_len, copies=copies, homooligomer=True)


af_model.opt["weights"]["plddt"] = 0.1
af_model.opt["weights"]["pae"] = 0.1
af_model.opt["weights"]["i_pae"] = 0.1
af_model.opt["weights"]["ptm"] = 0.1
af_model.opt["weights"]["i_ptm"] = 0.1


print("weights", af_model.opt["weights"])


weights {'con': 1.0, 'exp_res': 0.0, 'i_con': 1.0, 'i_pae': 0.1, 'pae': 0.1, 'plddt': 0.1, 'seq_ent': 0.0, 'ptm': 0.1, 'i_ptm': 0.1}


In [52]:
# Start with 50 iters of soft design
af_model.restart(mode="gumbel", seed=0)
af_model.design_soft(50)

# three stage design  
af_model.restart(seq=af_model.aux["seq"]["pseudo"], keep_history=True)
af_model.design_3stage(50,50,5)

1 models [2] recycles 0 hard 0 soft 1 temp 1 loss 12.93 con 4.92 i_con 4.22 plddt 0.51 ptm 0.22 i_ptm 0.11 aspect_ratio_loss 3.79
2 models [3] recycles 0 hard 0 soft 1 temp 1 loss 9.40 con 3.61 i_con 4.18 plddt 0.54 ptm 0.25 i_ptm 0.11 aspect_ratio_loss 1.61
3 models [4] recycles 0 hard 0 soft 1 temp 1 loss 9.10 con 3.65 i_con 4.18 plddt 0.42 ptm 0.21 i_ptm 0.10 aspect_ratio_loss 1.27
4 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.15 con 3.58 i_con 4.22 plddt 0.38 ptm 0.18 i_ptm 0.07 aspect_ratio_loss 0.35
5 models [3] recycles 0 hard 0 soft 1 temp 1 loss 9.16 con 3.65 i_con 4.13 plddt 0.53 ptm 0.25 i_ptm 0.11 aspect_ratio_loss 1.37
6 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.74 con 3.65 i_con 4.13 plddt 0.39 ptm 0.20 i_ptm 0.09 aspect_ratio_loss 0.95
7 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.36 con 3.23 i_con 3.85 plddt 0.56 ptm 0.35 i_ptm 0.22 aspect_ratio_loss 0.29
8 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.25 con 3.10 i_con 3.90 plddt 0.53 ptm 0.32 i

In [53]:
af_model.plot_pdb()

This design has a bias towards axial contacts, like we expected.


# Example 1:

The first output was fairly extended along the symmetry axis.

Let's see how things look if we apply the cyllindrical aspect ratio loss.

This loss minimizes the square-difference between the first two principle components of the protein's coordinates.

In [59]:
clear_mem()
af_model = mk_afdesign_model(protocol="hallucination",
                             debug=False,
                             loss_callback=cyllinder_loss)

af_model.prep_inputs(length=chain_len, copies=copies, homooligomer=True)


af_model.opt["weights"]["plddt"] = 0.1
af_model.opt["weights"]["pae"] = 0.1
af_model.opt["weights"]["i_pae"] = 0.1
af_model.opt["weights"]["ptm"] = 0.1
af_model.opt["weights"]["i_ptm"] = 0.1
af_model.opt["weights"]["aspect_ratio_loss"] = 1.0


print("weights", af_model.opt["weights"])


weights {'con': 1.0, 'exp_res': 0.0, 'i_con': 1.0, 'i_pae': 0.1, 'pae': 0.1, 'plddt': 0.1, 'seq_ent': 0.0, 'ptm': 0.1, 'i_ptm': 0.1, 'aspect_ratio_loss': 1.0}


In [60]:
# Start with 50 iters of soft design
af_model.restart(mode="gumbel", seed=0)
af_model.design_soft(50)

# three stage design  
af_model.restart(seq=af_model.aux["seq"]["pseudo"], keep_history=True)
af_model.design_3stage(50,50,5)

1 models [2] recycles 0 hard 0 soft 1 temp 1 loss 10.58 con 4.92 i_con 4.22 plddt 0.51 ptm 0.22 i_ptm 0.11 aspect_ratio_loss 1.43
2 models [3] recycles 0 hard 0 soft 1 temp 1 loss 8.54 con 3.73 i_con 4.14 plddt 0.53 ptm 0.26 i_ptm 0.11 aspect_ratio_loss 0.66
3 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.40 con 3.90 i_con 4.17 plddt 0.37 ptm 0.18 i_ptm 0.10 aspect_ratio_loss 0.32
4 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.00 con 3.81 i_con 4.11 plddt 0.37 ptm 0.20 i_ptm 0.09 aspect_ratio_loss 0.08
5 models [3] recycles 0 hard 0 soft 1 temp 1 loss 8.07 con 3.63 i_con 4.32 plddt 0.43 ptm 0.21 i_ptm 0.09 aspect_ratio_loss 0.11
6 models [4] recycles 0 hard 0 soft 1 temp 1 loss 7.73 con 3.41 i_con 4.27 plddt 0.47 ptm 0.19 i_ptm 0.06 aspect_ratio_loss 0.04
7 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.45 con 3.00 i_con 4.38 plddt 0.58 ptm 0.27 i_ptm 0.11 aspect_ratio_loss 0.07
8 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.51 con 3.21 i_con 4.25 plddt 0.53 ptm 0.26 i

In [61]:
af_model.plot_pdb()

Awesome!  This round of hallucination generated a cyclic oligomer that is much more compact and globular.

# Example 2:

Let's see if we can actually force it to be a rod, by trying to maximize the first singular value?

In [79]:
clear_mem()
af_model = mk_afdesign_model(protocol="hallucination",
                             debug=False,
                             loss_callback=rod_loss)

af_model.prep_inputs(length=chain_len, copies=copies, homooligomer=True)


af_model.opt["weights"]["plddt"] = 0.1
af_model.opt["weights"]["pae"] = 0.1
af_model.opt["weights"]["i_pae"] = 0.1
af_model.opt["weights"]["ptm"] = 0.1
af_model.opt["weights"]["i_ptm"] = 0.1
af_model.opt["weights"]["aspect_ratio_loss"] = 1.0


print("weights", af_model.opt["weights"])

weights {'con': 1.0, 'exp_res': 0.0, 'i_con': 1.0, 'i_pae': 0.1, 'pae': 0.1, 'plddt': 0.1, 'seq_ent': 0.0, 'ptm': 0.1, 'i_ptm': 0.1, 'aspect_ratio_loss': 1.0}


In [80]:
# Start with 50 iters of soft design
af_model.restart(mode="gumbel", seed=0)
af_model.design_soft(50)

# three stage design  
af_model.restart(seq=af_model.aux["seq"]["pseudo"], keep_history=True)
af_model.design_3stage(50,50,5)

1 models [2] recycles 0 hard 0 soft 1 temp 1 loss 9.23 con 4.92 i_con 4.22 plddt 0.51 ptm 0.22 i_ptm 0.11 aspect_ratio_loss 0.08
2 models [3] recycles 0 hard 0 soft 1 temp 1 loss 8.08 con 3.67 i_con 4.18 plddt 0.42 ptm 0.21 i_ptm 0.11 aspect_ratio_loss 0.23
3 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.25 con 3.93 i_con 4.17 plddt 0.43 ptm 0.23 i_ptm 0.11 aspect_ratio_loss 0.16
4 models [4] recycles 0 hard 0 soft 1 temp 1 loss 7.98 con 3.61 i_con 4.10 plddt 0.46 ptm 0.25 i_ptm 0.12 aspect_ratio_loss 0.27
5 models [3] recycles 0 hard 0 soft 1 temp 1 loss 8.07 con 3.73 i_con 4.08 plddt 0.45 ptm 0.25 i_ptm 0.12 aspect_ratio_loss 0.27
6 models [4] recycles 0 hard 0 soft 1 temp 1 loss 8.03 con 3.70 i_con 4.07 plddt 0.43 ptm 0.23 i_ptm 0.11 aspect_ratio_loss 0.26
7 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.33 con 3.20 i_con 3.87 plddt 0.56 ptm 0.31 i_ptm 0.16 aspect_ratio_loss 0.26
8 models [1] recycles 0 hard 0 soft 1 temp 1 loss 7.49 con 3.24 i_con 3.89 plddt 0.55 ptm 0.30 i_

In [81]:
af_model.plot_pdb()

This looks a lot like the first hallucination that we got, when we weren't using a custom loss function.  Although the helices here seem slightly more ordered.