# OpenFold Colab

Runs a simplified version of [OpenFold](https://github.com/aqlaboratory/openfold) on a target sequence. Adapted from DeepMind's [official AlphaFold Colab](https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb).

**Differences to AlphaFold v2.0**

OpenFold is a trainable PyTorch reimplementation of AlphaFold 2. For the purposes of inference, it is practically identical to the original ("practically" because ensembling is excluded from OpenFold (recycling is enabled, however)).

In this notebook, OpenFold is run with your choice of our original OpenFold parameters or DeepMind's publicly released parameters for AlphaFold 2.

**Note**

Like DeepMind's official Colab, this notebook uses **no templates (homologous structures)** and a selected portion of the full [BFD database](https://bfd.mmseqs.com/).

**Citing this work**

Any publication that discloses findings arising from using this notebook should [cite](https://github.com/deepmind/alphafold/#citing-this-work) DeepMind's [AlphaFold paper](https://doi.org/10.1038/s41586-021-03819-2).

**Licenses**

This Colab supports inference with the [AlphaFold model parameters](https://github.com/deepmind/alphafold/#model-parameters-license), made available under the Creative Commons Attribution 4.0 International ([CC BY 4.0](https://creativecommons.org/licenses/by/4.0/legalcode)) license. The Colab itself is provided under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0). See the full license statement below.

**More information**

You can find more information about how AlphaFold/OpenFold works in DeepMind's two Nature papers:

*   [AlphaFold methods paper](https://www.nature.com/articles/s41586-021-03819-2)
*   [AlphaFold predictions of the human proteome paper](https://www.nature.com/articles/s41586-021-03828-1)

FAQ on how to interpret AlphaFold/OpenFold predictions are [here](https://alphafold.ebi.ac.uk/faq).

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

In [13]:
import py3Dmol
from proteome import protein
from proteome.models.folding.openfold.modeling import OpenFoldForFolding, OPENFOLD_MODEL_URLS
from proteome.models.folding.openfold.np.relax import relax

In [3]:
sequence = 'MAAHKGAEHHHKAAEHHEQAAKHHHAAAEHHEKGEHEQAAHHADTAYAHHKHAEEHAAQAAKHDAEHHAPKPH'

In [14]:
model_names = list(OPENFOLD_MODEL_URLS.keys())

In [15]:
for model_name in model_names:
    print("Model name", model_name)
    folder = OpenFoldForFolding(model_name, random_seed=0)
    predicted_protein, plddt = folder.fold(sequence)
    unrelaxed_pdb_str = protein.to_pdb(predicted_protein)
    with open(f"reference_{folder.model_name}.pdb", mode="w") as f:
        f.writelines(unrelaxed_pdb_str)

Model name finetuning-3
Running jackhmmer on uniref90 database...
Running jackhmmer on smallbfd database...
Running jackhmmer on mgnify database...
58 sequences found in uniref90
110 sequences found in smallbfd
9 sequences found in mgnify
Model name finetuning-4
Downloading model from s3://openfold/openfold_params/finetuning_4.pt...
Model downloaded and cached in /home/conradry71/.cache/torch/hub/checkpoints/finetuning_4.pt.
Running jackhmmer on uniref90 database...
Running jackhmmer on smallbfd database...
Running jackhmmer on mgnify database...
58 sequences found in uniref90
110 sequences found in smallbfd
9 sequences found in mgnify
Model name finetuning-5
Downloading model from s3://openfold/openfold_params/finetuning_5.pt...
Model downloaded and cached in /home/conradry71/.cache/torch/hub/checkpoints/finetuning_5.pt.
Running jackhmmer on uniref90 database...
Running jackhmmer on smallbfd database...
Running jackhmmer on mgnify database...
58 sequences found in uniref90
110 sequenc

In [9]:
folder = OpenFoldForFolding(model_name, random_seed=0)
predicted_protein, plddt = folder.fold(sequence)

Running jackhmmer on uniref90 database...
Running jackhmmer on smallbfd database...
Running jackhmmer on mgnify database...
58 sequences found in uniref90
110 sequences found in smallbfd
9 sequences found in mgnify


In [10]:
folder.model_name

'finetuning-3'

In [11]:
unrelaxed_pdb_str = protein.to_pdb(predicted_protein)

In [12]:
with open(f"reference_{folder.model_name}_1.pdb", mode="w") as f:
    f.writelines(unrelaxed_pdb_str)

In [20]:
amber_relaxer = relax.AmberRelaxation(
    max_iterations=0,
    tolerance=2.39,
    stiffness=10.0,
    exclude_residues=[],
    max_outer_iterations=20,
    use_gpu=False,
)
relaxed_pdb, _, _ = amber_relaxer.process(
    prot=predicted_protein, cif_output=False
)

In [23]:
PLDDT_BANDS = [
  (0, 50, '#FF7D45'),
  (50, 70, '#FFDB13'),
  (70, 90, '#65CBF3'),
  (90, 100, '#0053D6')
]
view = py3Dmol.view(width=800, height=600)
view.addModelsAsFrames(relaxed_pdb)
color_map = {i: bands[2] for i, bands in enumerate(PLDDT_BANDS)}
style = {'cartoon': {'colorscheme': {'prop': 'b', 'map': color_map}}}

style['stick'] = {}

view.setStyle({'model': -1}, style)
view.zoomTo()

<py3Dmol.view at 0x7f65d1391660>