
# Co-folding with Boltz

*Disclaimer: This topic is a very active area of research and thus prone to quick changes. The Notebook was last edited on 30.06.2025*

---
### In this lesson you'll learn:

- how to predict a protein structure from the amino acid sequence.
- how to predict the structure of a protein-ligand complex from sequence and SMILES.
- about current limitations of co-folding methods.

---

This notebook is about protein folding, especially *co-folding*, i.e. not just the prediciton of the 3D protein structure from the sequence but also the placement of a ligand in the correct conformation and binding pocket.  

While protein-folding was largly solved by [Alphafold2](https://doi.org/10.1038/s41586-021-03819-2), newer models try to solve the structure of protein-ligand complexes from the sequence and a SMILES. Finding the binding conformation is typically done using docking software such as [gnina](https://github.com/gnina/gnina), [GOLD](https://www.ccdc.cam.ac.uk/solutions/software/gold/) or [Glide](https://www.schrodinger.com/platform/products/glide/). 

Google's [Alphafold3](https://github.com/google-deepmind/alphafold3) is still one of the top performing models for co-folding, but in this notebook we'll be using [Boltz](https://github.com/jwohlwend/boltz), specifically Boltz2. As of writing, this is the newest co-folding model and even offers binding strenght estimation, which we'll  have a look at at the end of the notebook. 

---

## Installation, Google Colab and the Command Line

**Optional Boltz Install**: To complete this notebook, it is not necessary to install Boltz, but we give the option to do so. Another great resource to try out Boltz in Google Colab is [this Notebook](https://colab.research.google.com/github/kimjc95/computational-chemistry/blob/main/Boltz_on_Colab.ipynb).

**Local install**: This Notebook is created to be run with Google Colab and the following cells will install Boltz and its dependencies. If you want to run Boltz locally (recommended if you have a strong GPU) please follow the instructions [here]() and install on linux or in WSL. Please then just execute the escaped linux commands (the commands starting with `!`, e.g. `!boltz predict example_file.fasta`). Note that you might need to resolve the paths (where the files are stored).

**Google Colab**: Make sure you are connected to a runtime with GPU support (go to the upper right corner, open the drop-down menu, select "change runtime type" and make sure a Hardware Accelerator other than CPU is selected).

**Command Line**: As Boltz is a command line program, we will be using a few linux commands instead of typical python code. You can recognize these commands by the `!` which is used to tell the python interpreter to pass this  command to the underlying operating system.

---

In [None]:
# Install py3Dmol for viewing
!pip install py3Dmol

## (Optional) Run an actual Boltz Prediciton

In [None]:
# Installation (for Google Colab only)
# this code was adapted from [Joo-Chan Kim](https://zenodo.org/records/14881401)

import os
import subprocess

print('Installing dependencies (estimate: 2min) ... ', end='')
dependencies = "torch torchvision torchaudio numpy hydra-core pytorch-lightning "
dependencies += "rdkit dm-tree requests pandas types-requests einops einx fairscale "
dependencies += "mashumaro modelcif wandb click pyyaml biopython scipy numba gemmi "
dependencies += "scikit-learn chembl_structure_pipeline "
dependencies += "cuequivariance_ops_cu12 cuequivariance_ops_torch_cu12 cuequivariance_torch"

precision = '32-true'

subprocess.run("pip install ipywidgets torch torchvision torchaudio", shell=True)
subprocess.run("git clone https://github.com/jwohlwend/boltz.git", shell=True)
subprocess.run(f"sed -i 's/bf16-mixed/{precision}/g' /content/boltz/src/boltz/main.py", shell=True)
subprocess.run(f"pip install {dependencies}", shell=True)
subprocess.run("cd boltz; pip install --no-deps -e .", shell=True)

print('done.')

Now we'll create a very simple fasta file using some linux commands:

In [None]:
# `echo` is the linux way of using `print()` and with the `>` we write the ouput to a file
!echo -e ">A|protein|empty\nAAAA\n" > peptide.fasta

In [None]:
# we can have a look at the file with `cat`
!cat peptide.fasta

Now we can use `boltz predict peptide.fasta` to predict the structure of the small peptide.
Note that we used the keyword `empty` in our `.fasta` file. This will lead to much worse performance, because we don't do any sequence alignment.
Normally, we would have to add the --use_msa_server keyword to use an external server for the sequence alignment (or a supply our own .a3m file).

In [None]:
!boltz predict peptide.fasta

The output is saved in the folder `boltz_results_peptide` and we can use Google Colabs file browser (the folder symbol on the left hotbar). The actual predicted 3D structure can be visualized using py3Dmol.

In [None]:
import py3Dmol 
fasta_name = 'peptide'

with open(f"boltz_results_{fasta_name}/predictions/{fasta_name}/{fasta_name}_model_0.cif") as ifile:
    system = "".join([x for x in ifile])

view = py3Dmol.view(width=400, height=300)
view.addModelsAsFrames(system)
view.setStyle({'model': -1}, {"cartoon": {'color': 'spectrum'}})
view.zoomTo()
view.show()

### Make 

In [None]:
#@title (Optional) Make a Boltz prediction




<details>
<summary><strong>How to hide part of a cell:</strong></summary>

hidden details

</details>

In [None]:
import py3Dmol 

with open("boltz_results_ligand/predictions/ligand/ligand_model_0.cif") as ifile:
# s.listdir('boltz_results_ligand/predictions/ligand/ligand_model0.cif')
    system = "".join([x for x in ifile])

view = py3Dmol.view(width=400, height=300)
view.addModelsAsFrames(system)
view.setStyle({'model': -1}, {"cartoon": {'color': 'spectrum'}})
view.zoomTo()
view.show(