# **IgFold**: Fast, accurate antibody structure prediction

Official notebook for [IgFold](https://www.biorxiv.org/content/10.1101/2022.04.20.488972): Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies.  The code, data, and weights for this work are made available for non-commercial use. For commercial inquiries, please contact `jruffolo[at]jhu.edu`.

##Installation in Colab/ Jupyter Notebook/ Dependency Installation

In [1]:
# 1. Uninstall and clean up potentially conflicting core packages
print("1. Uninstalling old torch and core dependencies...")
!pip uninstall torch torchvision torchaudio transformers numpy -y

# 2. Installing modern PyTorch...
!pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118

# 3. Install IgFold/ dependencies
print("3. Installing IgFold and base dependencies...")
!pip install igfold --upgrade

# 4. Installing visualization (py3Dmol) and ensuring core versions...
print("4. Installing visualization (py3Dmol) and ensuring core versions...")
!pip install py3Dmol
!pip install transformers --upgrade

# 5. Installing OpenMM refinement dependencies (pdbfixer, openmm) using Conda...
print("5. Installing OpenMM refinement dependencies (pdbfixer, openmm) using Conda...")

# Install condacolab (to use conda/mamba in Google Colab)
!pip install -q condacolab

# Restart session to finalize condacolab setup
import condacolab
condacolab.install()

# Use mamba to install the scientific packages from the trusted conda-forge channel
# Crucially, we remove the explicit Python version pinning that caused the conflict
!mamba install -c conda-forge openmm pdbfixer -y

print("OpenMM and PDBFixer installation complete.")

# 6. Install missing dependency for antibody renumbering (abnumber)
print("6. Installing abnumber...")
!pip install abnumber


print("\nInstallation complete. Please RESTART THE SESSION now (Runtime -> Restart session).")

1. Uninstalling old torch and core dependencies...
Found existing installation: torch 2.8.0+cu126
Uninstalling torch-2.8.0+cu126:
  Successfully uninstalled torch-2.8.0+cu126
Found existing installation: torchvision 0.23.0+cu126
Uninstalling torchvision-0.23.0+cu126:
  Successfully uninstalled torchvision-0.23.0+cu126
Found existing installation: torchaudio 2.8.0+cu126
Uninstalling torchaudio-2.8.0+cu126:
  Successfully uninstalled torchaudio-2.8.0+cu126
Found existing installation: transformers 4.57.1
Uninstalling transformers-4.57.1:
  Successfully uninstalled transformers-4.57.1
Found existing installation: numpy 2.0.2
Uninstalling numpy-2.0.2:
  Successfully uninstalled numpy-2.0.2
Looking in indexes: https://download.pytorch.org/whl/cu118
[31mERROR: Could not find a version that satisfies the requirement torch==2.1.0 (from versions: 2.2.0+cu118, 2.2.1+cu118, 2.2.2+cu118, 2.3.0+cu118, 2.3.1+cu118, 2.4.0+cu118, 2.4.1+cu118, 2.5.0+cu118, 2.5.1+cu118, 2.6.0+cu118, 2.7.0+cu118, 2.7.1+

4. Installing visualization (py3Dmol) and ensuring core versions...
Collecting py3Dmol
  Downloading py3dmol-2.5.3-py2.py3-none-any.whl.metadata (2.1 kB)
Downloading py3dmol-2.5.3-py2.py3-none-any.whl (7.2 kB)
Installing collected packages: py3Dmol
Successfully installed py3Dmol-2.5.3
5. Installing OpenMM refinement dependencies (pdbfixer, openmm) using Conda...
⏬ Downloading https://github.com/jaimergp/miniforge/releases/download/24.11.2-1_colab/Miniforge3-colab-24.11.2-1_colab-Linux-x86_64.sh...
📦 Installing...
📌 Adjusting configuration...
🩹 Patching environment...
⏲ Done in 0:00:11
🔁 Restarting kernel...

Looking for: ['openmm', 'pdbfixer']

[?25l[2K[0G[+] 0.0s
[2K[1A[2K[0G[+] 0.1s
conda-forge/linux-64  ⣾  
conda-forge/noarch    ⣾  [2K[1A[2K[1A[2K[0G[+] 0.2s
conda-forge/linux-64  ⣾  
conda-forge/noarch     9%[2K[1A[2K[1A[2K[0G[+] 0.3s
conda-forge/linux-64   4%
conda-forge/noarch    16%[2K[1A[2K[1A[2K[0G[+] 0.4s
conda-forge/linux-64  15%
conda-forge/noarch   


Installation complete. Please RESTART THE SESSION now (Runtime -> Restart session).


##Reload imports/definitions cell

In [1]:
# Initialization process / prepare environment
import os
import torch
import torch.serialization
from igfold import IgFoldRunner
import py3Dmol

# Imports for model deserialization (required by IgFold's loading mechanism)
from transformers.models.bert.configuration_bert import BertConfig
from transformers.models.bert.tokenization_bert import BertTokenizer
from transformers.tokenization_utils import Trie
from transformers.models.bert.tokenization_bert import BasicTokenizer
from transformers.models.bert.tokenization_bert import WordpieceTokenizer

# Core IgFold utility import (note: show_pdb is an optional viz function)
from igfold.utils.visualize import show_pdb

# 1. Setup/ configuration
# Define sequences
SEQUENCES = {
    "H": "QVQLQESGPGLVKPSQTLSLTCAISGDSVSSNSAAWNWIRQSPSRGLEWLGRTYYRSKWYNDYAVSVRRFTISRDDSKNTVYLQMNSLRAEDTAVYYCARYYYYYYGMDYWGQGSLVTVSS",
    "L": "DIQMTQSPSSLSASVGDRVTITCKASQSVSANDVVAWYQQKPGKAPKLVIYWASTRESGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCLQHFWSTPRTFGQGTKVEIK"
}

# Define output path and file name
ANTIBODY_NAME = "chimeric_anti-OKT3_scFv"
PRED_DIR = "igfold_predictions"
DO_REFINE = False
DO_RENUM = False
USE_OPENMM = False
NUM_MODELS = 4

# Create output directory/ file path (must run once)
if not os.path.exists(PRED_DIR):
    os.makedirs(PRED_DIR)
pred_pdb_path = os.path.join(PRED_DIR, f"{ANTIBODY_NAME}.pdb")

# 2. Folding execution
if __name__ == "__main__":
    # Load IgFold models using the patched safe_globals context manager
    with torch.serialization.safe_globals({BertConfig, BertTokenizer, Trie, BasicTokenizer, WordpieceTokenizer}):
        print(f"Initializing IgFoldRunner with {NUM_MODELS} models...")
        igfold = IgFoldRunner(num_models=NUM_MODELS)
        print("Successfully loaded IgFold and AntiBERTy models.")

    # Perform folding prediction
    print(f"\nFolding sequences for {ANTIBODY_NAME}...")

    igfold.fold(
        pred_pdb_path,
        sequences=SEQUENCES,
        do_refine=DO_REFINE,
        use_openmm=USE_OPENMM,
        do_renum=DO_RENUM,
    )

    print(f"Structure successfully saved to {pred_pdb_path}")

    # Display final structure using igfold's utility
    show_pdb(
        pred_pdb_path,
        len(SEQUENCES),
        bb_sticks=False,
        sc_sticks=True,
        color="rainbow"
    )

# 3. Visualization with py3Dmol (Use PRED_DIR and ANTIBODY_NAME for consistency)
# This code runs *outside* the if __name__ == "__main__": block
pdb_file = os.path.join(PRED_DIR, f"{ANTIBODY_NAME}.pdb")

# Create and display the py3Dmol view
view = py3Dmol.view(query=f'pdb:{pdb_file}', width=800, height=600)
view.setStyle({'cartoon': {'color': 'spectrum'}})
view.zoomTo()
view.show()



Initializing IgFoldRunner with 4 models...

    The code, data, and weights for this work are made available for non-commercial use 
    (including at commercial entities) under the terms of the JHU Academic Software License 
    Agreement. For commercial inquiries, please contact awichma2[at]jhu.edu.
    License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
    
Loading 4 IgFold models...
Using device: cpu
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_1.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_2.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_3.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_5.ckpt...
Successfully loaded 4 IgFold models.




Loaded AntiBERTy model.
Successfully loaded IgFold and AntiBERTy models.

Folding sequences for chimeric_anti-OKT3_scFv...


Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /pytorch/aten/src/ATen/native/Cross.cpp:63.)
  n_vec = (b_coord - a_coord).expand(bc_vec.shape).cross(bc_vec)
  return torch._C._get_cublas_allow_tf32()
  with disable_tf32(), autocast(enabled = False):


Completed folding in 42.80 seconds.
Structure successfully saved to igfold_predictions/chimeric_anti-OKT3_scFv.pdb


##Structural Prediction of chimeric_anti_OKT3_scFv

In [2]:
# Initialization process/ prepare environment

import os
import torch
import torch.serialization # Include torch directly after restart
from igfold import IgFoldRunner
import py3Dmol # For visualization

# Imports for model deserialization
from transformers.models.bert.configuration_bert import BertConfig
from transformers.models.bert.tokenization_bert import BertTokenizer
from transformers.tokenization_utils import Trie
from transformers.models.bert.tokenization_bert import BasicTokenizer
from transformers.models.bert.tokenization_bert import WordpieceTokenizer

# Core IgFold and visualization imports
from igfold.utils.visualize import show_pdb
from igfold import IgFoldRunner

# 1. Setup/ configuration

# Define sequences
SEQUENCES = {
    "H": "QVQLQESGPGLVKPSQTLSLTCAISGDSVSSNSAAWNWIRQSPSRGLEWLGRTYYRSKWYNDYAVSVRRFTISRDDSKNTVYLQMNSLRAEDTAVYYCARYYYYYYGMDYWGQGSLVTVSS",
    "L": "DIQMTQSPSSLSASVGDRVTITCKASQSVSANDVVAWYQQKPGKAPKLVIYWASTRESGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCLQHFWSTPRTFGQGTKVEIK"
}

# Define output path and file name
ANTIBODY_NAME = "chimeric_anti-OKT3_scFv"
PRED_DIR = "igfold_predictions"

# Execution flags (set to FALSE as the environment could not support these dependencies)
DO_REFINE = False
DO_RENUM = False
USE_OPENMM = False
NUM_MODELS = 4

# Create output directory/ file path (can be run safely multiple times)
if not os.path.exists(PRED_DIR):
    os.makedirs(PRED_DIR)
pred_pdb_path = os.path.join(PRED_DIR, f"{ANTIBODY_NAME}.pdb")

# 2. Folding execution

if __name__ == "__main__":
    # Create output directory/ file path
    if not os.path.exists(PRED_DIR):
        os.makedirs(PRED_DIR)
    pred_pdb_path = os.path.join(PRED_DIR, f"{ANTIBODY_NAME}.pdb")

    # Load IgFold models using the patched safe_globals context manager
    with torch.serialization.safe_globals({BertConfig, BertTokenizer, Trie, BasicTokenizer, WordpieceTokenizer}):
        print(f"Initializing IgFoldRunner with {NUM_MODELS} models...")
        igfold = IgFoldRunner(num_models=NUM_MODELS)
        print("Successfully loaded IgFold and AntiBERTy models.")

    # Perform folding prediction
    print(f"\nFolding sequences for {ANTIBODY_NAME}...")

    igfold.fold(
        pred_pdb_path,
        sequences=SEQUENCES,
        do_refine=DO_REFINE,
        use_openmm=USE_OPENMM,
        do_renum=DO_RENUM,
    )

    print(f"Structure successfully saved to {pred_pdb_path}")

    # Display final structure
    show_pdb(
        pred_pdb_path,
        len(SEQUENCES),
        bb_sticks=False,
        sc_sticks=True,
        color="rainbow"
    )


Initializing IgFoldRunner with 4 models...

    The code, data, and weights for this work are made available for non-commercial use 
    (including at commercial entities) under the terms of the JHU Academic Software License 
    Agreement. For commercial inquiries, please contact awichma2[at]jhu.edu.
    License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
    
Loading 4 IgFold models...
Using device: cpu
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_1.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_2.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_3.ckpt...
Loading /usr/local/lib/python3.12/dist-packages/igfold/trained_models/IgFold/igfold_5.ckpt...
Successfully loaded 4 IgFold models.
Loaded AntiBERTy model.
Successfully loaded IgFold and AntiBERTy models.

Folding sequences for chimeric_anti-OKT3_scFv...


  with disable_tf32(), autocast(enabled = False):


Completed folding in 42.13 seconds.
Structure successfully saved to igfold_predictions/chimeric_anti-OKT3_scFv.pdb


##Visualization of chimeric_anti_OKT3_scFv

In [3]:
import os
import py3Dmol

ANTIBODY_NAME = "chimeric_anti-OKT3_scFv"
PRED_DIR = "igfold_predictions"
pdb_file = os.path.join(PRED_DIR, f"{ANTIBODY_NAME}.pdb")

# read the PDB file contents into a string
with open(pdb_file, 'r') as f:
    pdb_data = f.read()

# create the view
view = py3Dmol.view(width=800, height=600)

# add model data directly
view.addModel(pdb_data, 'pdb')

# apply styling and display
view.setStyle({'cartoon': {'color': 'spectrum'}})
view.zoomTo()
view.show()

In [4]:
# Define the prediction directory name (IgFold default)
pred_dir = "igfold_predictions"

# Define the name of your sequence (as used in the prediction step)
name = "chimeric_anti-OKT3_scFv"

#@title Plot per-residue predicted RMSD

prmsd_fig_file = os.path.join(pred_dir, f"{name}_prmsd.png")
plot_prmsd(sequences, pred.prmsd.cpu(), prmsd_fig_file, shade_cdr=do_renum, pdb_file=pred_pdb)

NameError: name 'plot_prmsd' is not defined

In [None]:
#@title Show predicted structure with predicted RMSD

#@markdown Structure is colored from low (blue) to high (red) pRMSD.

show_pdb(pred_pdb, len(sequences), bb_sticks=False, sc_sticks=True, color="b")

In [None]:
#@title Download results

#@markdown Download zip file containing structure prediction and annotation results. If download fails, results are also accessible from file explorer on the left panel of the notebook.

from google.colab import files
import locale
locale.getpreferredencoding = lambda: "UTF-8"

!zip -FSr $name".result.zip" $pred_dir/ &> /dev/null
files.download(f"{name}.result.zip")