# Haddock3 Protein-Protein Docking using HADDOCK3 with BioExcel Building Blocks (biobb)
**Based on the official [HADDOCK3 antibody-antigen modelling tutorial](https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/).**

***
This tutorial aims to illustrate the process of **proptein protein docking**, step by step, using **Haddock3** with the **BioExcel Building Blocks library (biobb)**. The particular systems used in this tutorial are the **Interleukin-1β (IL-1β) antigen** (PDB code 4I1B, https://doi.org/10.2210/pdb4I1B/pdb) and the **gevokizumab antibody** (PDB code 4G6K, https://doi.org/10.2210/pdb4G6K/pdb). The complex is also available under the PDB code 4G6M, https://doi.org/10.2210/pdb4G6M/pdb.

***

**Biobb modules** used:

* [biobb_io](https://github.com/bioexcel/biobb_io): Tools to fetch biomolecular data from public databases.
* [biobb_pdb_tools](https://github.com/bioexcel/biobb_pdb_tools): Swiss army knife for manipulating and editing PDB files. 
* [biobb_haddock](https://github.com/bioexcel/biobb_haddock): Module collection to compute information-driven flexible protein-protein docking.

**Auxiliary libraries** used:

* [jupyter](https://jupyter.org/): Free software, open standards, and web services for interactive computing across all programming languages.
* [nglview](http://nglviewer.org/#nglview): Jupyter/IPython widget to interactively view molecular structures and trajectories in notebooks.

### Conda Installation and Launch

```console
git clone https://github.com/bioexcel/biobb_wf_haddock.git
cd biobb_wf_haddock
conda env create -f conda_env/environment.yml
conda activate biobb_wf_haddock
jupyter-notebook biobb_wf_haddock/notebooks/biobb_wf_haddock.ipynb
  ``` 

***
### Pipeline steps:
 1. [Input Parameters](#Input-parameters)
 2. [Preparing PDB files for docking](#Preparing-PDB-files-for-docking)
    - [Fetching PDB structures](#Fetching-PDB-structures)
    - [Preparing the antibody structure](#Preparing-the-antibody-structure)
    - [Preparing the antigen structure](#Preparing-the-antigen-structure)
    - [Preparing the reference structure](#Preparing-the-reference-structure)
 3. [Defining HADDOCK3 restraints](#Defining-HADDOCK3-restraints)
    - [Paratope Restraints](#Paratope-Restraints)
    - [Epitope Restraints](#Epitope-Restraints)
    - [HADDOCK3 passive restraints](#HADDOCK3-passive-restraints)
    - [HADDOCK3 ambiguous restraints](#HADDOCK3-ambiguous-restraints)
    - [Additional restraints for multi-chain proteins](#Additional-restraints-for-multi-chain-proteins)
 4. [Docking](#Docking)
    - [Create topology](#1.-Create-topology)
    - [Rigid Body sampling](#2.-Rigid-Body-sampling)
    - [1st CAPRI evaluation](#3.-1st-CAPRI-evaluation)
    - [Select Top structures](#4.-Select-Top-structures)
    - [Flexible Refinement](#5.-Flexible-Refinement)
    - [2nd CAPRI evaluation](#6.-2nd-CAPRI-evaluation)
    - [Energy minimization refinement](#7.-Energy-minimization-refinement)
    - [3rd CAPRI evaluation](#8.-3rd-CAPRI-evaluation)
    - [Clustering](#9.-Clustering)
    - [Selecting top clusters](#10.-Selecting-top-clusters)
    - [Final CAPRI evaluation](#11.-Final-CAPRI-evaluation)
    - [Contacts analysis](#12.-Contacts-analysis)
    - [Final Results](#13.-Final-Results)
 5. [Questions & Comments](#Questions-&-Comments) 

 
***

<table cellspacing="0" cellpadding="0" style="border-collapse: collapse; width: 100%;">
    <tr style="background: white;">
        <td style="text-align: center; vertical-align: middle; width: 50%;">
            <img src="https://bioexcel.eu/wp-content/uploads/2019/04/Bioexcell_logo_1080px_transp.png" alt="Bioexcel2 logo"
            title="Bioexcel logo" width="400" />        
        </td>
        <td style="text-align: center; vertical-align: middle; width: 50%;">
            <img src="imgs/HADDOCK3-logo.png" alt="HADDOCK3" style="max-width: 100%; height: auto; display: block; margin: auto;">
        </td>
    </tr>
</table>

***

<a class="anchor" name="box"></a>
## Initializing colab
The two cells below are used only in case this notebook is executed via **Google Colab**. Take into account that, for running conda on **Google Colab**, the **condacolab** library must be installed. As [explained here](https://pypi.org/project/condacolab/), the installation requires a **kernel restart**, so when running this notebook in **Google Colab**, don't run all cells until this **installation** is properly **finished** and the **kernel** has **restarted**.

In [1]:
# Only executed when using google colab
import sys
if 'google.colab' in sys.modules:
  import subprocess
  from pathlib import Path
  try:
    subprocess.run(["conda", "-V"], check=True)
  except FileNotFoundError:
    subprocess.run([sys.executable, "-m", "pip", "install", "condacolab"], check=True)
    import condacolab
    condacolab.install()
    # Clone repository
    repo_URL = "https://github.com/bioexcel/biobb_wf_haddock.git"
    repo_name = Path(repo_URL).name.split('.')[0]
    if not Path(repo_name).exists():
      subprocess.run(["mamba", "install", "-y", "git"], check=True)
      subprocess.run(["git", "clone", repo_URL], check=True)
      print("⏬ Repository properly cloned.")
    # Install environment
    print("⏳ Creating environment...")
    env_file_path = f"{repo_name}/conda_env/environment.yml"
    subprocess.run(["mamba", "env", "update", "-n", "base", "-f", env_file_path], check=True)
    print("🎨 Install NGLView dependencies...")
    subprocess.run(["mamba", "install", "-y", "-c", "conda-forge", "nglview==3.1.4", "ipywidgets==8.1.6"], check=True)
    print("👍 Conda environment successfully created and updated.")

In [2]:
# Enable widgets for colab
if 'google.colab' in sys.modules:
  from google.colab import output
  output.enable_custom_widget_manager()
  # Change working dir
  import os
  os.chdir("biobb_wf_haddock/biobb_wf_haddock/notebooks")
  print(f"📂 New working directory: {os.getcwd()}")

## Helper functions
Small **functions** used multiple times in the notebook and defined to make the code shorter and easier to follow.
- ***def_dict***: initializes the properties dictionary with paths for the log files (log.out, log.err). Avoids ending up with a folder full of log files.
- ***display_actpass***: loads a PDB file into NGLview widget and adds HADDOCK3 active and passive residues with surface representation.
- ***capri_visualization_models***: builds a dropdown to select specific models from the list of structures generated by HADDOCK. 
- ***superpose_ref***: superposes a PDB structure to a reference one.
- ***superpose_models***: superposes multiple PDB structures to a reference one and writes a multi-model PDB file.
- ***capri_visualization***: builds the NGL widget to show a 3D representation of the models generated by HADDOCK, with the possibility to select a specific model to represent. 
- ***open_results***: opens a new tab in the web browser and loads a HADDOCK generated html file with summary results.

In [205]:
# Imports
import nglview as nv
import pytraj as pt
import glob
import os
import json
import webbrowser
from Bio.PDB import PDBParser, PDBIO, CEAligner, Selection, Structure, Model, Chain
from IPython.display import display, Markdown
from jupyter_server.serverapp import list_running_servers

# Helpers
def def_dict(properties={}):
    def_props = {'out_log_path': 'log/log.out',
                 'err_log_path': 'log/log.err'}
                 #'remove_tmp': False,               # Uncomment if interested in temporary files x step 
                 #'can_write_console_log': False}    # Uncomment to disable output log information 
    def_props.update(properties)
    return def_props

def pdb_tools_pipeline(inp_file, out_file, steps):
    """Helper function to concatenate calls to pdb_tools"""
    tmp_file = inp_file
    for step, props in steps:
        # Apply each step in the pipeline
        step(input_file_path  = tmp_file, 
             output_file_path = out_file, 
             properties       = def_dict(props))
        tmp_file = 'tmp.pdb'
        os.rename(out_file, tmp_file)
    os.rename(tmp_file,out_file)

def display_actpass(pdb, actpass, opacity=1):
    with open(actpass, 'r') as file:
        actpass = file.read().splitlines()
        act_res = actpass[0].replace(' ', ', ')
        pas_res = actpass[1].replace(' ', ', ')
        
    # Load the PDB files
    view = nv.NGLWidget()
    view.add_component(pdb)
    view.clear()
    view.add_cartoon(color='black')
    view.add_ball_and_stick(color='grey',opacity=opacity)
    view.add_surface(selection=f'not ( {pas_res}, {act_res} )', color='white', opacity=opacity)
    if act_res != '':
        view.add_surface(selection=f'{act_res}', color='red')
    if pas_res != '':
        view.add_surface(selection=f'{pas_res}', color='green', opacity=opacity)
    view.layout.width = '100%'
    return view

def capri_visualization_models(tsv_dir,capri_out):

    models = []
    for i, m in enumerate(capri_out['model']):
        models.append(('Model Rank ' + str(i+1), os.path.normpath(os.path.join(tsv_dir, single_df['model'][i]))))
        
    mdsel = ipywidgets.Dropdown(
        options=models,
        description='Sel. model:',
        disabled=False,
    )
    
    # Dictionary of options in the dropdown
    options_dict = dict(mdsel.options)

    return mdsel

def superpose_ref(pdb_ref, pdb_to_sup, output_file, chain):

    # Parse the structures
    parser = PDBParser(QUIET=True)
    structure1 = parser.get_structure("pdb_ref", pdb_ref)
    structure2 = parser.get_structure("pdb_to_sup", pdb_to_sup)
    
    # Select only a portion of reference structure (e.g., Chain A )
    selected_residues = [res for res in structure1[0][chain]]
    
    # Create a new structure object with the selected residues
    selected_structure = Structure.Structure("selected_structure")
    model = Model.Model(0)
    chain = Chain.Chain("A")
    
    for res in selected_residues:
        chain.add(res)
    
    model.add(chain)
    selected_structure.add(model)    

    # Perform CE alignment using the selected region as reference
    aligner = CEAligner()
    aligner.set_reference(selected_structure)
    aligner.align(structure2)
    
    # Save structure2 to a PDB file
    io = PDBIO()
    io.set_structure(structure2)
    io.save(output_file)
    
def superpose_models(chain,input_path,output_file):
    
    pdb_files = sorted(glob.glob(f"{input_path}/*.pdb"))
    
    # Get all PDB files and sort them
    # Create a trajectory from the PDB files
    traj = pt.iterload(pdb_files, top=pdb_files[0])
    # Align the structures by chain and save the ensemble as a multi-model PDB
    pt.align(traj, ref=0, mask=f'::{chain}')
    traj.save(output_file, options="model", overwrite=True)

def capri_visualization(ensemble_A,ensemble_B,reference_A,reference_B,input_path,single_df,cluster_df):

    def on_dropdown_change(change):
        """Handle dropdown selection changes.
        From https://github.com/nglviewer/nglview/issues/765
        """
    
        if change['type'] == 'change' and change['name'] == 'value': 
            selected_file = change['new']
            if selected_file=='All':
                view1._remote_call('setSelection', target='compList', args=["*"], 
                   kwargs=dict(component_index=0))
                view2._remote_call('setSelection', target='compList', args=["*"], 
                   kwargs=dict(component_index=0))
            else:
                # Extract model number from the filename
                model_arr = selected_file.split('_')
                if (len(model_arr)) > 2:
                    cluster_num = int(model_arr[-3])
                    cluster_model_num = int(model_arr[-1])
                    model_num = (cluster_num * cluster_model_num) - 1
                else:
                    model_num = int(model_arr[-1]) - 1
                print(f"Selected model: {model_num}")
                with open(f"{out_path}/example.txt", "w") as file:
                    file.write(f"Hello: {model_num}")
                    file.write(f"Dict: {dict(component_index=0)}")
                # Update the view with the selected model
                view1._remote_call('setSelection', target='compList', 
                                args=[f"/{model_num}"], 
                                kwargs=dict(component_index=0))
                # You can also update view2 if needed
                view2._remote_call('setSelection', target='compList', 
                                args=[f"/{model_num}"], 
                                kwargs=dict(component_index=0))

    # Create a dropdown widget
    pdb_files = sorted(glob.glob(f"{input_path}/*.pdb"))
    opts = ['All']
    opts.extend([pdb_file.split('/')[-1].split('.')[0] for pdb_file in pdb_files])
    mdsel = ipywidgets.Dropdown(
        options=opts,
        description='Sel. model:',
        disabled=False,
    )
    display(Markdown("#### Please select a model:"))
    display(mdsel)
    
    # Register the callback function
    mdsel.observe(on_dropdown_change, names='value')
    
    # Loading Ensemble aligned to chain A (Antibody)
    view1 = nv.show_structure_file(ensemble_A, default_representation=False)
    
    # Loading Ensemble aligned to chain B (Antigen)
    view2 = nv.show_structure_file(ensemble_B, default_representation=False)
    
    # Adding reference structure for comparison purposes
    view1.add_component(reference_A)
    view1.component_0.add_cartoon(color='cyan',opacity=.6)
    view2.add_component(reference_B)
    view2.component_0.add_cartoon(color='cyan',opacity=.6)
    
    # Colouring models aligned to chain A (Antibody: blue, Antigen: green) 
    view1.component_1.clear()
    view1.component_1.add_cartoon(selection=':A', color='purple')
    view1.component_1.add_cartoon(selection=':B', color='green')
    view1._remote_call('setSize', target='Widget', args=['500px','500px'])
    view1.layout.margin = "auto"
    view1.camera='orthographic'
    
    # Colouring models aligned to chain B (Antibody: blue, Antigen: green) 
    view2.component_1.clear()
    view2.component_1.add_cartoon(selection=':A', color='purple')
    view2.component_1.add_cartoon(selection=':B', color='green')
    view2._remote_call('setSize', target='Widget', args=['500px','500px'])
    view2.layout.margin = "auto"
    view2.camera='orthographic'
    
    # Display the viewer
    box = ipywidgets.HBox([view1, view2])
    display(Markdown("#### Generated models (left, aligned to chain A -Antibody-; right, aligned to chain B -Antigen-, cyan, reference)"))
    display(box)
    
    # Show CAPRI Evaluation values
    display(Markdown("#### CAPRI Evaluation values for Single Structure:"))
    display(Markdown("DockQ: incorrect (<0.23), acceptable (0.23-0.49), medium (0.49-0.80), and high (>=0.80)"))
    display(single_df.head(10))
    display(Markdown("#### CAPRI Evaluation values for Cluster-based output:"))
    display(cluster_df.head(10))

def open_results(html_file):

    # Load the jupyter servers running
    servers = list(list_running_servers())
    
    if server := servers[0]:
        port = server.get("port", 8888)  # Default to 8888 if port is not found

        # Notebook path starting from where the jupyter kernel was initiated
        notebook_path = os.path.relpath(os.getcwd(),server['root_dir'])
        notebook_path = '' if notebook_path == '.' else notebook_path+'/'
        
        # Construct the URL using the detected port
        url = f"http://localhost:{port}/view/{notebook_path}{html_file}"
        webbrowser.open(url)
    else:
        print("Could not determine the Jupyter Notebook port.")


***
## Input parameters
**Input parameters** needed:
 - **antibody_pdb**: Antibody structure PDB code (in this case 4G6K)
 - **antigen_pdb**: Antigen structure PDB code (in this case 4I1B)
 - **complex_pdb**: Complex structure PDB code, if available (in this case 4G6M)
 - **out_path**: Output folder where resulting files will be generated
 - **data_path**: Auxiliary folder with HADDOCK3 input files needed (taken from the [official HADDOCK3 tutorial site](https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/#software-and-data-setup))  

In [4]:
import ipywidgets
import pandas
import re

antibody_pdb = '4G6K'
antigen_pdb = '4I1B'
complex_pdb = '4G6M' 
out_path = './data/antibody'
data_path = './data/haddock'

Creating **output folder hierarchy**:
- ***out_path***
    - ***pre***: *Initial data* 
    - ***docking***: *Docking results*

In [5]:
#os.makedirs('data',exist_ok=True)
os.makedirs(out_path,exist_ok=True)
os.makedirs(f'{out_path}/pre',exist_ok=True)
os.makedirs(f'{out_path}/docking',exist_ok=True)

***
# Preparing PDB files for docking
 
Before initiating a docking process with **HADDOCK3**, the input PDB structures must meet a [**predefined set of requirements**](https://www.bonvinlab.org/haddock3-user-manual/structure_requirements.html#pdb-format). For example, each molecule should consist of a **single chain** with **non-overlapping residue numbering** within the same chain. Additionaly, **insertion codes** must be renumbered, an issue particularly problematic with antibodies, as often follow the Chothia numbering scheme and insertions created by this system (e.g. 82A, 82B, 82C) cannot be directly processed by **HADDOCK3**.

References: <br>

**ANARCI: antigen receptor numbering and receptor classification.**<br>
*Dunbar J, Deane CM.*<br>
*Bioinformatics, 2016;32(2):298-300.*<br>
*Available at: https://doi.org/10.1093/bioinformatics/btv552*
***

<div class="alert alert-block alert-info">
<b>IMPORTANT NOTE:</b> The preparation steps in this notebook were prepared for the specific systems used (<i>Interleukin-1β antigen, gevokizumab antibody</i>), which we know don't have issues such as <b>insertion codes</b> or <b>alternative locations</b>. Please consider using a more complete preparation pipeline for complex cases. Examples can be found in the <a href="https://www.bonvinlab.org/education/HADDOCK3/">HADDOCK3 official tutorials site</a>.   
</div>

***
## Fetching PDB structures
Downloading **PDB structures** from the [PDBe](https://www.ebi.ac.uk/pdbe/) database. Keeping only **standard residues** (removing heteroatoms). <br>
Alternatively, local PDB files can be used as starting structures. <br>
Downloading **three different files**: 
- **antibody**: Antibody protein structure
- **antigen**: Antigen protein structure
- **complex**: Antibody-antigen complex structure 

***
**Building Blocks** used:
 - [Pdb](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb) from **biobb_io.api.pdb**
***

In [6]:
# Downloading desired PDB files
# Import module
from biobb_io.api.pdb import pdb

## Antibody PDB

# Create properties dict and inputs/outputs
ab_pdb  = f'{out_path}/pre/{antibody_pdb}_0.pdb'
ab_prop = def_dict({
    'pdb_code': antibody_pdb,
    'filter': ['ATOM', 'TER', 'END'],
})

# Create and launch bb
pdb(output_pdb_path = ab_pdb,  
    properties = ab_prop)

## Antigen PDB

# Create properties dict and inputs/outputs
ag_pdb  = f'{out_path}/pre/{antigen_pdb}_0.pdb'
ag_prop = def_dict({
    'pdb_code': antigen_pdb,
    'filter': ['ATOM', 'TER', 'END']
})

# Create and launch bb
pdb(output_pdb_path = ag_pdb,  
    properties = ag_prop)

## Complex PDB

# Create properties dict and inputs/outputs
cx_pdb = f'{out_path}/pre/{complex_pdb}_0.pdb'
cx_prop = def_dict({
    'pdb_code': complex_pdb,
    'filter': ['ATOM', 'TER', 'END']
})

# Create and launch bb
pdb(output_pdb_path = cx_pdb,  
    properties = cx_prop)

2025-06-02 16:49:39,195 [MainThread  ] [INFO ]  Module: biobb_io.api.pdb Version: 5.0.1
2025-06-02 16:49:39,196 [MainThread  ] [INFO ]  Downloading 4g6k from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4g6k.ent
2025-06-02 16:49:39,615 [MainThread  ] [INFO ]  Writting pdb to: ./data/antibody/pre/4G6K_0.pdb
2025-06-02 16:49:39,616 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'TER', 'END']
2025-06-02 16:49:39,620 [MainThread  ] [INFO ]  
2025-06-02 16:49:39,622 [MainThread  ] [INFO ]  Module: biobb_io.api.pdb Version: 5.0.1
2025-06-02 16:49:39,623 [MainThread  ] [INFO ]  Downloading 4i1b from: https://www.ebi.ac.uk/pdbe/entry-files/download/pdb4i1b.ent
2025-06-02 16:49:39,831 [MainThread  ] [INFO ]  Writting pdb to: ./data/antibody/pre/4I1B_0.pdb
2025-06-02 16:49:39,831 [MainThread  ] [INFO ]  Filtering lines NOT starting with one of these words: ['ATOM', 'TER', 'END']
2025-06-02 16:49:39,833 [MainThread  ] [INFO ]  
2025-06-02 16:49:39,83

0

### Visualizing 3D structures
Visualizing the downloaded **PDB structures** using **NGLview**:  
- Left: **Antibody**. Heavy Chain: Blue; Light Chain: Red.
- Center: **Antigen** (Green)
- Right: **Complex**. Antibody: Heavy Chain: Blue; Light Chain: Red; Antigen: Green.

In [89]:
# Show structures: antibody, antigen and complex
view1 = nv.show_structure_file(ab_pdb)
view1._remote_call('setSize', target='Widget', args=['350px','400px'])
view1.clear_representations()
view1.add_representation(repr_type='cartoon', selection=':H', color='blue')
view1.add_representation(repr_type='cartoon', selection=':L', color='red')
view1.camera='orthographic'
view1
view2 = nv.show_structure_file(ag_pdb)
view2._remote_call('setSize', target='Widget', args=['350px','400px'])
view2.clear_representations()
view2.add_representation(repr_type='cartoon', selection='protein', color='green')
view2.camera='orthographic'
view2
view3 = nv.show_structure_file(cx_pdb)
view3._remote_call('setSize', target='Widget', args=['350px','400px'])
view3.clear_representations()
view3.add_representation(repr_type='cartoon', selection=':H', color='blue')
view3.add_representation(repr_type='cartoon', selection=':L', color='red')
view3.add_representation(repr_type='cartoon', selection=':A', color='green')
view3.camera='orthographic'
view3
ipywidgets.HBox([view1, view2, view3])

HBox(children=(NGLWidget(), NGLWidget(), NGLWidget()))

## Preparing the antibody structure

The **gevokizumab antibody**  needs to be processed to select only the **Heavy** and **Light** chains and combine both into one single file with a **unified chain** and **segment id** and **residue numbering** starting at 1. Additionaly, only the **variable fragment (Fv)** of the antibody (see image) will be selected and used, to shorten the structure size and thus reduce the computation time. 

<div style="text-align: center;">
    <img src="imgs/antigen.png" alt="antibody_image" width="400">
</div>

Please note that the PDB file used in this tutorial only contains the upper part of the **Heavy chains** (variable region). For a comprehensive example of a **complete antibody**, please refer to [5DK3 PDB](https://www.rcsb.org/structure/5DK3) (Pembrolizumab IgG4 antibody). 

***
**Building Blocks** used:
 - [biobb_pdb_tidy](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_tidy) from **biobb_pdb_tools.pdb_tools.biobb_pdb_tidy**
 - [biobb_pdb_selchain](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selchain) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selchain**
 - [biobb_pdb_delhetatm](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_delhetatm) from **biobb_pdb_tools.pdb_tools.biobb_pdb_delhetatm**
 - [biobb_pdb_fixinsert](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_fixinsert) from **biobb_pdb_tools.pdb_tools.biobb_pdb_fixinsert**
 - [biobb_pdb_selaltloc](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selaltloc) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selaltloc**
 - [biobb_pdb_keepcord](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_keepcord) from **biobb_pdb_tools.pdb_tools.biobb_pdb_keepcord**
 - [biobb_pdb_selres](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selres) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selres**
 - [biobb_pdb_reres](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_reres) from **biobb_pdb_tools.pdb_tools.biobb_pdb_reres**
 - [biobb_pdb_merge](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_merge) from **biobb_pdb_tools.pdb_tools.biobb_pdb_merge**
 - [biobb_pdb_chain](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chain) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chain**
 - [biobb_pdb_chainxseg](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chainxseg) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chainxseg**
***

#### Complete pdb_tools version from the official HADDOCK3 tutorial:
`pdb_tidy -strict 4G6K.pdb | pdb_selchain -H | pdb_delhetatm | pdb_fixinsert | pdb_selaltloc | pdb_keepcoord | pdb_selres -1:120 | pdb_tidy -strict > 4G6K_H.pdb`

### [Preparing the Antibody structure] Step 1: Select and extract protein regions/chains
**Extract chains** (Heavy: H, Light; L) and select specific residues from the **Variable Fragment (Fv)**.<br>
Two PDB files should be generated:
- *4G6K_H_reduced.pdb* (Heavy, FV)
- *4G6K_L_reduced.pdb* (Light, FV)

In [8]:
from biobb_pdb_tools.pdb_tools import  *

pdb_final = {}
for ch, sel in [('H',120),('L',107)]:
    
    # Create properties dict and inputs/outputs

    # CHAINS H (Heavy), L (Light)
    pdb_final[ch] = f'{out_path}/pre/{antibody_pdb}_{ch}_reduced.pdb'
    
    steps = [
        (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True}),              # Adhere to the format specifications
        (biobb_pdb_selchain.biobb_pdb_selchain,   {'chains': ch}),                # Extract chain
        (biobb_pdb_delhetatm.biobb_pdb_delhetatm, {}),                            # Remove all HETATM records 
        (biobb_pdb_fixinsert.biobb_pdb_fixinsert, {}),                            # Delete insertion codes and shift residue numbering 
        (biobb_pdb_selaltloc.biobb_pdb_selaltloc, {}),                            # Modify the chain identifier column 
        (biobb_pdb_keepcoord.biobb_pdb_keepcoord, {}),                            # Remove all non-coordinate records 
        (biobb_pdb_selres.biobb_pdb_selres,       {'selection': '1:'+str(sel)}),  # Select residues by their range
        (biobb_pdb_tidy.biobb_pdb_tidy,           {})                             # Adhere to the format specifications
    ]

    # Create and launch bb
    pdb_tools_pipeline(ab_pdb, pdb_final[ch], steps)

2025-06-02 16:49:40,256 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_tidy Version: 5.0.1
2025-06-02 16:49:40,257 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_89c40cfc-2768-4fb2-a0ed-2832749b2784 directory successfully created
2025-06-02 16:49:40,258 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_0.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_89c40cfc-2768-4fb2-a0ed-2832749b2784
2025-06-02 16:49:40,259 [MainThread  ] [INFO ]  Appending optional boolean property
2025-06-02 16:49:40,259 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4G6K_0.pdb > ./data/antibody/pre/4G6K_H_reduced.pdb
2025-06-02 16:49:40,259 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:40,260 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4G6K_0.pdb > ./data/antibody/pre/4G6K_H_reduced.pdb

2025-06-02 16:49:40,285 [MainThread

### [Preparing the Antibody structure] Step 2: Merge regions into a single file
Merge **both chains** (Heavy, Light) into a single PDB file. <br>
- *4G6K_clean.pdb* (Heavy + Light chains, FV)

In [9]:
import zipfile
from biobb_pdb_tools.pdb_tools.biobb_pdb_merge import biobb_pdb_merge

# Join the generated PDB files in a single ZIP file 
zip_file_path = f'{out_path}/pre/{antibody_pdb}_HL.zip'
# Create a zip file and add the pdb_out file to it
with zipfile.ZipFile(zip_file_path, 'w') as zipf:
    zipf.write(pdb_final['H'], arcname=f'{antibody_pdb}_H.pdb')
    zipf.write(pdb_final['L'], arcname=f'{antibody_pdb}_L.pdb')
    
# Create properties dict and inputs/outputs
complexFile = f'{out_path}/pre/{antibody_pdb}_clean_merge.pdb' 

prop = def_dict({})

# Create and launch bb
biobb_pdb_merge(input_file_path = zip_file_path,  
                output_file_path=complexFile,  
                properties = prop)

2025-06-02 16:49:40,665 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_merge Version: 5.0.1
2025-06-02 16:49:40,666 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a7cdf169-4014-4ddf-898c-4ca68d27e6b8 directory successfully created
2025-06-02 16:49:40,668 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_HL.zip to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a7cdf169-4014-4ddf-898c-4ca68d27e6b8
2025-06-02 16:49:40,669 [MainThread  ] [INFO ]  pdb_merge /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a7cdf169-4014-4ddf-898c-4ca68d27e6b8/4G6K_H.pdb /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a7cdf169-4014-4ddf-898c-4ca68d27e6b8/4G6K_L.pdb > ./data/antibody/pre/4G6K_clean_merge.pdb
2025-06-02 16:49:40,669 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:40,670 [MainThread  ] [INFO ]  pdb_m

0

### [Preparing the Antibody structure] Step 3: Prepare PDB file to meet HADDOCK3 requirements
Add **single chain id** to the whole structure, **renumber residue ids**, and add chain id in the **segment identifier**.<br>
- *4G6K_clean_seg.pdb* 

In [10]:
# Create properties dict and inputs/outputs
antibody_prep = f'{out_path}/pre/{antibody_pdb}_clean.pdb'

steps = [
    (biobb_pdb_reres.biobb_pdb_reres,         {'number': 1}),      # Renumber the residues starting from 1
    (biobb_pdb_chain.biobb_pdb_chain,         {'chain': 'A'}),     # Modify the chain identifier column 
    (biobb_pdb_chainxseg.biobb_pdb_chainxseg, {}),                 # Swap the segment identifier for the chain identifier
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True}),   # Adhere to the format specifications
]

# Create and launch bb
pdb_tools_pipeline(complexFile, antibody_prep, steps)

2025-06-02 16:49:40,692 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_reres Version: 5.0.1
2025-06-02 16:49:40,693 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_84650287-6127-4bef-9f17-9ff8c5f15dba directory successfully created
2025-06-02 16:49:40,693 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_clean_merge.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_84650287-6127-4bef-9f17-9ff8c5f15dba
2025-06-02 16:49:40,693 [MainThread  ] [INFO ]  Appending optional boolean property
2025-06-02 16:49:40,693 [MainThread  ] [INFO ]  pdb_reres -1 /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_84650287-6127-4bef-9f17-9ff8c5f15dba/4G6K_clean_merge.pdb > ./data/antibody/pre/4G6K_clean.pdb
2025-06-02 16:49:40,694 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:40,694 [MainThread  ] [INFO ]  pdb_reres -1 /home/rchav

### [Preparing the Antibody structure] Step 4: Visualize final structure (antibody)

- Left: **Original antibody**, for comparison. Heavy Chain: Blue; Light Chain: Red.
- Right: **Prepared structure**, with only the variable domain (FV). Heavy Chain: Blue; Light Chain: Red.

In [109]:
# Show structures: Fv with and without contant domain
view1 = nv.show_structure_file(ab_pdb)
view1._remote_call('setSize', target='Widget', args=['500px','400px'])
view1.clear_representations()
view1.add_representation(repr_type='cartoon', selection=':H', color='blue')
view1.add_representation(repr_type='cartoon', selection=':L', color='red')
view1.add_representation(repr_type='surface', radius='.3', selection='1-120:H', color='blue')
view1.add_representation(repr_type='surface', radius='.3', selection='1-107:L', color='red')
view1.camera='orthographic'
view1
view2 = nv.show_structure_file(antibody_prep)
view2._remote_call('setSize', target='Widget', args=['500px','400px'])
view2.clear_representations()
view2.add_representation(repr_type='surface', radius='.3', selection='1-120:A', color='blue')
view2.add_representation(repr_type='surface', radius='.3', selection='121-228:A', color='red')
view2.camera='orthographic'
view2
ipywidgets.HBox([view1, view2])

HBox(children=(NGLWidget(), NGLWidget()))

## Preparing the antigen structure

The **Interleukin-1β (IL-1β) antigen** also needs to be processed to meet **HADDOCK3 requirements**. In particular, as each molecule given to **HADDOCK3** in a docking scenario must have a **unique chain id / segment id**, a new identifier ***-B-*** will be assigned.

***
**Building Blocks** used:
 - [biobb_pdb_tidy](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_tidy) from **biobb_pdb_tools.pdb_tools.biobb_pdb_tidy**
 - [biobb_pdb_delhetatm](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_delhetatm) from **biobb_pdb_tools.pdb_tools.biobb_pdb_delhetatm**
 - [biobb_pdb_selaltloc](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selaltloc) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selaltloc**
 - [biobb_pdb_keepcord](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_keepcord) from **biobb_pdb_tools.pdb_tools.biobb_pdb_keepcord**
 - [biobb_pdb_chain](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chain) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chain**
 - [biobb_pdb_chainxseg](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chainxseg) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chainxseg**

***

#### Complete pdb_tools version from the official HADDOCK3 tutorial:

`pdb_fetch 4I1B | pdb_tidy -strict | pdb_delhetatm | pdb_selaltloc | pdb_keepcoord | pdb_chain -B | pdb_chainxseg | pdb_tidy -strict > 4I1B_clean.pdb`

In [12]:
# Create properties dict and inputs/outputs
antigen_prep = f'{out_path}/pre/{antigen_pdb}_clean.pdb'

# Create and launch bb
steps = [
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True}),   # Adhere to the format specifications
    (biobb_pdb_delhetatm.biobb_pdb_delhetatm, {}),                 # Remove all HETATM records 
    (biobb_pdb_selaltloc.biobb_pdb_selaltloc, {}),                 # Modify the chain identifier column 
    (biobb_pdb_keepcoord.biobb_pdb_keepcoord, {}),                 # Remove all non-coordinate records 
    (biobb_pdb_chain.biobb_pdb_chain,         {'chain': 'B'}),     # Modify the chain identifier column 
    (biobb_pdb_chainxseg.biobb_pdb_chainxseg, {}),                 # Swap the segment identifier for the chain identifier
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True})    # Adhere to the format specifications
]
pdb_tools_pipeline(ag_pdb, antigen_prep, steps)

2025-06-02 16:49:40,816 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_tidy Version: 5.0.1
2025-06-02 16:49:40,816 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_bf49b215-367f-4760-86d1-2c08ce9e04ba directory successfully created
2025-06-02 16:49:40,817 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4I1B_0.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_bf49b215-367f-4760-86d1-2c08ce9e04ba
2025-06-02 16:49:40,817 [MainThread  ] [INFO ]  Appending optional boolean property
2025-06-02 16:49:40,817 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4I1B_0.pdb > ./data/antibody/pre/4I1B_clean.pdb
2025-06-02 16:49:40,818 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:40,818 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4I1B_0.pdb > ./data/antibody/pre/4I1B_clean.pdb

2025-06-02 16:49:40,836 [MainThread  ] [INF

### Visualize final structure (antigen)

- Left: **Original antigen**, for comparison. 
- Right: **Prepared structure**.

Note that the only difference in this case is the **chain / segment ids** (changed from ***-A-*** to ***-B-***).

In [160]:

view1 = nv.show_structure_file(ag_pdb)
view1._remote_call('setSize', target='Widget', args=['500px','400px'])
view1.clear_representations()
view1.add_representation(repr_type='cartoon', selection=':A', color='green')
view1.camera='orthographic'
view1
view2 = nv.show_structure_file(antigen_prep)
view2._remote_call('setSize', target='Widget', args=['500px','400px'])
view2.clear_representations()
view2.add_representation(repr_type='cartoon', selection=':B', color='green')
view2.camera='orthographic'
view2
ipywidgets.HBox([view1, view2])

HBox(children=(NGLWidget(), NGLWidget()))

## Preparing the reference structure

Finally, since an experimentally solved structure of the **Antibody-Antigen** complex is available, we aim to compare the **intermediate results** of **HADDOCK3** with it. For this, the **reference PDB** should also be prepared. 

***
**Building Blocks** used:
 - [biobb_pdb_selchain](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selchain) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selchain**
 - [biobb_pdb_selres](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_selres) from **biobb_pdb_tools.pdb_tools.biobb_pdb_selres**
 - [biobb_pdb_reres](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_reres) from **biobb_pdb_tools.pdb_tools.biobb_pdb_reres**
 - [biobb_pdb_merge](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_merge) from **biobb_pdb_tools.pdb_tools.biobb_pdb_merge**
 - [biobb_pdb_chain](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chain) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chain**
 - [biobb_pdb_chainxseg](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_chainxseg) from **biobb_pdb_tools.pdb_tools.biobb_pdb_chainxseg**
 - [biobb_pdb_tidy](https://biobb-pdb-tools.readthedocs.io/en/latest/pdb_tools.html#module-pdb_tools.biobb_pdb_tidy) from **biobb_pdb_tools.pdb_tools.biobb_pdb_tidy**
***

### [Preparing the Reference structure] Step 1: Select and extract Heavy and Light antibody regions/chains from the complex
**Extract chains** (Heavy: H, Light; L) and select specific residues from the **Variable Domain (FV)**.<br>
Two PDB files should be generated:
- *4G6M_H_reduced.pdb* (Heavy, FV)
- *4G6M_L_reduced.pdb* (Light, FV)

In [14]:
# Create properties dict and inputs/outputs
pdb_ref = {}
pdb_in  = f'{out_path}/pre/{complex_pdb}_0.pdb'

for ch, sel in [('H',120),('L',107)]:
    pdb_ref[ch] = f'{out_path}/pre/{complex_pdb}_{ch}_reduced.pdb'
    steps = [
        (biobb_pdb_selchain.biobb_pdb_selchain,   {'chains': ch}),                # Extract chains
        (biobb_pdb_delhetatm.biobb_pdb_delhetatm, {}),                            # Remove all HETATM records
        (biobb_pdb_fixinsert.biobb_pdb_fixinsert, {}),                            # Delete insertion codes and shift residue numbering
        (biobb_pdb_selaltloc.biobb_pdb_selaltloc, {}),                            # Modify the chain identifier column 
        (biobb_pdb_keepcoord.biobb_pdb_keepcoord, {}),                            # Remove all non-coordinate records 
        (biobb_pdb_selres.biobb_pdb_selres,       {'selection': '1:'+str(sel)}),  # Select residues by their range
        (biobb_pdb_tidy.biobb_pdb_tidy,           {})                             # Adhere to the format specifications
    ]

    # Create and launch bb
    pdb_tools_pipeline(pdb_in, pdb_ref[ch], steps)
    

2025-06-02 16:49:41,008 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_selchain Version: 5.0.1
2025-06-02 16:49:41,009 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_aea2a586-2a42-49ca-8ee5-c4ef855508fb directory successfully created
2025-06-02 16:49:41,010 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_0.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_aea2a586-2a42-49ca-8ee5-c4ef855508fb
2025-06-02 16:49:41,010 [MainThread  ] [INFO ]  Appending chains to select
2025-06-02 16:49:41,010 [MainThread  ] [INFO ]  pdb_selchain -H /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_aea2a586-2a42-49ca-8ee5-c4ef855508fb/4G6M_0.pdb > ./data/antibody/pre/4G6M_H_reduced.pdb
2025-06-02 16:49:41,010 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:41,010 [MainThread  ] [INFO ]  pdb_selchain -H /home/rchaves/repo/biobb_wf

### [Preparing the Reference structure] Step 2: Merge Heavy and Light regions (from the complex) into a single file
Merge **both chains** (Heavy, Light) into a single PDB file. <br>
- *4G6M_antibody_clean.pdb* (Heavy + Light chains, FV)

In [15]:
# Join the generated PDB files in a single ZIP file 
zip_file_path = f'{out_path}/pre/{complex_pdb}_HL.zip'
# Create a zip file and add the pdb_out file to it
with zipfile.ZipFile(zip_file_path, 'w') as zipf:
    zipf.write(pdb_ref['H'], arcname=f'{complex_pdb}_H.pdb')
    zipf.write(pdb_ref['L'], arcname=f'{complex_pdb}_L.pdb')
    
# Create properties dict and inputs/outputs
complexFile_ref_antibody = f'{out_path}/pre/{complex_pdb}_antibody_clean.pdb' 

prop = def_dict({})

# Create and launch bb
biobb_pdb_merge(input_file_path = zip_file_path,  
                output_file_path=complexFile_ref_antibody,  
                properties = prop)

2025-06-02 16:49:41,358 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_merge Version: 5.0.1
2025-06-02 16:49:41,359 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_ebb37ada-7fb0-47e5-beeb-daf97f2423e9 directory successfully created
2025-06-02 16:49:41,360 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_HL.zip to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_ebb37ada-7fb0-47e5-beeb-daf97f2423e9
2025-06-02 16:49:41,360 [MainThread  ] [INFO ]  pdb_merge /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_ebb37ada-7fb0-47e5-beeb-daf97f2423e9/4G6M_H.pdb /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_ebb37ada-7fb0-47e5-beeb-daf97f2423e9/4G6M_L.pdb > ./data/antibody/pre/4G6M_antibody_clean.pdb
2025-06-02 16:49:41,361 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:41,361 [MainThread  ] [INFO ]  pd

0

### [Preparing the Reference structure] Step 3: Prepare the Antibody PDB file (from the complex) to match HADDOCK3 requirements
Add **single chain id** to the whole structure, **renumber residue ids**, and add chain id in the **segment identifier**, with the aim of having a **reference PDB** that can be directly compared with the **HADDOCK3** generated models.<br>
- *4G6M_clean_antibody_final.pdb* 

In [16]:
# Create properties dict and inputs/outputs
complexFile_ref_antibody_final = f'{out_path}/pre/{complex_pdb}_clean_antibody_final.pdb'

steps = [
    (biobb_pdb_reres.biobb_pdb_reres,         {'number': 1}),     # Renumber the residues starting from 1
    (biobb_pdb_chain.biobb_pdb_chain,         {'chain': 'A'}),    # Modify the chain identifier column 
    (biobb_pdb_chainxseg.biobb_pdb_chainxseg, {}),                # Swap the segment identifier for the chain identifier
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True})   # Adhere to the format specifications
]

# Create and launch bb
pdb_tools_pipeline(complexFile_ref_antibody, complexFile_ref_antibody_final, steps)


2025-06-02 16:49:41,385 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_reres Version: 5.0.1
2025-06-02 16:49:41,385 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a560c756-d6dd-4cd5-bf3b-8a2e34079cbe directory successfully created
2025-06-02 16:49:41,386 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_antibody_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a560c756-d6dd-4cd5-bf3b-8a2e34079cbe
2025-06-02 16:49:41,386 [MainThread  ] [INFO ]  Appending optional boolean property
2025-06-02 16:49:41,386 [MainThread  ] [INFO ]  pdb_reres -1 /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_a560c756-d6dd-4cd5-bf3b-8a2e34079cbe/4G6M_antibody_clean.pdb > ./data/antibody/pre/4G6M_clean_antibody_final.pdb
2025-06-02 16:49:41,387 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:41,387 [MainThread  ] [INFO ]  pdb

### [Preparing the Reference structure] Step 4: Prepare the Antigen PDB file (from the complex) to match HADDOCK3 requirements 
Add **single chain id** to the whole structure, **renumber residue ids**, and add chain id in the **segment identifier**, with the aim of having a **reference PDB** that can be directly compared with the **HADDOCK3** generated models.<br>
- *4G6M_clean_antigen_final.pdb* 

In [17]:
# Create properties dict and inputs/outputs
complexFile_ref_antigen_final = f'{out_path}/pre/{complex_pdb}_clean_antigen_final.pdb'

steps = [
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True}),    # Adhere to the format specifications
    (biobb_pdb_selchain.biobb_pdb_selchain,   {'chains': 'A'}),     # Extract chains
    (biobb_pdb_chain.biobb_pdb_chain,         {'chain': 'B'}),      # Modify the chain identifier column 
    (biobb_pdb_chainxseg.biobb_pdb_chainxseg, {}),                  # Swap the segment identifier for the chain identifier
    (biobb_pdb_delhetatm.biobb_pdb_delhetatm, {}),                  # Remove all HETATM records
    (biobb_pdb_fixinsert.biobb_pdb_fixinsert, {}),                  # Delete insertion codes and shift residue numbering
    (biobb_pdb_selaltloc.biobb_pdb_selaltloc, {}),                  # Modify the chain identifier column 
    (biobb_pdb_keepcoord.biobb_pdb_keepcoord, {}),                  # Remove all non-coordinate records 
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True})     # Adhere to the format specifications
]
# Create and launch bb
pdb_tools_pipeline(cx_pdb, complexFile_ref_antigen_final, steps)


2025-06-02 16:49:41,489 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_tidy Version: 5.0.1
2025-06-02 16:49:41,490 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_99a093c8-89a9-4898-be53-cc02646b803d directory successfully created
2025-06-02 16:49:41,490 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_0.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_99a093c8-89a9-4898-be53-cc02646b803d
2025-06-02 16:49:41,491 [MainThread  ] [INFO ]  Appending optional boolean property
2025-06-02 16:49:41,491 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4G6M_0.pdb > ./data/antibody/pre/4G6M_clean_antigen_final.pdb
2025-06-02 16:49:41,491 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:41,491 [MainThread  ] [INFO ]  pdb_tidy -strict ./data/antibody/pre/4G6M_0.pdb > ./data/antibody/pre/4G6M_clean_antigen_final.pdb

2025-06-02 16:4

### [Preparing the Reference structure] Step 5: Merge Heavy + Light regions and Antigen (from the complex) into a single file
Merge **all chains** (Antibody Heavy chain, Antibody Light chain, Antigen) into a single PDB file. <br>
- *4G6M_clean.pdb* (Heavy + Light chains -FV- + Antigen)

In [18]:
## Join the generated PDB files in a single ZIP file ##

zip_file_path = f'{out_path}/pre/{complex_pdb}_HL_B.zip'
# Create a zip file and add the pdb_out file to it
with zipfile.ZipFile(zip_file_path, 'w') as zipf:
    zipf.write(complexFile_ref_antibody_final, arcname=f'{complex_pdb}_antibody.pdb')
    zipf.write(complexFile_ref_antigen_final, arcname=f'{complex_pdb}_antigen.pdb')
    
# Create properties dict and inputs/outputs
complex_prep = f'{out_path}/pre/{complex_pdb}_clean.pdb' 

steps = [
    (biobb_pdb_merge,         {}),                                 # Merges several PDB files into one
    (biobb_pdb_tidy.biobb_pdb_tidy,           {'strict': True})    # Adhere to the format specifications
]

# Create and launch bb
pdb_tools_pipeline(zip_file_path, complex_prep, steps)

2025-06-02 16:49:41,710 [MainThread  ] [INFO ]  Module: biobb_pdb_tools.pdb_tools.biobb_pdb_merge Version: 5.0.1
2025-06-02 16:49:41,710 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8871192a-6c1f-468e-a0e1-3b4919b97a94 directory successfully created
2025-06-02 16:49:41,712 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_HL_B.zip to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8871192a-6c1f-468e-a0e1-3b4919b97a94
2025-06-02 16:49:41,713 [MainThread  ] [INFO ]  pdb_merge /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8871192a-6c1f-468e-a0e1-3b4919b97a94/4G6M_antibody.pdb /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8871192a-6c1f-468e-a0e1-3b4919b97a94/4G6M_antigen.pdb > ./data/antibody/pre/4G6M_clean.pdb
2025-06-02 16:49:41,713 [MainThread  ] [INFO ]  Creating command line with instructions and required arguments
2025-06-02 16:49:41,713 [MainThread  ] [INFO

### [Preparing the Reference structure] Step 6: Visualize final structure (antibody-antigen complex)

Visualizing the processed **antibody-antigen complex** using **NGLview**:  
- Left: **Original Antibody-Antigen complex** for comparison. Heavy Chain: Green; Light Chain: Blue; Antigen: Red
- Right: **Processed Antibody-Antigen complex** with only the variable domain (FV). Heavy Chain: Green; Light Chain: Blue; Antigen: Red

In [159]:
# Show structures: antibody, antigen and complex
view1 = nv.show_structure_file(cx_pdb)
view1._remote_call('setSize', target='Widget', args=['500px','400px'])
view1.clear_representations()
view1.add_representation(repr_type='surface', radius='0.3', selection=':H', color='blue')
view1.add_representation(repr_type='surface', radius='0.3', selection=':L', color='red')
view1.add_representation(repr_type='surface', radius='0.3', selection=':A', color='green')
view1.camera='orthographic'
view1
view2 = nv.show_structure_file(complex_prep)
view2._remote_call('setSize', target='Widget', args=['500px','400px'])
view2.clear_representations()
view2.add_representation(repr_type='surface', radius='0.3', selection='1-120:A', color='blue')
view2.add_representation(repr_type='surface', radius='0.3', selection='121-227:A', color='red')
view2.add_representation(repr_type='surface', radius='0.3', selection=':B', color='green')
view2.camera='orthographic'
view2
ipywidgets.HBox([view1, view2])

HBox(children=(NGLWidget(), NGLWidget()))

# Defining HADDOCK3 restraints

The **HADDOCK3 protein-protein docking** tool integrates diverse sources of information, including experimental data, biochemical and biophysical insights, bioinformatics predictions, and prior knowledge, to **guide** and **refine** the docking process effectively. In this tutorial, knowledge of the **hypervariable loops** ([CDRs](https://en.wikipedia.org/wiki/Complementarity-determining_region)) on the **antibody**, and **epitope** information identified from **NMR experiments** for the **antigen** will be used to **guide the docking**.

The small part of the **antibody Fv region** that binds the **antigen** is called **paratope**. The part of the **antigen** that binds to an **antibody** is called **epitope** (see image, left). The **paratope** consists of six **highly flexible loops**, known as **complementarity-determining regions** (CDRs) or **hypervariable loops** whose sequence and conformation are altered to bind to different **antigens** (see image, right). 

Both regions are going to be used as **HADDOCK3 restrictions** to **guide** the **docking process**.

<table cellspacing="0" cellpadding="0" style="border-collapse: collapse; width: 100%;">
    <tr style="background: white;">
        <td style="text-align: center; vertical-align: middle; width: 25%;">
            <img src="imgs/EpitopeParatope.png" alt="Epitope/Paratope" style="max-width: 100%; height: auto; display: block; margin: auto;">
            <i>Paratope (Antibody) and Epitope (Antigen) representation<br> 
        </td>
        <td style="text-align: center; vertical-align: middle; width: 50%;">
            <img src="imgs/CDRs_reduced.png" alt="CDRs" style="max-width: 100%; height: auto; display: block; margin: auto;">
            <i>Paratope complementarity-determining regions (CDRs)<br> Figure represented with permissions from the official <a href="https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/#introduction">HADDOCK3 antibody-antigen tutorial</a></i>
        </td>
    </tr>
</table>

### Paratope Restraints

Developing new **vaccines** and **antibody therapeutics** is a lengthy process, often spanning several years. **Prediction** and understanding of the **paratope** region (**antibody binding site**) can **accelerate** this timeline and **lower the costs**. 
  
As a result, the research community has developed a wide range of tools to **predict paratope regions** (residues within **hypervariable loops** that play a key role in **antibody-antigen binding**) directly from **antibody sequences**.

This tutorial will work with a list of **paratope residues** obtained using the **ProABC-2 predictor** (see reference). **ProABC-2** uses a **convolutional neural network** to identify not only residues which are located in the **paratope region** but also the nature of interactions they are most likely involved in (hydrophobic or hydrophilic). 

**Paratope residues** used are those with either an **overall probability >= 0.4** or a **probability for hydrophobic or hydrophilic > 0.3**. Residue numbering corresponding to the processed file: *4G6K_clean.pdb*

References: <br>

**proABC-2: PRediction of AntiBody contacts v2 and its application to information-driven docking.**<br>
*F. Ambrosetti, T. H. Olsen, P. P. Olimpieri, B. Jiménez-García, E. Milanetti, P. Marcatilli, A. MJJ Bonvin*<br>
*Bioinformatics, 2020;36(20):5107-5108.*<br>
*Available at: https://doi.org/10.1093/bioinformatics/btaa644*<br>
*Source Code: https://github.com/haddocking/proabc-2*
***

In [20]:
# Antibody paratope residue list from proABC-2
paratope_sel = '31,32,33,34,35,52,54,55,56,100,101,102,103,104,105,106,151,152,169,170,173,211,212,213,214,216'

In [157]:
# Show antibody paratope
view = nv.show_structure_file(antibody_prep)
view._remote_call('setSize', target='Widget', args=['500px','500px'])
view.add_representation(repr_type='surface', selection=paratope_sel.replace(',', ', '), color='#a271a2')
centered_view = ipywidgets.HBox([view], layout=ipywidgets.Layout(justify_content="center"))
centered_view

HBox(children=(NGLWidget(),), layout=Layout(justify_content='center'))

### Epitope Restraints

The work describing the **crystal structures** deposited in the **PDB** for the **gevokizumab antibody** (PDB: 4G6K) and the **antibody-antigen** (Gevokizumab-Interleukin1β) complex (PDB: 4G6M) also comes with experimental **NMR chemical shift** titration experiments to map the **binding site** of the antibody on **Interleukin1β**. The residues affected by binding are listed in Table 5 of Blech et al. JMB 2013 (see references below). 

References: <br>

**One Target—Two Different Binding Modes: Structural Insights into Gevokizumab and Canakinumab Interactions to Interleukin-1β.**<br>
*M. Blech, D. Peter, P Fischer, M. M.T. Bauer, M. Hafner, M. Zeeb, H. Nar*<br>
*Journal of Molecular Biology, 2013;425(1):94-111.*<br>
*Available at: https://doi.org/10.1016/j.jmb.2012.09.021*
***

In [22]:
# Antigen Epitope residue list from Blech et al. JMB 2013, Table 5
epitope_sel  = '72,73,74,75,81,83,84,89,90,92,94,96,97,98,115,116,117'

In [155]:
# Show antibody paratope
view = nv.show_structure_file(antigen_prep)
view._remote_call('setSize', target='Widget', args=['500px','500px'])
view.add_representation(repr_type='surface', selection=epitope_sel.replace(',', ', '), color='#a9d18e')
centered_view = ipywidgets.HBox([view], layout=ipywidgets.Layout(justify_content="center"))
centered_view

HBox(children=(NGLWidget(),), layout=Layout(justify_content='center'))

### Visualize interfaces (paratope, epitope)

Visualizing the **predicted paratope** and the **described epitope** using **NGLview**:  
- Left: **predicted paratope** (light purple), with only the antibody variable domain (Fv). 
- Center: **described epitope** (light green), with the antigen.
- Right: **predicted paratope** (light purple) and  **described epitope** (light green) shown on the reference. **Antibody-Antigen complex** with only the variable domain (Fv). Antibody Fv: Blue; Antigen: Green.

In [154]:
# Show structures: antibody, antigen and complex (for reference)
view1 = nv.show_structure_file(antibody_prep)
view1._remote_call('setSize', target='Widget', args=['350px','400px'])
view1.clear_representations()
view1.add_representation(repr_type='cartoon', selection=':A', color='purple')
view1.add_representation(repr_type='surface', ratio='0.3', selection=paratope_sel.replace(',', ', '), color='#a271a2')
view1.camera='orthographic'
view1
view2 = nv.show_structure_file(antigen_prep)
view2._remote_call('setSize', target='Widget', args=['350px','400px'])
view2.clear_representations()
view2.add_representation(repr_type='cartoon', selection=':B', color='green')
view2.add_representation(repr_type='surface', ratio='0.3', selection=epitope_sel.replace(',', ', '), color='#a9d18e')
view2.camera='orthographic'
view2
view3 = nv.show_structure_file(complex_prep)
view3._remote_call('setSize', target='Widget', args=['350px','400px'])
view3.clear_representations()
view3.add_representation(repr_type='cartoon', selection=':A', color='purple')
view3.add_representation(repr_type='surface', ratio='0.3', selection=paratope_sel_ngl, color='#a271a2')
paratope_sel_ngl = ' '.join(f"{num}:A" for num in paratope_sel.split(','))
view3.add_representation(repr_type='cartoon', selection=':B', color='green')
view3.add_representation(repr_type='surface', ratio='0.3', selection=epitope_sel_ngl, color='#a9d18e')
epitope_sel_ngl = ' '.join(f"{num}:B" for num in epitope_sel.split(','))
view3.camera='orthographic'
view3
ipywidgets.HBox([view1, view2, view3])

HBox(children=(NGLWidget(), NGLWidget(), NGLWidget()))

### HADDOCK3 passive restraints

**Binding site interfaces** defined in the previous steps are tagged as **active residues** in **HADDOCK3**. Additionaly, **HADDOCK3** also gives the possibility to define **passive residues** to refine potentially **incomplete binding sites**. These residues are selected based on the **surface neighbors** of the **active residues**, and are included in the **interface definition**. However, they do not incur any **energetic penalty** if they are not part of the binding site in the final models. In contrast, **active residues** (typically identified or predicted as **key binding site residues**) are subject to an **energetic penalty** if they are not part of the binding site in the final models.

Here is the **HADDOCK3 definition** of **active** and **passive** residues:

**Active residues**: These residues are “forced” to be at the interface. If they are not part of the interface in the final models, an energetic penalty will be applied. The interface in this context is defined by the union of active and passive residues on the partner molecules.

**Passive residues**: These residues are expected to be at the interface. However, if they are not, no energetic penalty is applied.

In **HADDOCK3**, **passive** and **active residues** are defined in a **specific file** containing two lines:

- The first line corresponds to the list of **active residues** (numbers separated by spaces)
- The second line corresponds to the list of **passive residues** (numbers separated by spaces).

The file must always consist of two lines, but a line can be empty (e.g., if you do not want to define **active or passive residues** for one molecule). However, there must be at least one set of **active residues** defined for one of the molecules.

In this tutorial, a list of **passive residues** are defined for the **epitope region**, whereas no **passive residues** are going to be added to the **paratope region**, using only the **active residues** to guide the **docking process**. 

***
**Building Blocks** used:
 - [haddock3_passive_from_active](https://biobb-haddock.readthedocs.io/en/latest/haddock_restraints.html#module-haddock_restraints.haddock3_passive_from_active) from **biobb_haddock.haddock_restraints.haddock3_passive_from_active**
***

In [25]:
# Defining only active residues for antibody paratope region 
# Adding empty line as passive residues
ab_actpass = f'{out_path}/pre/{antibody_pdb}_actpass.txt' 
with open(ab_actpass, 'w') as f:
    f.write( paratope_sel.replace(',', ' ')+'\n\n')

In [26]:
# For the antigen, we will use the epitope selection as the active selection
# and some reidues around it as passsive

from biobb_haddock.haddock_restraints.haddock3_passive_from_active import haddock3_passive_from_active

# Create properties dict and inputs/outputs
ag_actpass = f'{out_path}/pre/{antigen_pdb}_actpass.txt'

prop = def_dict({
    'active_list' : epitope_sel
})

# Create and launch bb
haddock3_passive_from_active( 
    input_pdb_path      = antigen_prep,
    output_actpass_path = ag_actpass,
    properties          = prop
)

2025-06-02 16:49:41,878 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock_restraints.haddock3_passive_from_active Version: 5.0.1
2025-06-02 16:49:41,879 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4bbe36c9-a98a-4f9b-b19b-ebb9c43cf11b directory successfully created
2025-06-02 16:49:41,880 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4I1B_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4bbe36c9-a98a-4f9b-b19b-ebb9c43cf11b
2025-06-02 16:49:41,880 [MainThread  ] [INFO ]  haddock3-restraints passive_from_active /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4bbe36c9-a98a-4f9b-b19b-ebb9c43cf11b/4I1B_clean.pdb 72,73,74,75,81,83,84,89,90,92,94,96,97,98,115,116,117 &> /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4bbe36c9-a98a-4f9b-b19b-ebb9c43cf11b/4I1B_actpass.txt

2025-06-02 16:49:42,283 [MainThread  ] [INFO ]  Executing: haddock3-restraints passive

0

### Visualize active / passive residues (epitope)

Visualizing the defined **active** (red) and **passive** (green) residues for the **antigen epitope region** using **NGLview** 


In [161]:
display_actpass(antigen_prep, ag_actpass)

NGLWidget(layout=Layout(width='100%'))

### HADDOCK3 ambiguous restraints

Once **active** and **passive residues** are identified and defined for both molecules, they need to be transformed into CNS-formatted **Ambiguous Interaction Restraints** (AIR) files, so that they can be integrated in the **docking process**.  

**HADDOCK3** uses **CNS** as computational engine. A description of the format for the various restraint types supported by **HADDOCK3** can be found in the **Nature Protocol HADDOCK paper**, 2024, Box 1 (see reference below).

References: <br>

**The HADDOCK2.4 web server for integrative modeling of biomolecular complexes.**<br>
*R. Honorato et al.*<br>
*Nat. Protoc., 2024, 19, 3219–3241.*<br>
*Available at: https://doi.org/10.1038/s41596-024-01011-0*
***

**Building Blocks** used:
 - [haddock3_actpass_to_ambig](https://biobb-haddock.readthedocs.io/en/latest/haddock_restraints.html#module-haddock_restraints.haddock3_actpass_to_ambig) from **biobb_haddock.haddock_restraints.haddock3_actpass_to_ambig**
***

In [28]:
# Convert active/passive to ambiguous restraints
from biobb_haddock.haddock_restraints.haddock3_actpass_to_ambig import haddock3_actpass_to_ambig

# Create properties dict and inputs/outputs
complex_tbl = f'{out_path}/pre/ambig-paratope-NMR-epitope.tbl'

prop = def_dict({
    'segid_one': 'A', 
    'segid_two': 'B'
})

# Create and launch bb
haddock3_actpass_to_ambig( 
    input_actpass1_path=ab_actpass,
    input_actpass2_path=ag_actpass,    
    output_tbl_path=complex_tbl,
    properties = prop
)

2025-06-02 16:49:42,305 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock_restraints.haddock3_actpass_to_ambig Version: 5.0.1
2025-06-02 16:49:42,305 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_456220c5-5abe-4b60-ba67-b79c29503c52 directory successfully created
2025-06-02 16:49:42,306 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_actpass.txt to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_456220c5-5abe-4b60-ba67-b79c29503c52
2025-06-02 16:49:42,307 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4I1B_actpass.txt to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_456220c5-5abe-4b60-ba67-b79c29503c52
2025-06-02 16:49:42,308 [MainThread  ] [INFO ]  haddock3-restraints active_passive_to_ambig /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_456220c5-5abe-4b60-ba67-b79c29503c52/4G6K_actpass.txt /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks

0

### Additional restraints for multi-chain proteins

As an **antibody** consists of two separate chains, it is important to define a few **distance restraints** to keep them together during the high temperature **flexible refinement** stage of **HADDOCK3** otherwise they might slightly drift appart. 

The following step generates **CA-CA distance restraints** with the exact distance measured between **randomly picked CA atoms pairs**, which will keep the two chains together.

***

**Building Blocks** used:
 - [haddock3_restrain_bodies](https://biobb-haddock.readthedocs.io/en/latest/haddock_restraints.html#module-haddock_restraints.haddock3_restrain_bodies) from **biobb_haddock.haddock_restraints.haddock3_restrain_bodies**
***

In [29]:
# Tie antibody chains together
from biobb_haddock.haddock_restraints.haddock3_restrain_bodies import haddock3_restrain_bodies

# Create properties dict and inputs/outputs
body_tbl = f'{out_path}antibody-unambig.tbl'

prop = def_dict({})

# Create and launch bb
haddock3_restrain_bodies( 
    input_structure_path=antibody_prep,
    output_tbl_path=body_tbl,
    properties = prop
)

2025-06-02 16:49:42,648 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock_restraints.haddock3_restrain_bodies Version: 5.0.1
2025-06-02 16:49:42,649 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_e3ea1606-f969-42f9-83c3-b220607edaf8 directory successfully created
2025-06-02 16:49:42,649 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_e3ea1606-f969-42f9-83c3-b220607edaf8
2025-06-02 16:49:42,650 [MainThread  ] [INFO ]  haddock3-restraints restrain_bodies /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_e3ea1606-f969-42f9-83c3-b220607edaf8/4G6K_clean.pdb &> /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_e3ea1606-f969-42f9-83c3-b220607edaf8/antibodyantibody-unambig.tbl

2025-06-02 16:49:42,969 [MainThread  ] [INFO ]  Executing: haddock3-restraints restrain_bodies /home/rchaves/repo/biobb_wf_haddock/biobb

0

# Docking


**HADDOCK3** introduced a new **modular version** of the original **HADDOCK**, broken down in a catalogue of independent modules and then enriched with powerful analysis tools and third-party integrations (https://github.com/haddocking/haddock3, see reference below). In **HADDOCK3**, users have the freedom to configure **docking workflows** into functional pipelines by combining the different modules, thus adapting the **workflows** to their projects. **HADDOCK3** has therefore developed to truthfully work like a puzzle of many pieces (simulation modules) that users can combine freely. In the following cells, **BioBB building blocks** wrapping these independent modules are used to build a **HADDOCK3 docking workflow**, step by step. If you are interested in the **HADDOCK3 complete workflow** configuration file, please refer to the [official HADDOCK3 antibody-antigen tutorial section](https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/#haddock3-workflow-definition).  

<div align="center">
    <img src="imgs/HADDOCK3-workflow-scheme.png" alt="HADDOCK3 modularity" title="HADDOCK3 modularity" width="700"/>
    <br><i>HADDOCK3 workflow modularity<br>Figure represented with permissions from the official 
    <a href="https://www.bonvinlab.org/education/HADDOCK3/HADDOCK3-antibody-antigen/#introduction">HADDOCK3 antibody-antigen tutorial</a></i>
    <br><br>
</div>

The **docking workflow** consists of **14 steps**, outlined below. Since this is a **demonstration workflow**, and given the availability of reliable information on the **paratope** and **epitope**, the sampling steps have been reduced. This optimization minimizes computational cost and accelerates time-to-results while maintaining the accuracy of the generated models.

[**1. Create topology**](#1.-Create-topology): Generates the topologies for the CNS engine and builds missing atoms<br>
[**2. Rigid Body sampling**](#2.-Rigid-Body-sampling): Preforms rigid body energy minimization<br>
[**3. 1st CAPRI evaluation**](#3.-1st-CAPRI-evaluation): Calculates CAPRI metrics (i-RMSD, l-RMSD, Fnat, DockQ) with respect to the reference structure (complex)<br>
[**4. Select Top structures**](#4.-Select-Top-structures): Selects the top X models from the previous module<br>
[**5. Flexible Refinement**](#5.-Flexible-Refinement): Preforms semi-flexible refinement of the interface<br>
[**6. 2nd CAPRI evaluation**](#6.-2nd-CAPRI-evaluation): Calculates CAPRI metrics with the models generated in the previous step<br>
[**7. Energy minimization refinement**](#7.-Energy-minimization-refinement): Final refinement by energy minimization<br>
[**8. 3rd CAPRI evaluation**](#8.-3rd-CAPRI-evaluation): Calculates CAPRI metrics with the models generated in the previous step<br>
[**9. Clustering**](#9.-Clustering): Clustering of models based on the *Fraction of Common Contacts* (FCC)<br>
[**10. Selecting top clusters**](#10.-Selecting-top-clusters): Selects the top models of all clusters<br>
[**11. Final CAPRI evaluation**](#11.-Final-CAPRI-evaluation): Calculates CAPRI metrics with the models generated in the previous step<br>
[**12. Contacts analysis**](#12.-Contacts-analysis): Contacts analysis of intermolecular contacts<br>
[**13. Docking Results**](#13.-Docking-Results): Visualize a summary of the final models with statistics and energies<br>


To know more about the HADDOCK3 modules, please refer to the [**HADDOCK3 modules documentation**](https://www.bonvinlab.org/haddock3/modules/index.html)

References: <br>

**HADDOCK3: A modular and versatile platform for integrative modelling of biomolecular complexes.**<br>
*M. Giulini, V. Reys, J.M.C. Teixeira, B. Jiménez-García, R. V. Honorato, A. Kravchenko, X. Xu, R. Versini, A. Engel, S. Verhoeven, A. M.J.J. Bonvin*<br>
*BioArxiv, 2025*<br>
*Available at: https://doi.org/10.1101/2025.04.30.651432*
***

In [30]:
# Important variables to remember; those storing file names needed for the Docking process

# antibody_prep  --> Processed antibody PDB file
# antigen_prep --> Processed antigen PDB file
# complex_prep --> Processed antibody-antigen complex PDB file

# complex_tbl --> HADDOCK paratope/epitope restraints (AIR file)
# body_tbl --> HADDOCK multi-body restraints

***
### 1. Create topology

Create the **CNS all-atom topology**: CNS compatible parameters (.param) and topologies (.psf) for each of the input structures. Detects **missing atoms**, including **hydrogen atoms**, re-builds them when missing, builds and writes out **topologies** (psf) and **coordinates** (PDB) files.

This module is a **pre-requisite** to run any downstream steps and so is often used as the first module in a **HADDOCK3 workflow**.

***
**Building Blocks** used:
 - [topology](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.topology) from **biobb_haddock.haddock.topology**
***
Corresponding **HADDOCK3 module**:
 - [topo_aa](https://www.bonvinlab.org/haddock3/modules/topology/haddock.modules.topology.topoaa.html) from **topology_modules**
***

In [31]:
from biobb_haddock.haddock.topology import topology
step_idx = 1

# Create properties dict and inputs/outputs
mol1_output_top_zip_path = f'{out_path}/docking/{step_idx}/top_mol1.zip'
mol2_output_top_zip_path = f'{out_path}/docking/{step_idx}/top_mol2.zip'
wf_topology              = f'{out_path}/docking/{step_idx}/wf_topology.zip'

prop = {}

# Create and launch bb
topology(mol1_input_pdb_path        = antibody_prep,
         mol2_input_pdb_path        = antigen_prep,
         mol1_output_top_zip_path   = mol1_output_top_zip_path,
         mol2_output_top_zip_path   = mol2_output_top_zip_path,
         output_haddock_wf_data_zip = wf_topology,
         properties                 = def_dict(prop)
)

2025-06-02 16:49:42,985 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.topology Version: 5.0.1
2025-06-02 16:49:42,986 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_c03a5f55-4b46-4e19-818b-bed1f4696518 directory successfully created
2025-06-02 16:49:42,987 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6K_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_c03a5f55-4b46-4e19-818b-bed1f4696518
2025-06-02 16:49:42,987 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4I1B_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_c03a5f55-4b46-4e19-818b-bed1f4696518
2025-06-02 16:49:42,988 [MainThread  ] [INFO ]  haddock3 e8c27436-914f-416e-9019-59e24a5d68c5/haddock.cfg

2025-06-02 16:49:44,836 [MainThread  ] [INFO ]  Executing: haddock3 e8c27436-914f-416e-9019-59e24a5d68c5/haddock.cfg...
2025-06-02 16:49:44,836 [MainThread  ] [INFO ]  Exit code: 0
2025-06-02 16:49:44,837

0

***
### 2. Rigid Body sampling

**Randomization of orientations** and **rigid-body minimization**. Interacting partners are treated as **rigid bodies**, meaning that all geometrical parameters such as **bond lengths, bond angles, and dihedral angles** are **frozen**. The partners are first separated in space and randomly rotated around their respective centres of mass. Afterwards, the molecules are brought together by **rigid-body energy minimization** with rotations and translation as the only degrees of freedom.

The **driving force** for this **energy minimization** is the **energy function**, which consists of the **intermolecular van der Waals** and **electrostatic energy** terms and the **restraints** defined to **guide** the docking. The definition of those **restraints** is particularly important as they effectively **guide** the minimization process. For example, with a stringent set of **AIRs** or **unambiguous distance restraints**, the solutions of the minimization will **converge** much better and the **sampling** can be limited. In **ab-initio mode**, however, very diverse solutions will be obtained and the **sampling** should be **increased** to make sure to sample enough the possible **interaction space**.

In this case, **restraints** generated in the previous section (**epitope/paratope** and **multi-chain protein**) will be used to **guide** the **rigid body sampling**. 

***
**Building Blocks** used:
 - [rigid_body](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.rigid_body) from **biobb_haddock.haddock.rigid_body**
***
Corresponding **HADDOCK3 module**:
 - [rigidbody](https://www.bonvinlab.org/haddock3/modules/sampling/haddock.modules.sampling.rigidbody.html) from **sampling_modules**
***

In [32]:
from biobb_haddock.haddock.rigid_body import rigid_body
step_idx = 2

# Create properties dict and inputs/outputs
docking_output_zip_path = f'{out_path}/docking/{step_idx}/docking.zip'
wf_rigidbody            = f'{out_path}/docking/{step_idx}/wf_rigidbody.zip'

prop={
    'cfg': {
        #'sampling': 10, # Reduced sampling (10 instead of the default of 1000)
        'sampling': 50, # Reduced sampling (50 instead of the default of 1000)
        #'sampling': 100, # Reduced sampling (100 instead of the default of 1000)
        'clean': False   # Not compressing generated PDB files
    }
}

# Create and launch bb
rigid_body(input_haddock_wf_data_zip     = wf_topology,
           docking_output_zip_path       = docking_output_zip_path,
           ambig_restraints_table_path   = complex_tbl,
           unambig_restraints_table_path = body_tbl,
           output_haddock_wf_data_zip    = wf_rigidbody,
           properties                    = def_dict(prop)
)

2025-06-02 16:49:44,861 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.rigid_body Version: 5.0.1
2025-06-02 16:49:44,862 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8c08a3d9-e7d4-417e-8993-8b0957816d7d directory successfully created
2025-06-02 16:49:44,862 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/ambig-paratope-NMR-epitope.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8c08a3d9-e7d4-417e-8993-8b0957816d7d
2025-06-02 16:49:44,864 [MainThread  ] [INFO ]  Copy: ./data/antibodyantibody-unambig.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_8c08a3d9-e7d4-417e-8993-8b0957816d7d
2025-06-02 16:49:44,867 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/1/wf_topology.zip
2025-06-02 16:49:44,868 [MainThread  ] [INFO ]  to:
2025-06-02 16:49:44,868 [MainThread  ] [INFO ]  ['f3c6a0d2-93e6-4add-8c24-d2e

0

***
### 3. 1st CAPRI evaluation

**CAPRI (Critical Assessment of PRedicted Interactions)** is a community wide initiative for testing computational algorithms in **blind predictions** of experimentally determined 3D structures of **protein complexes**, the “targets”, provided to **CAPRI** prior to publication.

This step calculates **metrics** used during the **CAPRI evaluation process** (i-RMSD, l-RMSD, Fnat, DockQ) with respect to the **reference structure** (complex).

***
**Building Blocks** used:
 - [capri_eval](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.capri_eval) from **biobb_haddock.haddock.capri_eval**
***
Corresponding **HADDOCK3 module**:
 - [caprieval](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.caprieval.html) from **analysis_modules**
***

In [33]:
from biobb_haddock.haddock.capri_eval import capri_eval
step_idx = 3

# Create properties dict and inputs/outputs
output_evaluation_zip_path = f'{out_path}/docking/{step_idx}/caprieval.zip'
wf_caprieval               = f'{out_path}/docking/{step_idx}/wf_caprieval.zip'

prop = {}

# Create and launch bb
capri_eval(input_haddock_wf_data_zip  = wf_rigidbody,
           reference_pdb_path         = complex_prep,
           output_evaluation_zip_path = output_evaluation_zip_path,
           output_haddock_wf_data_zip = wf_caprieval,
           properties                 = def_dict(prop))

2025-06-02 16:51:41,413 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.capri_eval Version: 5.0.1
2025-06-02 16:51:41,413 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_77e64600-f727-4ab3-93ce-45939424ae04 directory successfully created
2025-06-02 16:51:41,414 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_77e64600-f727-4ab3-93ce-45939424ae04
2025-06-02 16:51:41,466 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/2/wf_rigidbody.zip
2025-06-02 16:51:41,466 [MainThread  ] [INFO ]  to:
2025-06-02 16:51:41,466 [MainThread  ] [INFO ]  ['0715d503-a1f8-47d8-ba13-2f021dc5396a/0_topoaa', '0715d503-a1f8-47d8-ba13-2f021dc5396a/1_rigidbody', '0715d503-a1f8-47d8-ba13-2f021dc5396a/analysis', '0715d503-a1f8-47d8-ba13-2f021dc5396a/data', '0715d503-a1f8-47d8-ba13-2f021dc5396a/traceback',

0

In [163]:
# Let's inspect the intermediate results
intermediate_results = f'{out_path}/intermediate_results/capri_1'

with zipfile.ZipFile(wf_caprieval, 'r') as zip_ref:
    zip_ref.extractall(intermediate_results)

In [150]:
superpose_ref??

[31mSignature:[39m superpose_ref(pdb_ref, pdb_to_sup, output_file, chain)
[31mDocstring:[39m <no docstring>
[31mSource:[39m   
[38;5;28;01mdef[39;00m superpose_ref(pdb_ref, pdb_to_sup, output_file, chain):

    [38;5;66;03m# Parse the structures[39;00m
    parser = PDBParser(QUIET=[38;5;28;01mTrue[39;00m)
    structure1 = parser.get_structure([33m"pdb_ref"[39m, pdb_ref)
    structure2 = parser.get_structure([33m"pdb_to_sup"[39m, pdb_to_sup)

    [38;5;66;03m# Select only a portion of reference structure (e.g., Chain A )[39;00m
    selected_residues = [res [38;5;28;01mfor[39;00m res [38;5;28;01min[39;00m structure1[[32m0[39m][chain]]

    [38;5;66;03m# Create a new structure object with the selected residues[39;00m
    selected_structure = Structure.Structure([33m"selected_structure"[39m)
    model = Model.Model([32m0[39m)
    chain = Chain.Chain([33m"A"[39m)

    [38;5;28;01mfor[39;00m res [38;5;28;01min[39;00m selected_residues:
        chain.add(re

In [166]:
# Looking at the cluster-based (clt) and single-structure (ss) CAPRI evaluation outputs
tsv_dir = intermediate_results + '/2_caprieval/'

# Load the cluster and single data into pandas DataFrames
cluster_df = pandas.read_csv(tsv_dir + 'capri_clt.tsv', sep='\t',comment='#')
single_df = pandas.read_csv(tsv_dir + 'capri_ss.tsv', sep='\t',comment='#')

# Align reference to models
model = os.path.normpath(os.path.join(tsv_dir, "../1_rigidbody/rigidbody_1.pdb"))

ref_sup_A = f"{out_path}/aligned_reference_A.pdb"
ref_sup_B = f"{out_path}/aligned_reference_B.pdb"
superpose_ref(model, complex_prep, ref_sup_A, "A")
superpose_ref(model, complex_prep, ref_sup_B, "B")

# Align generated models by chain
input_path = f'{intermediate_results}/1_rigidbody'
for chain in ["A","B"]:
    output_file = f"{out_path}/aligned_ensemble_{chain}.pdb"
    superpose_models(chain,input_path,output_file)

# Show results
aligned_ensemble_A = f"{out_path}/aligned_ensemble_A.pdb"
aligned_ensemble_B = f"{out_path}/aligned_ensemble_B.pdb"
capri_visualization(aligned_ensemble_A,aligned_ensemble_B,ref_sup_A,ref_sup_B,input_path,single_df,cluster_df)

#### Please select a model:

Dropdown(description='Sel. model:', options=('All', 'rigidbody_1', 'rigidbody_10', 'rigidbody_11', 'rigidbody_…

#### Generated models (left, aligned to chain A -Antibody-; right, aligned to chain B -Antigen-)

HBox(children=(NGLWidget(layout=Layout(margin='auto')), NGLWidget(layout=Layout(margin='auto'))))

#### CAPRI Evaluation values for Single Structure:

DockQ: incorrect (<0.23), acceptable (0.23-0.49), medium (0.49-0.80), and high (>=0.80)

Unnamed: 0,model,md5,caprieval_rank,score,irmsd,fnat,lrmsd,ilrmsd,dockq,rmsd,...,dihe,elec,improper,rdcs,rg,sym,total,vdw,vean,xpcs
0,../1_rigidbody/rigidbody_23.pdb,-,1,-12.776,4.556,0.241,9.356,8.419,0.264,4.779,...,0.0,-2.686,0.0,0.0,0.0,0.0,779.591,2.397,0.0,0.0
1,../1_rigidbody/rigidbody_18.pdb,-,2,-8.621,10.655,0.034,20.85,18.646,0.065,11.1,...,0.0,-2.835,0.0,0.0,0.0,0.0,628.731,6.921,0.0,0.0
2,../1_rigidbody/rigidbody_49.pdb,-,3,-7.74,1.187,0.69,2.149,2.362,0.748,1.111,...,0.0,-9.267,0.0,0.0,0.0,0.0,469.449,-9.797,0.0,0.0
3,../1_rigidbody/rigidbody_34.pdb,-,4,-7.294,14.821,0.069,23.483,21.868,0.065,14.225,...,0.0,-1.797,0.0,0.0,0.0,0.0,399.86,-18.973,0.0,0.0
4,../1_rigidbody/rigidbody_15.pdb,-,5,-7.281,4.339,0.241,8.76,8.026,0.278,4.541,...,0.0,-2.49,0.0,0.0,0.0,0.0,1401.83,43.933,0.0,0.0
5,../1_rigidbody/rigidbody_27.pdb,-,6,-7.061,14.847,0.069,23.47,21.871,0.065,14.237,...,0.0,-1.881,0.0,0.0,0.0,0.0,488.546,-9.95,0.0,0.0
6,../1_rigidbody/rigidbody_36.pdb,-,7,-6.255,5.007,0.19,11.639,10.73,0.207,4.486,...,0.0,-5.115,0.0,0.0,0.0,0.0,634.921,-4.691,0.0,0.0
7,../1_rigidbody/rigidbody_42.pdb,-,8,-6.005,1.14,0.828,2.288,2.311,0.798,1.146,...,0.0,-5.828,0.0,0.0,0.0,0.0,466.675,-5.776,0.0,0.0
8,../1_rigidbody/rigidbody_40.pdb,-,9,-5.269,5.001,0.19,11.584,10.721,0.207,4.47,...,0.0,-5.587,0.0,0.0,0.0,0.0,809.231,4.277,0.0,0.0
9,../1_rigidbody/rigidbody_11.pdb,-,10,-5.077,5.031,0.19,11.754,10.807,0.205,4.501,...,0.0,-4.987,0.0,0.0,0.0,0.0,823.225,12.124,0.0,0.0


#### CAPRI Evaluation values for Cluster-based output:

Unnamed: 0,cluster_rank,cluster_id,n,under_eval,score,score_std,irmsd,irmsd_std,fnat,fnat_std,...,bsa_std,desolv,desolv_std,elec,elec_std,total,total_std,vdw,vdw_std,caprieval_rank
0,-,-,50,-,0.065,2.547,10.732,3.601,0.091,0.058,...,87.083,7.468,2.585,-6.388,2.396,1110.982,215.449,25.869,11.508,1


In [127]:
# Open HADDOCK CAPRI evaluation results summary (browser) 

capri_analysis_path = f'{intermediate_results}/analysis/2_caprieval_analysis/report.html'

open_results(capri_analysis_path)

***
### 4. Select Top structures

Select a **number of structures** from the input models. By default, the selection is based on the **HADDOCK score** of the models. The number of models to be selected is defined by the ***select*** parameter. In the standard **HADDOCK protocol**, this number is **200**, which can be increased if more models should be refined.

***
**Building Blocks** used:
 - [sele_top](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.sele_top) from **biobb_haddock.haddock.sele_top**
***
Corresponding **HADDOCK3 module**:
 - [seletop](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.seletop.html) from **analysis_modules**
***

In [37]:
from biobb_haddock.haddock.sele_top import sele_top
step_idx = 4

# Create properties dict and inputs/outputs
output_selection_zip_path = f'{out_path}/docking/{step_idx}/selected.zip'
wf_seletop                = f'{out_path}/docking/{step_idx}/wf_seletop.zip'

prop={
    'cfg': {
        #'select': 8,  # Selection of the top 8 best scoring complexes (instead of the default of 200)
        'select': 20,  # Selection of the top 20 best scoring complexes (instead of the default of 200)
    }
}

# Create and launch bb
sele_top(input_haddock_wf_data_zip  = wf_caprieval,
         output_selection_zip_path  = output_selection_zip_path,
         output_haddock_wf_data_zip = wf_seletop,
         properties                 = def_dict(prop))

2025-06-02 16:51:48,003 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.sele_top Version: 5.0.1
2025-06-02 16:51:48,004 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_abae9258-148c-4fc2-9489-f1c14ab7b7f8 directory successfully created
2025-06-02 16:51:48,085 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/3/wf_caprieval.zip
2025-06-02 16:51:48,085 [MainThread  ] [INFO ]  to:
2025-06-02 16:51:48,086 [MainThread  ] [INFO ]  ['37d25737-6312-4d46-ae67-8dbdfcdf54a7/0_topoaa', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/1_rigidbody', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/2_caprieval', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/analysis', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/data', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/traceback', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/log', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/traceback/consensus.tsv', '37d25737-6312-4d46-ae67-8dbdfcdf54a7/tra

0

***
### 5. Flexible Refinement

**Flexible refinement** with **CNS**, a **semi-flexible simulated annealing** (SA) protocol based on molecular dynamics in **torsion angle** space.

This refinement consists of several stages:

- High temperature rigid body molecular dynamics
- Rigid body SA
- Semi-flexible SA with flexible side-chains at the interface
- Semi-flexible SA with fully flexible interface (both backbone and side-chains)

By default, only the **interface regions** are treated as **flexible**, automatically defined based on the **intermolecular contacts**. It is also possible to manually define the **semi-flexible regions**, and also define **fully flexible regions** that are allowed to move throughout the entire protocol from the high temperature rigid body molecular dynamics on. The **temperature** and **number of steps** for the various stages can be tuned.

***
**Building Blocks** used:
 - [flex_ref](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.flex_ref) from **biobb_haddock.haddock.flex_ref**
***
Corresponding **HADDOCK3 module**:
 - [flexref](https://www.bonvinlab.org/haddock3/modules/refinement/haddock.modules.refinement.flexref.html) from **refinement_modules**
***

In [38]:
from biobb_haddock.haddock.flex_ref import flex_ref
step_idx = 5

# Create properties dict and inputs/outputs
refinement_output_zip_path = f'{out_path}/docking/{step_idx}/flexref.zip'
wf_flexref                 = f'{out_path}/docking/{step_idx}/wf_flexref.zip'

prop={
    'cfg': {
        'tolerance' : 5,   # Failure tolerance percentage
        'clean': False,    # Not compressing generated PDB files
    },
}

# Create and launch bb
flex_ref(input_haddock_wf_data_zip     = wf_seletop,
         refinement_output_zip_path    = refinement_output_zip_path,
         ambig_restraints_table_path   = complex_tbl,
         unambig_restraints_table_path = body_tbl,
         output_haddock_wf_data_zip    = wf_flexref,
         properties                    = def_dict(prop))

2025-06-02 16:51:51,507 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.flex_ref Version: 5.0.1
2025-06-02 16:51:51,507 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_3a34c2c0-cbe4-4e9b-9d55-893cedc73887 directory successfully created
2025-06-02 16:51:51,508 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/ambig-paratope-NMR-epitope.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_3a34c2c0-cbe4-4e9b-9d55-893cedc73887
2025-06-02 16:51:51,508 [MainThread  ] [INFO ]  Copy: ./data/antibodyantibody-unambig.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_3a34c2c0-cbe4-4e9b-9d55-893cedc73887
2025-06-02 16:51:51,543 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/4/wf_seletop.zip
2025-06-02 16:51:51,543 [MainThread  ] [INFO ]  to:
2025-06-02 16:51:51,544 [MainThread  ] [INFO ]  ['09089407-fcc5-46f4-9ee1-b449c4

0

***
### 6. 2nd CAPRI evaluation

**CAPRI (Critical Assessment of PRedicted Interactions)** is a community wide initiative for testing computational algorithms in **blind predictions** of experimentally determined 3D structures of **protein complexes**, the “targets”, provided to **CAPRI** prior to publication.

This step calculates **metrics** used during the **CAPRI evaluation process** (i-RMSD, l-RMSD, Fnat, DockQ) with respect to the **reference structure** (complex).

***
**Building Blocks** used:
 - [capri_eval](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.capri_eval) from **biobb_haddock.haddock.capri_eval**
***
Corresponding **HADDOCK3 module**:
 - [caprieval](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.caprieval.html) from **analysis_modules**
***

In [39]:
from biobb_haddock.haddock.capri_eval import capri_eval
step_idx = 6

# Create properties dict and inputs/outputs
output_evaluation_zip_path2 = f'{out_path}/docking/{step_idx}/caprieval2.zip'
wf_caprieval2               = f'{out_path}/docking/{step_idx}/wf_caprieval2.zip'

prop = {}

# Create and launch bb
capri_eval(input_haddock_wf_data_zip  = wf_flexref,
           reference_pdb_path         = complex_prep,
           output_evaluation_zip_path = output_evaluation_zip_path2,
           output_haddock_wf_data_zip = wf_caprieval2,
           properties                 = def_dict(prop))

2025-06-02 16:57:18,469 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.capri_eval Version: 5.0.1
2025-06-02 16:57:18,470 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4114717c-b4c0-46b5-848b-ec710845a93e directory successfully created
2025-06-02 16:57:18,471 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_4114717c-b4c0-46b5-848b-ec710845a93e
2025-06-02 16:57:18,529 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/5/wf_flexref.zip
2025-06-02 16:57:18,529 [MainThread  ] [INFO ]  to:
2025-06-02 16:57:18,529 [MainThread  ] [INFO ]  ['7c9ea72d-df9c-4cc6-98af-d20ce5642756/0_topoaa', '7c9ea72d-df9c-4cc6-98af-d20ce5642756/1_rigidbody', '7c9ea72d-df9c-4cc6-98af-d20ce5642756/2_caprieval', '7c9ea72d-df9c-4cc6-98af-d20ce5642756/3_seletop', '7c9ea72d-df9c-4cc6-98af-d20ce5642756/4_fle

0

In [132]:
# Let's inspect the intermediate results
intermediate_results = f'{out_path}/intermediate_results/capri_2'

with zipfile.ZipFile(wf_caprieval2, 'r') as zip_ref:
    zip_ref.extractall(intermediate_results)

In [133]:
# Looking at the cluster-based (clt) and single-structure (ss) CAPRI evaluation outputs
tsv_dir = intermediate_results + '/5_caprieval/'

# Load the cluster and single data into pandas DataFrames
cluster_df = pandas.read_csv(tsv_dir + 'capri_clt.tsv', sep='\t',comment='#')
single_df = pandas.read_csv(tsv_dir + 'capri_ss.tsv', sep='\t',comment='#')

# Align reference to models
model = os.path.normpath(os.path.join(tsv_dir, "../4_flexref/flexref_1.pdb"))
ref_sup_A = f"{out_path}/aligned_reference_A.pdb"
ref_sup_B = f"{out_path}/aligned_reference_B.pdb"
superpose_ref(model, complex_prep, ref_sup_A, "A")
superpose_ref(model, complex_prep, ref_sup_B, "B")

# Align generated models by chain
input_path = f'{intermediate_results}/4_flexref'
for chain in ["A","B"]:
    output_file = f"{out_path}/aligned_ensemble_{chain}.pdb"
    superpose_models(chain,input_path,output_file)

# Show results
aligned_ensemble_A = f"{out_path}/aligned_ensemble_A.pdb"
aligned_ensemble_B = f"{out_path}/aligned_ensemble_B.pdb"
capri_visualization(aligned_ensemble_A,aligned_ensemble_B,ref_sup_A,ref_sup_B,input_path,single_df,cluster_df)

#### Please select a model:

Dropdown(description='Sel. model:', options=('All', 'flexref_1', 'flexref_10', 'flexref_11', 'flexref_12', 'fl…

#### Generated models (left, aligned to chain A -Antibody-; right, aligned to chain B -Antigen-)

HBox(children=(NGLWidget(layout=Layout(margin='auto')), NGLWidget(layout=Layout(margin='auto'))))

#### CAPRI Evaluation values for Single Structure:

DockQ: incorrect (<0.23), acceptable (0.23-0.49), medium (0.49-0.80), and high (>=0.80)

Unnamed: 0,model,md5,caprieval_rank,score,irmsd,fnat,lrmsd,ilrmsd,dockq,rmsd,...,dihe,elec,improper,rdcs,rg,sym,total,vdw,vean,xpcs
0,../4_flexref/flexref_3.pdb,-,1,-338.723,1.244,0.776,2.032,2.199,0.771,1.192,...,0.0,-320.933,0.0,0.0,0.0,0.0,-230.381,-20.571,0.0,0.0
1,../4_flexref/flexref_8.pdb,-,2,-333.405,1.178,0.966,2.707,2.115,0.831,1.328,...,0.0,-293.085,0.0,0.0,0.0,0.0,-246.16,-36.222,0.0,0.0
2,../4_flexref/flexref_12.pdb,-,3,-320.623,1.372,0.638,3.643,3.107,0.676,1.257,...,0.0,-301.345,0.0,0.0,0.0,0.0,-113.747,-28.671,0.0,0.0
3,../4_flexref/flexref_11.pdb,-,4,-315.248,0.937,0.845,1.562,1.492,0.844,0.913,...,0.0,-294.431,0.0,0.0,0.0,0.0,-202.525,-24.533,0.0,0.0
4,../4_flexref/flexref_18.pdb,-,5,-304.427,0.965,0.828,3.593,2.197,0.794,1.234,...,0.0,-271.191,0.0,0.0,0.0,0.0,-123.504,-40.062,0.0,0.0
5,../4_flexref/flexref_16.pdb,-,6,-291.084,1.062,0.845,2.915,2.296,0.802,1.136,...,0.0,-245.094,0.0,0.0,0.0,0.0,-150.865,-39.404,0.0,0.0
6,../4_flexref/flexref_10.pdb,-,7,-252.705,4.905,0.138,11.989,10.813,0.186,4.27,...,0.0,-200.766,0.0,0.0,0.0,0.0,-112.22,-55.418,0.0,0.0
7,../4_flexref/flexref_9.pdb,-,8,-249.857,5.025,0.138,11.44,10.865,0.192,4.32,...,0.0,-197.136,0.0,0.0,0.0,0.0,-107.5,-55.768,0.0,0.0
8,../4_flexref/flexref_14.pdb,-,9,-245.142,10.317,0.052,21.244,18.573,0.07,10.729,...,0.0,-212.712,0.0,0.0,0.0,0.0,-65.611,-35.891,0.0,0.0
9,../4_flexref/flexref_2.pdb,-,10,-227.763,10.747,0.034,20.632,18.801,0.066,11.108,...,0.0,-180.73,0.0,0.0,0.0,0.0,-141.625,-42.792,0.0,0.0


#### CAPRI Evaluation values for Cluster-based output:

Unnamed: 0,cluster_rank,cluster_id,n,under_eval,score,score_std,irmsd,irmsd_std,fnat,fnat_std,...,bsa_std,desolv,desolv_std,elec,elec_std,total,total_std,vdw,vdw_std,caprieval_rank
0,-,-,20,-,-234.159,64.291,7.893,5.309,0.28,0.297,...,230.399,1.181,5.704,-191.454,77.351,-108.222,90.972,-39.351,15.44,1


In [42]:
# Open HADDOCK CAPRI evaluation results summary (browser) 

capri_analysis_path = f'{intermediate_results}/analysis/5_caprieval_analysis/report.html'

open_results(capri_analysis_path)

['jpserver-66961.json', 'jpserver-66961-open.html', 'kernel-b0d26e4a-2347-4af7-ac9a-9965498c46e1.json', 'kernel-eecb767a-417f-434c-90f3-4aef819d3dbf.json', 'jpserver-55116.json', 'jpserver-55116-open.html', 'jpserver-42472-open.html', 'jpserver-50791.json', 'kernel-bbf2407d-609d-4987-ae64-79fcbea5b3e9.json', 'jupyter_cookie_secret', 'jpserver-50791-open.html', 'jpserver-42472.json', 'kernel-4dfa4166-d6f6-4ab6-a50f-9abffe9f759d.json']


***
### 7. Energy minimization refinement

**Energy minimization** refinement with **CNS**, refines the input complexes by **energy minimization** using **conjugate gradient** method.
**Coordinates** of the **energy minimized structures** are saved, and each complex is then evaluated using the **HADDOCK scoring function**.

***
**Building Blocks** used:
 - [em_ref](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.em_ref) from **biobb_haddock.haddock.em_ref**
***
Corresponding **HADDOCK3 module**:
- [emref](https://www.bonvinlab.org/haddock3/modules/refinement/haddock.modules.refinement.emref.html) from **refinement.modules**

In [43]:
from biobb_haddock.haddock.em_ref import em_ref
step_idx = 7

# Create properties dict and inputs/outputs
refinement_output_zip_path = f'{out_path}/docking/{step_idx}/emref.zip'
wf_emref                   = f'{out_path}/docking/{step_idx}/wf_emref.zip'

prop={
    'cfg': {
        'tolerance' : 5,   # Failure tolerance percentage
        'clean': False,  # Not compressing generated PDB files
    },
}

# Create and launch bb
em_ref(input_haddock_wf_data_zip  = wf_caprieval2,
       refinement_output_zip_path = refinement_output_zip_path,
       ambig_restraints_table_path   = complex_tbl,
       unambig_restraints_table_path = body_tbl,
       output_haddock_wf_data_zip = wf_emref,
       properties                 = def_dict(prop))

2025-06-02 16:57:23,636 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.em_ref Version: 5.0.1
2025-06-02 16:57:23,637 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_95c8fe27-829b-48ab-bf4c-574b561cf292 directory successfully created
2025-06-02 16:57:23,638 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/ambig-paratope-NMR-epitope.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_95c8fe27-829b-48ab-bf4c-574b561cf292
2025-06-02 16:57:23,638 [MainThread  ] [INFO ]  Copy: ./data/antibodyantibody-unambig.tbl to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_95c8fe27-829b-48ab-bf4c-574b561cf292
2025-06-02 16:57:23,731 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/6/wf_caprieval2.zip
2025-06-02 16:57:23,731 [MainThread  ] [INFO ]  to:
2025-06-02 16:57:23,732 [MainThread  ] [INFO ]  ['530e3be3-9efb-4c68-b957-e16d3

0

***
### 8. 3rd CAPRI evaluation

**CAPRI (Critical Assessment of PRedicted Interactions)** is a community wide initiative for testing computational algorithms in **blind predictions** of experimentally determined 3D structures of **protein complexes**, the “targets”, provided to **CAPRI** prior to publication.

This step calculates **metrics** used during the **CAPRI evaluation process** (i-RMSD, l-RMSD, Fnat, DockQ) with respect to the **reference structure** (complex).

***
**Building Blocks** used:
 - [capri_eval](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.capri_eval) from **biobb_haddock.haddock.capri_eval**
***
Corresponding **HADDOCK3 module**:
 - [caprieval](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.caprieval.html) from **analysis_modules**
***

In [44]:
from biobb_haddock.haddock.capri_eval import capri_eval
step_idx = 8

# Create properties dict and inputs/outputs
output_evaluation_zip_path3 = f'{out_path}/docking/{step_idx}/caprieval3.zip'
wf_caprieval3               = f'{out_path}/docking/{step_idx}/wf_caprieval3.zip'

prop = {}

# Create and launch bb
capri_eval(input_haddock_wf_data_zip  = wf_emref,
           reference_pdb_path         = complex_prep,
           output_evaluation_zip_path = output_evaluation_zip_path3,
           output_haddock_wf_data_zip = wf_caprieval3,
           properties                 = def_dict(prop))

2025-06-02 16:57:53,654 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.capri_eval Version: 5.0.1
2025-06-02 16:57:53,655 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_0d8610fe-b8e4-4294-9132-26e9ad043d9f directory successfully created
2025-06-02 16:57:53,656 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_0d8610fe-b8e4-4294-9132-26e9ad043d9f
2025-06-02 16:57:53,742 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/7/wf_emref.zip
2025-06-02 16:57:53,743 [MainThread  ] [INFO ]  to:
2025-06-02 16:57:53,743 [MainThread  ] [INFO ]  ['49af9454-d819-4809-b758-a50012b035a4/0_topoaa', '49af9454-d819-4809-b758-a50012b035a4/1_rigidbody', '49af9454-d819-4809-b758-a50012b035a4/2_caprieval', '49af9454-d819-4809-b758-a50012b035a4/3_seletop', '49af9454-d819-4809-b758-a50012b035a4/4_flexr

0

In [134]:
# Let's inspect the intermediate results
intermediate_results = f'{out_path}/intermediate_results/capri_3'

with zipfile.ZipFile(wf_caprieval3, 'r') as zip_ref:
    zip_ref.extractall(intermediate_results)

In [135]:
# Looking at the cluster-based (clt) and single-structure (ss) CAPRI evaluation outputs
tsv_dir = intermediate_results + '/7_caprieval/'

# Load the cluster and single data into pandas DataFrames
cluster_df = pandas.read_csv(tsv_dir + 'capri_clt.tsv', sep='\t',comment='#')
single_df = pandas.read_csv(tsv_dir + 'capri_ss.tsv', sep='\t',comment='#')

# Align reference to models
model = os.path.normpath(os.path.join(tsv_dir, "../6_emref/emref_1.pdb"))
ref_sup_A = f"{out_path}/aligned_reference_A.pdb"
ref_sup_B = f"{out_path}/aligned_reference_B.pdb"
superpose_ref(model, complex_prep, ref_sup_A, "A")
superpose_ref(model, complex_prep, ref_sup_B, "B")

# Align generated models by chain
input_path = f'{intermediate_results}/6_emref'
for chain in ["A","B"]:
    output_file = f"{out_path}/aligned_ensemble_{chain}.pdb"
    superpose_models(chain,input_path,output_file)

# Show results
aligned_ensemble_A = f"{out_path}/aligned_ensemble_A.pdb"
aligned_ensemble_B = f"{out_path}/aligned_ensemble_B.pdb"
capri_visualization(aligned_ensemble_A,aligned_ensemble_B,ref_sup_A,ref_sup_B,input_path,single_df,cluster_df)

#### Please select a model:

Dropdown(description='Sel. model:', options=('All', 'emref_1', 'emref_10', 'emref_11', 'emref_12', 'emref_13',…

#### Generated models (left, aligned to chain A -Antibody-; right, aligned to chain B -Antigen-)

HBox(children=(NGLWidget(layout=Layout(margin='auto')), NGLWidget(layout=Layout(margin='auto'))))

#### CAPRI Evaluation values for Single Structure:

DockQ: incorrect (<0.23), acceptable (0.23-0.49), medium (0.49-0.80), and high (>=0.80)

Unnamed: 0,model,md5,caprieval_rank,score,irmsd,fnat,lrmsd,ilrmsd,dockq,rmsd,...,dihe,elec,improper,rdcs,rg,sym,total,vdw,vean,xpcs
0,../6_emref/emref_16.pdb,-,1,-133.27,1.042,0.879,2.822,2.12,0.818,1.163,...,0.0,-531.885,0.0,0.0,0.0,0.0,-444.334,-43.74,0.0,0.0
1,../6_emref/emref_8.pdb,-,2,-131.151,1.177,0.914,2.643,2.047,0.815,1.358,...,0.0,-509.178,0.0,0.0,0.0,0.0,-479.105,-41.409,0.0,0.0
2,../6_emref/emref_12.pdb,-,3,-127.699,1.47,0.672,3.77,3.225,0.673,1.329,...,0.0,-557.161,0.0,0.0,0.0,0.0,-419.589,-45.249,0.0,0.0
3,../6_emref/emref_18.pdb,-,4,-124.272,0.973,0.845,3.586,2.065,0.799,1.282,...,0.0,-547.016,0.0,0.0,0.0,0.0,-415.4,-40.045,0.0,0.0
4,../6_emref/emref_3.pdb,-,5,-120.953,1.281,0.828,2.025,2.181,0.784,1.227,...,0.0,-529.573,0.0,0.0,0.0,0.0,-467.182,-36.018,0.0,0.0
5,../6_emref/emref_11.pdb,-,6,-117.326,0.959,0.828,1.523,1.473,0.835,0.948,...,0.0,-453.659,0.0,0.0,0.0,0.0,-389.786,-42.259,0.0,0.0
6,../6_emref/emref_9.pdb,-,7,-116.105,5.016,0.138,11.381,10.798,0.193,4.333,...,0.0,-348.386,0.0,0.0,0.0,0.0,-273.94,-66.766,0.0,0.0
7,../6_emref/emref_4.pdb,-,8,-113.203,14.895,0.069,23.774,22.009,0.064,14.3,...,0.0,-257.993,0.0,0.0,0.0,0.0,-228.586,-72.258,0.0,0.0
8,../6_emref/emref_10.pdb,-,9,-111.867,4.942,0.155,11.981,10.891,0.191,4.292,...,0.0,-353.53,0.0,0.0,0.0,0.0,-272.765,-60.996,0.0,0.0
9,../6_emref/emref_6.pdb,-,10,-106.399,15.003,0.069,23.835,21.834,0.064,14.505,...,0.0,-321.515,0.0,0.0,0.0,0.0,-271.972,-62.242,0.0,0.0


#### CAPRI Evaluation values for Cluster-based output:

Unnamed: 0,cluster_rank,cluster_id,n,under_eval,score,score_std,irmsd,irmsd_std,fnat,fnat_std,...,bsa_std,desolv,desolv_std,elec,elec_std,total,total_std,vdw,vdw_std,caprieval_rank
0,-,-,20,-,-103.134,16.208,7.882,5.273,0.297,0.315,...,231.534,2.539,5.002,-352.073,104.809,-288.713,110.66,-46.216,15.585,1


In [47]:
# Open HADDOCK CAPRI evaluation results summary (browser) 

capri_analysis_path = f'{intermediate_results}/analysis/7_caprieval_analysis/report.html'

open_results(capri_analysis_path)

['jpserver-66961.json', 'jpserver-66961-open.html', 'kernel-b0d26e4a-2347-4af7-ac9a-9965498c46e1.json', 'kernel-eecb767a-417f-434c-90f3-4aef819d3dbf.json', 'jpserver-55116.json', 'jpserver-55116-open.html', 'jpserver-42472-open.html', 'jpserver-50791.json', 'kernel-bbf2407d-609d-4987-ae64-79fcbea5b3e9.json', 'jupyter_cookie_secret', 'jpserver-50791-open.html', 'jpserver-42472.json', 'kernel-4dfa4166-d6f6-4ab6-a50f-9abffe9f759d.json']


***
### 9. Clustering 

**Cluster** models with the *Fraction of Common Contacts* (FCC) method. The module takes the models generated in the previous step and calculates the **contacts** between them. Then, calculates the **FCC matrix** and **clusters** the models based on the **calculated contacts**.

References: <br>

**Clustering biomolecular complexes by residue contacts similarity.**<br>
*J.P.G.L.M. Rodrigues, M. Trellet, C. Schmitz, P. Kastritis, E. Karaca, A.S.J. Melquiond, A.M.J.J. Bonvin.*<br>
*Proteins, 2012, 80(7),1810-7*<br>
*Available at: https://doi.org/10.1002/prot.24078*
***
**Building Blocks** used:
 - [clust_fcc](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.clust_fcc) from **biobb_haddock.haddock.clust_fcc**
***
Corresponding **HADDOCK3 module**:
 - [clustfcc](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.clustfcc.html) from **analysis_modules**
***


In [48]:
from biobb_haddock.haddock.clust_fcc import clust_fcc
step_idx = 9

# Create properties dict and inputs/outputs
output_cluster_zip_path = f'{out_path}/docking/{step_idx}/clustfcc.zip'
wf_clustfcc             = f'{out_path}/docking/{step_idx}/wf_clustfcc.zip'

prop={
    'cfg': {
        'plot_matrix' : True,   # Plot matrix of members
    }
}

# Create and launch bb
clust_fcc(input_haddock_wf_data_zip = wf_caprieval3,
         output_cluster_zip_path    = output_cluster_zip_path,
         output_haddock_wf_data_zip = wf_clustfcc,
         properties                 = def_dict(prop))

2025-06-02 16:57:58,949 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.clust_fcc Version: 5.0.1
2025-06-02 16:57:58,950 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_306c84f4-e0e8-432a-89c9-648db731a183 directory successfully created
2025-06-02 16:57:59,102 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/8/wf_caprieval3.zip
2025-06-02 16:57:59,103 [MainThread  ] [INFO ]  to:
2025-06-02 16:57:59,103 [MainThread  ] [INFO ]  ['b1f4ee14-c376-467c-bb45-b279f6241ff1/0_topoaa', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/1_rigidbody', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/2_caprieval', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/3_seletop', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/4_flexref', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/5_caprieval', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/6_emref', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/7_caprieval', 'b1f4ee14-c376-467c-bb45-b279f6241ff1/a

0

***
### 10. Selecting top clusters 

Select **models** from the **top clusters**. The selection is based on the **score** of the models within the clusters.

In the standard **HADDOCK** analysis, the **top 4 models** of the **top 10 clusters** are shown.

***
**Building Blocks** used:
 - [sele_top_clusts](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.sele_top_clusts) from **biobb_haddock.haddock.sele_top_clusts**
***
Corresponding **HADDOCK3 module**:
 - [seletopclusts](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.seletopclusts.html) from **analysis_modules**
***

In [167]:
from biobb_haddock.haddock.sele_top_clusts import sele_top_clusts
step_idx = 10

# Create properties dict and inputs/outputs
output_seletopclusts_zip_path = f'{out_path}/docking/{step_idx}/seletopclusts.zip'
wf_seletopclusts              = f'{out_path}/docking/{step_idx}/wf_seletopclusts.zip'

prop={
    'cfg': {
        'top_models': 4,   # Selection of the top 4 best scoring complexes from each cluster
        'clean': False     # Not compressing generated PDB files
    },
}

# Create and launch bb
sele_top_clusts(input_haddock_wf_data_zip  = wf_clustfcc,
                output_selection_zip_path  = output_seletopclusts_zip_path,
                output_haddock_wf_data_zip = wf_seletopclusts,
                properties                 = def_dict(prop))

2025-06-03 10:16:24,621 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.sele_top_clusts Version: 5.0.1
2025-06-03 10:16:24,621 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_ad37d253-7738-457c-bc48-4bf013331dfc directory successfully created
2025-06-03 10:16:24,743 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/9/wf_clustfcc.zip
2025-06-03 10:16:24,743 [MainThread  ] [INFO ]  to:
2025-06-03 10:16:24,743 [MainThread  ] [INFO ]  ['7ea80032-2d0e-4d3f-82e4-de233bda7f28/0_topoaa', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/1_rigidbody', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/2_caprieval', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/3_seletop', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/4_flexref', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/5_caprieval', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/6_emref', '7ea80032-2d0e-4d3f-82e4-de233bda7f28/7_caprieval', '7ea80032-2d0e-4d3f-82e4-de233bda7f

0

***
### 11. Final CAPRI evaluation

**CAPRI (Critical Assessment of PRedicted Interactions)** is a community wide initiative for testing computational algorithms in **blind predictions** of experimentally determined 3D structures of **protein complexes**, the “targets”, provided to **CAPRI** prior to publication.

This step calculates **metrics** used during the **CAPRI evaluation process** (i-RMSD, l-RMSD, Fnat, DockQ) with respect to the **reference structure** (complex).

***
**Building Blocks** used:
 - [capri_eval](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.capri_eval) from **biobb_haddock.haddock.capri_eval**
***
Corresponding **HADDOCK3 module**:
 - [caprieval](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.caprieval.html) from **analysis_modules**
***

In [50]:
from biobb_haddock.haddock.capri_eval import capri_eval
step_idx = 11

# Create properties dict and inputs/outputs
output_evaluation_zip_path4 = f'{out_path}/docking/{step_idx}/caprieval4.zip'
wf_caprieval4               = f'{out_path}/docking/{step_idx}/wf_caprieval4.zip'

prop = {}

# Create and launch bb
capri_eval(input_haddock_wf_data_zip  = wf_seletopclusts,
           reference_pdb_path         = complex_prep,
           output_evaluation_zip_path = output_evaluation_zip_path4,
           output_haddock_wf_data_zip = wf_caprieval4,
           properties                 = def_dict(prop))

2025-06-02 16:58:07,820 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.capri_eval Version: 5.0.1
2025-06-02 16:58:07,821 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_fa1601e6-ebac-4187-8c7c-f04045b05485 directory successfully created
2025-06-02 16:58:07,822 [MainThread  ] [INFO ]  Copy: ./data/antibody/pre/4G6M_clean.pdb to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_fa1601e6-ebac-4187-8c7c-f04045b05485
2025-06-02 16:58:07,955 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/10/wf_seletopclusts.zip
2025-06-02 16:58:07,956 [MainThread  ] [INFO ]  to:
2025-06-02 16:58:07,956 [MainThread  ] [INFO ]  ['0a428204-e3ee-4e5d-9e20-fe4f33b58b3c/0_topoaa', '0a428204-e3ee-4e5d-9e20-fe4f33b58b3c/1_rigidbody', '0a428204-e3ee-4e5d-9e20-fe4f33b58b3c/2_caprieval', '0a428204-e3ee-4e5d-9e20-fe4f33b58b3c/3_seletop', '0a428204-e3ee-4e5d-9e20-fe4f33b58b3

0

In [168]:
# Let's inspect the intermediate results
intermediate_results = f'{out_path}/intermediate_results/capri_4'

with zipfile.ZipFile(wf_caprieval4, 'r') as zip_ref:
    zip_ref.extractall(intermediate_results)

In [206]:
# Looking at the cluster-based (clt) and single-structure (ss) CAPRI evaluation outputs
tsv_dir = intermediate_results + '/10_caprieval/'

# Load the cluster and single data into pandas DataFrames
cluster_df = pandas.read_csv(tsv_dir + 'capri_clt.tsv', sep='\t',comment='#')
single_df = pandas.read_csv(tsv_dir + 'capri_ss.tsv', sep='\t',comment='#')

# Align reference to models
#model = os.path.normpath(os.path.join(tsv_dir, single_df['model'][0]))
model = os.path.normpath(os.path.join(tsv_dir, "../09_seletopclusts/cluster_1_model_1.pdb"))
ref_sup_A = f"{out_path}/aligned_reference_A.pdb"
ref_sup_B = f"{out_path}/aligned_reference_B.pdb"
superpose_ref(model, complex_prep, ref_sup_A, "A")
superpose_ref(model, complex_prep, ref_sup_B, "B")

# Align generated models by chain
input_path = f'{intermediate_results}/09_seletopclusts'
for chain in ["A","B"]:
    output_file = f"{out_path}/aligned_ensemble_{chain}.pdb"
    superpose_models(chain,input_path,output_file)

# Show results
aligned_ensemble_A = f"{out_path}/aligned_ensemble_A.pdb"
aligned_ensemble_B = f"{out_path}/aligned_ensemble_B.pdb"
capri_visualization(aligned_ensemble_A,aligned_ensemble_B,ref_sup_A,ref_sup_B,input_path,single_df,cluster_df)

#### Please select a model:

Dropdown(description='Sel. model:', options=('All', 'cluster_1_model_1', 'cluster_1_model_2', 'cluster_1_model…

#### Generated models (left, aligned to chain A -Antibody-; right, aligned to chain B -Antigen-, cyan, reference)

HBox(children=(NGLWidget(layout=Layout(margin='auto')), NGLWidget(layout=Layout(margin='auto'))))

#### CAPRI Evaluation values for Single Structure:

DockQ: incorrect (<0.23), acceptable (0.23-0.49), medium (0.49-0.80), and high (>=0.80)

Unnamed: 0,model,md5,caprieval_rank,score,irmsd,fnat,lrmsd,ilrmsd,dockq,rmsd,...,dihe,elec,improper,rdcs,rg,sym,total,vdw,vean,xpcs
0,../09_seletopclusts/cluster_1_model_1.pdb,-,1,-133.27,1.042,0.879,2.822,2.12,0.818,1.163,...,0.0,-531.885,0.0,0.0,0.0,0.0,-444.334,-43.74,0.0,0.0
1,../09_seletopclusts/cluster_1_model_2.pdb,-,2,-131.151,1.177,0.914,2.643,2.047,0.815,1.358,...,0.0,-509.178,0.0,0.0,0.0,0.0,-479.105,-41.409,0.0,0.0
2,../09_seletopclusts/cluster_1_model_3.pdb,-,3,-127.699,1.47,0.672,3.77,3.225,0.673,1.329,...,0.0,-557.161,0.0,0.0,0.0,0.0,-419.589,-45.249,0.0,0.0
3,../09_seletopclusts/cluster_1_model_4.pdb,-,4,-124.272,0.973,0.845,3.586,2.065,0.799,1.282,...,0.0,-547.016,0.0,0.0,0.0,0.0,-415.4,-40.045,0.0,0.0
4,../09_seletopclusts/cluster_2_model_1.pdb,-,5,-116.105,5.016,0.138,11.381,10.798,0.193,4.333,...,0.0,-348.386,0.0,0.0,0.0,0.0,-273.94,-66.766,0.0,0.0
5,../09_seletopclusts/cluster_2_model_2.pdb,-,6,-111.867,4.942,0.155,11.981,10.891,0.191,4.292,...,0.0,-353.53,0.0,0.0,0.0,0.0,-272.765,-60.996,0.0,0.0
6,../09_seletopclusts/cluster_2_model_3.pdb,-,7,-105.938,5.057,0.138,11.207,10.617,0.195,4.549,...,0.0,-329.029,0.0,0.0,0.0,0.0,-248.868,-59.685,0.0,0.0
7,../09_seletopclusts/cluster_3_model_1.pdb,-,8,-103.484,10.259,0.052,21.139,18.44,0.071,10.706,...,0.0,-377.958,0.0,0.0,0.0,0.0,-286.462,-44.614,0.0,0.0
8,../09_seletopclusts/cluster_3_model_2.pdb,-,9,-100.216,10.707,0.052,20.576,18.724,0.072,11.092,...,0.0,-317.538,0.0,0.0,0.0,0.0,-286.61,-43.944,0.0,0.0
9,../09_seletopclusts/cluster_2_model_4.pdb,-,10,-99.143,5.2,0.138,11.673,10.88,0.187,4.722,...,0.0,-222.607,0.0,0.0,0.0,0.0,-176.272,-72.549,0.0,0.0


#### CAPRI Evaluation values for Cluster-based output:

Unnamed: 0,cluster_rank,cluster_id,n,under_eval,score,score_std,irmsd,irmsd_std,fnat,fnat_std,...,bsa_std,desolv,desolv_std,elec,elec_std,total,total_std,vdw,vdw_std,caprieval_rank
0,1,1,4,-,-129.098,3.423,1.165,0.19,0.828,0.093,...,38.877,6.844,2.72,-536.31,18.063,-439.607,25.343,-42.611,2.016,1
1,2,3,4,-,-108.263,6.385,5.054,0.094,0.142,0.007,...,38.758,5.871,0.267,-313.388,53.203,-242.961,39.782,-64.999,5.108,2
2,3,2,4,-,-92.713,11.495,10.601,0.253,0.06,0.009,...,46.6,2.706,1.76,-336.32,28.939,-275.385,21.73,-38.054,7.444,3


In [53]:
# Open HADDOCK CAPRI evaluation results summary (browser) 

capri_analysis_path = f'{intermediate_results}/analysis/10_caprieval_analysis/report.html'

open_results(capri_analysis_path)

['jpserver-66961.json', 'jpserver-66961-open.html', 'kernel-b0d26e4a-2347-4af7-ac9a-9965498c46e1.json', 'kernel-eecb767a-417f-434c-90f3-4aef819d3dbf.json', 'jpserver-55116.json', 'jpserver-55116-open.html', 'jpserver-42472-open.html', 'jpserver-50791.json', 'kernel-bbf2407d-609d-4987-ae64-79fcbea5b3e9.json', 'jupyter_cookie_secret', 'jpserver-50791-open.html', 'jpserver-42472.json', 'kernel-4dfa4166-d6f6-4ab6-a50f-9abffe9f759d.json']


***
### 12. Contacts analysis

Compute **contacts** between chains in complexes, generating **heatmaps** and **chordcharts** of the **contacts** observed in the input complexes. If complexes are **clustered**, the analysis of contacts will be performed based on **all structures** from each **cluster**.

**Heatmaps** are describing the **probability of contacts** (<5A) between two residues (both intramolecular and intermolecular).

**Chordcharts** are describing only **intermolecular contacts in circles**, connecting with chords the two residues that are contacting.

***
**Building Blocks** used:
 - [contact_map](https://biobb-haddock.readthedocs.io/en/latest/haddock.html#module-haddock.contact_map) from **biobb_haddock.haddock.contact_map**
***
Corresponding **HADDOCK3 module**:
 - [contactmap](https://www.bonvinlab.org/haddock3/modules/analysis/haddock.modules.analysis.contactmap.html) from **analysis_modules**
***

In [54]:
from biobb_haddock.haddock.contact_map import contact_map
step_idx = 12

# Create properties dict and inputs/outputs
output_contactmap_zip_path = f'{out_path}/docking/{step_idx}/contact_map.zip'
wf_contact_map             = f'{out_path}/docking/{step_idx}/wf_contact_map.zip'

prop = {}

# Create and launch bb
contact_map(input_haddock_wf_data_zip  = wf_caprieval4,
            output_contactmap_zip_path = output_contactmap_zip_path,
            output_haddock_wf_data_zip = wf_contact_map,
            properties                 = def_dict(prop))

2025-06-02 16:58:13,191 [MainThread  ] [INFO ]  Module: biobb_haddock.haddock.contact_map Version: 5.0.1
2025-06-02 16:58:13,192 [MainThread  ] [INFO ]  /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_d2ea5afc-3a9b-419b-8657-3df1a4997632 directory successfully created
2025-06-02 16:58:13,205 [MainThread  ] [INFO ]  Copy: ./data/antibody/docking/11/wf_caprieval4.zip to /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/sandbox_d2ea5afc-3a9b-419b-8657-3df1a4997632
2025-06-02 16:58:13,380 [MainThread  ] [INFO ]  Extracting: /home/rchaves/repo/biobb_wf_haddock/biobb_wf_haddock/notebooks/data/antibody/docking/11/wf_caprieval4.zip
2025-06-02 16:58:13,381 [MainThread  ] [INFO ]  to:
2025-06-02 16:58:13,381 [MainThread  ] [INFO ]  ['291d8196-a5c3-4ce6-91d7-7afa8e5c91b5/00_topoaa', '291d8196-a5c3-4ce6-91d7-7afa8e5c91b5/01_rigidbody', '291d8196-a5c3-4ce6-91d7-7afa8e5c91b5/02_caprieval', '291d8196-a5c3-4ce6-91d7-7afa8e5c91b5/03_seletop', '291d8196-a5c3-4ce6-91d7

1

In [204]:
htmls = sorted(glob.glob(f'{out_path}/final_results/11_contactmap/cluster*.html'))

models = []
for i, m in enumerate(htmls):
    models.append(m.split('/')[-1])
    
mdsel = ipywidgets.Dropdown(
    options=models,
    description='Sel. model:',
    disabled=False,
)
def on_dropdown_change(change):
    """Handle dropdown selection changes.
    From https://github.com/nglviewer/nglview/issues/765
    """
    if change['type'] == 'change' and change['name'] == 'value': 
        contact_map_path = f'{out_path}/final_results/11_contactmap/{change['new']}'
        open_results(contact_map_path)
        
# Register the callback function
mdsel.observe(on_dropdown_change, names='value')
mdsel

Dropdown(description='Sel. model:', options=('cluster1_rank1_chordchart.html', 'cluster1_rank1_heatmap.html', …

***
### 13. Final Results

**Final results** of the **docking process** contain various steps of the defined **workflow** numbered sequentially starting at 0, e.g.:

     00_topoaa/
     01_rigidbody/
     02_caprieval/

In addition, there are four **additional directories** and a **log file**:

- ***analysis directory***: contains various plots to visualize the results for each caprieval step and a general report (report.html) that provides all statistics with various plots (to be opened in your preferred web browser)
- ***data directory***: contains the input data (PDB and restraint files) for the various modules, as well as an input workflow (in configurations directory)
- ***toppar directory***: contains the force field topology and parameter files 
- ***traceback directory***: contains traceback.tsv, which links all models to see which model originates from which throughout all steps of the workflow.
- ***log file***: text file with information about the docking process and duration of the run

Each ***sampling/refinement/selection*** module will contain **PDB files**. For example, the 09_seletopclusts directory contains the **selected models** from each cluster. The **clusters** in that directory are numbered based on their rank, i.e. cluster_1 refers to the top-ranked cluster. Information about the origin of these files can be found in that directory in the seletopclusts.txt file.

***

In [57]:
step_idx = 13

final_results = f'{out_path}/final_results'

with zipfile.ZipFile(wf_contact_map, 'r') as zip_ref:
    zip_ref.extractall(final_results)

In [58]:
# Listing directories of the final results
!ls {out_path}/final_results

00_topoaa     03_seletop    06_emref	  09_seletopclusts  analysis  traceback
01_rigidbody  04_flexref    07_caprieval  10_caprieval	    data
02_caprieval  05_caprieval  08_clustfcc   11_contactmap     log


In [59]:
# Listing -seletopclust- directory content (Note the generated PDB models)
!ls {out_path}/final_results/09_seletopclusts

cluster_1_model_1.pdb.gz  cluster_2_model_2.pdb.gz  cluster_3_model_3.pdb.gz
cluster_1_model_2.pdb.gz  cluster_2_model_3.pdb.gz  cluster_3_model_4.pdb.gz
cluster_1_model_3.pdb.gz  cluster_2_model_4.pdb.gz  io.json
cluster_1_model_4.pdb.gz  cluster_3_model_1.pdb.gz  params.cfg
cluster_2_model_1.pdb.gz  cluster_3_model_2.pdb.gz  seletopclusts.txt


In [60]:
# Listing ANALYSES directories of the final results
!ls {out_path}/final_results/analysis

01_rigidbody_analysis  08_clustfcc_analysis	  4_flexref_analysis
02_caprieval_analysis  09_seletopclusts_analysis  5_caprieval_analysis
03_seletop_analysis    10_caprieval_analysis	  6_emref_analysis
04_flexref_analysis    11_contactmap_analysis	  7_caprieval_analysis
05_caprieval_analysis  1_rigidbody_analysis	  8_clustfcc_analysis
06_emref_analysis      2_caprieval_analysis	  9_seletopclusts_analysis
07_caprieval_analysis  3_seletop_analysis	  README.md


In [61]:
# Listing -seletopclusts- directory content of the final ANALYSES results
!ls {out_path}/final_results/analysis/09_seletopclusts_analysis 

air_clt.html	   dockq_score.html    ilrmsd_elec.html   lrmsd_desolv.html
blosum62_A.aln	   dockq_vdw.html      ilrmsd_score.html  lrmsd_elec.html
blosum62_B.aln	   elec_clt.html       ilrmsd_vdw.html	  lrmsd_score.html
blosum62.izone	   fnat_air.html       io.json		  lrmsd_vdw.html
bsa_clt.html	   fnat_clt.html       irmsd_air.html	  report.html
capri_clt.tsv	   fnat_desolv.html    irmsd_clt.html	  score_clt.html
capri_ss.tsv	   fnat_elec.html      irmsd_desolv.html  summary.tgz
desolv_clt.html    fnat_score.html     irmsd_elec.html	  vdw_clt.html
dockq_air.html	   fnat_vdw.html       irmsd_score.html   weights_params.json
dockq_clt.html	   ilrmsd_air.html     irmsd_vdw.html
dockq_desolv.html  ilrmsd_clt.html     lrmsd_air.html
dockq_elec.html    ilrmsd_desolv.html  lrmsd_clt.html


***

## Questions & Comments

Questions, issues, suggestions and comments are really welcome!

* GitHub issues:
    * [https://github.com/bioexcel/biobb](https://github.com/bioexcel/biobb)

* BioExcel forum:
    * [https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library](https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library)
