This notbook is used to run the HDOCK-Multimer (HDM) modeling pipeline, which integrates docking and assembly strategies for predicting large protein assembly structures. 

By default, the notebook runs on an example [PDB: 7D3U](https://www.rcsb.org/structure/7D3U), but by uploading your own stoi.json and .PDB files to Google Drive, you can predict your complex by changing the path in the "Run" cell, as instructed in the cell.

**Inputs:** \
 1.stoi.json - a JSON file recording stoichiometry and subunit information of the complex.\
 2.mono_files/ - a folder containing AlphaFold2-predicted monomer files.\
 3.subcomponents/ - a folder containing AlphaFold-Multimer-predicted subcomponent files. \

**Outputs:** \
A series of .PDB files containing the structures of the full complex.

For more information, see [HDOCK-Multimer](https://github.com/xyao7/HDOCK-Multimer).


In [None]:
#@title Download and install HDM (~2 min)
!pip -q install -U "git+https://github.com/openmm/pdbfixer.git"
!pip -q install py3Dmol
!echo Installed python dependencies

!echo Install HDM
!wget -qnc https://github.com/xyao7/HDOCK-Multimer/archive/refs/heads/main.zip -O HDM-master.zip 
!unzip -q HDM-master.zip 
!cd HDOCK-Multimer-main && bash setup.sh 
!echo HDM Installed Successfully!

import py3Dmol
import os

def view_pdb_color_by_chain(pdb_path: str):
  pdb_content = open(pdb_path, "r").read()
  view = py3Dmol.view(width=400, height=300)
  view.addModelsAsFrames(pdb_content)

  # Get the list of chains in the protein
  chains = {i[21] for i in pdb_content.split("\n") if len(i) > 21}

  # Assign a color to each chain
  colors = ["red", "blue", "green", "orange", "purple", "yellow", "pink", "brown", "black", "gray", "cyan", "magenta", "olive", "maroon", "navy", "teal", "gold", "silver", "crimson"]
  colors = 10 * colors

  # Set the style for each chain
  for i, chain in enumerate(chains):
      view.setStyle({'chain': chain}, {'cartoon': {'color': colors[i]}})

  view.zoomTo()
  view.show()

In [None]:
#@title View example elements {run: "auto"}

#@markdown  Here we demonstrate the input used to create a model of [PDB: 7D3U](https://www.rcsb.org/structure/7D3U), which is composed of 6 different chains.

#@markdown  - stoi.json - Defines the stoichiometry and subunits information.

#@markdown  - pdb files - Comprises of two categories:
#@markdown  1. monomer files (saved in the `mono_files/` folder): monomer models predicted by AlphaFold2 (AF2)\
#@markdown  2. subcomponent files (saved in the `subcomponents/` folder): subcomponent models predicted by AlphaFold-Multimer (AFM)\

#@markdown  You can view elements in the input by choosing them and running the cell.

element_to_view = "stoi.json" #@param ["stoi.json", "mono_files/A.pdb", "mono_files/B.pdb", "mono_files/C.pdb", "mono_files/D.pdb", "mono_files/E.pdb", "mono_files/F.pdb", "subcomponents/7D3U_ABC.pdb", "subcomponents/7D3U_ABD.pdb", "subcomponents/7D3U_ABE.pdb", "subcomponents/7D3U_ABF.pdb", "subcomponents/7D3U_ACD.pdb", "subcomponents/7D3U_ACE.pdb", "subcomponents/7D3U_ACF.pdb", "subcomponents/7D3U_ADE.pdb", "subcomponents/7D3U_ADF.pdb", "subcomponents/7D3U_AEF.pdb", "subcomponents/7D3U_BCD.pdb", "subcomponents/7D3U_BCE.pdb", "subcomponents/7D3U_BCF.pdb", "subcomponents/7D3U_BDE.pdb", "subcomponents/7D3U_BDF.pdb", "subcomponents/7D3U_BEF.pdb", "subcomponents/7D3U_CDE.pdb", "subcomponents/7D3U_CDF.pdb", "subcomponents/7D3U_CEF.pdb", "subcomponents/7D3U_DEF.pdb"]

example_path = "/content/HDOCK-Multimer-main/examples/7D3U"

stoi_path = os.path.join(example_path, "stoi.json")
if element_to_view == "stoi.json":
  print(open(stoi_path, 'r').read())
else:
  view_pdb_color_by_chain(os.path.join(example_path, element_to_view))


In [None]:
#@title Run HDM

#@markdown The folder `path_task/` should have:
#@markdown (1) a file named "stoi.json" recording the stoichiometry and subunit information
#@markdown (2) a folder named "mono_files/" storing monomer structure files predicted by AF2
#@markdown and (3) a folder named "subcomponents/" storing subcomponent structure files predicted by AFM.

#@markdown The results will be saved to a new folder named "results", under the folder `path_task/`.

import os 

path_task="/content/HDOCK-Multimer-main/examples/7D3U/" #@param {type:"string"}
max_results_number = "10" #@param[1, 5, 10, 20]

!bash HDOCK-Multimer-main/HDM_pipeline.sh \
  -stoi "{path_task}/stoi.json" \
  -mono_dir "{path_task}/mono_files/" \
  -sub_dir "{path_task}/subcomponents/" \
  -nmax {max_results_number}


In [None]:
#@title Display output 3D structure models {run: "auto"}
model_num = "1" #@param [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
model_num = int(model_num)

result_path = os.path.join(path_task, "results")
output_filenames = [i for i in os.listdir(result_path) if i.endswith(".pdb")]
assert model_num < len(output_filenames), f"Only have {len(output_filenames)} models"
output_path = os.path.join(result_path, f"model_{model_num}.pdb")

view_pdb_color_by_chain(output_path)
