Skip to content
No description, website, or topics provided.
Python Jupyter Notebook Dockerfile Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
py2 moving modeller scripts py2 to py2 folder Jul 12, 2019
receptor-class-templates.txt added A0203 Aug 8, 2019

APE-Gen: Anchored Peptide-MHC Ensemble Generator


  • APE-Gen is a tool that generates multiple clash-free conformations of a peptide bound to an MHC

  • All that is required for input is the sequence of the peptide and the MHC allotype name (if supported)

  • Minimal example: python QFKDNVILL HLA-A*24:02

  • For a peptide with "n" residues, APE-Gen will use a template of another peptide with "n" residues to place the anchor residues in the correct pocket of the MHC

    • to change the template, modify n-mer-templates.txt for the corresponding n-mer, then add the template pdb file (peptide and MHC) to the templates/ folder
  • A receptor model is also needed to run APE-Gen

    • if the allotype of the MHC is supported (found in receptor-class-templates.txt), then simply specify the name as input into APE-Gen
    • if the allotype is not supported, but one has the PDB file of the receptor
      • place the PDB file inside the templates/ folder
      • modify receptor-class-templates.txt with an appropriate name
      • then call APE-Gen with the chosen name
    • One way to obtain a model of the receptor is to use the provided scripts in modeller_scripts/ for homology modelling (see below)
  • PDB File preparation:

    • For PDB files to be used as input anywhere in the script, the chains must be labelled in a particular way
    • chain A is the heavy alpha chain, chain B is the light beta-immunoglobulin chain, chain C is the peptide
  • Specifying receptor degrees-of-freedom

    • modify flex_res.txt to add/remove residues that are allowed to be flexible during the SMINA minimization step
  • dunbrack.bin and loco.score are files required for the RCD step

  • is required for using PYMOL for alignment of PDB files

  • is needed for further minimization using OpenMM


  • Each round of APE-Gen is saved within a folder with the index of the round (counting from 0)

  • full_system_confs/ contains the ensemble of conformations of peptide and MHC after energy refinement and filtering

    • each conformation is named by the index of the loop as generated by RCD
      • not every conformation makes it past the energy refinement and filtering step
  • peptide_confs.pdb contains only the peptide conformations

  • filtered_energies.npz is a numpy file that contains the energies of the ensemble according to the SMINA scoring function

    • it contains two arrays of the same size: filtered_indices which contains indices of each conformation and filtered_energies which contains the corresponding energies

Helper scripts


    • usage: python <pdb code>
    • assumes pdb code is of a peptide-MHC structure
    • adds missing atoms/residues, removes all waters and ions, labels chains as A,B,C where chain C is the peptide

    • Usage: ~/pymol/bin/pymol -qc <pdb> <selection> <new_residue in 3-letter code> <name of new pdb file>
    • example: ~/pymol/bin/pymol -qc 0.pdb C/1/ ALA 0_mutated.pdb

Options help:

usage: [-h] [-n NUM_CORES] [-l NUM_LOOPS] [-t RCD_DIST_TOL] [-r]
                  [-d] [-p] [-a ANCHOR_TOL] [-o] [-g NUM_ROUNDS]
                  [-b {receptor_only,pep_and_recept}] [-s] [--use_gpu]
                  [--no_progress] [--clean_rcd]
                  peptide_input receptor_class

Anchored Peptide-MHC Ensemble Generator

positional arguments:
  peptide_input         Sequence of peptide to dock or pdbfile of crystal
  receptor_class        Class descriptor of MHC receptor. Use REDOCK along
                        with crystal input to perform redocking. Or pass a PDB
                        file with receptor

optional arguments:
  -h, --help            show this help message and exit
  -n NUM_CORES, --num_cores NUM_CORES
                        Number of cores to use for RCD and smina computations.
                        (default: 8)
  -l NUM_LOOPS, --num_loops NUM_LOOPS
                        Number of loops to generate with RCD. (Note that the
                        final number of sampled conformations may be less due
                        to steric clashes. (default: 100)
  -t RCD_DIST_TOL, --RCD_dist_tol RCD_DIST_TOL
                        RCD tolerance (in angstroms) of inner residues when
                        performing IK (default: 1.0)
  -r, --rigid_receptor  Disable sampling of receptor degrees of freedom
                        specified in flex_res.txt (default: False)
  -d, --debug           Print extra information for debugging (default: False)
  -p, --save_only_pep_confs
                        Disable saving full conformations (peptide and MHC)
                        (default: False)
  -a ANCHOR_TOL, --anchor_tol ANCHOR_TOL
                        Anchor tolerance (in angstroms) of first and last
                        backbone atoms of peptide when filtering (default:
  -o, --score_with_openmm
                        Rescore full conformations with openmm (AMBER)
                        (default: False)
  -g NUM_ROUNDS, --num_rounds NUM_ROUNDS
                        Number of rounds to perform. (default: 1)
  -b {receptor_only,pep_and_recept}, --pass_type {receptor_only,pep_and_recept}
                        When using multiple rounds, pass best scoring
                        conformation across different rounds (choose either
                        'receptor_only' or 'pep_and_recept') (default:
  -s, --min_with_smina  Minimize with SMINA instead of the default Vinardo
                        (default: False)
  --use_gpu             Use GPU for OpenMM Minimization step (default: False)
  --no_progress         Do not print progress bar (default: False)
  --clean_rcd           Remove RCD folder at the end of each round (default:

Installation instructions:

  1. install miniconda
  1. using conda, install the following
  • conda install -c bioconda smina
  • conda install -c omnia pdbfixer
  • conda install -c conda-forge mdtraj
  • conda install -c schrodinger pymol
  • conda install -c bioconda autodock-vina
  • conda install -c omnia -c conda-forge openmm
  1. install RCD (v1.40)

Using Modeller script

usage: [-h] [-n NUM_MODELS] alpha_chain_seq template

Homology Modeling of HLAs using Modeller

positional arguments:
  alpha_chain_seq       Fasta file (.fasta) containing the sequence of the
                        alpha chain of HLA or name of HLA allele (ex.
                        HLA-A*02:01). If allele name given, the program will
                        try to download the sequence from EMBL-EBI.
  template              PDB of the template HLA or name of HLA allele (ex.
                        HLA-A*02:01). If allele name given, a template based
                        on the allele's supertype (as defined in
                        supertype_templates.csv) will be chosen.

optional arguments:
  -h, --help            show this help message and exit
  -n NUM_MODELS, --num_models NUM_MODELS
                        Number of models to sample with Modeller (default: 10)
  • Requires Modeller, Biopython, and BeautifulSoup4
    • conda install -c salilab modeller
    • conda install -c conda-forge biopython
    • conda install -c anaconda beautifulsoup4
    • conda install -c anaconda requests
  • Model with the best DOPE score is found in best_model.pdb
  • Example: python P01892.fasta 3I6L.pdb
    • Requires license key
    • models HLA-A*02:01 using 3I6L as a template
      • 3I6L contains a model of HLA-A*24:02
  • Even simpler example: python HLA-A*02:01 HLA-A*02:01
    • First argument says that the sequence of HLA-A*02:01 will be downloaded
    • Second argument says that a template PDB will be downloaded based on a representative allele from the same supertype classification

Instructions to run Minimal Example with Docker

  • Open a Terminal
  • Pull image from Docker hub: docker pull jayab867/apegen:v2.0
  • Go to the directory where you would like the APE-Gen results to be saved
  • Create a container that links the current working directory to a directory in the container called /data
    • docker run -it --rm -v $(pwd):/data --workdir "/data" jayab867/apegen:v2.0
  • Run APE-Gen: python /APE-Gen/ QFKDNVILL HLA-A*24:02
  • Exit the container with Ctrl-D
  • There will be a number of folders which contain the results for each round of APE-Gen
    • Default is one round: so a single folder in this example called 0/
    • In each folder:
      • The best scoring conformation for each round is called min_energy_system.pdb
      • The whole ensemble generated is located in full_system_confs/
You can’t perform that action at this time.