Skip to content
No description, website, or topics provided.
Python Jupyter Notebook Dockerfile Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
example_data
py2 moving modeller scripts py2 to py2 folder Jul 12, 2019
singularity
templates
.gitignore
APE_Gen.py
APE_Gen_IO.py
Dockerfile
LICENSE
README.md
__init__.py
align.py
dunbrack.bin
environment.yml
example.ipynb
flex_res.txt
flex_res.txt.orig
get_pMHC_pdb.py
loco.score
minimize.py
model_receptor.py
mutate.py
n-mer-templates.txt
receptor-class-templates.txt added A0203 Aug 8, 2019
rename_atoms.py
setup_notebook.sh
supertype_templates.csv

README.md

APE-Gen: Anchored Peptide-MHC Ensemble Generator

Description:

  • APE-Gen is a tool that generates multiple clash-free conformations of a peptide bound to an MHC

  • All that is required for input is the sequence of the peptide and the MHC allotype name (if supported)

  • Minimal example: python APE_Gen.py QFKDNVILL HLA-A*24:02

  • For a peptide with "n" residues, APE-Gen will use a template of another peptide with "n" residues to place the anchor residues in the correct pocket of the MHC

    • to change the template, modify n-mer-templates.txt for the corresponding n-mer, then add the template pdb file (peptide and MHC) to the templates/ folder
  • A receptor model is also needed to run APE-Gen

    • if the allotype of the MHC is supported (found in receptor-class-templates.txt), then simply specify the name as input into APE-Gen
    • if the allotype is not supported, but one has the PDB file of the receptor
      • place the PDB file inside the templates/ folder
      • modify receptor-class-templates.txt with an appropriate name
      • then call APE-Gen with the chosen name
    • One way to obtain a model of the receptor is to use the provided scripts in modeller_scripts/ for homology modelling (see below)
  • PDB File preparation:

    • For PDB files to be used as input anywhere in the script, the chains must be labelled in a particular way
    • chain A is the heavy alpha chain, chain B is the light beta-immunoglobulin chain, chain C is the peptide
  • Specifying receptor degrees-of-freedom

    • modify flex_res.txt to add/remove residues that are allowed to be flexible during the SMINA minimization step
  • dunbrack.bin and loco.score are files required for the RCD step

  • align.py is required for using PYMOL for alignment of PDB files

  • minimize.py is needed for further minimization using OpenMM

Output

  • Each round of APE-Gen is saved within a folder with the index of the round (counting from 0)

  • full_system_confs/ contains the ensemble of conformations of peptide and MHC after energy refinement and filtering

    • each conformation is named by the index of the loop as generated by RCD
      • not every conformation makes it past the energy refinement and filtering step
  • peptide_confs.pdb contains only the peptide conformations

  • filtered_energies.npz is a numpy file that contains the energies of the ensemble according to the SMINA scoring function

    • it contains two arrays of the same size: filtered_indices which contains indices of each conformation and filtered_energies which contains the corresponding energies

Helper scripts

  • get_pMHC_pdb.py

    • usage: python get_pMHC_pdb.py <pdb code>
    • assumes pdb code is of a peptide-MHC structure
    • adds missing atoms/residues, removes all waters and ions, labels chains as A,B,C where chain C is the peptide
  • mutate.py

    • Usage: ~/pymol/bin/pymol -qc mutate.py <pdb> <selection> <new_residue in 3-letter code> <name of new pdb file>
    • example: ~/pymol/bin/pymol -qc mutate.py 0.pdb C/1/ ALA 0_mutated.pdb

Options help:

usage: APE_Gen.py [-h] [-n NUM_CORES] [-l NUM_LOOPS] [-t RCD_DIST_TOL] [-r]
                  [-d] [-p] [-a ANCHOR_TOL] [-o] [-g NUM_ROUNDS]
                  [-b {receptor_only,pep_and_recept}] [-s] [--use_gpu]
                  [--no_progress] [--clean_rcd]
                  peptide_input receptor_class

Anchored Peptide-MHC Ensemble Generator

positional arguments:
  peptide_input         Sequence of peptide to dock or pdbfile of crystal
                        structure
  receptor_class        Class descriptor of MHC receptor. Use REDOCK along
                        with crystal input to perform redocking. Or pass a PDB
                        file with receptor

optional arguments:
  -h, --help            show this help message and exit
  -n NUM_CORES, --num_cores NUM_CORES
                        Number of cores to use for RCD and smina computations.
                        (default: 8)
  -l NUM_LOOPS, --num_loops NUM_LOOPS
                        Number of loops to generate with RCD. (Note that the
                        final number of sampled conformations may be less due
                        to steric clashes. (default: 100)
  -t RCD_DIST_TOL, --RCD_dist_tol RCD_DIST_TOL
                        RCD tolerance (in angstroms) of inner residues when
                        performing IK (default: 1.0)
  -r, --rigid_receptor  Disable sampling of receptor degrees of freedom
                        specified in flex_res.txt (default: False)
  -d, --debug           Print extra information for debugging (default: False)
  -p, --save_only_pep_confs
                        Disable saving full conformations (peptide and MHC)
                        (default: False)
  -a ANCHOR_TOL, --anchor_tol ANCHOR_TOL
                        Anchor tolerance (in angstroms) of first and last
                        backbone atoms of peptide when filtering (default:
                        2.0)
  -o, --score_with_openmm
                        Rescore full conformations with openmm (AMBER)
                        (default: False)
  -g NUM_ROUNDS, --num_rounds NUM_ROUNDS
                        Number of rounds to perform. (default: 1)
  -b {receptor_only,pep_and_recept}, --pass_type {receptor_only,pep_and_recept}
                        When using multiple rounds, pass best scoring
                        conformation across different rounds (choose either
                        'receptor_only' or 'pep_and_recept') (default:
                        receptor_only)
  -s, --min_with_smina  Minimize with SMINA instead of the default Vinardo
                        (default: False)
  --use_gpu             Use GPU for OpenMM Minimization step (default: False)
  --no_progress         Do not print progress bar (default: False)
  --clean_rcd           Remove RCD folder at the end of each round (default:
                        False)

Installation instructions:

  1. install miniconda
  1. using conda, install the following
  • conda install -c bioconda smina
  • conda install -c omnia pdbfixer
  • conda install -c conda-forge mdtraj
  • conda install -c schrodinger pymol
  • conda install -c bioconda autodock-vina
  • conda install -c omnia -c conda-forge openmm
  1. install RCD (v1.40)

Using Modeller script

usage: model_receptor.py [-h] [-n NUM_MODELS] alpha_chain_seq template

Homology Modeling of HLAs using Modeller

positional arguments:
  alpha_chain_seq       Fasta file (.fasta) containing the sequence of the
                        alpha chain of HLA or name of HLA allele (ex.
                        HLA-A*02:01). If allele name given, the program will
                        try to download the sequence from EMBL-EBI.
  template              PDB of the template HLA or name of HLA allele (ex.
                        HLA-A*02:01). If allele name given, a template based
                        on the allele's supertype (as defined in
                        supertype_templates.csv) will be chosen.

optional arguments:
  -h, --help            show this help message and exit
  -n NUM_MODELS, --num_models NUM_MODELS
                        Number of models to sample with Modeller (default: 10)
  • Requires Modeller, Biopython, and BeautifulSoup4
    • conda install -c salilab modeller
    • conda install -c conda-forge biopython
    • conda install -c anaconda beautifulsoup4
    • conda install -c anaconda requests
  • Model with the best DOPE score is found in best_model.pdb
  • Example: python model_receptor.py P01892.fasta 3I6L.pdb
    • Requires license key
    • models HLA-A*02:01 using 3I6L as a template
      • 3I6L contains a model of HLA-A*24:02
  • Even simpler example: python model_receptor.py HLA-A*02:01 HLA-A*02:01
    • First argument says that the sequence of HLA-A*02:01 will be downloaded
    • Second argument says that a template PDB will be downloaded based on a representative allele from the same supertype classification

Instructions to run Minimal Example with Docker

  • Open a Terminal
  • Pull image from Docker hub: docker pull jayab867/apegen:v2.0
  • Go to the directory where you would like the APE-Gen results to be saved
  • Create a container that links the current working directory to a directory in the container called /data
    • docker run -it --rm -v $(pwd):/data --workdir "/data" jayab867/apegen:v2.0
  • Run APE-Gen: python /APE-Gen/APE_Gen.py QFKDNVILL HLA-A*24:02
  • Exit the container with Ctrl-D
  • There will be a number of folders which contain the results for each round of APE-Gen
    • Default is one round: so a single folder in this example called 0/
    • In each folder:
      • The best scoring conformation for each round is called min_energy_system.pdb
      • The whole ensemble generated is located in full_system_confs/
You can’t perform that action at this time.