-
APE-Gen is a tool that generates multiple clash-free conformations of a peptide bound to an MHC
-
All that is required for input is the sequence of the peptide and the MHC allotype name (if supported)
-
Minimal example:
python APE_Gen.py QFKDNVILL HLA-A*24:02 -
For a peptide with "n" residues, APE-Gen will use a template of another peptide with "n" residues to place the anchor residues in the correct pocket of the MHC
- to change the template, modify
n-mer-templates.txtfor the corresponding n-mer, then add the template pdb file (peptide and MHC) to thetemplates/folder
- to change the template, modify
-
A receptor model is also needed to run APE-Gen
- if the allotype of the MHC is supported (found in
receptor-class-templates.txt), then simply specify the name as input into APE-Gen - if the allotype is not supported, but one has the PDB file of the receptor
- place the PDB file inside the
templates/folder - modify
receptor-class-templates.txtwith an appropriate name - then call APE-Gen with the chosen name
- place the PDB file inside the
- One way to obtain a model of the receptor is to use the provided scripts in
modeller_scripts/for homology modelling (see below)
- if the allotype of the MHC is supported (found in
-
PDB File preparation:
- For PDB files to be used as input anywhere in the script, the chains must be labelled in a particular way
- chain A is the heavy alpha chain, chain B is the light beta-immunoglobulin chain, chain C is the peptide
-
Specifying receptor degrees-of-freedom
- modify
flex_res.txtto add/remove residues that are allowed to be flexible during the SMINA minimization step
- modify
-
dunbrack.binandloco.scoreare files required for the RCD step -
align.pyis required for using PYMOL for alignment of PDB files -
minimize.pyis needed for further minimization using OpenMM
-
Each round of APE-Gen is saved within a folder with the index of the round (counting from 0)
-
full_system_confs/contains the ensemble of conformations of peptide and MHC after energy refinement and filtering- each conformation is named by the index of the loop as generated by RCD
- not every conformation makes it past the energy refinement and filtering step
- each conformation is named by the index of the loop as generated by RCD
-
peptide_confs.pdbcontains only the peptide conformations -
filtered_energies.npzis a numpy file that contains the energies of the ensemble according to the SMINA scoring function- it contains two arrays of the same size:
filtered_indiceswhich contains indices of each conformation andfiltered_energieswhich contains the corresponding energies
- it contains two arrays of the same size:
-
get_pMHC_pdb.py- usage:
python get_pMHC_pdb.py <pdb code> - assumes pdb code is of a peptide-MHC structure
- adds missing atoms/residues, removes all waters and ions, labels chains as A,B,C where chain C is the peptide
- usage:
-
mutate.py- Usage:
~/pymol/bin/pymol -qc mutate.py <pdb> <selection> <new_residue in 3-letter code> <name of new pdb file> - example:
~/pymol/bin/pymol -qc mutate.py 0.pdb C/1/ ALA 0_mutated.pdb
- Usage:
usage: APE_Gen.py [-h] [-n NUM_CORES] [-l NUM_LOOPS] [-t RCD_DIST_TOL] [-r]
[-d] [-p] [-a ANCHOR_TOL] [-o] [-g NUM_ROUNDS]
[-b {receptor_only,pep_and_recept}] [-s] [--use_gpu]
[--no_progress] [--clean_rcd]
peptide_input receptor_class
Anchored Peptide-MHC Ensemble Generator
positional arguments:
peptide_input Sequence of peptide to dock or pdbfile of crystal
structure
receptor_class Class descriptor of MHC receptor. Use REDOCK along
with crystal input to perform redocking. Or pass a PDB
file with receptor
optional arguments:
-h, --help show this help message and exit
-n NUM_CORES, --num_cores NUM_CORES
Number of cores to use for RCD and smina computations.
(default: 8)
-l NUM_LOOPS, --num_loops NUM_LOOPS
Number of loops to generate with RCD. (Note that the
final number of sampled conformations may be less due
to steric clashes. (default: 100)
-t RCD_DIST_TOL, --RCD_dist_tol RCD_DIST_TOL
RCD tolerance (in angstroms) of inner residues when
performing IK (default: 1.0)
-r, --rigid_receptor Disable sampling of receptor degrees of freedom
specified in flex_res.txt (default: False)
-d, --debug Print extra information for debugging (default: False)
-p, --save_only_pep_confs
Disable saving full conformations (peptide and MHC)
(default: False)
-a ANCHOR_TOL, --anchor_tol ANCHOR_TOL
Anchor tolerance (in angstroms) of first and last
backbone atoms of peptide when filtering (default:
2.0)
-o, --score_with_openmm
Rescore full conformations with openmm (AMBER)
(default: False)
-g NUM_ROUNDS, --num_rounds NUM_ROUNDS
Number of rounds to perform. (default: 1)
-b {receptor_only,pep_and_recept}, --pass_type {receptor_only,pep_and_recept}
When using multiple rounds, pass best scoring
conformation across different rounds (choose either
'receptor_only' or 'pep_and_recept') (default:
receptor_only)
-s, --min_with_smina Minimize with SMINA instead of the default Vinardo
(default: False)
--use_gpu Use GPU for OpenMM Minimization step (default: False)
--no_progress Do not print progress bar (default: False)
--clean_rcd Remove RCD folder at the end of each round (default:
False)
- install miniconda
- using conda, install the following
conda install -c bioconda sminaconda install -c omnia pdbfixerconda install -c conda-forge mdtrajconda install -c schrodinger pymolconda install -c bioconda autodock-vinaconda install -c omnia -c conda-forge openmm
- install RCD (v1.40)
- http://chaconlab.org/modeling/rcd/rcd-download
- make sure RCD is added to path so that
rcdis a command in the terminal - intel mkl may be needed (
conda install -c intel mkl) and added to library path
usage: model_receptor.py [-h] [-n NUM_MODELS] alpha_chain_seq template
Homology Modeling of HLAs using Modeller
positional arguments:
alpha_chain_seq Fasta file (.fasta) containing the sequence of the
alpha chain of HLA or name of HLA allele (ex.
HLA-A*02:01). If allele name given, the program will
try to download the sequence from EMBL-EBI.
template PDB of the template HLA or name of HLA allele (ex.
HLA-A*02:01). If allele name given, a template based
on the allele's supertype (as defined in
supertype_templates.csv) will be chosen.
optional arguments:
-h, --help show this help message and exit
-n NUM_MODELS, --num_models NUM_MODELS
Number of models to sample with Modeller (default: 10)
- Requires Modeller, Biopython, and BeautifulSoup4
conda install -c salilab modellerconda install -c conda-forge biopythonconda install -c anaconda beautifulsoup4conda install -c anaconda requests
- Model with the best DOPE score is found in
best_model.pdb - Example:
python model_receptor.py P01892.fasta 3I6L.pdb- Requires license key
- models HLA-A*02:01 using 3I6L as a template
- 3I6L contains a model of HLA-A*24:02
- Even simpler example:
python model_receptor.py HLA-A*02:01 HLA-A*02:01- First argument says that the sequence of
HLA-A*02:01will be downloaded - Second argument says that a template PDB will be downloaded based on a representative allele from the same supertype classification
- First argument says that the sequence of
- Open a Terminal
- Pull image from Docker hub:
docker pull jayab867/apegen:v2.0 - Go to the directory where you would like the APE-Gen results to be saved
- Create a container that links the current working directory to a directory in the container called
/datadocker run -it --rm -v $(pwd):/data --workdir "/data" jayab867/apegen:v2.0
- Run APE-Gen:
python /APE-Gen/APE_Gen.py QFKDNVILL HLA-A*24:02 - Exit the container with Ctrl-D
- There will be a number of folders which contain the results for each round of APE-Gen
- Default is one round: so a single folder in this example called
0/ - In each folder:
- The best scoring conformation for each round is called
min_energy_system.pdb - The whole ensemble generated is located in
full_system_confs/
- The best scoring conformation for each round is called
- Default is one round: so a single folder in this example called