Fast GPU-based protein folding simulations that run from only a protein sequence and set of restraints designed to fold the protein.
Pre-install
If you do not have a suitable Python 3 installation, we recommend you use an Anaconda Python distribution.
If you are installing on a cluster or HPC, you will likely need to specify a non-standard miniconda
directory to use the software to avoid permissions issues. For example on OLCF resources such as Summit,
a good place to install software is /ccs/proj/<projID>/miniconda3
.
Here is a lightweight version that will get you started:
# Use this if you are on PPC64LE
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-ppc64le.sh
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh
# If you choose to keep your bashrc clean, add the conda/bin to PATH
# - change the "pwd" here according to your miniconda install path
export PATH="$(pwd)/miniconda3/bin:$PATH"
# Two dependencies for the installation itself
conda install pyyaml
# OpenMM Install Before Exafold
# If you are on plain-vanilla x86 machines
conda install -c omnia openmm
# If you have a different hardware, you will need to match
# your CUDA version and hardware to a build of openmm
# or build it yourself in the worst case.
# -> this one is for OLCF Summit w/ default cuda module
conda install openmm -c omnia-dev/label/cuda101
Install:
If you are on OLCF Summit (or any other PPC64LE machine) we will do some funny install methods. A custom fork of
MDTraj will be installed from source, side-by-side with your clone of this repository. This fork gives the
necessary MDTraj functionality without any architecture-specific components (so you can't use any geometric
calculations essentially). We also will pull OpenMM from a specific target which hard-codes a build against
CUDA 10.1, if you need a different PPC64LE build of OpenMM, you will have to modify setup.py accordintly.
git clone https://github.com/jkwoods/exafold
cd exafold
git checkout master
# choose develop option if you want to
# make and test source changes
python setup.py [ install || develop ]
Useful for OpenMM on GPU These need to go into you job script to make GPUs accessible for OpenMM
module load <your cuda module name>
# On summit `module load cuda/10.1.168`
export OPENMM_CUDA_COMPILER=`which nvcc`
Requirements:
Python 3
Packages:
- parse
- OpenMM 7.3+
- MDTraj 1.9.3+
To Run:
- Run a test:
# from ExaFold top directory
cd tests
python run-test-omm_walker.py
- (incomplete) Edit
my_parameters.yaml
with your parameters in a working directory you set up (explaination below) 2.1. Run from structure + restraintlist 2.2. (incomplete) Run from protein sequence
Input to the Program:
- From structure + restraint list
my_parameters.yaml
with paths to structure, restraint files
- From protein sequence
my_parameters.yaml
with a protein sequence or fasta file
Example API calls
user configuration
exafold.mdsystem.ommsystem
exafold.restraints.reader
exafold.restraints.definitions
Notes
- See previous repos and migrate relevant stuff
The code is currently under development. A full version will:
- [partial] take a protein sequence for input via a config file
- [partial] create a linear protein structure for this sequence
- [outside] calculate distances used as restraints to help the protein fold correctly
- [partial] calculate distances from secondary structure prediction
- [v1 done] read a set of distance restraints
- [v1 done] build a simulation system and apply restraints
- [nostart] run a swarm of protein folding walkers
- [nostart] prune and move walkers to more effectively fold the protein
Short-term Roadmap
v0.1 restraints are read and applied to an OpenMM system
v0.2 hook the sequence to PDB/restraints upstream
v0.x1 launches HPC job with swarm of folding walkers
v0.x2 prune and move walkers
v0.3 updates to config for more control of workflow