Protein DNA Simulations
Note: a newer version of the tools are available under tools/awsem_3spn2_HaoWu
AWSEM is a protein force field, originally designed for studying protein structure prediction, folding dynamics, binding interface prediction, folding in membranes etc. Since the coarse-grained DNA model (3SPN.2) has been made available as part of the LAMMPS MD package, we build a protein-DNA hybrid simulation platform by implementing both the AWSEM and the 3SPN.2 into the LAMMPS.
This tutorial will walk you through a step-by-step build of protein-DNA simulation. There are three steps: 1. Build DNA data file 2. Build protein data file 3. Merge them. You can download the tutorial package (fisDNA_example.tar.gz) from the example page: https://github.com/adavtyan/awsemmd/tree/master/examples.
Also, you will need to have python, pylab, and Ruby installed on your computer in order to complete the tutorial. For python-related packages, we recommend to install Anaconda https://www.continuum.io/downloads (choose Python 2.7 version), which offers a python scientific programming environment required for this tutorial, including numpy and scipy etc. In addition, Ruby needs to be installed independently https://www.ruby-lang.org/en/downloads/.
To build a 3SPN DNA model, DNA data file is created using both the 3SPN and X3DNA toolkits. Before we can actually build the model, we need to download these packages, unzip/untar them, and properly setup path to which your shell can have an access. This tutorial is tested and run on Linux under a Bash shell prompt $
. First of all, let's move to the working folder.
$ cd buildDna/
where you can find x3dna-v2.2.tar.gz and USER-3SPN2.tar.gz already there for you. Now we extract (untar) these files using the follwoing commands
$ tar -zxvf x3dna-v2.2.tar.gz
and
$ tar -zxvf USER-3SPN2.tar.gz
which generate the folder x3dna-v2.2 and USER-3SPN2, respectively.
Second, we setup up correct path for using functions of x3dna. Our default shell here is a Bash shell. You can setup the path by entering two lines in the command line.
$ export X3DNA="YOUR_LOCAL_FOLDER/x3dna-v2.2"
$ export PATH="YOUR_LOCAL_FOLDER/x3dna-v2.2/bin:$PATH"
The former sets the internal shell variable X3DNA in order to validate the use of the x3dna functions, the latter allows users to have access to the local executables/scripts from the command line. It's highly recommended to add the above two lines in the .bashrc (or .profile etc.) file as a shell default setting, otherwise, the setting would be gone after you close the session (effectively as an one-time action).
Once it's done. We can proceed to look at the script file genConf.sh, which runs all the binary/scripts of x3dna and 3spn2 needed to build a DNA data file. You can create it with an arbitrary sequence specified in dnaSeq.txt.
$ cat dnaSeq.txt
54
AAATTTGTTTGAATTTTGAGCAAATTTAAATTTGTTTGAATTTTGAGCAAATTT
Make the shell script executable by typing
$ chmod +x genConf.sh
Now let's execute ./genConf.sh
.
$ ./genConf.sh
The average twist in this sequence is 34.516981
Time used: 00:00:00:00
################################################################ Pair coefficients for 3SPN.2 representation of B-DNA pair_coeff 1 1 3spn2 0.239006 4.500000 pair_coeff 2 2 3spn2 0.239006 6.200000 pair_coeff 3 3 3spn2 0.239006 5.400000 pair_coeff 4 4 3spn2 0.239006 7.100000 pair_coeff 5 5 3spn2 0.239006 4.900000 pair_coeff 6 6 3spn2 0.239006 6.400000 pair_coeff 7 7 3spn2 0.239006 5.400000 pair_coeff 8 8 3spn2 0.239006 7.100000 pair_coeff 9 9 3spn2 0.239006 4.900000 pair_coeff 10 10 3spn2 0.239006 6.400000 pair_coeff 11 11 3spn2 0.239006 5.400000 pair_coeff 12 12 3spn2 0.239006 7.100000 pair_coeff 13 13 3spn2 0.239006 4.900000 pair_coeff 14 14 3spn2 0.239006 6.400000 ################################################################
That's it!
With the message printing out the screen, you now get a DNA data file, a set of DNA list files, as well as several intermediate files.
In principle, the DNA data file (bdna_curv_conf.in) and the set of DNA list files (in00_bond.list, in00_angl.list, and in00_dihe.list) in their present form are ready for setting up DNA simulation alone. However, to incoporate protein into the simulation, we need to tweak these files a bit. The script gen_multi_dna_awsem_v2.py can do the job; it reads the old data file and the list files and turn them into the format for merging with protein.
$ python gen_multi_dna_awsem_v2.py bdna_curv_conf.in in00_bond.list in00_angl.list in00_dihe.list 567 1 0 0 0 dna_premerge.data
Note that 567 refers to an offset for indexing DNA atoms in the list files.
The output would be
-
dna_premerge.data
-
new_bond.list 3. new_angl.list, and 4. new_dihe.list.
These files will be used later.
Check AWSEM tutorial (https://github.com/adavtyan/awsemmd/wiki) for making protein data file
Once you have your protein and DNA data files ready (summarized below). Merging them into one data file is relatively easy!
Protein data file: data.fis
DNA date file: dna_premerge.data
$ cd ../merge
$ python merge.py ../buildProtein/data.fis ../buildDna/dna_premerge.data ../buildProtein/fis.seq
Finally, you will get data.merge where protein and DNA information is properly merged into one data file.