Skip to content

bibip-impmc/mypmfs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MyPMFs

Postic G., Hamelryck T., Chomilier J., Stratmann D.

Generate statistical potentials from a user-defined list of protein structures

INSTALL

Type 'make' in the terminal. This will create executable binaries named 'scoring' and 'training'.

GET HELP

Run each program without any argument (or with -h option).

EXAMPLES

Case#1

$ ./training -l example/list1.txt -d example/dataset/ -o myPotentials

This will create a statistical potential for each residue pair represented by the carbons alpha (n=210; *.nrg files).

The 'myPotentials/' output directory will also contain 3 Tab-Separated Values (.tsv) files with some statistics about the training dataset:

  • the atomic pairs ranked by their lowest energy peaks (top_energies.tsv);
  • the atomic pairs ranked by their frequencies (top_occurrences.tsv);
  • the 100 shortest distances (top_distances.tsv).

Note: The same results can be obtained with the following command:

$ ./training -L example/list3.txt -d example/dataset/ -o myPotentials

Unlike the -l argument, -L does not require using a list of native protein structures (i.e. a list of PDB codes). This allows using a set of decoys as an input (each having any type of filename).

$ ./scoring -i example/dataset/1BKR.pdb -d myPotentials/

This will calculate the pseudo-energy of the structure 1BKR by using the previously computed potentials.

Case#2

$ ./training -l example/list1.txt -d example/dataset/ -o myPotentials -r CB -p -g

This will create statistical potentials, with residues represented by their carbons beta (-r CB) Each potential will be plotted as a SVG file (-p). This interatomic squared distances used for the calculations are written into *.dat files (-g).

Note: Any previously created 'myPotentials/parameters.log' file will be overwritten.

$ ./scoring -i example/dataset/1BKR.pdb -d myPotentials/ -c -p -w -o myResults

The pseudo-energy of 1BKR will be calculated with cubic-interpolated potentials (-c). These interpolated potentials will be plotted as SVG files (-p). Two TSV files will be written (-w):

  • the pseudo-energy and distance for each atomic pair (data.tsv);
  • the pseudo-energy for each residue of the protein sequence (energy_[WINDOW_SIZE].tsv). All these data are written into 'myResults' directory (-o myResults).

Notes:

Case#3

$ ./training -l example/list1.txt -d example/dataset/ -o myPotentials -k e -b SJ-dpi -p

Same training as case#1 but with Kernel Density Estimations (KDE) Here, we use an Epanechnikov kernel (-k e), and the kernel bandwidth is selected with the Sheather-Jones direct plug-in (-b SJ-dpi) method. Each potential will be plotted as a SVG file (-p).

$ ./scoring -i example/dataset/1BKR.pdb -d myPotentials/ -q 10A,11A,12A,13A,14A,15A,16A,17A,18A,19A,20A -z -s 2000

Only the residues 10A to 20A of 1BKR will be processed (-q). A Z-score will be computed to evaluate the absolute structural quality (-z); the more negative, the better the model. This Z-score will be computed on 2000 random sequence decoys (-s 2000).

Case#4

After any training:

$ ./scoring -l example/list2.txt -d myPotentials/

Multiple inputs: a pseudo-energy will be calculated for each of the 25 structures of the 'example/list2.txt'. The chain name is provided for 2 structures in this list. By default, all chains found will be processed.

Case#5

$ ./training -l example/list1.txt -d example/dataset/ -o myRefState -r allatom -W

This trains the reference state separately (-W) on all atoms (-r allatom). A 'frequencies.ref' file is created, which can then be used (-R) to train a statistical potential.

$ ./training -l example/list1.txt -d example/dataset/ -o myPotentials -R myRefState/frequencies.ref -r backbone

Thus, the observed frequencies are trained on backbones, while the reference state is trained on all atoms.

Contact: guillaume.postic@u-paris.fr

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published