Skip to content
Protein structure prediction using DCA and XL constraints with structure based models
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore Initial commit Aug 15, 2017
LICENSE Initial commit Aug 15, 2017 added reference to readme file Apr 20, 2018
distavg Add files via upload Sep 12, 2017
example_input.tar.gz Add files via upload Sep 13, 2017
example_output.tar.gz Add files via upload Sep 13, 2017
sbm_calpha_SA_200to1_20ns.mdp Add files via upload Sep 13, 2017 Add files via upload Sep 13, 2017


The script provided herein ( can be used to generate structure-based models (SBMs) from interaction data obtained from coevolution analysis and chemical cross-linking/SM and run protein folding predictions.

The detailed description of application of these strategies to model building is described in:

R. N. dos Santos, A. J. R. Ferrari, H. C. R. de Jesus, F. C. Gozzo, F. Morcos, L. Martínez, Enhancing protein fold determination by exploring the complementary information of chemical cross-linking and coevolutionary signals. Bioinformatics, 2018

The following input data should be provided:

(1) Primary sequence of target protein.

(2) Predicted type of secondary structure (SS) for each sequence region. Any SS prediction server can be used (e.g Jpred,       
  PSIPRED, PP, etc). However, SS tags should be represented as follows: "H" for alpha-helices, "E" for beta-strands and "C" 
  or "-" for undetermined (coiled-coil) structures. 
(3) List of high-correlated residue couplings estimated from coevolutionary analysis. Despite any available method can be 
  used, we highly recommend Direct-Coupling Analysis approach (freely available at
(4) List of residue pairs identified by chemical cross-linking/mass spectrometry experiments. The most common chemical 
  linkers are supported: 
        (A) Direct reaction (zero-length) between residue sidechains in the presence of carbodiimides. 
            Possible pairs: DK, EK, DS, ES.
        (B) Disuccinimidyl suberate (DSS). Possible pairs: KK, KS, SS.
        (C) 1,6-Hexanediamine. Possible pairs: EE, ED, DD

Data for (1) and (2) should be compiled in a text file with sequence and predicted SS located in first and second lines, respectively (as shown above).


Finally, an initial 3D model and a topology can be generated using the following command:

    python sequence_SS coevolution_list crosslink_list

Input/output examples are found herein (.TAR.GZ files). Fold predictions can be carried in GROMACS/gaussian simulations ( using the generated .gro and .top files. A example GROMACS parameter file for an annealing protocol based in this proposed methodology is also provided and can be used for any system of study (sbm_calpha_SA_200to1_20ns.mdp).

Further assistance can be found at:

You can’t perform that action at this time.