This repository contains programs and data for running, calibrating and testing the Structure and stability constrained protein evolution substitution model computed by the program Prot_evol https://github.com/ugobas/Prot_evol
- SSCPE.zip : Program SSCPE.pl for computing phylogenetic trees using Structure and stability constrained substitution models of protein evolution (SSCPE, Lorca I, Arenas M and Bastolla U. 2022. Structure and stability constrained substitution models outperform traditional substitution models used for evolutionary inference. Submitted) and the RAxML-NG program (Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A. 2019. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35: 4453-4455).
To install it, run the following:
unzip SSCPE.zip -> It extracts the README where you can find more information chmod u+x script_install_SSCPE.sh ./script_install_SSCPE.sh
The script creates the folder BIN and installs there the RAxML executable by Kozlov et al. (you can also download it from https://github.com/amkozlov/raxml-ng) and compiles and install the Prot_evol program (https://github.com/ugobas/Prot_evol) and the tnm program (https://github.com/ugobas/tnm)
-
Alignments.zip : Multiple sequence alignments including one PDB structure that gives name to the alignment. Courtesy of Julian Echave, Universidad Nacional de San Martin, Argentina. Input file to Prot_evol needed for computing the SSCPE models
-
RMSD.zip : Predicted RMSD generated by all point mutations of the wild-type protein in the PDB. They are predicted by the program tnm https://github.com/ugobas/tnm with input parameter PRED_MUT=1. Input file to Prot_evol needed for computing the SSCPE models
-
DE.zip : Predicted harmonic energy barrier generated by all point mutations of the wild-type protein in the PDB. They are predicted by the program tnm https://github.com/ugobas/tnm with input parameter PRED_MUT=1. Input file to Prot_evol needed for computing the SSCPE models
-
Rates.zip : Site-specific evolutionary rate, sequence entropy and hydrophobicity predicted by Prot_evol for every site of the tested proteins and number of contacts computed from the PDB structure
-
Summary.zip : Properties of the different SSCPE models averaged over sites for all test proteins: Log-likelihood of the MSA, sequence entropy, Kullback-Leibler divergences, average hydrophobicity, and global folding free energy DeltaG
-
Prot_evol_models.zip : 10 site-specific SSCPE substitution models used for testing the tree likelihood with the program RAxML-ng (Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A. 2019. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35: 4453-4455).
-
Input_Prot_evol.in : Sample input file to the program Prot_evol
-
Input_TNM.in : Sample input file to the program tnm
-
Mutation_para.in : Mutation weights used by the tnm program for predicting the RMSD and DE. Input to the tnm program setting the line MUT_PARA=Mutation_para.in
The PDB files can be downloaded from the PDB site https://www.rcsb.org/