TransMembrane (TM) topology prediction from amino acid sequence only
Users may submit their sequences to the PureseqTM Server at: http://pureseqtm.predmp.com
Users may find the train, test data of PureseqTM at: https://github.com/PureseqTM/PureseqTM_Dataset
https://www.biorxiv.org/content/10.1101/627307v2.abstract
git clone https://github.com/PureseqTM/PureseqTM_Package.git
cd ./PureseqTM_Package
cd source_code
make
cd ../
./PureseqTM.sh -i example/4j7cK.fasta -m 0
./PureseqTM.sh -i example/4j7cK.fasta
./PureseqTM.sh -i example/4j7cK.fasta -l example/4j7cK.top
If the ground-truth label is provided, run below commands to evaluate the prediction accuracy:
grep -v "#" 4j7cK_PureTM/4j7cK.prob | awk '{print $NF" "$3}' > 4j7cK.pred_reso
util/TM2_Evaluation 4j7cK.pred_reso 0.5
rm -f 4j7cK.pred_reso
./PureseqTM.sh -i example/4j7cK.fasta -P 1
Note that gnuplot is required to be installed in the local system.
./PureseqTM.sh -i example/1bhaA.tgt
Note that to generate the evolutionary profile, users are suggested to use the TGT_Package to generate the TGT file:
./A3M_TGT_Gen.sh -i <input_fasta> -d uniprot20_2016_02 -h hhsuite2 -n 3 -E 10
For more details on evolutionary profile, please goto https://github.com/realbigws/TGT_Package
As it takes about 2 hours for the whole Human proteome, below please find a toy example:
./PureseqTM_proteome.sh -i example/test_proteome.fasta
Transfer PDBTM 9-state label to 0/1 TransMembrane (TM) label:
python util/pdbtm2binary.py example/1bhaA.pdbtm
There shall be 5 to 7 output files in XXX_PureTM by default, where XXX is the input name:
File name | Description | Option |
---|---|---|
XXX.fasta_raw | original input sequence file in FASTA format. | |
XXX.fasta | canonical sequence file without non-canonical characters. | |
XXX.top | simple 2-state TransMembrane (TM) prediction in FASTA format. | |
XXX.prob | detailed 2-state TransMembrane (TM) probability prediction. | |
XXX.pred_mode | prediction mode (1 for sequence and 0 for profile). | |
XXX.png | posterior probabilities plotted by GNUPLOT | if option -P 1 is set |
XXX.gff | segment-level output | if option -P 1 is set |
Title:
A Combined Transmembrane Topology and Signal Peptide Prediction Method
Authors:
Lukas Kall, Anders Krogh and Erik L. L. Sonnhammer
Journal:
J. Mol. Biol. (2004) 338, 1027–1036
Title:
Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks
Authors:
Sheila M. Reynolds, Lukas Kall, Michael E. Riffle, Jeff A. Bilmes, William Stafford Noble
Journal:
PLoS computational biology (2008) 4(11):e1000213
Title:
Protein secondary structure prediction using deep convolutional neural fields
Authors:
Sheng Wang, Jian Peng, Jianzhu Ma, Jinbo Xu
Journal:
Scientific reports (2016) 6:18962