Skip to content
Stand-alone software package of PureseqTM for transmembrane topology prediction from amino acid sequence only.
Roff Other
  1. Roff 98.8%
  2. Other 1.2%
Branch: master
Clone or download
realbigws
realbigws PureseqTM.sh mod
Latest commit 4bb463f Aug 9, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin first commit Apr 26, 2019
example README.md mod May 7, 2019
param add param/model_fast.2 May 6, 2019
source_code first commit Apr 26, 2019
util first commit Apr 26, 2019
PureseqTM.sh PureseqTM.sh mod Aug 9, 2019
PureseqTM_proteome.sh first commit Apr 26, 2019
README.md README.md mod May 7, 2019

README.md

PureseqTM

TransMembrane (TM) topology prediction from amino acid sequence only

Server

Users may submit their sequences to the PureseqTM Server at: http://pureseqtm.predmp.com

Train and test data

Users may find the train, test data of PureseqTM at: https://github.com/PureseqTM/PureseqTM_Dataset

Paper

https://www.biorxiv.org/content/10.1101/627307v2.abstract

Contact email

pureseqtm@gmail.com

Install

Download the PureseqTM package

git clone https://github.com/PureseqTM/PureseqTM_Package.git
cd ./PureseqTM_Package

Compilation if necessary

cd source_code
    make
cd ../

Examples

single protein prediction

prediction with fast mode (i.e., NOT use DBN features from Philius)

./PureseqTM.sh -i example/4j7cK.fasta -m 0

prediction without ground-truth label

./PureseqTM.sh -i example/4j7cK.fasta

given ground-truth 2-state TM label

./PureseqTM.sh -i example/4j7cK.fasta -l example/4j7cK.top

evaluate the prediction result

If the ground-truth label is provided, run below commands to evaluate the prediction accuracy:

grep -v "#" 4j7cK_PureTM/4j7cK.prob | awk '{print $NF" "$3}' > 4j7cK.pred_reso
util/TM2_Evaluation 4j7cK.pred_reso 0.5
rm -f 4j7cK.pred_reso

plot the posterior probabilities

./PureseqTM.sh -i example/4j7cK.fasta -P 1

Note that gnuplot is required to be installed in the local system.

prediction with given profile

./PureseqTM.sh -i example/1bhaA.tgt

Note that to generate the evolutionary profile, users are suggested to use the TGT_Package to generate the TGT file:

./A3M_TGT_Gen.sh -i <input_fasta> -d uniprot20_2016_02 -h hhsuite2 -n 3 -E 10

For more details on evolutionary profile, please goto https://github.com/realbigws/TGT_Package

proteome prediction

As it takes about 2 hours for the whole Human proteome, below please find a toy example:

./PureseqTM_proteome.sh -i example/test_proteome.fasta

transfer PDBTM label

Transfer PDBTM 9-state label to 0/1 TransMembrane (TM) label:

python util/pdbtm2binary.py example/1bhaA.pdbtm

Output files

There shall be 5 to 7 output files in XXX_PureTM by default, where XXX is the input name:

File name Description Option
XXX.fasta_raw original input sequence file in FASTA format.
XXX.fasta canonical sequence file without non-canonical characters.
XXX.top simple 2-state TransMembrane (TM) prediction in FASTA format.
XXX.prob detailed 2-state TransMembrane (TM) probability prediction.
XXX.pred_mode prediction mode (1 for sequence and 0 for profile).
XXX.png posterior probabilities plotted by GNUPLOT if option -P 1 is set
XXX.gff segment-level output if option -P 1 is set

References

Phobius

Title:
     A Combined Transmembrane Topology and Signal Peptide Prediction Method
Authors:
     Lukas Kall, Anders Krogh and Erik L. L. Sonnhammer
Journal:
     J. Mol. Biol. (2004) 338, 1027–1036

Philius

Title:
     Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks
Authors:
     Sheila M. Reynolds, Lukas Kall, Michael E. Riffle, Jeff A. Bilmes, William Stafford Noble
Journal:
     PLoS computational biology (2008) 4(11):e1000213

DeepCNF

Title:
     Protein secondary structure prediction using deep convolutional neural fields
Authors:
     Sheng Wang, Jian Peng, Jianzhu Ma, Jinbo Xu
Journal:
     Scientific reports (2016) 6:18962
You can’t perform that action at this time.