Skip to content

liulifenyf/TSPTFBS

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

TSPTFBS

Python programs for predicting TFBS for 265 Arabidopsis TFs

Dependencies

The program requires:

  • python 3.6
  • tensorflow2.0.0 module
  • keras2.3.1 module
  • pandas module
  • numpy module
  • scikit-learn module
  • TAIR10 reference genome
  • the bedtools software

Install

git clone git@github.com:liulifenyf/TSPTFBS.git

Tutorial

Predicting

cd TSPTFBS
python Predict.py <input fasta file>

After running the program, a file named 'result.csv' will be generated in the current folder which records the prediction results of the models. We here provide a Test.fa file for an example:

python Predict.py Example/Test.fa

Input File Format

The input file must contain DNA sequences which have a length of 201bp with a FASTA format. A FASTA file of the example is:

>chr2L:12715-12916
CAAAAATAGAAATACCCACCACAGGAGCACGATGTTTTAATTGTATTTCTTTAGCAAGCTGCGCAGAAATTCGGCGGGGCATGTGTGGTGGTGCATTGCCACTTGCCGACGGGACGGCAGTTGCCGCGGTCTGCGCTGGTGGCAAATGCAGAAGGAAAACCGAGACTGTACTGGCATTTGTTGCTGACCACAAAGTTGGCG
>chr2L:59143-59344
TAGACCGCCTGACAAGTTCGGGTGACCATCGAGCGTCTCTGCTTACCGTGCGCTTAAGCGAACCACACGTCCTAATCGAAACAACTATACAGCGCGACTGTGCGGACGAGTGTCTTGAGACTCTGGGCAAGCGCAGCCAGCCAACCAAGTTTCGAAGTCTGGCTTTTGGGCCAAGCTTGGTCTGCGCCACGCTTGGCCCCG

Output File Format

The output file will seem like below: the first column represents the names of 265 Arabidopsis TFs, the remaining columns (The example has two remaining columns because the input file has two enquired DNA sequences) record the probabilities of given DNA sequences to be predicted as a TFBS of one of 265 Arabidopsis TFs.

TF Name	chr2L:12715-12916	chr2L:59143-59344
AT3G10113	5.8268368E-05	0.00023753107
AT3G12130	0.0003848466	0.077134416
AT3G52440	0.6031477	0.42776716
AT3G12730	0.0059271543	0.03132692
AT3G60580	0.05004888	0.012927864
AT3G60490	0.86310613	0.8729982
AT4G00250	0.13277796	0.2112361
AT3G24120	0.011899605	0.052325442
AT3G09600	0.0072161327	0.016331125
...

About

trans-species prediction of TFBS in plants

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages