Skip to content

KateK/PredcircRNA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features

PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using
multiple kernel learning. Firstly we extracted different sources of discriminative features, including graph feature, conservation information and sequence compositions, ALU and tandem repeat, SNP density and open reading frame (ORF) from transcripts. Secondly, to better integrate features from different sources, we proposed a computational approach based on multiple kernel learning framework to fuse those heterogeneous features.

Dependcy:

  1. GraphProt: http://www.bioinf.uni-freiburg.de/Software/GraphProt/
  2. SHOTGUN: http://www.shogun-toolbox.org/
  3. txCdsPredict: http://hgdownload.cse.ucsc.edu/admin/
  4. Tandem repeats finder(trf): http://tandem.bu.edu/trf/trf.download.html

Input bed file format(such as test_bed):
chr2 69304539 69318051 + gene1
chr7 138593736 138597206 - gene2
chr22 39134591 39137055 - gene3

NOTICE: in the last column, we need have unique name (here is gene1, gene2...) for the transcript.

How to use the tool, the command as follows:
python PredcircRNA.py --inputfile=test_bed --outputfile=test_bed_out

The output file have corresponding lncRNA type in last column.

About

predicting circularRNA from other long non-coding RNA using machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%