Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Fetching latest commit…

Cannot retrieve the latest commit at this time

..
Failed to load latest commit information.
galaxy
LICENSE
NEWS
README
asp
data
genomic.py
model.py
seqdict.py
signal_detectors.py

README

This is the accurate splicer (asp) program accompanying the paper 
"Accurate Splice Site Prediction Using Support Vector Machines"
by Soeren Sonnenburg, Gabriele Schweikert, Petra Philips,
Jonas Behr and Gunnar Raetsch [1].


ASP PROGRAM REQUIREMENTS:

Asp requires a working python (2.4 or later) installation with numpy
(version 1.0 or later) and the shogun toolbox (version 0.7.3 or later)
- which is available from http://www.shogun-toolbox.org for Linux, MacOSX,
cygwin/win32. If you are running Debian GNU Linux, shogun 0.7.3 is available in
debian unstable http://packages.debian.org/unstable/science/shogun-python-modular.

ASP PROGRAM RUNNING TIME AND MEMORY REQUIREMENTS:

Asp requires about 100M of memory for short sequences. Memory requirements
don't grow much (a additional linear term w.r.t. the length of the input
sequence). On first run with a new model (see --model option below),
asp will load and decompress the .bz2 compressed model file and store it
as a python native pickle dump, which increases startup times a lot.
Due to the optimizations in [2] splice form prediction (layer 1) times
won't change much for many/long sequences. 

ASP PROGRAM USAGE:

./asp fasta_file.fa

This will read all entries in the .fa file and print a .gff file with the
predictions for each of the entries to stdout. One may optionally specify the
start and stop of the transcript via --start <basenum> / --stop <basenum> and
the model via --model one of worm, fly, cress, fish, human.
<basenum> is zero based.


REFERENCES:

[1] S. Sonnenburg, G. Schweikert, P. Philips, J. Behr and Gunnar Raetsch,
	Accurate Splice Site Prediction, BMC Bioinformatics, Special Issue from NIPS workshop on
	New Problems and Methods in Computational Biology Whistler, Canada, 18 December 2006},
	December, 2007, BMC Bioinformatics,8:(Suppl. 10):S7

[2]	Sonnenburg, S, Rätsch, G, Schäfer, C, Schölkopf, B. Large Scale Multiple
	Kernel Learning. Journal of Machine Learning Research,7:1531-1565,
	July 2006, K.Bennett and E.P.-Hernandez Editors.
Something went wrong with that request. Please try again.