Find file
Fetching contributors…
Cannot retrieve contributors at this time
186 lines (173 sloc) 6.72 KB
SW# - CUDA parallelized Smith Waterman with applying Hirschberg's algorithm
Copyright (C) 2011 Matija Korpar, contributor Mile Šikić
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Contact the author by mkorpar@gmail.com.
1) DEPENDENCIES
2) INSTALLATION
3) USAGE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
1) DEPENDENCIES
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Application uses following software:
1) Linux shell
2) gcc 4.4 (for newest nvcc) - depends on nvcc version
3) nvcc 2.*+
4) make
5) ar (for lib building) (optional)
6) doxygen (for doc building) (optional)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
2) INSTALLATION
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
1) run Makefile with make <options>
make <options>:
1) make - builds the standard executable
3) make install - builds include/ and lib/ directories in the current
directory for API usage
(see doc/examples api_usage.cpp)
4) make docs - builds the html documentation in doc/html/
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
3) USAGE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
./sw# -i <first_sequnce> -j <second_sequnce> [OPTIONS] [FLAGS]
OPTIONS:
--query, -i <first_sequnce>
fasta query file path
- example fasta query file:
> name
query chars
in more rows
- if the NAME ROW is NOT PROVIDED program will TERMINATE
--target, -j <second_sequnce>
fasta database file path
fasta database are appended fasta files in a single text file
- example fasta database file:
> name 1
query chars 1
> name 2
query chars 2
> name 3
query chars 2
- in the example there are 3 database fasta queries, same rules as
for the fasta query file apply
--matrix-file <file_path>
(optional)
similarity matrix file path
if given the similarity matrix is loaded from the file
- example matrix file:
5
A C T G *
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 0
- in the first row number of matcher elemenets is provided
- the example file demonstrates a configuration of +1 for the one
of the symbols A, C, T, or G
- * is a wildcard which means all other symbols are treated as this
symbol
- row and column of the matrix both correspond to the n-th element
of the second row
--matrix-table <BLOSUM50|BLOSUM55|BLOSUM62|BLOSUM70|DNA|BLAST|CHIAR>
embedded similarity matrix
(optional)
(default: CHIAR)
if given --matrix-file option won't be used
--match <float>
(optional)
(default: 1.0)
match value
if --match and --mismatch are given, --matrix-table and --matrix-file
options are not used
--mismatch <float>
(optional)
(default: -3.0)
mismacth value
if --match and --mismatch are given, --matrix-table and --matrix-file
options are not used
--gap-open, -g <float>
(optional)
(default: 5.0)
gap open value
--gap-extend, -e <float>
(optional)
(default: 2.0)
gap extend value
--threshold <float>
(optional)
(default: NO_THRESHOLD)
score threshold
program will output not just the max score and its reconstruction, but
all the scores which are above the given threshold
if given --solve flag is ignored
if given --global flag is ignored
if given --shotgun flag is ignored
--kbest <int>
(optional)
(default: 1)
compute k best scores
program will output the k best disjunct solutions
if given --solve flag is ignored
if given --global flag is ignored
if given --shotgun flag is ignored
--min-hits <int>
(optional)
(default: 0)
the option is used with --window option, it means there must be at
least min_hits of matchings in every window length segment
--window <int>
(optional)
(default: 0)
the option is used with --min-hits option, it means there must be at
least min_hits of matchings in every window length segment
--out <string>
(optional)
output results to given file
--verbose <int>
(optional)
(default: 1)
verbose level:
0 : no verbose, no output will be printed on stdout
1 : default, preferences and results will be printed on stdout
2 : including 1, also progress bars and progress details are included
FLAGS:
--complement
(optional)
if input is a nucleotide, search also the other strand
--shotgun
(optional)
shotgun flag is used for assembling, the input files are treated as
shotgun files
shotgun piece separator char is: '_'
number of results will be m * n where m is number of pieces in first
file, while n in the second
- example shotgun file (similar as fasta entries):
> name
CCC_AAA_
- example file consists of two pieces CCC and AAA
- it is MANDOTORY that each shotgun sequnce ENDS WITH the PIECE
SEPARATOR CHAR
if the --global flag is provided all of the results will contatain
the whole coresponding piece from the database file
--solve
(optional)
solve only, no reconstruction result or statistics are provided
--cpu
(optional)
force CPU solving
--single
(optional)
for using only one CUDA card
--global
(optional)
global aligment
if given --threshold this option won't be used