Skip to content
The Filtered Spaced Word Matches Approach
C++ Makefile
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Bucket.cpp Add files via upload Apr 16, 2019
Bucket.h
Fswm.cpp Add files via upload Apr 16, 2019
LICENSE
Makefile Add files via upload Apr 16, 2019
README.md Update README.md Apr 16, 2019
Seed.cpp Add files via upload Apr 16, 2019
Seed.h Add files via upload Apr 16, 2019
Sequence.cpp
Sequence.h Add files via upload Apr 16, 2019
Word.cpp Add files via upload Apr 16, 2019
Word.h
pattern.cpp
pattern.h

README.md

FSWM

The program takes a set of genomic sequences as input and generates a distance matrix with the pairwise distances between the input sequences.

Usage: to compile type: make

run with: ./fswm [options]

format:

The input sequences must be contained in one single file FASTA format. Each species/genome must be represented by one single sequence in the input FASTA file. If you have multiple reads, contigs or chromosomes per input species, please concatenate them to one single sequence to make sure each species/genome corresponds to only one sequence. Example:

>Genome1
ATAGTAGATGAT..
>Genome2
ATAGTAGTAGTAG..
>Genome3
ATGATGATGATGATG..
..
etc.

options:

-h: print this help and exit
-k : pattern weight (default 12)
-t : numer of threads (default: 10)
-s : the minimum score of a spaced-word match to be considered homologous (default: 0)

Scientific publications using filtered spaced word matches should cite:

C.-A. Leimeister, S. Sohrabi-Jahromi, B. Morgenstern (2017) Fast and Accurate Phylogeny Reconstruction using Filtered Spaced-Word Matches Bioinformatics 33, 971-979

You can’t perform that action at this time.