GitHub - pratas/owl: A tool to order FASTQ reads using elastic cluster mapping

OWL: a tool to order FASTQ reads using elastic cluster mapping.

OWL is a new tool to order FASTQ reads, neglecting the original order. It maps the reads according to a reference sequence using k-mer positional hashing and, then, it orders the reads using elastic clustering. Its usage is only needed during compression, enabling a very fast and low memory decompression. The tool can be used to substantially improve the compression of the FASTQ files (See the following Figure for a pipeline using the general purpose GZIP compressor). The time complexity of the tool is approximately linear.

A human reference genome can be downloaded using the script GetHuman.sh contained in the scripts folder.

1. INSTALLATION

Downloading and installing OWL:

git clone https://github.com/pratas/owl.git
cd owl/src/
cmake .
make

Cmake is needed for the installation (http://www.cmake.org/). You can download it directly from http://www.cmake.org/cmake/resources/software.html or use an appropriate packet manager, such as:

sudo apt-get install cmake

An alternative to cmake, but limited to Linux, can be set using the following instructions:

cp Makefile.linux Makefile
make

2. USAGE

To see the possible options of OWL type

./OWL

or

./OWL -h

These will print the following options:

Usage: OWL [OPTIONS]... [FILE] [FILE] A tool to order FASTQ reads using elastic cluster mapping. Non-mandatory arguments: -h give this help, -V display version number, -v verbose mode (more information), -N does NOT order reads, -W writes the full header, -D does NOT delete the temporary file, -k <k-mer> k-mer size [1;20], -m <minimum> minimum block size. Mandatory arguments: <FILE> reference file, < <FILE> stdin input FASTQ file, > <FILE> stdout output sorted FASTQ file. Example: ./OWL -v -k 16 -m 40 reference.fa < ex1.fq > ex1-sort.fq Report bugs to <{pratas,ap}@ua.pt>.

All the parameters can be better explained trough the following table:

Parameters	Meaning
-h	It will print the parameters menu (help menu)
-V	It will print the OWL version number, license type and authors information.
-v	It will print progress information.
-N	It will NOT sort the reads (for analysis purposes).
-W	It will write the full header in the output FASTQ file. Usually a very part of the header is not needed.
-D	It will not delete the temporary file for ordering the reads (for analysis purposes).
-k <k-mer>	The word size of the slidding window. From 1 to 20. Usually, larger values need more memory.
-m <minimum>	The minimum size of proximity. Used in the elastic clustering.
[FILE]	Reference filename (DNA sequence or FASTA file).
< [FILE]	Input FASTQ file with the arbitrary read order (standard input).
> [FILE]	Output FASTQ file with the reads ordered (standard output).

3. EXAMPLES

The OWL tool can be integrated with most of the general purpose and specific FASTQ compressors. For the example consider a reference sequence in FASTA format with the name 'reference.fa' and a FASTQ file with the name 'reads.fq'.

3.1 EXAMPLE WITH GZIP

The following instructions shows how to integrate OWL with GZIP:

./OWL -v -k 10 -m 40 reference.fa < reads.fq | gzip > reads.gz

and for decompression:

gunzip reads.gz

3.2 EXAMPLE WITH FQZ_COMP

The following instructions shows how to integrate OWL with FQZ_COMP:

./OWL -v -k 10 -m 40 reference.fa < reads.fq | ./fqz_comp > reads.gz

and for decompression:

./fqz_comp -d < reads.gz > reads.fq

4. CITATION

On using this tool/method, please, cite:

D. Pratas, A. J. Pinho (2017). v1.1 pratas/owl: A tool to order FASTQ reads using elastic cluster mapping.
DOI: 10.5281/zenodo.1048947

5. ISSUES

For any issue let us know at issues link.

6. LICENSE

GPL v3.

For more information:

http://www.gnu.org/licenses/gpl-3.0.html

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
imgs		imgs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
owl		owl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imgs

imgs

scripts

scripts

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

owl

owl

Repository files navigation

1. INSTALLATION

2. USAGE

3. EXAMPLES

3.1 EXAMPLE WITH GZIP

3.2 EXAMPLE WITH FQZ_COMP

4. CITATION

5. ISSUES

6. LICENSE

About

Releases 1

Packages

Languages

License

pratas/owl

Folders and files

Latest commit

History

Repository files navigation

1. INSTALLATION

2. USAGE

3. EXAMPLES

3.1 EXAMPLE WITH GZIP

3.2 EXAMPLE WITH FQZ_COMP

4. CITATION

5. ISSUES

6. LICENSE

About

Resources

License

Stars

Watchers

Forks

Languages