Skip to content
/ owl Public

A tool to order FASTQ reads using elastic cluster mapping

License

Notifications You must be signed in to change notification settings

pratas/owl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OWL


OWL: a tool to order FASTQ reads using elastic cluster mapping.


OWL is a new tool to order FASTQ reads, neglecting the original order. It maps the reads according to a reference sequence using k-mer positional hashing and, then, it orders the reads using elastic clustering. Its usage is only needed during compression, enabling a very fast and low memory decompression. The tool can be used to substantially improve the compression of the FASTQ files (See the following Figure for a pipeline using the general purpose GZIP compressor). The time complexity of the tool is approximately linear.

OWL


A human reference genome can be downloaded using the script GetHuman.sh contained in the scripts folder.

1. INSTALLATION

Downloading and installing OWL:

git clone https://github.com/pratas/owl.git
cd owl/src/
cmake .
make

Cmake is needed for the installation (http://www.cmake.org/). You can download it directly from http://www.cmake.org/cmake/resources/software.html or use an appropriate packet manager, such as:

sudo apt-get install cmake

An alternative to cmake, but limited to Linux, can be set using the following instructions:

cp Makefile.linux Makefile
make

2. USAGE

To see the possible options of OWL type

./OWL

or

./OWL -h

These will print the following options:

Usage: OWL [OPTIONS]... [FILE] [FILE] A tool to order FASTQ reads using elastic cluster mapping. Non-mandatory arguments: -h give this help, -V display version number, -v verbose mode (more information), -N does NOT order reads, -W writes the full header, -D does NOT delete the temporary file, -k <k-mer> k-mer size [1;20], -m <minimum> minimum block size. Mandatory arguments: <FILE> reference file, < <FILE> stdin input FASTQ file, > <FILE> stdout output sorted FASTQ file. Example: ./OWL -v -k 16 -m 40 reference.fa < ex1.fq > ex1-sort.fq Report bugs to <{pratas,ap}@ua.pt>.

All the parameters can be better explained trough the following table:

Parameters Meaning
-h It will print the parameters menu (help menu)
-V It will print the OWL version number, license type and authors information.
-v It will print progress information.
-N It will NOT sort the reads (for analysis purposes).
-W It will write the full header in the output FASTQ file. Usually a very part of the header is not needed.
-D It will not delete the temporary file for ordering the reads (for analysis purposes).
-k <k-mer> The word size of the slidding window. From 1 to 20. Usually, larger values need more memory.
-m <minimum> The minimum size of proximity. Used in the elastic clustering.
[FILE] Reference filename (DNA sequence or FASTA file).
< [FILE] Input FASTQ file with the arbitrary read order (standard input).
> [FILE] Output FASTQ file with the reads ordered (standard output).

3. EXAMPLES

The OWL tool can be integrated with most of the general purpose and specific FASTQ compressors. For the example consider a reference sequence in FASTA format with the name 'reference.fa' and a FASTQ file with the name 'reads.fq'.

3.1 EXAMPLE WITH GZIP

The following instructions shows how to integrate OWL with GZIP:

./OWL -v -k 10 -m 40 reference.fa < reads.fq | gzip > reads.gz

and for decompression:

gunzip reads.gz

3.2 EXAMPLE WITH FQZ_COMP

The following instructions shows how to integrate OWL with FQZ_COMP:

./OWL -v -k 10 -m 40 reference.fa < reads.fq | ./fqz_comp > reads.gz

and for decompression:

./fqz_comp -d < reads.gz > reads.fq

4. CITATION

On using this tool/method, please, cite:

D. Pratas, A. J. Pinho (2017). v1.1 pratas/owl: A tool to order FASTQ reads using elastic cluster mapping.
DOI: 10.5281/zenodo.1048947

5. ISSUES

For any issue let us know at issues link.

6. LICENSE

GPL v3.

For more information:

http://www.gnu.org/licenses/gpl-3.0.html