Remove PCR duplicates from FASTQ files
This Rust program removes PCR duplicates from FASTQ files. The driving goal was to process large files on a modest computer, so it trades some speed for less memory consumption by doing two things:

  1. encode the already seen sequences as a byte sequences (encoding 3 bases per byte, followed by a byte encoding the original sequence length);
  2. storing these byte sequences in a Patricia Tree (see and, which reduces the memory used by taking advantage of common prefixes.


fqdedup is written in Rust, so it requires Rust and Cargo to compile it ( fqdedup is known to run on OS X and Linux. It may be possible to compile it on other platforms as long as supporting Rust libraries can be made to compile, specifically and are likely to be the dependencies that are most troublesome to compile.

To install Rust and Cargo visit, fqdedup should compile with Rust 1.18.0 or later. Alternatively, on OS X, if you have Homebrew ( ) installed and up to date, you can use brew to install rust as so:

brew install rust

Afterwards, uncompress the source and build:

tar xzf fqdedup-1.0.0.tar.gz
cd fqdedup-1.0.0
cargo build --release 

After compilation, copy the fqdedup binary (fqdedup-1.0.0/target/release/fqdedup) to a folder in your PATH, for example /usr/local/bin.


    fqdedup [options] -i <filename>
    fqdedup (--help | --version)

  -h --help              Show this screen.
  --version              Show version.
  -o --output OUTPUT     Output filename (defaults to appending '_deduplicated' to the input name).
