A seqtk-style toolkit for sequence analysis. By no means feature complete. In fact largely contains features other authors have not merged into their respective tools.
Install zlib version >=1.2.5, then:
git clone https://github.com/kdmurray91/seqhax.git cd seqhax mkdir build && cd build cmake .. make make install
To make static binaries, one can use
cmake -DSTATIC_BUILD=On .. in the above series of commands.
Any other issues, file a bug report on github.
seqhax command has many subcommands. The commands, along with a synopsis
of their actions, are displayed when one types
seqhax with no arguments.
At the time of writing, these were:
$ seqhax USAGE: seqhax PROGRAM [options] where PROGRAM is one of: anon -- Rename sequences by a sequential number convert -- Convert between FASTA and FASTQ formats filter -- Filter reads from a sequence file pairs -- (De)interleave paired end reads pecheck -- Check that paired end reads match properly (also join them) preapp -- Prepend or append string to sequences randseq -- Generate a random sequence file stats -- Basic statistics about sequence files trunc -- Truncate sequences
The usage of each subcommand can be obtained using the
-h flag to that
seqhax preapp -h.
Re-name sequences with a sequential numeric ID.
Convert between FASTA and FASTQ formats.
Removes sequences based on certain criteria:
Interleaves or de-interleaves paired reads. Converts between the following forms:
- Separate R1/R2 paired files and single read read file.
- "Strict" interleaved file, where failed/missing reads are replaced with a single 'N'.
- "Broken paired" interleaved files, where failed/missing reads are simply not present.
Checks that read pairs are correctly matched, between split (R1 & R2) files, or interleaved files. Optionally, can be used to join multiple R1/R2 from the sample sample into a single interleaved file, while checking read IDs match.
Addition of a constant prefix and/or suffix to each sequence.
Generates a fasta or fastq file containing sequences with random sequences.
Counts number of reads and basepairs in sequence files, and outputs a convenient table.
Truncates reads at given length.
GPL v3 (see ./LICENSE).
Copyright (c) 2014-2016 Kevin Murray.