This repository has been archived by the owner on Jun 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Andrea Telatin edited this page Sep 17, 2020
·
16 revisions
A collection of Sequence FASTX Utilities, partly shipped with this repository and partly coming from external sources.
They have been originally built these principles:
- Reading both FASTQ and FASTA sequences with the same parser
- Parsing both name and comments from sequence headers (i.e.
>Seq_name length=1200
) - Supporting .gz input files, and possibly other compression formats
- Supporting streams (standard input / standard output)
- Native support for Illumina Paired-End libraries when needed
New scripts also adopt BioX::Seq
.
- seqfu - core utility
- fu-cat, concatenate FASTX files
- fu-grep, extract sequences by DNA pattern, by name or comment
- fu-len, filter sequences by size
- fu-count, count sequences
- fu-rename, rename sequences with a prefix
- fu-sort, sort sequences by size
- pe-cat, concatenate paired-end files (error tolerant, can be used to repair broken PE)
- pe-len, filter paired end sets by length
- pe-grep, filter paired end sets
- pe-ren, rename FASTQ files using barcodes or Illumina paired ends
- n50, calculate N50, number of sequences, minimum, maximum and total length
- interleafq, interleave and deinterleave paired sequences
𧬠SeqFU - a collection of tools to parse and manipulate FASTA and FASTQ files, supporting compressed input