Skip to content

moold/kseq-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crates.io Crates.io docs.rs

kseq

kseq is a simple fasta/fastq (fastx) format parser library for Rust, its main function is to iterate over the records from fastx files (similar to kseq in C). It uses shared buffer to read and store records, so the speed is very fast. It supports a plain or gz fastx file or io::stdin, as well as a fofn (file-of-file-names) file, which contains multiple plain or gz fastx files (one per line).

Using kseq is very simple. Users only need to call parse_path to parse a path or parse_reader to parse a reader, and then use iter_record method to get each record.

  • parse_path This function takes a path that implements AsRef<std::path::Path> as input, a path can be a fastx file, - for io::stdin, or a fofn file. It returns a Result type:

    • Ok(T): A struct T with the iter_record method.
    • Err(E): An error E including missing input, can't open or read, wrong fastx format or invalid path or file errors.
  • parse_reader This function takes a reader that implements std::io::Read as input. It returns a Result type:

    • Ok(T): A struct T with the iter_record method.
    • Err(E): An error E including missing input, can't open or read, wrong fastx format or invalid path or file errors.
  • iter_record This function can be called in a loop, it returns a Result<Option<Record>> type:

    • Ok(Some(Record)): A struct Record with methods:

      • head -> &str: get sequence id/identifier
      • seq -> &str: get sequence
      • des -> &str: get sequence description/comment
      • sep -> &str: get separator
      • qual -> &str: get quality scores
      • len -> usize: get sequence length

      Note: call des, sep and qual will return "" if Record doesn't have these attributes.

    • Ok(None): Stream has reached EOF.

    • Err(ParseError): An error ParseError including IO, TruncateFile, InvalidFasta or InvalidFastq errors.

Example

use std::env::args;
use std::fs::File;
use kseq::parse_path;

fn main(){
	let path: String = args().nth(1).unwrap();
	let mut records = parse_path(path).unwrap();
	// let mut records = parse_reader(File::open(path).unwrap()).unwrap();
	while let Some(record) = records.iter_record().unwrap() {
		println!("head:{} des:{} seq:{} qual:{} len:{}", 
			record.head(), record.des(), record.seq(), 
			record.qual(), record.len());
	}
}

Installation

cargo add kseq

Benchmarking

cargo bench

About

A FASTA/FASTQ format parser library

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages