Skip to content

Latest commit

 

History

History
39 lines (31 loc) · 1.4 KB

README.md

File metadata and controls

39 lines (31 loc) · 1.4 KB

mazu

A Rust library for building modular, fast and compact indexes over genomic data

Mazu (媽祖)... revered as a tutelary deity of seafarers, including fishermen and sailors...

Disclaimer --- This library is in alpha and is under active development.

Highlights

  1. Query ready indexes via plug-and-play k-mer-to-unitig and unitig-to-occurrence mappings.
  2. Load (only) compatibility with pufferfish, deserialize pufferfish indices and work with them in Rust.
  3. Streaming queries for generic indexes for free with .as_streaming()
  4. An easy test-bed for new compression algorithms for unitig-occurrences and k-mer dictionaries.
  5. No more CMake.

Examples

// Load a pufferfish index from C++ implementation
let p = to_abs_path(YEAST_CHR01_INDEX);
let pi = DenseIndex::deserialize_from_cpp(p).unwrap();
// Extract unitigs and build a SSHash
let unitig_set = pi.as_ref().clone();
let sshash = SSHash::from_unitig_set(unitig_set, 15, 32, WyHashState::default()).unwrap();

// Drop in an SSHash for a new index
let pi = ModIndex::from_parts(
    pi.base.clone(),
    sshash,
    pi.as_u2pos().clone(),
    pi.as_refseqs().clone(),
);

// Generic implementations take care of query and validation
pi.validate_self();

// Attach a streaming cache and drive the index.
let driver = pi.as_streaming();
driver.validate_self();