Skip to content

COMBINE-lab/mazu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mazu

A Rust library for building modular, fast and compact indexes over genomic data

Mazu (媽祖)... revered as a tutelary deity of seafarers, including fishermen and sailors...

Disclaimer --- This library is in alpha and is under active development.

Highlights

  1. Query ready indexes via plug-and-play k-mer-to-unitig and unitig-to-occurrence mappings.
  2. Load (only) compatibility with pufferfish, deserialize pufferfish indices and work with them in Rust.
  3. Streaming queries for generic indexes for free with .as_streaming()
  4. An easy test-bed for new compression algorithms for unitig-occurrences and k-mer dictionaries.
  5. No more CMake.

Examples

// Load a pufferfish index from C++ implementation
let p = to_abs_path(YEAST_CHR01_INDEX);
let pi = DenseIndex::deserialize_from_cpp(p).unwrap();
// Extract unitigs and build a SSHash
let unitig_set = pi.as_ref().clone();
let sshash = SSHash::from_unitig_set(unitig_set, 15, 32, WyHashState::default()).unwrap();

// Drop in an SSHash for a new index
let pi = ModIndex::from_parts(
    pi.base.clone(),
    sshash,
    pi.as_u2pos().clone(),
    pi.as_refseqs().clone(),
);

// Generic implementations take care of query and validation
pi.validate_self();

// Attach a streaming cache and drive the index.
let driver = pi.as_streaming();
driver.validate_self();

About

A Rust library for building modular, fast and compact indexes over genomic data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages