kmindex is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.
Given a databank
Indexing/Querying example (can be tested in the examples
directoy):
- Index a dataset:
kmindex build --fof fof1.txt --run-dir D1_index --index ./G --register-as D --hard-min 2 --kmer-size 25 --nb-cell 1000000
- Query the index:
kmindex query --index ./G --fastx query.fasta --zvalue 3
Full documentation is available at https://tlemane.github.io/kmindex
Citation Lemane, Téo, et al. "Indexing and real-time user-friendly queries in terabyte-sized complex genomic datasets with kmindex and ORA" Nature Computational Science 4.2 (2024): 104-109.
Pre-print paper is available on bioRxiv