Skip to content

v0.1.0 — research preview

Choose a tag to compare

@nuin nuin released this 02 Jun 20:43
· 87 commits to main since this release

Initial research preview of vcfclick.

A small VCF databases tool with an embedded ClickHouse engine,
embedded DuckDB annotations, and an MCP natural-language layer.

Quick start

pip install vcfclick

# pull the included 1000 Genomes BRCA1 demo database
vcfclick db pull demo https://github.com/nuin/vcfclick/releases/download/v0.1.0/1000g-brca1-demo.tar.gz

# run a query
vcfclick db query demo "SELECT count(DISTINCT (ingest_id, sample_id)) AS n_samples FROM genotypes WHERE chrom='chr17' AND pos BETWEEN 43044295 AND 43170245"

Assets

  • 1000g-brca1-demo.tar.gz — a ready-to-use vcfclick database bundle
    (1000 Genomes Phase 3 30x BRCA1 region, 3,014 variants × 3,202 samples).
    Restore with vcfclick db pull <name> <url>.
  • vcfclick-0.1.0-py3-none-any.whl — Python wheel.
  • vcfclick-0.1.0.tar.gz — source distribution.

Benchmark

End-to-end on a 1000 Genomes chr17 10Mb slice (235,768 variants ×
3,202 samples = 44.99M sparse calls), vcfclick parallel-8 ingests in
69s — ~70× faster than the comparable TileDB-VCF workflow. Full
methodology and caveats in bench/BENCHMARK.md.