v0.1.0 — scikit-bio for DuckDB/SQL
First release of vgi-scikit-bio, a VGI worker exposing
scikit-bio to DuckDB/SQL — ~90 functions across 5 schemas:
sequence— GC content, reverse complement, complement, transcription,
translation (incl. six-frame), validation, sequence distances, k-mer & residue
compositionalignment— global/local pairwise alignment (scores and aligned strings)diversity— the full alpha-diversity metric family (aggregates),
beta-diversity distance matrices, phylogenetic Faith's PD & UniFrac, and
rarefactionstats— PCA/CA/PCoA ordination, PERMANOVA/ANOSIM/Mantel tests,
CLR/ILR/ALR (and inverse) compositional transforms, and ANCOM /
Dirichlet-multinomial differential abundancetree— neighbour joining / UPGMA / minimum evolution, Newick inspection,
and tree comparison (Robinson–Foulds, cophenetic)
Runs over stdio (DuckDB spawns it) or HTTP; a multi-arch container image is
published to ghcr.io/query-farm/vgi-scikit-bio. MIT licensed.
INSTALL vgi FROM community; LOAD vgi;
ATTACH 'skbio' (TYPE vgi, LOCATION 'vgi-scikit-bio');
SELECT skbio.sequence.gc_content('ATGCGGATTACAGG');