Releases · aayush4vedi/drift-spark

First public release of Drift — a Spark-native embedding lifecycle library.

Added

embed() — declarative batch embedding with cross-run dedup (MD5 hash scoped to (model, sink)), batching, exponential backoff, idempotent point IDs, and per-run cost tracking. shadow_mode=True runs with deterministic mock vectors and no API key.
watch() — incremental CDC refresh over the Delta Change Data Feed; handles insert / update_postimage / delete and writes the version watermark back to the ledger.
migrate() — model upgrades via dual-write and drift-adapter (Orthogonal Procrustes) strategies, with an ARR ≥ 0.97 quality gate (AdapterQualityError).
Ledger — SQLite lineage ledger (~/.drift/ledger.db) with cost_by_model(), provenance(), and recent_runs().
drift CLI — embed, watch, migrate, status.
Qdrant and pgvector sinks (pgvector write-only for now).