Skip to content

v0.2.0

Choose a tag to compare

@github-actions github-actions released this 05 May 13:59
· 28 commits to main since this release
1234b18

Added

  • CLI run variants via ads-bib run --from-run <run-id-or-path> --set ..., including automatic stage planning, --dry-run, downstream artifact hydration for visualization/citation-only variants, and optional variant provenance in run_summary.yaml.

Changed

  • Final dataset bundle exports now clean publication/reference keys, prune dangling reference IDs, and remove placeholder or duplicate author UIDs before writing public Parquet outputs and the dataset manifest.
  • New runs now use a modular artifact layout under runs/<run_id>/data/, with run-local stage restart points (search, export, translated, tokenized, and) plus final dataset and citations outputs; artifact_layout_version: 2 is recorded in run_summary.yaml.
  • Translated and tokenized snapshots now carry metadata fingerprints so changed-config variants do not reuse stale source/config combinations, and enabled AND runs let ads-and validate its own cache metadata instead of loading disambiguated snapshots directly.
  • OpenRouter embedding defaults now use larger documented batch sizes, the OpenRouter preset pins Toponymy-internal embeddings to Qwen3, and OpenRouter retries now fail fast on non-retryable request/auth/payment errors.
  • Tokenized snapshot metadata now stores AND source fingerprints so validated AND cache hits can avoid recomputing expensive frame fingerprints on future runs.