You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Added
CLI run variants via ads-bib run --from-run <run-id-or-path> --set ..., including automatic stage planning, --dry-run, downstream artifact hydration for visualization/citation-only variants, and optional variant provenance in run_summary.yaml.
Changed
Final dataset bundle exports now clean publication/reference keys, prune dangling reference IDs, and remove placeholder or duplicate author UIDs before writing public Parquet outputs and the dataset manifest.
New runs now use a modular artifact layout under runs/<run_id>/data/, with run-local stage restart points (search, export, translated, tokenized, and) plus final dataset and citations outputs; artifact_layout_version: 2 is recorded in run_summary.yaml.
Translated and tokenized snapshots now carry metadata fingerprints so changed-config variants do not reuse stale source/config combinations, and enabled AND runs let ads-and validate its own cache metadata instead of loading disambiguated snapshots directly.
OpenRouter embedding defaults now use larger documented batch sizes, the OpenRouter preset pins Toponymy-internal embeddings to Qwen3, and OpenRouter retries now fail fast on non-retryable request/auth/payment errors.
Tokenized snapshot metadata now stores AND source fingerprints so validated AND cache hits can avoid recomputing expensive frame fingerprints on future runs.