Skip to content

v1.3.0

Choose a tag to compare

@ayoub-ibm ayoub-ibm released this 15 Feb 21:05
· 62 commits to main since this release

Features

Delta Extraction (new)
  • Add opt-in delta contract with flat graph IR batching, global merge/dedup, and template projection controls (0b19e08)
  • Update unit + integration coverage for delta batching/merge/projection and contract routing (8150def)
LLM Extraction Pipeline
  • Harden llm pipeline w/ contract dispatch, staged extraction, deterministic merge & observability (92a5089)
  • Improve catalog definition, flatten ID discovery & add validation retries (a1aba89)
Structured Output (default ON)
  • Enable default schema-enforced structured output via LiteLLM with prompt-schema fallback (6e96f54)
Custom LLM Endpoints
  • Support custom OpenAI-compatible endpoints via env-based auth and init scaffolding (0bebc44)

Refactoring

Input ingestion
  • Unify ingestion via Docling conversion with DoclingDocument passthrough (689426b)
Trace & debug
  • Revamp debug trace_data into a chronological event log (4ba4b5b)
  • Improve stage naming and split serializer into helpers (0378f65)

Documentation

GitHub Pages & traces
  • Refresh pages with updated output handling and debug artifacts (07e0cbc)
Delta extraction
  • Document delta extraction contract (flat graph IR), config/CLI flags, and migration notes (66aa6be)
Staged extraction
  • Update staged extraction docs, schema definition and performance tuning guides (bfabcbb)

Bug Fixes

Delta Extraction Quality
  • Prevent spurious list-entity nodes by adding identity allowlists and post-merge filtering (f45f790)
  • Improve entity ID quality, limiting index-based ID inference, and enabling content-based dedup (4767e26)
Continuous Integration
  • Remove unused mypy ignores for rapidfuzz and spacy imports (9fa8f75)