Skip to content

goldenmatch v1.13.0

Choose a tag to compare

@benzsevern benzsevern released this 11 May 04:16
· 771 commits to main since this release
5ebca19

goldenmatch v1.13.0

Release plumbing wave. No algorithm changes - DQbench / Febrl3 / NCVR / DBLP-ACM numbers unchanged from v1.12.0.

Added

  • Typed accessor API on MatchkeyConfig / MatchkeyField (PR #151). New properties: MatchkeyConfig.fuzzy_threshold, MatchkeyField.fuzzy_scorer, MatchkeyField.fuzzy_weight, MatchkeyField.resolved_field. Each raises ValueError when the matchkey is not a fuzzy/weighted type, so the invariant is now enforceable in pyright strict.
from goldenmatch.config.schemas import MatchkeyConfig, MatchkeyField

mk = MatchkeyConfig(
    name="identity",
    type="weighted",
    threshold=0.85,
    fields=[MatchkeyField(field="name", transforms=["lowercase"], scorer="jaro_winkler", weight=1.0)],
)
assert mk.fuzzy_threshold == 0.85  # safe on weighted matchkey
# mk.fuzzy_threshold on an exact matchkey raises ValueError
  • docs/scale-envelope.md (PR #149): Polars / DuckDB / Ray operating ranges plus block-size failure modes.
  • Postgres CI lane (PR #144): flipped from skipped to live.

Changed

  • PyPI metadata corrected (PR #148): [project.urls] Homepage / Repository / Documentation now point at the monorepo. This release is what makes the refresh land on PyPI.

Fixed

  • Reproducibility of all four published benchmark numbers (PR #152, replaces #150): DQbench composite 91.04, DBLP-ACM 0.9641, Febrl3 0.9443, NCVR 0.9719 all reproduce from a fresh clone. See docs/reproducing-benchmarks.md.

Internal (contributors only)

  • Ruff lint expanded to F / I / B-narrowed / UP rule sets across packages/python/ (PR #146).
  • Pyright strict on the 21-file core slice of goldenmatch (PR #147). Typed accessors in PR #151 eliminated 7 type-suppression workarounds.

Benchmarks (zero-config, no LLM)

Unchanged vs v1.12.0 - algorithm not touched this wave.

Dataset v1.12.0 v1.13.0 Delta
DBLP-ACM 0.9641 0.9641 +0.0000
Febrl3 0.9443 0.9443 +0.0000
NCVR 0.9719 0.9719 +0.0000
DQbench composite 91.04 91.04 +0.00