Skip to content

v1.18.0 -- intelligent golden rules + custom plugin slot

Choose a tag to compare

@benzsevern benzsevern released this 22 May 13:05
· 532 commits to main since this release
372e9f0

goldenmatch 1.18.0 -- 2026-05-22

Addresses all four follow-up items from the golden-field
consolidation discussion + adds post-cluster auto-config refinement.

Three new built-in strategies

  • longest_value -- longest non-null string. Free-text use case.
  • unanimous_or_null -- emit value only on full agreement; emit
    None on any disagreement. Compliance use case.
  • confidence_majority -- majority weighted by cluster pair_scores.
    Strong-edge minority can beat weak-edge majority.

Post-cluster GoldenRulesRefiner

Two-phase auto-config: matchkey + blocking stays pre-cluster
(clustering depends on them); golden-rules picking moves
POST-cluster where it benefits from real cluster shape signals.

core/golden_rules_refiner.py runs between build_clusters and
build_golden_records when golden_rules.adaptive=True. Reads:

  • Compliance column-name patterns (ssn / npi / tax_id / license /
    dob / mrn / passport / hipaa_id / cusip / lei / isin / etc.)
  • High-cardinality identifier columns
  • Sibling timestamp detection (updated_at / modified_at / etc.
    with > 80% coverage)
  • Within-cluster value spread
  • Per-source completeness (uses source if present)
  • Date-column timestamp coverage
  • ColumnProfile signals (col_type, avg_len, null_rate)

Rule table:

  1. (PRE) compliance column name -> unanimous_or_null
  2. (PRE) col_type=identifier + cardinality > 0.9 -> unanimous_or_null
  3. col_type=date + > 50% cluster coverage -> most_recent
  4. Mutable field + sibling timestamp -> most_recent on the sibling
  5. One source > 1.5x median completeness -> source_priority
  6. Free-text + long + within-cluster disagreement -> longest_value
  7. null_rate > 0.5 -> first_non_null
  8. spread > 2.0 -> confidence_majority
  9. Else -> defer to base default

Opt-in via GoldenRulesConfig.adaptive: bool = False (default off).
Default-on is a v1.19 candidate after benchmark validation.

Custom plugin slot

strategy="custom:<name>" looks up a registered GoldenStrategyPlugin.
Rich protocol signature (values + sources + dates + quality_weights

  • pair_scores + rule_kwargs). Defensive defaults: missing plugin OR
    plugin exception -> WARNING + most_complete fallback. Opt-in strict
    mode via GOLDENMATCH_GOLDEN_STRATEGY_STRICT=1.

Specs

  • docs/superpowers/specs/2026-05-22-intelligent-golden-rules-design.md
  • docs/superpowers/specs/2026-05-22-golden-strategy-plugin-slot-design.md

Full CHANGELOG: https://github.com/benseverndev-oss/goldenmatch/blob/v1.18.0/packages/python/goldenmatch/CHANGELOG.md