Skip to content

v0.3.1 -- Domain Packs, Evaluate CLI, Incremental Matching, GitHub Actions Try-It

Choose a tag to compare

@benzsevern benzsevern released this 23 Mar 00:03
· 1612 commits to main since this release

What's New

Domain Packs (7 built-in)

Pre-built YAML rulebooks for instant domain-specific entity resolution:

  • Electronics -- model numbers, SKUs, specs (36 brands)
  • Software -- versions, editions, platforms (23 brands)
  • Healthcare -- NDC, NPI, ICD-10, pharma brands (20 brands)
  • Financial -- CUSIP, ISIN, LEI, institutions (20 brands)
  • Real Estate -- ZIP, APN, MLS, property attributes (10 brokerages)
  • People -- SSN, DOB, phone, email patterns
  • Retail -- UPC, EAN, GTIN, CPG brands (20 brands)

Custom packs: drop a YAML file in .goldenmatch/domains/ and it's auto-discovered.

New CLI Commands

  • goldenmatch evaluate -- measure precision/recall/F1 against ground truth CSV
  • goldenmatch incremental -- match new CSV records against an existing base dataset without re-running the full pipeline

GitHub Actions "Try It"

Zero-install demo: paste a CSV URL into the workflow_dispatch form, get deduplication results as a downloadable artifact. No setup required.

Codespaces

One-click dev environment via .devcontainer. Open a Codespace, start coding immediately.

dbt Integration

New dbt-goldenmatch package for DuckDB-based entity resolution in dbt pipelines.

Community

  • GitHub Discussions enabled with seed posts
  • Bug report and feature request issue templates
  • Contributing guide, Code of Conduct, Security policy
  • Download count badge on README

Stats

  • 855 tests passing (+ 6 skipped)
  • 19 CLI commands
  • 268 PyPI downloads in first 3 days

Install / Upgrade

pip install --upgrade goldenmatch