Skip to content

v1.0.0 -- Production/Stable

Choose a tag to compare

@benzsevern benzsevern released this 23 Mar 19:07
· 1612 commits to main since this release

GoldenMatch 1.0.0

Production/Stable. The Beta label is gone. Semver is enforced.

What is GoldenMatch?

Entity resolution toolkit -- deduplicate records, match across sources, privacy-preserving linkage. Works on files, databases, or as a Python library.

pip install goldenmatch
import goldenmatch as gm
result = gm.dedupe("customers.csv", exact=["email"], fuzzy={"name": 0.85})

Highlights

Accuracy

  • 97.2% F1 on structured data (DBLP-ACM, zero-config fuzzy)
  • 92.4% F1 on PPRL person data (FEBRL4, auto-configured bloom filters)
  • 72.2% F1 on product matching (Abt-Buy, domain extraction + LLM)

Features

  • 21 CLI commands -- dedupe, match, evaluate, label, pprl, incremental, serve, mcp-serve, and more
  • 96 Python exports -- import goldenmatch as gm gives access to every feature
  • In-context LLM clustering -- send blocks of records to GPT/Claude for group clustering with uncertainty scores
  • Privacy-preserving linkage -- bloom filter encryption, multi-party protocol, auto-configuration that beats manual tuning
  • Ray distributed backend -- scale to 10M+ records with --backend ray
  • 7 domain packs -- healthcare, financial, real estate, people, retail, electronics, software
  • REST API + MCP server -- 10 endpoints, 17 MCP tools for Claude Desktop
  • CI/CD quality gates -- goldenmatch evaluate --min-f1 0.90 for automated pipelines

For Developers

  • Clean API -- gm.dedupe(), gm.match(), gm.pprl_link(), gm.evaluate() with typed results
  • Jupyter display -- rich HTML tables in notebooks
  • REST client -- gm.Client("http://localhost:8000")
  • 7 runnable examples -- copy-paste starting points in examples/
  • 935 tests on Python 3.11/3.12/3.13

API Stability

Public API is frozen. See docs/api-stability.md for the full surface:

  • CLI commands and flags
  • Config YAML schema
  • Python function signatures
  • REST endpoints
  • MCP tools

New features in minor releases. No breaking changes until 2.0.

Install

pip install goldenmatch                  # core
pip install goldenmatch[ray]             # + distributed backend
pip install goldenmatch[llm]             # + LLM scoring
pip install goldenmatch[embeddings]      # + sentence-transformers
pip install goldenmatch[pprl]            # + secure multi-party computation
pip install goldenmatch[postgres]        # + database sync