v1.3.0 — CCMS Cluster Comparison & Sensitivity Analysis
What's New
CCMS Cluster Comparison
Compare two ER clustering outcomes without ground truth using the Case Count Metric System (Talburt et al., arXiv:2601.02824v1).
import goldenmatch as gm
result = gm.compare_clusters(clusters_a, clusters_b)
print(result.summary())
# {"unchanged": 42, "merged": 3, "partitioned": 5, "overlapping": 1, "twi": 0.92, ...}CLI:
goldenmatch compare-clusters run_a.json run_b.json --details --case-type mergedParameter Sensitivity Analysis
Sweep config parameters and see how clustering changes at each value:
from goldenmatch import run_sensitivity, SweepParam
results = run_sensitivity(
file_specs=[("data.csv", "src")],
config=cfg,
sweep_params=[SweepParam("threshold", 0.70, 0.95, 0.05)],
sample_size=5000,
)
for r in results:
print(r.stability_report())CLI:
goldenmatch sensitivity data.csv -c config.yaml --sweep threshold:0.70:0.95:0.05 --sample 5000Details
- 4 cluster transformation cases: unchanged, merged, partitioned, overlapping
- Talburt-Wang Index (TWI) for normalized similarity measure
- Per-point error handling — failed sweep points logged and skipped, partial results preserved
- 16 new tests, 1260 total passing
Full Changelog: v1.2.7...v1.3.0