Multi-label entities: use max similarity across all names by claude[bot] · Pull Request #28 · resolveworks/worldgraph

claude · 2026-03-27T14:02:59Z

Summary

Replaces Node.name: str with Node.names: list[str] so entities can carry multiple names (e.g. "Meridian Technologies" and "Meridian Tech")
Name similarity seeding in propagate_similarity now computes max(soft_tfidf(a, b)) across all name pairs, ensuring the closest match is always used
IDF computation includes all names from all entities
Graph I/O serializes the names list and loads legacy single-name format for backward compatibility

Context

Prerequisite for progressive merging (#25): when entities merge during propagation, the merged entity retains all names from both sides so subsequent name-similarity seeding remains accurate.

Test plan

New test: multi-label entity seeds similarity from best name pair
New test: all names contribute to IDF computation
New test: multi-label names survive save/load round-trip
New test: legacy single-name JSON format loads correctly
All 58 existing tests pass unchanged

Closes #27

🤖 Generated with Claude Code

Entities can carry multiple names (e.g. "Meridian Technologies" and "Meridian Tech"). Name similarity seeding now computes max(soft_tfidf) across all name pairs, ensuring the closest match is always used. - Node.name: str → Node.names: list[str] - Graph.add_entity() accepts str or list[str] - IDF built from all names across all entities - Graph I/O serializes names list, loads legacy single-name format - Functionality pooling uses first name as representative Closes #27 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude

One issue worth fixing: backward-compatibility shim for legacy single-name JSON format.

load_graph (graph.py:52-56) adds fallback logic to handle the old "name" key:

raw_names = node_data.get("names") or [node_data["name"]]
names=raw_names if isinstance(raw_names, list) else [raw_names],

There are no old-format graph files in the repo and no external users. Per CLAUDE.md conventions ("never add backward-compatibility shims... Refactor completely"), this should just read node_data["names"] directly. The test_load_legacy_single_name_format test should be removed as well — it tests the shim.

Everything else looks correct: the max-over-name-pairs seeding, IDF over all names, names[0] for functionality phrase pairs (same semantics as the old .name), and the union-find display. Clean refactor otherwise.

monneyboi

Remove the backward-compatibility shim in load_graph — just read node_data["names"] directly. Delete test_load_legacy_single_name_format as well.

Per review: no old-format files exist and no external users, so the fallback in load_graph and its test are unnecessary shims. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

Looks good. All 57 tests pass. The Node.name → Node.names refactor is applied consistently across graph.py, match.py, and all tests. Name similarity seeding correctly takes the max over all name pairs, IDF includes all names, and the backward-compat shim was properly removed per CLAUDE.md conventions. No dead code or half-finished refactors.

claude bot commented Mar 27, 2026

View reviewed changes

monneyboi requested changes Mar 27, 2026

View reviewed changes

Remove backward-compatibility shim for legacy single-name format

a224470

Per review: no old-format files exist and no external users, so the fallback in load_graph and its test are unnecessary shims. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude bot commented Mar 27, 2026

View reviewed changes

monneyboi merged commit 8a635b6 into main Mar 27, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-label entities: use max similarity across all names#28

Multi-label entities: use max similarity across all names#28
monneyboi merged 2 commits intomainfrom
multi-label-entities

claude bot commented Mar 27, 2026

Uh oh!

claude bot left a comment

Uh oh!

monneyboi left a comment

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claude bot commented Mar 27, 2026

Summary

Context

Test plan

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Uh oh!

monneyboi left a comment

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant