Reduce update churn and stabilize community IDs by FatahChan · Pull Request #822 · Graphify-Labs/graphify

FatahChan · 2026-05-11T23:24:59Z

Summary

add graphify update --no-cluster and thread it through the watch rebuild path
make graphify update idempotent by skipping graph.json / GRAPH_REPORT.md rewrites when graph/report content is unchanged (ignoring commit-hash-only report drift)
stabilize clustering output with deterministic partition input ordering, seeded Leiden when supported, and overlap-based remapping of new communities to prior IDs
add regression tests for remapping stability, idempotent rebuild under cluster-ID flapping, and update --no-cluster behavior

Closes #741

Test plan

uv run --with pytest pytest tests/test_cluster.py tests/test_watch.py tests/test_cli_export.py
uv run python -m graphify update . && uv run python -m graphify update .
uv run --with pytest pytest (pre-existing baseline failures unrelated to this change remain)

Make `graphify update` idempotent by skipping output rewrites when graph/report content is unchanged, add `update --no-cluster`, and preserve community IDs across runs via overlap-based remapping with deterministic partition inputs. Co-authored-by: Cursor <cursoragent@cursor.com>

Use safe JSON serialization fallbacks for deterministic sort keys in clustering and graph canonicalization, and skip invalid community IDs with a stderr warning instead of raising during update rebuilds. Co-authored-by: Cursor <cursoragent@cursor.com>

cluster-only re-runs Leiden clustering and then re-applies the existing .graphify_labels.json by raw cid index, which causes labels to attach to clusters whose members are unrelated to the label's original meaning whenever the graph has changed between labeling and re-clustering. Mirror the safety net already present in watch.py:_rebuild_code added in Graphify-Labs#822 for the watch/update paths. Adds a regression test that fails without the fix (label cids become orphaned from graph.json community attributes after re-clustering). Refs: Graphify-Labs#1027 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…1028) cluster-only re-runs Leiden clustering and then re-applies the existing .graphify_labels.json by raw cid index, which causes labels to attach to clusters whose members are unrelated to the label's original meaning whenever the graph has changed between labeling and re-clustering. Mirror the safety net already present in watch.py:_rebuild_code added in #822 for the watch/update paths. Adds a regression test that fails without the fix (label cids become orphaned from graph.json community attributes after re-clustering). Refs: #1027 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Ahmad Fathallah and others added 2 commits May 12, 2026 02:23

FatahChan closed this May 12, 2026

This was referenced May 26, 2026

cluster-only skips overlap-based community remapping, so labels become misaligned after re-clustering #1027

Closed

fix(cli): apply remap_communities_to_previous in cluster-only path #1028

Merged

TPAteeq mentioned this pull request Jul 4, 2026

feat(cluster): carry community LLM labels over re-clustering via member overlap #1662

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Reduce update churn and stabilize community IDs#822

Reduce update churn and stabilize community IDs#822
FatahChan wants to merge 2 commits into
Graphify-Labs:v7from
FatahChan:fix/update-idempotency-churn

FatahChan commented May 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

FatahChan commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FatahChan commented May 11, 2026 •

edited

Loading