feat: community-detection auto-albums over the entity graph (P13e-1)#148
Merged
Conversation
Group library items into thematic "Discovered" albums via deterministic label propagation over the entity graph (shared uploader/playlist/tag + co-download) — items linked through a web of signals that no single-facet smart album captures. Every-device (pure Datalog pull + Dart clustering; no embedder). Surfaced in Collections → Albums beside "Suggested", labeled by dominant shared signal (tag → uploader → site → title), with one-tap Save as collection (reuses the SuggestedAlbum model + screen). No schema, no deps. https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
P13e-1 — Community-detection auto-albums
First PR of P13e (advanced graph analytics). Groups library items into thematic "Discovered" albums via deterministic label propagation over the entity graph — items linked through a web of shared signals (uploader / playlist / tag + co-download) that no single-facet smart album captures. The looser, thematic cousin of the P10 "Suggested" similarity albums (whose tight clusterer explicitly deferred this to P13).
Every-device — pulls from Cozo (Datalog) and clusters in pure Dart, no embedder/RAM gate (maintainer call; semantic-similarity + tier enhancements → BACKLOG). No schema change, no deps.
How it works
cozo_query.dart—entityMembershipScript()(global[mediaId, kind, key]projection for uploader/playlist/tag; site excluded as too coarse) +coDownloadPairsScript()([a, b]).community_clustering.dart(new, pure — mirrorsnear_duplicate_clustering.dart) —detectCommunities(...): builds an item graph from shared buckets (dropping over-generic buckets >maxGroupSize, the "discard blobs" rule) + co-download edges, runs deterministic label propagation (sequential, ties → smallest label), returns communities withminSize ≤ size ≤ maxSize, largest-first, each with its dominant tag.GraphQueryService.communityClusters()— decodes the two scripts → the clusterer;[]when the store is unavailable.clusteredAlbumsProvider+clusterLabel— hydrate toMediaItems and label by dominant shared signal (tag → uploader → site → newest-title fallback), reusing the existingSuggestedAlbummodel +/suggested-albumscreen (MediaGrid + Save as collection).Tests (CI; native Cozo is APK-verified)
community_clustering_test— entity-bucket splitting, web-merge across signals, co-download edges,minSizefiltering,maxGroupSizepruning, dominant-tag (support ≥ 2, tie → smallest), and determinism (a single bridge edge correctly does not merge two dense clusters).cozo_query_test— both new script strings;graph_query_service_test—communityClusters()decode + unavailable;clustered_albums_test— provider hydrate/label +clusterLabeltruth table + empty-when-unavailable;collections_screen_test— the Discovered section renders.dart format+flutter analyzeclean.Docs
P13-PLAN.md— e-1[~]+ parentP13e[~](1 of 3);GRAPH-SPEC.md§7 — clustered-albums row realized at e-1 over the entity graph;VERIFICATION.md— P13e-1 checklist;BACKLOG.md— semantic-similarity/tier enhancements + richer labels/caching.Verification
Next: e-2 centrality "Rediscover", then e-3 path/bridge + graph-view polish.
https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
Generated by Claude Code