fix(search): scope local-search graph expansion to top-hit docs (RAN-35)#80
fix(search): scope local-search graph expansion to top-hit docs (RAN-35)#80
Conversation
`search.LocalSearch` narrowed the initial entity set to the top-hit documents, but its BFS walk called `RelationshipsForEntity` without any doc filter. That re-expanded through unrelated relationships anywhere in the project, so a scoped query could leak off-scope entities and edges into the result set. Add `Store.RelationshipsForEntityInDocs(entityID, depth, docIDs)` — a BFS variant that constrains every hop to `doc_id IN (...)` and dedups relationships across hops. Use it from `LocalSearch` so the expansion stays inside the top-hit doc set. Both the REST search path (`/search/local`) and the MCP `docsiq_search_local` tool go through `search.LocalSearch`, so the REST/MCP behaviour stays aligned after the fix. Entity-focused paths (`/entities/:id`, MCP entity-graph tools) intentionally keep the unscoped walk — they take an entity as input, not a document set. Regression coverage: - store: `TestRelationshipsForEntityInDocs_OnlyReturnsEdgesFromScopedDocs` proves a relationship from an unrelated document cannot appear in a scoped BFS result; depth and empty-input cases covered too. - search: `TestLocalSearch_GraphExpansionScopedToTopHitDocs` exercises the full `LocalSearch` path end-to-end. Co-Authored-By: Paperclip <noreply@paperclip.ing>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7ebcb85fd5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const docChunkSize = 900 | ||
| const frontierChunkSize = 900 |
There was a problem hiding this comment.
Reduce chunk sizes to honor SQLite bind parameter limit
RelationshipsForEntityInDocs chunks both frontier and docIDs at 900, but each query binds source_id IN (...) and target_id IN (...) for the same frontier plus doc_id IN (...), i.e. 2*len(fChunk)+len(dChunk) placeholders. At the configured sizes this reaches 2700 params, which exceeds SQLite builds that keep the common 999-variable cap and causes runtime too many SQL variables errors when local search spans larger frontiers/doc sets; in LocalSearch those errors are silently skipped (continue), so graph expansion can disappear without surfacing an error.
Useful? React with 👍 / 👎.
Summary
search.LocalSearchnarrowed the initial entity set to top-hit docs but then calledRelationshipsForEntitywith no doc filter, so BFS expansion could leak off-scope entities/edges into a scoped result. Fixes RAN-35.Store.RelationshipsForEntityInDocs(entityID, depth, docIDs)— BFS that constrains every hop todoc_id IN (...)and dedups across hops.LocalSearchnow uses it so the expansion stays inside the top-hit doc set./search/local) and MCP (docsiq_search_local) both route throughsearch.LocalSearch, so behaviour stays aligned after the fix. Entity-focused endpoints that take an entity ID (not a doc set) keep the unscoped walk.Regression coverage
internal/store/relationships_for_entity_in_docs_test.goTestRelationshipsForEntityInDocs_OnlyReturnsEdgesFromScopedDocs— proves a relationship from an unrelated document cannot appear in a scoped BFS result (directly exercises the leak described in RAN-35).TestRelationshipsForEntityInDocs_EmptyDocsReturnsNil— empty doc set → no expansion.TestRelationshipsForEntityInDocs_RespectsDepthLimit— depth=1 from a seed only returns the direct edge.internal/search/local_scope_test.goTestLocalSearch_GraphExpansionScopedToTopHitDocs— end-to-end: top chunk's doc set scopes the graph walk; a relationship a seed entity participates in via an unrelated doc does not leak through.Test plan
CGO_ENABLED=1 go build -tags sqlite_fts5 ./...CGO_ENABLED=1 go vet -tags sqlite_fts5 ./...CGO_ENABLED=1 go test -tags sqlite_fts5 -timeout 300s ./...— 662 passedmain(verified by asserting that unscopedRelationshipsForEntitystill surfaces the out-of-scope edge in the fixture)