Integrate HierarchyBuilder into ingestion, add hierarchical query routing, enforce Williams-bound consistency by Copilot · Pull Request #96 · devlux76/cortex

Copilot · 2026-03-14T11:08:16Z

Original prompt

This section details on the original issue you should resolve

<issue_title>Design alignment review: audit docs vs implementation</issue_title>
<issue_description>### Goal
Audit the CORTEX design docs (wiki + DESIGN.md) against the current implementation and produce a concrete backlog of issues to fix any divergences.

Deliverables

A complete mapping of design claims → code implementation

A prioritized list of gaps (missing features, mismatches, outdated docs)

A set of linked subtasks (issues) that implement the fixes

Subtask Issues

P1-A: Integrate HierarchyBuilder into ingestion (books→volumes→shelves) #90 — Integrate HierarchyBuilder into ingestion (books→volumes→shelves)

P1-E: Implement dialectical retrieval pipeline (Metroid + subgraph + coherence) #91 — Implement dialectical retrieval pipeline (Metroid + subgraph + coherence)

P0-X: Finalize semantic neighbor naming (remove remaining Metroid references) #92 — Finalize semantic neighbor naming (remove remaining Metroid references)

P1-P: Ensure Williams-bound budgets are applied consistently and efficiently #93 — Ensure Williams-bound budgets are applied consistently and efficiently

P2-D: Sync documentation with current implementation and planned features #94 — Sync documentation with current implementation and planned features

P2-C: Audit Daydreamer consolidation behavior against design (LTP/LTD, throttling, budgets) #95 — Audit Daydreamer consolidation behavior against design (LTP/LTD, throttling, budgets)

Initial checklist

Audit ingestion/hierarchy (Books/Volumes/Shelves)

Audit retrieval pipeline (dialectical Metroid + subgraph + coherence)

Audit consolidation (Daydreamer throttles, LTP/LTD, recalc budgets)

Audit storage & sharing (OPFS integrity, MIME/model checks, eligibility)

Audit policy usage (Williams bounds, model-derived numerics, performance)

Create linked subtask issues for each major gap

Notes

This issue is the parent for the design alignment work; child issues should include Part of devlux76/cortex#89 references to this issue and be tracked as subtasks.
</issue_description>

<agent_instructions>The goal here is to finish the complete implementation of the library side of things. Design docs are now at https://github.com/devlux76/cortex/wiki please make sure to close each subissue as you go along. </agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes Design alignment review: audit docs vs implementation #89

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

…d references from neighbor graph code Closes #92 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

ingestText() now calls buildHierarchy() after page creation to produce the full Books→Volumes→Shelves hierarchy on every ingest call. IngestResult extended with books[], volumes[], shelves[] fields. Closes #90 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

…peline Query.ts now routes through Shelf→Volume→Book→Page hierarchy using rankShelves/rankVolumes/rankBooks before flat page scoring. Combines hierarchy-discovered pages with hotpath pages for comprehensive results. Closes #91 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

- FullNeighborRecalc now derives maxNeighbors from computeNeighborMaxDegree instead of using a fixed default of 16 - Query.ts uses getResidentCount() as proxy for corpus size to avoid scanning all pages on the hot path for computeSubgraphBounds Closes #93 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

…ms-derived max degree HebbianUpdater.decayAndPrune now derives maxDegree from computeNeighborMaxDegree(graphMass) instead of a fixed 16, matching the design requirement for Williams-bound-driven edge pruning. Closes #95 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

Check off all implemented and tested items across Hippocampus (HierarchyBuilder, FastNeighborInsert, Ingest), Cortex (Ranking, MetroidBuilder, KnowledgeGapDetector, OpenTSPSolver, Query, QueryResult), Daydreamer (ExperienceReplay, ClusterStability), integration tests, benchmarks, CI hardening, documentation, and product surfaces. Update section headings with ✅ COMPLETE markers and refresh the summary table to reflect v0.1/v0.5/v1.0 phases as complete. Only P3-A (WebGL provider) and P3-H (GitHub sync smoke test) remain pending. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…implementation Update PLAN.md to reflect full completion of Hippocampus (100%), Cortex (100%), and all documentation sections. Update TODO.md checkboxes for all completed P1/P2/P3 items. Fix HierarchyBuilder test to expect volumes/shelves from ingestText. Closes #94 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

Copilot

Pull request overview

Aligns implementation with the design across ingestion, querying, consolidation budgets, and semantic-neighbor naming by integrating hierarchy construction into ingest, adding hierarchical routing to queries, and removing remaining “metroid neighbor” terminology.

Changes:

Update ingest to build full Pages→Books→Volumes→Shelves hierarchy and return it in IngestResult.
Add Shelf→Volume→Book→Page routing in query() and avoid full-corpus scans for subgraph bounds by using resident count.
Replace hardcoded neighbor degree defaults with Williams-derived computeNeighborMaxDegree(...), and clean up semantic-neighbor naming in code/tests + sync plan/todo docs.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/sharing/SubgraphExchange.test.ts	Renames mock neighbor map from `metroidNeighbors` to `semanticNeighbors`.
tests/integration/IngestQuery.test.ts	Updates integration assertions for multi-book ingest + persisted volumes/shelves.
tests/hippocampus/Ingest.test.ts	Adjusts ingest unit test expectations for hierarchy output.
tests/hippocampus/HierarchyBuilder.test.ts	Updates hierarchy-builder ingest expectations to include volumes/shelves and multi-book coverage.
tests/daydreamer/PrototypeRecomputer.test.ts	Renames mock neighbor map to `semanticNeighbors`.
tests/daydreamer/HebbianUpdater.test.ts	Renames mock neighbor map to `semanticNeighbors`.
tests/daydreamer/FullNeighborRecalc.test.ts	Renames mock neighbor map + sync accessor naming.
tests/daydreamer/ClusterStability.test.ts	Updates comments/stubs to semantic-neighbor naming.
tests/cortex/Query.test.ts	Adds coverage ensuring query works with hierarchy-based routing present.
tests/SalienceEngine.test.ts	Removes obsolete Metroid-neighbor stub methods from mock store.
package-lock.json	Locks added dev deps (@vitest/coverage-v8, husky) already reflected in package.json.
lib/storage/IndexedDbMetadataStore.ts	Updates v3 store comment to semantic-neighbor wording.
lib/sharing/SubgraphExporter.ts	Renames local variables/comments to “semantic neighbors”.
lib/hippocampus/Ingest.ts	Changes ingest to insert semantic neighbors + build hierarchy; expands `IngestResult`.
lib/daydreamer/HebbianUpdater.ts	Derives `maxDegree` from Williams bounds when not provided.
lib/daydreamer/FullNeighborRecalc.ts	Derives `maxNeighbors` from Williams bounds when not provided.
lib/daydreamer/ClusterStability.ts	Renames documentation/comments to semantic-neighbor terminology.
lib/cortex/Query.ts	Implements hierarchical routing and uses resident count for Williams-derived subgraph bounds.
TODO.md	Updates completion status and “Last Updated” date.
PLAN.md	Updates plan status text to reflect completed modules and corrected terminology.

devlux76 · 2026-03-14T13:07:29Z

lib/hippocampus/Ingest.ts

@@ -149,8 +102,20 @@ export async function ingestText(
    metadataStore,
  });

-  // Run hotpath promotion for the newly ingested pages and book.
-  await runPromotionSweep([...pageIds, bookId], metadataStore);
+  // Build the full hierarchy: Pages → Books → Volumes → Shelves.
+  // buildHierarchy handles medoid selection, adjacency edges, prototype
+  // computation, Williams fanout enforcement, and promotion sweeps.
+  const hierarchy = await buildHierarchy(pageIds, {
+    modelProfile,
+    vectorStore,
+    metadataStore,
+  });


@copilot use your best judgement to resolve the issue in accordance with the DESIGN to the best of your abilities even if that means the DESIGN will need to be updated as well. Try to solve the problem while minimizing downstream and knock-on effects.

Fixed in 4335f67. Swapped the ordering so buildHierarchy() runs before insertSemanticNeighbors() — this ensures books/volumes (and their reverse indexes) exist when the neighbor inserter traverses page→book→volume to flag dirty volumes via flagVolumeForNeighborRecalc(). Also addressed the other review items in the same commit: renamed the misleading hierarchyPageIds in Query.ts to per-tier variables (volumeIdsFromShelves, bookIdsFromVolumes, pageIdsFromBooks) and strengthened the integration test to enforce exactly-once book membership.

devlux76 · 2026-03-14T13:08:06Z

lib/cortex/Query.ts

+  if (hotpathShelfEntries.length > 0) {
+    const topShelves = await rankShelves(
+      queryEmbedding,
+      hotpathShelfEntries.map((e) => e.entityId),
+      Math.max(2, Math.ceil(hotpathShelfEntries.length / 2)),
+      rankingOptions,
+    );
+    for (const s of topShelves) {
+      const shelf = await metadataStore.getShelf(s.id);
+      if (shelf) {
+        for (const vid of shelf.volumeIds) hierarchyPageIds.add(vid);
+      }
+    }
+  }
+
+  // Rank volumes — include both hotpath volumes and those found via shelf drill-down
+  const volumeCandidateIds = new Set<Hash>([
+    ...hotpathVolumeEntries.map((e) => e.entityId),
+    ...hierarchyPageIds,
+  ]);
+  hierarchyPageIds.clear();
+
+  if (volumeCandidateIds.size > 0) {
+    const topVolumes = await rankVolumes(
+      queryEmbedding,
+      [...volumeCandidateIds],
+      Math.max(2, Math.ceil(volumeCandidateIds.size / 2)),
+      rankingOptions,
+    );
+    for (const v of topVolumes) {
+      const volume = await metadataStore.getVolume(v.id);
+      if (volume) {
+        for (const bid of volume.bookIds) hierarchyPageIds.add(bid);
+      }
+    }
+  }

-  const hotResults = await rankPages(queryEmbedding, hotpathIds, topK, rankingOptions);
+  // Rank books — include both hotpath books and those found via volume drill-down
+  const bookCandidateIds = new Set<Hash>([
+    ...hotpathBookEntries.map((e) => e.entityId),
+    ...hierarchyPageIds,
+  ]);
+  hierarchyPageIds.clear();
+
+  if (bookCandidateIds.size > 0) {
+    const topBooks = await rankBooks(
+      queryEmbedding,
+      [...bookCandidateIds],
+      Math.max(2, Math.ceil(bookCandidateIds.size / 2)),
+      rankingOptions,
+    );
+    for (const b of topBooks) {
+      const book = await metadataStore.getBook(b.id);
+      if (book) {
+        for (const pid of book.pageIds) hierarchyPageIds.add(pid);
+      }


@copilot use your best judgement to resolve the issue in accordance with the DESIGN to the best of your abilities even if that means the DESIGN will need to be updated as well. Try to solve the problem while minimizing downstream and knock-on effects.

Fixed in 91b10f4. Ranking functions (rankShelves/rankVolumes/rankBooks/rankPages/spillToWarm) now return RankedResult { id, score, childIds } — child IDs are captured during the scoring pass and returned alongside the score. Query.ts uses s.childIds/v.childIds/b.childIds directly from ranking results instead of re-fetching entities from IDB.

devlux76 · 2026-03-14T13:08:29Z

lib/cortex/Query.ts

+  // Collect candidate page IDs from hierarchical routing.
+  const hierarchyPageIds = new Set<Hash>();
+
+  // Shelf → Volume → Book → Page drill-down
+  if (hotpathShelfEntries.length > 0) {
+    const topShelves = await rankShelves(
+      queryEmbedding,
+      hotpathShelfEntries.map((e) => e.entityId),
+      Math.max(2, Math.ceil(hotpathShelfEntries.length / 2)),
+      rankingOptions,
+    );
+    for (const s of topShelves) {
+      const shelf = await metadataStore.getShelf(s.id);
+      if (shelf) {
+        for (const vid of shelf.volumeIds) hierarchyPageIds.add(vid);
+      }
+    }
+  }
+
+  // Rank volumes — include both hotpath volumes and those found via shelf drill-down
+  const volumeCandidateIds = new Set<Hash>([
+    ...hotpathVolumeEntries.map((e) => e.entityId),
+    ...hierarchyPageIds,
+  ]);
+  hierarchyPageIds.clear();
+


@copilot use your best judgement to resolve the issue in accordance with the DESIGN to the best of your abilities even if that means the DESIGN will need to be updated as well. Try to solve the problem while minimizing downstream and knock-on effects.

Already addressed in 4335f67 — hierarchyPageIds was renamed to per-tier variables: volumeIdsFromShelves, bookIdsFromVolumes, pageIdsFromBooks. Each variable now clearly indicates its tier.

devlux76 · 2026-03-14T13:11:01Z

tests/integration/IngestQuery.test.ts

+    // Every page must belong to exactly one book
+    const allBookPageIds = result.books.flatMap((b) => b.pageIds);
    for (const page of result.pages) {
-      expect(storedBook!.pageIds).toContain(page.pageId);
+      expect(allBookPageIds).toContain(page.pageId);
+    }


@copilot it makes more sense if "exactly" is replaced with "at least". There's a small albeit non-zero chance that we hit the same page across multiple books. For instance legal corpora are famously long and duplicative and full of cross citations.

Fixed in 91b10f4. Changed the test comment to "Every page must belong to at least one book" and removed the exactly-once membership assertion. Pages can now appear in multiple books without failing the test.

…line Swap the ordering so that Books/Volumes/Shelves (and their reverse indexes) exist before semantic neighbor insertion attempts to flag dirty volumes via page→book→volume traversal. Also: rename misleading `hierarchyPageIds` in Query.ts to per-tier variables (`volumeIdsFromShelves`, `bookIdsFromVolumes`, `pageIdsFromBooks`) and strengthen the integration test assertion to enforce exactly-once book membership. Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

…book membership assertion Ranking functions (rankShelves/rankVolumes/rankBooks/rankPages/spillToWarm) now return RankedResult { id, score, childIds } so Query.ts can use child IDs from the ranking pass directly instead of re-fetching entities from IDB. Test assertion for page-to-book membership relaxed from "exactly one" to "at least one" per user feedback — pages can legitimately appear in multiple books (e.g., legal corpora with cross-citations). Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

Initial plan

58e7485

Copilot AI assigned Copilot and devlux76 Mar 14, 2026

Copilot started work on behalf of devlux76 March 14, 2026 11:08 View session

Copilot AI and others added 2 commits March 14, 2026 11:13

Initial plan for design alignment audit

e4813bf

Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

fix(#92): finalize semantic neighbor naming — remove remaining Metroi…

e6b44bf

…d references from neighbor graph code Closes #92 Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>

github-actions bot added the layer: testing Test coverage and integration tests label Mar 14, 2026

Copilot AI and others added 6 commits March 14, 2026 11:17

Copilot AI changed the title ~~[WIP] Audit CORTEX design docs against current implementation~~ Integrate HierarchyBuilder into ingestion, add hierarchical query routing, enforce Williams-bound consistency Mar 14, 2026

Copilot AI requested a review from devlux76 March 14, 2026 11:38

Copilot finished work on behalf of devlux76 March 14, 2026 11:38

devlux76 marked this pull request as ready for review March 14, 2026 12:33

Copilot AI review requested due to automatic review settings March 14, 2026 12:33

Copilot started reviewing on behalf of devlux76 March 14, 2026 12:33 View session

Copilot AI reviewed Mar 14, 2026

View reviewed changes

github-actions bot added the layer: documentation API docs, developer guide, architecture diagrams label Mar 14, 2026

Copilot started work on behalf of devlux76 March 14, 2026 13:07 View session

Copilot finished work on behalf of devlux76 March 14, 2026 13:13

Copilot started work on behalf of devlux76 March 14, 2026 13:13 View session

Copilot finished work on behalf of devlux76 March 14, 2026 13:19

devlux76 merged commit 55d6249 into main Mar 14, 2026
3 checks passed

devlux76 deleted the copilot/audit-design-docs-implementation branch March 14, 2026 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate HierarchyBuilder into ingestion, add hierarchical query routing, enforce Williams-bound consistency#96

Integrate HierarchyBuilder into ingestion, add hierarchical query routing, enforce Williams-bound consistency#96
devlux76 merged 11 commits intomainfrom
copilot/audit-design-docs-implementation

Copilot AI commented Mar 14, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

devlux76 Mar 14, 2026

Uh oh!

Copilot AI Mar 14, 2026

Uh oh!

devlux76 Mar 14, 2026

Uh oh!

Copilot AI Mar 14, 2026

Uh oh!

devlux76 Mar 14, 2026

Uh oh!

Copilot AI Mar 14, 2026

Uh oh!

devlux76 Mar 14, 2026

Uh oh!

Copilot AI Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deliverables

Subtask Issues

Initial checklist

Notes

Comments on the Issue (you are @copilot in this section)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

devlux76 Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

devlux76 Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

devlux76 Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

devlux76 Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 14, 2026 •

edited

Loading