docs(architecture): add GENOME-FOUNDRY-SENTINEL — artifact-sharing economy on consumer hardware (Lane H design)#1327
Conversation
joelteply
left a comment
There was a problem hiding this comment.
Crit limited to overlap-with-shipped — others are better placed on (a)/(b)/(c).
Broker/governor split (Part 11) — endorsed. "PressureBroker keeps owning admission; SubstrateGovernor owns sizing" is the right boundary. My #1299 broker is hard-wired around eviction-when-over-threshold; sizing decisions (working-set caps, speculation depth, federation cadence) genuinely don't belong in it. The governor consuming broker's PressureAlert stream as its pressure-response input (Part 11 cascade) makes the broker a producer for the governor's loop without changing the broker's surface. Clean.
Artifact noun overlap with PIECE-2 — pin the relationship before Lane H lands. This doc's Artifact taxonomy (Part 1, six durable kinds + provenance) is conceptually adjacent to PIECE-2's ArtifactKey (runtime/artifact_handle.rs, #1321/#1323) — both name producer-side outputs other parts of the system care about. They're NOT the same scope today:
- PIECE-2's
ArtifactKey= transient bus-event keys for module-to-module dispatch (e.g.paging/broker.snapshot). String newtype, no provenance. Routed via MessageBus (per PR-3 design check). - This doc's
Artifact= durable cache-resident objects (model weights, KV pages, LoRAs, traces, compositions). Strongly-typed, mandatoryProvenance, cached across the 5-tier hierarchy.
That's a real distinction (transient signal vs persistent value) and the doc is correct to NOT just reuse ArtifactKey. But Lane H's PR-1 should explicitly cross-reference PIECE-2 in the docstring + name the relationship: "These are durable, content-addressed artifacts; bus-event keys for inter-module signaling are runtime::artifact_handle::ArtifactKey (PIECE-2). Different abstraction layer; not interchangeable." Otherwise readers will hit both names + assume they're variants of the same thing.
One forward-looking nit on Part 11. "Direct hardware-class detection" as a governor input — make sure this is the same inference_capability::HardwareProfile codex's #1315 PR-1 ships (cpu_cores, free_vram_bytes, has_metal/has_cuda/has_vulkan). Inventing a second hardware-class type splits the truth. The doc doesn't currently say either way.
LGTM on the overall shape. Lane H sequence (7 PRs) is the right cadence; PR-1 (governor types + policy schema only, no consumers) is the natural pure-functions slice. Peer ack — can't formally approve (shared author identity).
Three updates to ALPHA-GAP-ANALYSIS.md following continuum#1327: 1. Lane H added to the lane status table: Substrate governor + tiered genome cache. Sibling to Lane E (broker owns admission; governor owns sizing). 7-PR implementation sequence detailed in GENOME-FOUNDRY-SENTINEL.md Part 13. Currently Proposed, needs owner claim. 2. Lane claim update at end of the lane discussion: Lane H proposed via continuum#1327 with full design pinned to that doc; sibling to Lane E with the boundary stated explicitly. 3. Document Map gets GENOME-FOUNDRY-SENTINEL entry under "Runtime substrate (load-bearing)" — the artifact-sharing economy on top of the CBAR substrate. Tiered genome cache, page faults, foundry as JIT, sentinel-AI as profile-guided optimizer, demand-aligned recall, composer + speculator, SubstrateGovernor (DVFS). 4. Immediate Next Actions step 9 added: claim Lane H. Step 10 (formerly step 9) updated to reflect what's landed in this doc batch (CBAR-SUBSTRATE refinement via #1324, CONTINUUM-ARCHITECTURE refresh via #1317, CONTINUUM-VISION refresh via #1320, GENOME-FOUNDRY-SENTINEL via #1327) and what's next (CLAUDE.md substrate pointer; stale-section deprecations in UNIVERSAL-SENSORY / LEARNING / QUEUE-DRIVEN-COGNITION).
…onomy on consumer hardware
Adds docs/architecture/GENOME-FOUNDRY-SENTINEL.md, the design doc for
the artifact-sharing economy that flows on top of the CBAR-SUBSTRATE
runtime contract.
The synthesis: persona = process; genome = cache hierarchy; engrams =
paged virtual memory; foundry = JIT compiler; sentinel-AI = profile-
guided optimizer; substrate governor = DVFS. The autonomy side and the
efficiency side are the same architecture seen from two angles. The
substrate works on a MacBook Air (16GB UMA) and on an RTX 5090 (32+64
GB) with the same Rust code; only the governor's policy file differs.
Structure (15 parts + diagram + see-also):
1. Artifact taxonomy — six durable artifact kinds (commands, modules,
personas, LoRA layers, MoE experts, engrams) plus transient
composition state, each with creator / adopter / refinement /
provenance shape. Provenance is mandatory — the substrate refuses
artifacts without it.
2. Cache hierarchy — five tiers (L1 accelerator-resident through L5
cold archive), eviction policy per tier, two hardware anchors
(MacBook Air and RTX 5090). Same Rust code, parameterized.
3. Paging, working set, page faults — WorkingSet + WorkingSetManager
types; PageFault as a typed event on the trace bus; how recurring
faults become the substrate's main "working set mismatch" signal.
4. Compartmentalization — personas as processes, genome pool as shared
read-only library, MMU-style permission table per region, audit log.
5. Foundry as JIT — Foundry trait, SOTASource, ImportedArtifact, why
it's substrate not external service (provenance, hardware-awareness,
federation alignment).
6. Sentinel-AI as profile-guided optimizer — SentinelAI trait,
CognitionTrace, RefinedArtifact, why local-first and one-per-
instance not one-per-persona.
7. Demand-aligned recall — DemandAlignedRecall trait, CapabilityQuery,
RankedPool, RecallScore. The central substrate API every cell
should reach for; persona keeps composition agency.
8. Composition — CompositionPlan, Composer trait, materialize.
Composition is the binary; genome pool is the library; composer
is the linker.
9. Speculative pre-composition — SpeculativeBranch, Speculator trait,
hit-rate tracking. Conservative on Air, aggressive on 5090.
10. Sharing protocol — global-scale hive, eventual consistency with
provenance not MESI, trust-class lookup, trust learned not declared.
11. Substrate governor — DVFS for AI, HardwareClass detection,
PressureSignal cascade in defined order (speculation first,
concurrency next, working set, federation cadence, consolidation
deferral).
12. Artifact lifecycle — Created → Adopted → Refined → Archived →
Retired with provenance preserved at every transition, all
typed events on the trace bus.
13. Connection to CBAR-SUBSTRATE — three connection points (recall on
ModuleContext, broker informs governor, RuntimeFrame carries
CompositionRef). Proposes Lane H in ALPHA-GAP with 7-PR sequence.
14. Acceptance criteria — concrete proofs across provenance,
observability, hardware portability, recall, foundry, sentinel,
lifecycle, compartmentalization, governor cascade.
15. Open questions — 8 real questions the engineer will hit, with
tentative answers (MoE granularity, engram embedding, cross-persona
privacy default, foundry trust anchor, speculation discard cost,
24/7 instance scheduling, federation discovery, composition
stability).
Architecture diagram included — synthesis flow showing foundry +
sentinel + consolidation feeding genome pool, persona working sets
paging from the pool, substrate governor underneath. Diagram earns
its space; not decorative.
Doc-only PR. No code touched. Every Rust trait shape shown is proposed
(targeted at src/workers/continuum-core/src/genome/, foundry/,
sentinel/, governor/). Implementation lands per ALPHA-GAP Lane H once
the design here is reviewed.
Part 11 was the most under-specified section of the doc relative to its load-bearing role: "same Rust code on Air and 5090, different policy file" is the architectural pitch, but the policy file format was not shown, the cascade thresholds were not stated, and the governor's own performance budget was missing. This commit obsesses on Part 11 specifically, expanding it from ~50 lines to ~280, deep enough that an engineer can land governor-types (the first Lane H PR) without writing more docs first. Added subsections: - Trait surface — SubstrateGovernor with wait-free Arc current_policy reads, subscribe() for wake-on-change, never blocks readers. Policy is rewritten under pressure, never mutated in place (arc_swap pattern). - HardwareClass detection — deterministic probe sequence at boot (silicon, vram, system_ram, power_source, thermal_class, battery, thermal_headroom). Each probe has a typed fallback; silent guess-where-we-are is forbidden by the same no_silent_fallback rule as the rest of the substrate. Re-detection triggers: eGPU hot-plug, power source change, 5-minute periodic sanity check. - Policy file format — concrete TOML schemas for the two anchor configurations (Apple M-thinandlight 16GB UMA and NVIDIA 5090 workstation). Same schema, same Rust loader, same GovernorPolicy struct — only the numbers differ. Intermediate hardware ships as defaults; ~/.continuum/policy/local.toml is the user-overlay escape hatch. - Adjustment cascade with thresholds, hysteresis, algorithm. Six steps (0 = normal, 5 = max throttle). Each step has an enter threshold and an exit threshold; the gap is the hysteresis that prevents oscillation. Specific signal thresholds named (SpeculationMissRate > 0.5, VRAMHigh > 85, Thermal::Hot, etc.). Rust pseudocode for the step-up / step-down algorithm. Restore order rule: speculation aggressiveness restored one step LATER than it was throttled (calibration window) — the single most- important anti-oscillation rule. - Runtime adjustment loop — small explicit tokio loop, the only place that mutates GovernorState. No subsystem writes to the governor directly; pressure flows in via PressureBroker (CBAR- SUBSTRATE), policy flows out via Arc subscriptions. - Federation policy reconciliation — deliberately minimal. Instances do NOT sync policy (a 5090 must not be throttled by a fellow Air's pressure). Only RecallScoreWeights are federated, so the federation agrees on what counts as trustworthy without agreeing on hardware sizing. - Override mechanism — three escape hatches for engineers: CONTINUUM_POLICY_FILE env var; ~/.continuum/policy/local.toml overlay; `continuum governor pin --step N` CLI. All overrides emit typed GovernorOverride events so VDD records aren't misattributed. - Observability — five event types emitted to the trace bus on every state change. Every VDD record carries the active policy_version and cascade_step so VDD runs at different throttle levels are attributable to the governor, not noise. - Performance budget for the governor itself — wait-free reads < 50 ns, subscriber wake < 1 μs, cascade evaluation < 10 μs, policy rewrite < 100 μs, periodic re-evaluation < 1 ms / 5s. The governor cannot become a contention point or a latency tax; its own performance is part of its acceptance criteria. The section is now engineer-buildable: the first Lane H PR (governor-types) lands the trait surface, the policy loader, and the hardware detection probes. Subsequent PRs land the cascade algorithm and the federation reconciliation. The doc tells the engineer exactly what each PR ships. Doc-only change. Part 11 only; other parts of the doc unchanged.
…ss the grid Recall is the single most-used substrate primitive and the place where consumer-hardware federation either earns its keep or doesn't. Previously sketched at the trait level; now deep enough that an engineer can land the recall PR confidently and another agent can write a compliant client against it. The dynamicism-across-the-grid framing changed the shape of this section. Recall is no longer a local lookup — it's the substrate the federated underdogs use to coordinate, and the ingenuity of its design is what makes a swarm of consumer machines compete with single-datacenter brute force. Added subsections (in order): - Trait surface — explicit recall() + replay() pair. CapabilityQuery gains RecallScope (Local | LocalThenGrid | Federation) and FreshnessTarget. PersonaContext explicit. RankedPool gains a per-artifact ResidencyHint so the persona sees not just what's relevant but where it lives and what it costs to use. This is the load-bearing addition: cost-aware composition without the persona having to know the topology. - The scoring function — explicit, tunable, sentinel-refined. Concrete Rust score() showing how the five factors combine. Each factor has a clean definition. grid_penalty(latency_ms) as the steep cost function for federated recall: same-LAN ~0.55, cross-region ~0.15. The penalty is steep on purpose — a hot local L3 hit usually wins, which is why a federated swarm of Airs can compete with a datacenter (swarm's local cache wins latency; swarm's diversity wins coverage; substrate's recall makes both visible). - Dynamic weights — both governor and sentinel tune. Governor sets per-hardware-class baseline weights (Air emphasizes tier_proximity; 5090 emphasizes semantic match because it has room to hold more hot). Sentinel observes recall→outcome chains and refines per-persona weights as profile-guided optimization of the recall function itself. Sentinel-refined weights are themselves publishable artifacts with provenance. - Indexing — sub-ms local, coordinated grid. Four layered structures with explicit costs: working-set index (in-memory HashMap, < 1 ms log n); local catalog (sqlite + hnsw ANN, < 1 ms top-K); grid catalog (gossip-propagated peer summaries, < 5 ms cached); federation catalog (pull-based, governor-rate- limited). First layer that satisfies budget + freshness wins. - Within-turn caching and coalescing — two behaviors: memoization of identical CapabilityQuery within one turn; coalescing of concurrent identical queries via shared BroadcastReceiver. Across personas, coalescing is sub-query (embed once, ANN-lookup once, score per-persona). Prevents the multi-recall-per-turn pattern from re-running the pipeline. - Cross-instance recall — the grid coordination layer. Three rules: per-instance pull cadence governs both pushes and pulls (Air ≈ 10 min, 5090 ≈ 1 min); grid catalog is gossip- propagated NOT query-on-demand so recall hits the local cache of the gossip at sub-ms latency; grid artifact blobs require explicit promotion to fetch — RankedPool shows GridPeer residency without paying network cost until the persona pins. The win: a swarm of Airs gossiping summaries every 10 minutes has effectively realtime federated artifact catalog, because the scoring function uses the cached summary. Only on pin does the blob move. Performance on cellular bandwidth + coordination at the level of "what exists, what's been refined." - Replay semantics — RecallTrace captures snapshotted query + context + policy version + content-hashed catalog snapshot + returned pool. replay(trace) re-runs score() deterministically. Sentinel uses this to attribute "did my refinement actually win the ranking?" — without deterministic replay, sentinel can't tell help from luck. - Recall under pressure — explicit table mapping governor cascade steps 0..5 to recall behavior. Step 5 caps at L1+L2 only; cold- archive returns Deferred(MemoryPressure). Recall under pressure is correct — doesn't lie, doesn't return placeholders, returns smaller pools with explicit Deferred entries. Composer sees and narrows or defers; never silently degrades. - Performance budget — concrete sub-ms targets for both anchors. First three rows (within-turn cache hit, working-set index hit, local catalog ANN) cover ≥ 95% of recalls. Acceptance criteria includes P50/P99 smoke test. - "Why this earns its space in the doc" — five properties together (local-first, gossip-aware, sentinel-refined, governor- tuned, cost-visible-to-persona, deterministic-in-replay) let an Air solo + a 5090 solo + a swarm of mixed all use the same Rust code path and all benefit from each other's evolved genome. Dynamicism-across-the-grid made concrete. Section grew from ~40 lines to ~280. Engineer-buildable. Part 7 PR (recall-api) is now a clean piece of work: trait + scoring function + working-set index + within-turn cache + local catalog. Grid + federation + replay are subsequent PRs in the same lane H sequence. Doc-only change. Part 7 only.
eab0c79 to
eb60c60
Compare
… document map; lane status truth-up (#1316) * docs(alpha): refresh status against 2026-05-16 canary Three changes to ALPHA-GAP-ANALYSIS.md: 1. Header date 2026-05-13 -> 2026-05-16. Add explicit cross-link to CBAR-SUBSTRATE-ARCHITECTURE.md as the runtime substrate spec. 2. Restructure the Document Map (was a flat list) into categorized references (Runtime substrate / Cognition migration / Memory paging / Model registry / Grid), and add the precedence rule: if any supporting doc disagrees with ALPHA-GAP on the substrate contract (concurrency, scheduling, memory, pressure, telemetry, artifact handles), defer to CBAR-SUBSTRATE-ARCHITECTURE.md. 3. Refresh the Current Snapshot table against canary @ 2026-05-16: - Rust core row reflects the PressureBroker bootstrap stack (#1307 / #1308 / #1310), runtime lease broker (#1313), cognition oxidization (#1284 / #1290 / #1291 / #1293 / #1298 / #1301 / #1303 / #1292), dead-Candle deletes (#1277 / #1279 / #1281 / #1288), and the inference-grpc fail-closed (#1314). GRID-INFERENCE-ROUTING PR-1 announcer in flight on feat/grid-inference-routing-pr2-announcer. - Node/TS row notes net-negative trend (~2500 LOC TS deleted via the 8-PR cognition stacks). - Docker row records Docker tier Phase 1 (#1297). - Config row records SQLite-first default (#1271). - Tests row records the no-CPU-fallback contract gap: the existing regression test in workers/continuum-core covers llama.cpp / ORT only, not the Candle-side paths where the orpheus + inference-grpc fallbacks lived before #1314. * docs(alpha): refresh lane status table and immediate-next-actions Two updates to ALPHA-GAP-ANALYSIS.md: 1. Lane status table now reflects actual state @ 2026-05-16, not aspiration: - Lane A: in progress, model_registry/ exists with admission resolver. - Lane B: Phase 1 landed (#1297 docker-tier-stats); GPU profile + tier-pool eviction (#1238 / #1239) still open. - Lane C: structured RuntimeMetric emits from inference paths; vdd-report-command not yet bound. - Lane D: UNSTARTED — flagged as the highest-leverage open lane because Lane E (PressureBroker) and the inbox coalescing pattern both presuppose RuntimeFrame / CognitionTurnFrame. - Lane E: bootstrap landed (#1307 / #1308 / #1310 / #1313); paging and pre-broker concurrency-hack deletion remain. Concrete deletion target called out: get_num_workers() in inference-grpc/main.rs, which reads INFERENCE_WORKERS from config.env and otherwise picks worker count from system memory at startup — both branches violate the "we do not hard code" / "dynamic, broker-owned concurrency" rule. - Lane F: ~2500 LOC TS deleted manually this session; mechanical CI ratchet still not landed (deletion is reversible until it is). - Lane G: refresh in flight on joel/docs-alpha-refresh. Adds an "adjacent active workstream" note for GRID-INFERENCE-ROUTING (PR-1 announcer + probe + registry in flight on feat/grid-inference-routing-pr2-announcer) as the grid-side counterpart to Lane A. 2. Immediate Next Actions reordered by alpha leverage, not by who is online. Top three items are Lane D claim, the universal-trait "for free" triplet (RuntimeModule base trait + derive macro + scaffold generator from CBAR-SUBSTRATE-ARCHITECTURE.md), and the get_num_workers() deletion. Adds the Lane C VDD report command and the widening of no_cpu_fallback_contract.rs to cover Candle paths. Adds doc-refresh follow-ups so each supporting doc gets cross-linked back into the Document Map. * docs(alpha): add Lane H + GENOME-FOUNDRY-SENTINEL cross-links Three updates to ALPHA-GAP-ANALYSIS.md following continuum#1327: 1. Lane H added to the lane status table: Substrate governor + tiered genome cache. Sibling to Lane E (broker owns admission; governor owns sizing). 7-PR implementation sequence detailed in GENOME-FOUNDRY-SENTINEL.md Part 13. Currently Proposed, needs owner claim. 2. Lane claim update at end of the lane discussion: Lane H proposed via continuum#1327 with full design pinned to that doc; sibling to Lane E with the boundary stated explicitly. 3. Document Map gets GENOME-FOUNDRY-SENTINEL entry under "Runtime substrate (load-bearing)" — the artifact-sharing economy on top of the CBAR substrate. Tiered genome cache, page faults, foundry as JIT, sentinel-AI as profile-guided optimizer, demand-aligned recall, composer + speculator, SubstrateGovernor (DVFS). 4. Immediate Next Actions step 9 added: claim Lane H. Step 10 (formerly step 9) updated to reflect what's landed in this doc batch (CBAR-SUBSTRATE refinement via #1324, CONTINUUM-ARCHITECTURE refresh via #1317, CONTINUUM-VISION refresh via #1320, GENOME-FOUNDRY-SENTINEL via #1327) and what's next (CLAUDE.md substrate pointer; stale-section deprecations in UNIVERSAL-SENSORY / LEARNING / QUEUE-DRIVEN-COGNITION). --------- Co-authored-by: Test <test@test.com>
…ning; navigate to MODULE-CATALOG queue Second refresh of ALPHA-GAP Immediate Next Actions to reflect work landed since #1316 merged. Six items closed; navigation into MODULE-CATALOG queue made explicit. Closed: #6 contract widening (#1341), #8 GRID-INFERENCE-ROUTING PR-1 (#1315), CBAR-PIECE-5 end-to-end (#1331/#1333/#1335/#1338), PIECE-8 inference-grpc hardcoded-clamps (#1340), doc family architecture surface (#1324/#1327/#1332/#1336/#1337 open; #1316/#1317/#1320/#1329 merged). Item #9 reorganized to point at MODULE-CATALOG's 'Next Modules To Build' queue (audit-recorder → threat-detector → working-set-manager → demand-aligned-recall → substrate-governor). Adds closeout summary section listing what's done, what's open (5 architecture-doc PRs ready for review + 2 airc PRs), and what's queued (5 modules with dependency state + LoC + acceptance criteria in MODULE-CATALOG). Doc-driven development cycle is working: doc spec → implementing agent picks up → ships PR → next spec referenced.
…ning; navigate to MODULE-CATALOG queue (#1342) Second refresh of ALPHA-GAP Immediate Next Actions to reflect work landed since #1316 merged. Six items closed; navigation into MODULE-CATALOG queue made explicit. Closed: #6 contract widening (#1341), #8 GRID-INFERENCE-ROUTING PR-1 (#1315), CBAR-PIECE-5 end-to-end (#1331/#1333/#1335/#1338), PIECE-8 inference-grpc hardcoded-clamps (#1340), doc family architecture surface (#1324/#1327/#1332/#1336/#1337 open; #1316/#1317/#1320/#1329 merged). Item #9 reorganized to point at MODULE-CATALOG's 'Next Modules To Build' queue (audit-recorder → threat-detector → working-set-manager → demand-aligned-recall → substrate-governor). Adds closeout summary section listing what's done, what's open (5 architecture-doc PRs ready for review + 2 airc PRs), and what's queued (5 modules with dependency state + LoC + acceptance criteria in MODULE-CATALOG). Doc-driven development cycle is working: doc spec → implementing agent picks up → ships PR → next spec referenced. Co-authored-by: Test <test@test.com>
What
Adds `docs/architecture/GENOME-FOUNDRY-SENTINEL.md` (716 lines), the design doc for the artifact-sharing economy that flows on top of CBAR-SUBSTRATE's runtime contract.
Doc-only. No code. Every Rust trait shape shown is proposed, targeted at `src/workers/continuum-core/src/{genome,foundry,sentinel,governor}/`. Implementation lands per ALPHA-GAP Lane H once this design is reviewed.
The Synthesis
Persona = process. Genome = cache hierarchy. Engrams = paged virtual memory. Foundry = JIT compiler. Sentinel-AI = profile-guided optimizer. Substrate governor = DVFS.
The autonomy side (artifact-sharing economy) and the efficiency side (classical computer-architecture toolbox) are the same architecture seen from two angles. The substrate works on a MacBook Air (16 GB UMA) and on an RTX 5090 (32 + 64 GB) with the same Rust code path — only the governor's policy file differs.
Structure
15 parts + one architecture diagram + see-also.
Architecture Diagram
Synthesis flow showing foundry + sentinel + consolidation feeding the genome pool; persona working sets paging from the pool; substrate governor underneath governing tier sizes / cadences / concurrency / speculation. Two hardware anchors named. Diagram earns its space — not decorative.
Connection to existing PRs
This is the doc the previous four refinements have been pointing at:
Follow-up PRs (small) will add cross-link pointers in #1316 (Lane H entry) and #1324 (See Also bump).
Why This Doc Is Worth Reading In Full
Two design choices that anchor the rest:
Provenance is mandatory at the type level. Every artifact carries a `Provenance` record. The substrate refuses artifacts without one. This is what `no_silent_fallback` looks like at the artifact economy layer — and what makes federation safe at global scale.
The governor is one Rust subsystem; the policy file is what changes per hardware. Same code on Air and 5090. The architectural beauty is that "runs on consumer hardware" stops being a degraded mode and becomes the same architecture with different governor settings.
Validation