Skip to content

feat(brain): section posteriors with Thompson Sampling (ADR-048 Phase 1)#505

Closed
ohdearquant wants to merge 7 commits into
mainfrom
feat/brain-persistence-impl
Closed

feat(brain): section posteriors with Thompson Sampling (ADR-048 Phase 1)#505
ohdearquant wants to merge 7 commits into
mainfrom
feat/brain-persistence-impl

Conversation

@ohdearquant
Copy link
Copy Markdown
Owner

Summary

  • SectionType enum (10 knowledge-section types) with per-section Beta posteriors
  • Combinatorial Thompson Sampling weight derivation: softmax over Gamma-sampled Beta posteriors
    with temperature schedule (ADR-048 Correction 1 — NOT normalized-Beta which has linear regret)
  • ESS cap using (m, s) canonical form preserving mean exactly (ADR-048 Correction 2)
  • V20 migration: section_posteriors table with (profile_id, namespace, section) unique constraint
  • brain.create_profile handler accepts seed_priors param for role-based initialization
  • Marsaglia-Tsang gamma sampler (no external stats dependency)
  • 660 insertions across 7 files

Test plan

  • 131 existing brain pack tests pass (129 unit + 2 integration)
  • 11 new section tests: weight normalization, ESS cap mean preservation, role seed priors
  • cargo clippy -p khive-pack-brain --all-targets -- -D warnings clean
  • cargo check --workspace clean

🤖 Generated with Claude Code

ohdearquant and others added 2 commits May 27, 2026 16:23
…ivation (ADR-048 Phase 1)

- SectionType enum (10 knowledge-section types) in state.rs
- SectionPosterior struct with Beta(a,b) updates, ESS cap via (m,s) canonical form
- section.rs: combinatorial TS weight derivation using softmax-over-Gamma-sampled Beta
  posteriors with temperature schedule (Marsaglia-Tsang gamma sampler)
- V20 migration: section_posteriors table with (profile_id, section) unique constraint
- BrainState: section_posteriors HashMap integrated into snapshots + restore
- create_profile handler: seed_priors parameter for role-based prior initialization
  ("implementer", "researcher", "tester" templates)
- 11 tests covering weight normalization, ESS cap mean preservation, role seed priors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…orState

Resolves all 3 CRIT and 7 MAJ findings from codex review of PR #505:

- F-001/F-002/F-003: delete conflicting section.rs types; section.rs now
  delegates to SectionPosteriorState.weights() / deterministic_weights()
- F-004/F-005: implement ADR-048 Correction 1 temperature schedule in
  sample_weights(): tau = tau_0 * (exploration_epoch / epoch_max),
  softmax over Thompson samples with numerically stable max-subtraction
- F-005/F-007: exploit path uses tau_exploit=0.1 softmax over posterior
  means; apply_floor_and_renorm() iterates to guarantee all weights ≥ 0.05
- F-006: DEFAULT_ESS_CAP 50 → 100 (ADR-048 Correction 2)
- F-009/F-010: lib.rs section_posteriors → section_states; feedback routes
  through SectionPosteriorFold (only updates when section_signals present,
  not uniform update to all sections)
- Fix apply_ess_cap formula: scale = (cap - prior_ess)/(ess - prior_ess)
  so resulting ESS equals cap exactly
- Add Default impls for SectionPosteriorState and SectionPosteriorFold

143 unit tests + 2 integration tests pass; clippy -D warnings clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ohdearquant
Copy link
Copy Markdown
Owner Author

Merge sequence

Order PR Description Base Status
1 #499 authorize() → Result main Codex: fixed
2 #500 EntityType registry #499 Codex: fixed
3 #501 Lifecycle tests main Codex: fixed
4 #502 Memory/brain/knowledge #499 Codex: fixed
5 #503 Write-key conflicts #499 Codex: fixed
6 #504 Vamana ANN index main Codex: fixed
7 #505 Brain section posteriors main Codex: fixed
8 #506 Runtime backfill/sweep/pipeline main Codex: fixed
9 #507 KG namespace isolation main New

Merge #499 first — PRs #500, #502, #503 depend on it (GitHub auto-retargets on merge).

ohdearquant and others added 4 commits May 27, 2026 16:55
…sertions

DEFAULT_ESS_CAP was set to 100.0 by the prior agent commit, diverging
from ADR-048 spec which defines 50.0. Corrected to match spec.

V20 migration test assertions still referenced the old table name
(section_posteriors) instead of the actual V20 tables
(brain_profile_snapshots, brain_event_log). Updated all 4 test
assertions to verify the correct table and index names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the persistence layer that saves brain state to the V20
tables (brain_profile_snapshots, brain_event_log):

- persist.rs: new module with append_brain_event(), upsert_snapshot(),
  load_latest_snapshot(), load_events_since(), ensure_loaded(), and
  persist_after_feedback() async functions
- PersistenceTracker: tracks loaded namespaces and dirty event counts
  with configurable batch threshold (default 5)
- Save path: after every feedback fold, append to brain_event_log;
  upsert brain_profile_snapshots when dirty count reaches threshold
- Load path: ensure_loaded() lazily loads latest snapshot and replays
  newer events through BalancedRecallFold + SectionPosteriorFold
- Wired into dispatch() for lazy initialization on first verb call
- Persistence failures are non-fatal (logged to stderr); the shared
  event store remains the source of truth for event history

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RuntimeConfig::default() now includes additional_embedding_models with
ParaphraseMultilingualMiniLmL12V2, which gets registered in the DB and
causes note-creation tests to attempt model loading. In CI, the ONNX
model files don't exist, causing 5 integration tests to fail with
"model initialization failed: IO error: No such file or directory".

Fix: explicitly set additional_embedding_models: vec![] in all test
RuntimeConfig blocks that set embedding_model: None.

Also includes cargo fmt and deno fmt fixes for runtime.rs and ADR docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ohdearquant
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 025a6a279a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +204 to +205
if let Some((snapshot, updated_at)) = snapshot_result {
let replay_events = load_events_since(sql.as_ref(), &namespace, updated_at).await?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Initialize state for namespaces without snapshots

When a namespace has no row in brain_profile_snapshots, this branch is skipped and the shared BrainState is left as whatever namespace was previously loaded. In a long-running server, the first brain.* request for a fresh namespace after namespace A was mutated will see A's profiles/bindings/posteriors, and any feedback already appended before the first batched snapshot is also never replayed because load_events_since is only called inside this branch. Treat the no-snapshot case as a fresh BrainState for this namespace and replay the event log from the beginning.

Useful? React with 👍 / 👎.

Comment on lines +928 to +930
if let Some(ref ss) = p.section_signals {
data["section_signals"] = ss.clone();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject malformed section feedback

If section_signals contains an unknown section key or signal value, the handler still returns emitted: true and stores the payload, but interpret() parses it with .ok() and converts the whole map to None, so SectionPosteriorFold performs no update. For knowledge-compose feedback with a typo such as "formalisms" or "usefull", callers get a successful response while the section posterior is silently unchanged; validate this object here and return InvalidInput before appending the event.

Useful? React with 👍 / 👎.

ADR-048 Correction 2 specifies ESS cap = 100 (half-life ~140 events).
The research-derived optimal C_opt ≈ 1/(2ε²) for ε=0.07 yields ~100.
50 was too aggressive for slowly-stabilizing sections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ohdearquant
Copy link
Copy Markdown
Owner Author

Superseded by clean rebase — new PR incoming (old branch had 20+ merge conflicts from cherry-picked CI fixes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant