Skip to content

fix(memory): serialize memory_tree schema init across worker pool#2059

Merged
senamakel merged 12 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-mt-init-worker-race
May 19, 2026
Merged

fix(memory): serialize memory_tree schema init across worker pool#2059
senamakel merged 12 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-mt-init-worker-race

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 18, 2026

Summary

  • Four tree_jobs_worker tasks were racing into memory::tree::store::with_connection on cold start, triggering ~19k Sentry events/day across three SQLite WAL/SHM bootstrap codes.
  • Gate apply_schema behind a process-wide Mutex<HashSet<PathBuf>> so only the first caller per DB path runs WAL setup + SCHEMA + ALTER TABLE migrations; subsequent callers skip.
  • Retry the open+init flow up to 3x on the three transient cold-start codes (14 / 1546 / 4874) with 50→150→500ms backoff as belt-and-suspenders.
  • Add a regression test spawning 8 threads against a cold tempdir DB; asserts all Ok and apply_schema runs exactly once per path.

Problem

Three Sentry issues, all tagged domain=memory, operation=tree_jobs_worker, worker_idx=0..3, sharing the same root cause:

Sentry ID Events Error message
OPENHUMAN-TAURI-HH 14168 Failed to initialize memory_tree schema: disk I/O error: Error code 1546: Error attempting to truncate file
OPENHUMAN-TAURI-ZM 3013 Failed to initialize memory_tree schema: disk I/O error: Error code 4874: I/O error within the xShmMap method (trying to resize an existing shared-memory segment)
OPENHUMAN-TAURI-MB 2177 Failed to initialize memory_tree schema: unable to open database file: Error code 14: Unable to open the database file

ZM + MB share a trace_id — two SQLite errors in the same event chain. Cross-platform (Windows + macOS) → race, not OS-specific. All three codes are flavors of the same race window on WAL+SHM cold start:

  • code 14 (SQLITE_CANTOPEN) — lockfile race, WAL not yet ready
  • code 1546 (SQLITE_IOERR_TRUNCATE) — WAL truncation contention
  • code 4874 (SQLITE_IOERR_SHMMAP) — shared-memory segment resize race

The 4 worker tasks (WORKER_COUNT = 4 in memory::tree::jobs::worker) all enter with_connection via claim_next immediately after the pool boots. Each connection re-ran PRAGMA journal_mode=WAL + execute_batch(SCHEMA) + ALTER TABLE migrations — fine for a warm DB, racy on cold start when WAL + SHM files don't yet exist.

Solution

Centralised the init in store.rs:

  • SCHEMA_INITIALIZED: OnceLock<Mutex<HashSet<PathBuf>>> records DB paths whose schema has been applied this process.
  • with_connection calls open_and_init_with_retry(&db_path):
    • Fast path: path already in set → just open conn + set busy_timeout + return.
    • Cold path: open conn, acquire mutex, double-check, run apply_schema (WAL + SCHEMA + ALTERs + index) once, insert path, drop mutex.
  • The whole flow is wrapped in a 3-attempt retry with 50/150/500ms backoff on the three transient codes; classifier walks the anyhow cause chain so codes buried under .context() layers still match.
  • Keyed on PathBuf (not "process-wide once") so per-test tempdir() configs each initialise independently — guards against the stale-handle pattern in feedback_oncelock_sqlite_test_isolation.md.

The busy_timeout = 15s and journal_mode = WAL from the existing code are unchanged.

Test: with_connection_serialises_concurrent_schema_init spawns 8 threads racing into with_connection on a fresh tempdir. Asserts (a) no errors surface and (b) apply_schema runs exactly once for the shared path. Per-path counter avoids false positives from parallel tests in the same binary against their own tempdirs.

Micro-commits (in order):

  1. fix(memory): serialize memory_tree schema init across workers — the mutex-gated guard + retry in store.rs
  2. test(memory): cover concurrent memory_tree schema init regression — 8-thread regression test in store_tests.rs

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • N/A: behaviour-change is a serialization guard around an existing init path; covered by the new 8-thread regression test plus the existing store test suite (13 tests pass locally). Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml.
  • N/A: behaviour-only change to existing memory_tree init path, no new feature row. Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change.
  • N/A: bug fix tracked via Sentry IDs, not GH-issue feature flow. All affected feature IDs from the matrix are listed in the PR description under ## Related.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: change is in the Rust core's SQLite init path; not a release-cut surface. Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md).
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform: desktop (Windows + macOS observed in Sentry). Eliminates ~19k events/day of Failed to initialize memory_tree schema errors from the worker pool's cold-start path.
  • Performance: hot path (every with_connection call after first) now does Mutex<HashSet>::lock().contains() instead of running PRAGMA journal_mode=WAL, the full SCHEMA execute_batch, and 12+ ALTER TABLE probes. Net speedup per call on warm DBs — small but free, especially under contention.
  • Security/migration/compatibility: none. Schema is unchanged; the guard only changes when the existing migrations are run (once per process per path instead of once per with_connection call).

Related

  • Closes OPENHUMAN-TAURI-HH
  • Closes OPENHUMAN-TAURI-ZM
  • Closes OPENHUMAN-TAURI-MB
  • Follow-up PR(s)/TODOs:

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/sentry-mt-init-worker-race
  • Commit SHA: see HEAD of branch

Validation Run

  • N/A: Rust-only change. pnpm --filter openhuman-app format:check
  • N/A: Rust-only change. pnpm typecheck
  • Focused tests: cargo test --lib openhuman::memory::tree::store -- --test-threads=4 → 13/13 pass (includes new with_connection_serialises_concurrent_schema_init)
  • Rust fmt/check (if changed): cargo fmt --check clean; cargo check clean
  • N/A: no Tauri shell changes. Tauri fmt/check (if changed)

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: serialise schema init across concurrent with_connection callers; skip redundant WAL + SCHEMA + ALTER work after first per process.
  • User-visible effect: eliminates the three Sentry "Failed to initialize memory_tree schema" error classes on cold start. No UX-visible delta on success path.

Parity Contract

  • Legacy behavior preserved: every connection still gets busy_timeout = 15s set; WAL journal mode persists in the DB header from the first init; all ALTER TABLE migrations remain idempotent and are unchanged.
  • Guard/fallback/dispatch parity checks: retry classifier (is_transient_cold_start) only matches the three documented cold-start codes; non-transient errors still bubble up unchanged.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution: N/A

Summary by CodeRabbit

  • Bug Fixes

    • DB schema initialization is now serialized per file and retries transient SQLite startup errors with backoff.
    • Foreign-key enforcement is consistently enabled on every connection.
  • New Features

    • Schema expanded with embedding and summary fields, importance/reason, backlinks, lifecycle status and index, content pointers, raw references, and an entity "is_user" flag.
  • Tests

    • Added regression tests for concurrent schema init and transient-error classification, a manual smoke/stress binary for concurrent init, and tightened a test with a global lock to prevent race flakes.

Review Change Stack

oxoxDev added 2 commits May 18, 2026 13:03
Four tree_jobs_worker tasks were racing into with_connection on cold
start, each hitting SQLite WAL/SHM bootstrap simultaneously. Surfaced
as ~19k events/day across three codes — 14 (CANTOPEN), 1546
(IOERR_TRUNCATE), 4874 (IOERR_SHMMAP) — all reported as
"Failed to initialize memory_tree schema".

Gate apply_schema behind a process-wide Mutex<HashSet<PathBuf>> so
only the first caller per DB path runs WAL setup + SCHEMA + ALTERs;
subsequent callers see the path in the set and skip the work. Keyed
on PathBuf (not "process-wide once") so per-test tempdir configs
each initialise independently — see feedback_oncelock_sqlite_test_isolation.

Belt-and-suspenders: retry the open+init flow up to 3x on the three
transient cold-start codes with 50→150→500ms backoff before
surfacing the error.

Closes OPENHUMAN-TAURI-HH
Closes OPENHUMAN-TAURI-ZM
Closes OPENHUMAN-TAURI-MB
Spawns 8 threads racing into with_connection on a cold tempdir DB.
Asserts all callers return Ok and apply_schema runs exactly once
per path — the structural guarantee that the mutex in the parent
commit eliminates the OPENHUMAN-TAURI-HH/-ZM/-MB cold-start race.

Per-path counter (not global) so the test is robust against other
tests in the same binary running in parallel against their own
tempdirs.
@oxoxDev oxoxDev requested a review from a team May 18, 2026 07:34
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

SQLite initialization now uses per-DB-path guards and retries transient cold-starts with backoff; schema application is idempotent and extended with additive migrations. Tests and a smoke binary validate concurrent init correctness and that foreign key enforcement is enabled per connection.

Changes

Concurrent-Safe SQLite Initialization with Schema Extension

Layer / File(s) Summary
Concurrency guards and initialization tracking
src/openhuman/memory/tree/store.rs
Adds imports for path/keyed process-local bookkeeping, a per-path SCHEMA_INITIALIZED guard and SCHEMA_APPLY_COUNTS counter, and exposes a test-only accessor to read apply counts.
Connection opening with retry and FK per-connection
src/openhuman/memory/tree/store.rs
Refactors with_connection to open via open_and_init_with_retry, which checks the per-path guard and retries transient SQLite cold-start errors with bounded backoff before invoking schema application; moves foreign_keys enabling to per-connection setup and removes it from the base SCHEMA.
Schema migration column additions
src/openhuman/memory/tree/store.rs
apply_schema extended to idempotently add chunk embeddings, mem_tree_score LLM importance/reason fields, mem_tree_chunks.parent_summary_id, mem_tree_summaries.embedding, mem_tree_chunks.lifecycle_status and its index, content_path/content_sha256 for chunks and summaries, mem_tree_chunks.raw_refs_json, and mem_tree_entity_index.is_user.
Concurrent initialization regression test & smoke CLI
src/openhuman/memory/tree/store_tests.rs, Cargo.toml, src/bin/memory_tree_init_smoke.rs, src/openhuman/wallet/execution.rs
Adds with_connection_serialises_concurrent_schema_init and is_transient_cold_start_classifies_known_extended_codes tests asserting single-apply under concurrency and classification of transient extended codes; adds memory-tree-init-smoke binary and Cargo entry to stress-test concurrent initialization; synchronizes a wallet test with TEST_LOCK.

Sequence Diagram

sequenceDiagram
  participant WorkerThread
  participant with_connection
  participant open_and_init_with_retry
  participant InitializationGuard as Guard
  participant SQLiteDB as DB
  WorkerThread->>with_connection: request connection(path)
  with_connection->>open_and_init_with_retry: open/init path
  open_and_init_with_retry->>Guard: check/acquire per-path lock
  Guard->>open_and_init_with_retry: grant or block
  open_and_init_with_retry->>DB: open and run SCHEMA / migrations (if leader)
  open_and_init_with_retry->>with_connection: return Connection
  with_connection->>WorkerThread: provide Connection
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

working

Suggested reviewers

  • senamakel

Poem

🐰 I hopped to guard the DB door,
One schema run, not one run more.
Threads raced in flurry and light,
Backoff hummed through startup night.
Now counts whisper: "once" — alright.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: serializing SQLite schema initialization for memory_tree to prevent concurrent worker races.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/memory/tree/store.rs`:
- Around line 758-815: The connection-level PRAGMA for foreign key enforcement
must be applied in open_connection so every returned Connection has FK checks
enabled; update open_connection (the function that calls Connection::open and
configures busy_timeout) to run "PRAGMA foreign_keys = ON" (or the equivalent
rusqlite call) before returning the conn, and remove or leave FK-related PRAGMA
only in apply_schema for one-time DDL if present, ensuring that
try_open_and_init / open_and_init_with_retry fast-path connections get FK
enforcement too.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 584d3086-9a24-4d25-99cd-a8609b5f0613

📥 Commits

Reviewing files that changed from the base of the PR and between 0f616e4 and e6db2a0.

📒 Files selected for processing (2)
  • src/openhuman/memory/tree/store.rs
  • src/openhuman/memory/tree/store_tests.rs

Comment thread src/openhuman/memory/tree/store.rs Outdated
Manual harness that spins N threads racing into with_connection
against a shared workspace. Cold-start exercises the race window
the OPENHUMAN-TAURI-HH/-ZM/-MB fix in the prior commits closes.

Bumps `with_connection` visibility from pub(crate) to pub so the
bin (a separate crate target) can call it directly. Existing
internal callers (read_rpc, ingest, bucket_seal, score::store)
are unaffected.

Usage:
  rm -rf /tmp/mt-smoke
  OPENHUMAN_WORKSPACE=/tmp/mt-smoke \
    cargo run --bin memory-tree-init-smoke -- 64

Exit 0 if all threads Ok, 1 if any failed.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bin/memory_tree_init_smoke.rs`:
- Around line 49-54: Replace the std print calls used for smoke diagnostics
(e.g. the eprintln! that prints "[smoke] workspace={} cold_start={} threads={}"
and the other println!/eprintln! occurrences around those blocks) with
structured diagnostic logs using the repo's logging crate (tracing or log) at
debug/trace level; update the top of this module to import and use
tracing::debug! or tracing::trace! (or log::debug! if the project uses log) and
change each eprintln!/println! invocation to the appropriate debug/trace macro
so the messages (including workspace.display(), cold, n and the other formatted
fields in the other two blocks) are emitted via the logging subsystem rather
than stdout/stderr.
- Around line 37-43: The current parsing sets let n: usize =
...parse().expect(...) but allows 0; after parsing (the binding named n)
validate that n > 0 and fail fast if not by panicking or returning an error with
a clear message (e.g., "thread count must be a positive integer greater than
0"); update the parsing block around the variable n (where
std::env::args().nth(1)...parse() is called) to check the parsed value and
replace the current expect with logic that enforces strictly positive thread
count.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4a52bd65-537c-4e4c-bb80-8e2de785b296

📥 Commits

Reviewing files that changed from the base of the PR and between e6db2a0 and a68109c.

📒 Files selected for processing (3)
  • Cargo.toml
  • src/bin/memory_tree_init_smoke.rs
  • src/openhuman/memory/tree/store.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/memory/tree/store.rs

Comment thread src/bin/memory_tree_init_smoke.rs Outdated
Comment thread src/bin/memory_tree_init_smoke.rs Outdated
oxoxDev added 2 commits May 18, 2026 13:53
PRAGMA foreign_keys is connection-local in SQLite — it resets to off
on every new Connection::open. The earlier schema-init refactor
moved the pragma out of per-call execution into apply_schema, which
only runs on first init per DB path. Result: every later
with_connection() fast-path call returned a connection with FK
enforcement off.

Move the pragma to open_connection() so every connection — cold
init + fast path — sets it. Add a regression test that calls
with_connection twice and asserts PRAGMA foreign_keys returns 1
on both calls.

Caught by CodeRabbit review on PR.
Two CodeRabbit review fixes for the memory-tree-init-smoke binary:

- Switch all println!/eprintln! to log::info/debug/error and init
  env_logger so the diagnostics flow through the same subsystem as
  the rest of the core. Workers default to debug level; control via
  RUST_LOG (e.g. RUST_LOG=debug to see per-worker results).
- Validate the thread-count arg: reject n=0 and unparseable values
  with a clear error message and exit code 2 (vs 1 = at least one
  worker failed, 0 = all ok). n=0 previously produced a spurious
  "0/0 ok" pass with no work done.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 18, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

Solid fix for a real production pain point — 19k Sentry events/day from concurrent WAL/SHM bootstrap races is nasty. The mutex-gated SCHEMA_INITIALIZED set with per-path keying is clean, the retry backoff on the three known transient codes is well-targeted, and the FK pragma move to open_connection() (addressing CodeRabbit's earlier catch) is correct. Two issues worth addressing before merge: test instrumentation leaking into production, and the pub visibility escape on with_connection.

Change Summary

File Change type Description
Cargo.toml Modified Added [[bin]] entry for memory-tree-init-smoke manual diagnostic tool
src/bin/memory_tree_init_smoke.rs Added Manual stress-test binary: N threads racing with_connection on cold workspace
src/openhuman/memory/tree/store.rs Modified Core fix: SCHEMA_INITIALIZED mutex guard, retry loop with backoff, open_connection() extracted with per-connection FK pragma, apply_schema() split out
src/openhuman/memory/tree/store_tests.rs Modified Two regression tests: concurrent init serialisation + FK pragma persistence

Comment thread src/openhuman/memory/tree/store.rs
Comment thread src/openhuman/memory/tree/store.rs Outdated
Comment thread src/openhuman/memory/tree/store.rs Outdated
Comment thread src/openhuman/memory/tree/store_tests.rs
oxoxDev added 3 commits May 18, 2026 16:11
…tinyhumansai#2059)

The SCHEMA_APPLY_COUNTS HashMap exists only so the concurrent-init
regression test can assert apply_schema runs exactly once per path.
Before this commit, the static + record_schema_apply ran in release
builds too — adding an OnceLock<Mutex<HashMap>>, a mutex acquisition,
and a HashMap insert on every cold-start schema init.

Wrap both the static and the write side behind cfg(test) so the
production binary carries no instrumentation. record_schema_apply
still exists in release builds but compiles to a no-op body.

Addresses graycyrus review (major) on PR tinyhumansai#2059.
…mansai#2059)

Previous commit widened with_connection from pub(crate) to pub so the
smoke binary in src/bin/ (a separate crate target) could call it.
That side effect made it part of openhuman_core's public surface —
any downstream crate could rely on it.

Add #[doc(hidden)] and a doc-comment line stating it is NOT a stable
API. Lightest-touch fix: smoke bin still resolves the symbol, but
rustdoc drops it from generated docs and downstream callers get a
clear signal not to depend on it.

Addresses graycyrus review (major) on PR tinyhumansai#2059.
…nyhumansai#2059)

The retry loop's transient-error gate matched raw extended_code
literals (14, 1546, 4874). Replace with named constants so the
match arm is self-documenting and grep-friendly:

  SQLITE_CANTOPEN        = 14
  SQLITE_IOERR_TRUNCATE  = 1546
  SQLITE_IOERR_SHMMAP    = 4874

is_transient_cold_start was the gatekeeper for the retry loop but
had no direct unit test — only exercised incidentally if the
concurrent-init test happened to fire a transient (rare on dev
machines). Add a targeted test that pins classification for all
three cold-start codes and explicitly rejects SQLITE_BUSY +
non-SQLite errors so regressions in the match arms surface.

Addresses graycyrus review (minor x2) on PR tinyhumansai#2059.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 18, 2026
quote_store_round_trips_and_expires called reset_quote_store_for_tests()
without acquiring TEST_LOCK, racing against the async execute_prepared_*
tests that store a quote and then await. The reset wiped the in-flight
quote between store + await, surfacing intermittently as 'quote q_retry
not found' in Rust Core Coverage (CI run 26029017503 failed this PR on
that race even though the wallet code is untouched here).

All other tests in this module acquire TEST_LOCK before touching the
shared store. One-line drive-by fix — flake was blocking the Coverage
Gate on tinyhumansai#2059.
@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 18, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 18, 2026
Resolved Cargo.toml [[bin]] conflict — keep both memory-tree-init-smoke
(this PR) and inference-probe (added on main as separate top-level bin
entry). Version field auto-resolved to 0.54.0 via release bumps on main.

Verified post-merge:
- cargo check core + both new bins → ok
- memory_tree::store tests: 15/15
- wallet::execution tests: 12/12
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuation Review — Previous Requested Changes Addressed

All four findings from the previous review have been addressed in follow-up commits:

  1. SCHEMA_APPLY_COUNTS test instrumentation (major) — #[cfg(test)]-gated in 883b295. Static, HashMap import, and record_schema_apply body all compile to no-ops in production. ✓
  2. with_connection pub visibility (major) — #[doc(hidden)] pub with explanatory doc comment in b9fb859. Smoke binary resolves the symbol; rustdoc hides it. ✓
  3. Magic SQLite error codes (minor) — Named constants SQLITE_CANTOPEN / SQLITE_IOERR_TRUNCATE / SQLITE_IOERR_SHMMAP with doc comments in 54a104e. ✓
  4. Missing is_transient_cold_start unit test (minor) — is_transient_cold_start_classifies_known_extended_codes added in 54a104e, covering all three transient codes plus negative cases. ✓

Additional Changes Since Last Review

File Change type Description
src/openhuman/wallet/execution.rs Bug fix Added TEST_LOCK guard in quote_store_round_trips_and_expires to prevent cross-test race with reset_quote_store_for_tests()
merge commit c7ad0bb Merge upstream/main into fix branch — no conflicts

The wallet test fix follows the established TEST_LOCK pattern already used in that module (see surrounding code in wallet/execution.rs). Clean, no issues.

Summary

Solid fix for a high-volume Sentry issue (~19k events/day). The mutex-gated init set, per-connection FK pragma, and retry-with-backoff are all well-structured. Test coverage is thorough — concurrent init, FK persistence across cold/warm paths, and classifier pin tests. Code follows existing patterns in memory/tree/ (naming, error handling, logging prefixes). No new critical or major findings. All previously requested changes are resolved.

All four review threads have been resolved.

@graycyrus
Copy link
Copy Markdown
Contributor

@oxoxDev bro conflicts

…#1574 §7)

Reconcile worker-pool race fix (tinyhumansai#2059) with per-(row,model) embedding
storage cutover (tinyhumansai#1574 §7):

- store.rs:
  - Keep tinyhumansai#2059's pragma comment AND tinyhumansai#1574's TREE_EMBEDDING_MIGRATION_VERSION
    const at top.
  - Keep tinyhumansai#2059's `apply_schema` pure (schema only) — preserves split between
    `open_connection` (per-call FK pragma), `apply_schema` (first-init only),
    and the legacy migration.
  - Wire tinyhumansai#1574's `migrate_legacy_embeddings_to_sidecar(&conn, config)?` into
    `with_connection` after `open_and_init_with_retry` returns. Fast-path
    cost is one PRAGMA user_version read once gate set.

- store_tests.rs:
  - Keep all three tinyhumansai#2059 regression tests
    (`with_connection_serialises_concurrent_schema_init`,
     `is_transient_cold_start_classifies_known_extended_codes`,
     `with_connection_keeps_foreign_keys_on_for_every_call`).
  - Keep tinyhumansai#1574's `legacy_embeddings_migrate_to_sidecar_once`.

Verified: `cargo check --tests` clean; 600/600
`openhuman::memory::tree::` lib tests pass, including all 4
conflicting tests.
@oxoxDev
Copy link
Copy Markdown
Contributor Author

oxoxDev commented May 19, 2026

@graycyrus resolved bro — merged upstream/main into branch, reconciled #2059 worker-race split with #1574 §7 legacy→sidecar migration. 600/600 memory_tree tests green. CI re-running.

# Conflicts:
#	src/openhuman/memory/tree/store.rs
@senamakel senamakel merged commit 20dfafb into tinyhumansai:main May 19, 2026
26 checks passed
mtkik pushed a commit to mtkik/openhuman-meet that referenced this pull request May 21, 2026
CodeGhost21 pushed a commit to CodeGhost21/openhuman that referenced this pull request May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants