Skip to content

fix(runtime): harden multi-db rollout and metrics rollups#173

Merged
mergify[bot] merged 5 commits intomatrixorigin:mainfrom
gouhongshen:fix/runtime-multi-db-rollout-and-rollups
Apr 9, 2026
Merged

fix(runtime): harden multi-db rollout and metrics rollups#173
mergify[bot] merged 5 commits intomatrixorigin:mainfrom
gouhongshen:fix/runtime-multi-db-rollout-and-rollups

Conversation

@gouhongshen
Copy link
Copy Markdown
Contributor

Summary

  • persist per-user schema versions in mem_schema_meta so restart no longer re-runs heavy compat migrations for already-marked user DBs
  • tighten multi-db routing/startup/pool handling and wire the metrics rollup paths through the updated API and storage surfaces
  • add router multi-db coverage for schema marker persistence and stale-marker repair

Validation

  • cargo test -p memoria-storage --test router_multi_db -- --test-threads=1
  • cargo check -q
  • preserved 200-user dataset backfill run: seed_rest_store p95 ~81.47s, p99 ~82.09s, 0 errors
  • restart + preserved 200-user rerun: seed_rest_store p95 ~10.50s, p99 ~10.69s, 3/403 timeout-shaped 500s remaining

Notes

  • direct deployment/startup is still safe; old user DBs backfill their durable schema marker on first touch
  • after that one-time backfill, subsequent restarts can skip compat migrate for already-marked user DBs

Copilot AI review requested due to automatic review settings April 8, 2026 11:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the multi-database runtime rollout by introducing durable per-user schema version markers, tightening router/pool behavior, and reworking metrics/tooling paths to avoid expensive per-user fan-out on critical request paths.

Changes:

  • Introduces/uses per-user schema version markers to skip heavy migrations on restart, and improves multi-db routing/pool initialization (global user pool, cache changes, init concurrency control).
  • Reworks metrics aggregation to support multi-db via shared-db summaries + dirty marking, and lazily loads per-user tool usage in multi-db mode.
  • Expands test coverage for multi-db routing/schema markers, search scoring fields, trust-tier validation, and branch behavior.

Reviewed changes

Copilot reviewed 41 out of 43 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
memoria/crates/memoria-storage/tests/store_crud.rs Adds tests around vector retrieval score propagation and hybrid ranking behavior.
memoria/crates/memoria-storage/tests/router_multi_db.rs Adds coverage for user schema marker persistence/repair in multi-db routing.
memoria/crates/memoria-storage/tests/branch_ops.rs Adds routed-store branch fallback test ensuring “main” qualification behavior.
memoria/crates/memoria-storage/src/router.rs Refactors router to use sync moka caches, adds global user pool, schema init cache + semaphore, and registry registration helper.
memoria/crates/memoria-storage/src/migration.rs Adds configurable migration concurrency and adjusts per-user migration to use lightweight pools + registry registration.
memoria/crates/memoria-storage/src/lib.rs Re-exports new pool health types.
memoria/crates/memoria-storage/src/graph/store.rs Switches graph store to qualified-table mode (db_name + t()), sync moka cache, and updates SQL accordingly.
memoria/crates/memoria-storage/src/graph/retriever.rs Removes access_count frequency boost from activation scoring.
memoria/crates/memoria-storage/Cargo.toml Adds futures dependency; enables moka sync feature.
memoria/crates/memoria-service/src/service.rs Updates background flushers and entity/graph pool behavior for multi-db; switches to routed store constructors where needed.
memoria/crates/memoria-service/src/scoring.rs Adjusts scoring store impl to clone store before async calls.
memoria/crates/memoria-service/src/scheduler.rs Makes governance pool sizing configurable and multi-db aware.
memoria/crates/memoria-service/src/governance.rs Refactors governance to iterate routed stores without caching per-user pools; queries registry directly in multi-db.
memoria/crates/memoria-service/Cargo.toml Enables moka sync feature.
memoria/crates/memoria-mcp/tests/tools_unit.rs Validates tool schema docs for trust tier guidance and enum values.
memoria/crates/memoria-mcp/tests/perf_optimizations_e2e.rs Updates near-duplicate behavior expectations (no cross-type duplicate).
memoria/crates/memoria-mcp/tests/core_tools_e2e.rs Adds invalid trust_tier rejection test and capabilities hints assertions.
memoria/crates/memoria-mcp/tests/branch_e2e.rs Adds assertion that stored active branch resets to main after delete.
memoria/crates/memoria-mcp/src/tools.rs Adds tool-name enum dispatch, improves tool descriptions/schema, enforces trust_tier parsing errors, and adds call_owned.
memoria/crates/memoria-mcp/src/server.rs Refactors JSON-RPC dispatching with owned variants and method enum; adjusts HTTP dispatch plumbing.
memoria/crates/memoria-mcp/src/remote.rs Reuses shared capabilities text and adds call_owned.
memoria/crates/memoria-mcp/src/git_tools.rs Adds tool-name enum dispatch; multi-db qualification fixes; implements SQL-based diff/merge helpers and branch list/active behavior improvements.
memoria/crates/memoria-git/src/service.rs Qualifies branch/snapshot DDL and restore operations with db name to ensure correct database scoping.
memoria/crates/memoria-core/src/types.rs Adds canonical MemoryType::ALL_NAMES and FEEDBACK_SIGNALS, plus drift-prevention tests.
memoria/crates/memoria-core/src/lib.rs Re-exports FEEDBACK_SIGNALS.
memoria/crates/memoria-cli/src/main.rs Gates server commands behind feature, adds migration concurrency flag, and introduces configurable git pool init for multi-db.
memoria/crates/memoria-cli/Cargo.toml Adds server-runtime feature and makes API/MCP optional deps; updates default features.
memoria/crates/memoria-api/tests/api_e2e.rs Adds e2e validation that MCP tools/call records tool usage.
memoria/crates/memoria-api/src/state.rs Replaces moka api key cache with custom TTL cache; adds metrics summary manager init and dirty marking hook.
memoria/crates/memoria-api/src/routes/snapshots.rs Uses qualified table names and switches branch “active” detection to branch name.
memoria/crates/memoria-api/src/routes/sessions.rs Uses unquoted/qualified table formatting for session summary queries.
memoria/crates/memoria-api/src/routes/metrics.rs Switches business metrics to summary-backed approach (multi-db) with health metrics; retains cached scrape output.
memoria/crates/memoria-api/src/routes/memory.rs Adds lazy per-user tool usage DB load on empty cache; adjusts some table formatting.
memoria/crates/memoria-api/src/routes/mcp.rs Tracks tool usage, marks metrics dirty after mutating MCP calls, and updates dispatch to owned http entrypoint.
memoria/crates/memoria-api/src/routes/auth.rs Updates api key cache invalidation calls for new cache type.
memoria/crates/memoria-api/src/routes/admin.rs Qualifies tables per user store, marks metrics dirty on delete_user, and fixes call-log table routing.
memoria/crates/memoria-api/src/lib.rs Adds metrics summary module and marks summaries dirty from v1 mutating routes; adjusts call logging behavior.
memoria/crates/memoria-api/src/auth.rs Adds lazy per-user tool usage load, table qualification for usage/call log flushes, and rebuild merge semantics.
memoria/crates/memoria-api/Cargo.toml Enables moka sync feature.
memoria/Cargo.lock Adds futures dependency entry.
docs/per-user-database-architecture.md Updates architecture doc to “implemented” state with deployment/migration runbook and rationale.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread memoria/crates/memoria-storage/src/migration.rs Outdated
Comment thread memoria/crates/memoria-api/src/state.rs
Comment thread memoria/crates/memoria-mcp/src/server.rs
gouhongshen and others added 2 commits April 8, 2026 19:40
1. Atomic claim_batch: replace SELECT+UPDATE race with UPDATE-then-
   SELECT-by-token pattern. MySQL single-UPDATE atomicity prevents
   duplicate claims across concurrent workers.

2. Per-family DELETE scope in flush_rollups: compute per-family user
   sets from claimed_mask instead of using all_user_ids. Users who
   triggered a family but produce zero rows still get old rows deleted.
   Users who did NOT trigger a family are untouched.

3. Write actual refreshed_version: carry ClaimedUser.claimed_version
   through the by_family grouping and bind it in INSERT instead of 0u64.

4. Batch flush_state: replace per-user UPDATE loop with chunked CASE-
   based batch UPDATE (50 users/statement). MySQL left-to-right SET
   evaluation lets has_pending derive from the post-update pending_mask.
   Version-protected CAS semantics preserved.

Added 8 regression tests covering per-family deletion scope, zero-entry
user cleanup, FULL mask coverage, cross-family leak prevention, and
refreshed_version propagation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Persist per-user schema versions to avoid rerunning compat migrations after restart, tighten routed pool defaults and startup wiring, and finish the API/metrics rollup updates that support the capped multi-db runtime validation path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@gouhongshen gouhongshen force-pushed the fix/runtime-multi-db-rollout-and-rollups branch from 1547ed7 to df71466 Compare April 8, 2026 11:42
gouhongshen and others added 2 commits April 8, 2026 21:23
Fresh databases skip apply_user_compat_migrations() after the
schema-version marker change, so indexes that were previously
added only via compat migrations must also appear in the
bootstrap CREATE TABLE statements.

Adds to bootstrap_user_schema():
- idx_memories_user_observed on mem_memories
- idx_feedback_memory_user on mem_retrieval_feedback
- idx_feedback_created_at on mem_retrieval_feedback

Fixes CI: test_feedback_created_at_index_exists

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
buffer_unordered() does not preserve input order, so zipping
collected results with selected_users could misassociate reports.
Return (user_id, result) tuples from the stream instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mergify
Copy link
Copy Markdown

mergify Bot commented Apr 9, 2026

Merge Queue Status

  • Entered queue2026-04-09 04:45 UTC · Rule: main
  • Checks passed · in-place
  • Merged2026-04-09 05:12 UTC · at 89ee664aa572dac7d07728f915f306cf373d10c6

This pull request spent 26 minutes 40 seconds in the queue, including 26 minutes 29 seconds running CI.

Required conditions to merge

@mergify mergify Bot added the queued label Apr 9, 2026
@mergify mergify Bot merged commit fbd4d48 into matrixorigin:main Apr 9, 2026
6 checks passed
@mergify mergify Bot removed the queued label Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants