Skip to content

feat(session): engine-agnostic sessions with per-turn engine switching#5

Merged
HikoQiu merged 11 commits intomainfrom
feat/engine-agnostic-session
Mar 30, 2026
Merged

feat(session): engine-agnostic sessions with per-turn engine switching#5
HikoQiu merged 11 commits intomainfrom
feat/engine-agnostic-session

Conversation

@HikoQiu
Copy link
Copy Markdown
Contributor

@HikoQiu HikoQiu commented Mar 30, 2026

Summary

Existing sessions now seamlessly switch to a new AI engine when the user changes the default in Settings — same session, continuous conversation, no restart needed. The engine is resolved dynamically per-turn via drift detection at the sendMessage/resumeSession chokepoint. This PR also hardens the memory extraction pipeline, adds structured dev-tracing logs across session and memory lifecycles, and unifies UI switch components.

Changes

Engine-agnostic sessions

  • Add ManagedSession.switchEngine() with conversation summary injection for context continuity
  • Add detectAndApplyEngineDrift() DRY helper called from both sendMessage() and resumeSessionInternal()
  • Fix engine drift detection running after the fast path — move it before so active sessions don't silently ignore engine switches
  • Add EngineSwitchEvent type and SystemEventView rendering with ArrowRightLeft icon
  • Add buildConversationSummary() pure function with budget-capped head+tail truncation

Memory system hardening

  • Three-layer defense against silent memory loss: raise maxContentLength 500→1000, add length constraint to extraction prompt, graceful truncation instead of discard
  • Trust LLM judgment on content length — remove post-hoc truncation that destroyed information integrity
  • Show project name for project-scoped memories in list (e.g., "Project · OpenCow")
  • Unify scope terminology: "User" → "Global" across filter toggle and memory cards

Structured dev-tracing logs

  • Session lifecycle: QueryLifecycle start/push/stop, SessionOrchestrator pre-flight/completion, ManagedSession state transitions and engine switches
  • Memory lifecycle: DataBus event arrival, extraction start/end, per-candidate quality gate routing, retrieval counts, debounce queue depth/flush/overflow
  • Add parseResponse diagnostics for zero-candidate extraction failures

UI consistency

  • Replace inline switch implementations in UpdateSection and MemorySection with shared Switch component from ui/switch

Test Plan

  • CI passes (lint, typecheck, test, build)
  • Switch engine in Settings mid-conversation → next message uses new engine seamlessly, system event shown in chat
  • Memory extraction produces memories for rich project context without silent loss
  • Settings toggles (Updates auto-check, Memory enable/silent mode) render and behave consistently with IM Bot switches

HikoQiu and others added 11 commits March 30, 2026 07:26
… switching

When the user changes the default AI engine in Settings, existing sessions
now seamlessly switch to the new engine on the next message — same session,
continuous conversation. The engine is resolved dynamically per-turn via
drift detection in the sendMessage/resumeSession chokepoint, reusing the
existing provider-mode-drift pattern.

Key changes:
- ManagedSession.switchEngine() — updates dual engineKind storage, injects
  conversation summary into context, emits engine_switch system event
- detectAndApplyEngineDrift() — DRY helper in SessionOrchestrator called
  from both sendMessage() and resumeSessionInternal()
- buildConversationSummary() — pure function extracting text turns with
  budget-capped head+tail truncation strategy
- EngineSwitchEvent type + SystemEventView rendering with ArrowRightLeft icon
- Debug logs at settings change, message entry, and drift detection points

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…umeSessionInternal

When the SDK lifecycle was still alive (multi-turn wait mode), the fast
path at L1127 pushed the message directly into the existing queue and
returned before engine drift detection at L1150 ever ran. This caused
engine switches to be silently ignored for active sessions.

Move detectAndApplyEngineDrift() before the fast path check. If drift is
detected, set forceRestart=true so the fast path is skipped and the
session gets a full lifecycle restart with the new engine.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive logging to close observability gaps in the request
lifecycle, modeled on the Codex logging standard:

- QueryLifecycle: start (prompt preview, option keys, model), pushMessage
  (turn seq + content preview), stop (turns completed)
- SessionOrchestrator: pre-flight summary before lifecycle.start() with
  prompt layer sizes, message count, MCP server count, model; completion
  summary with duration and final state
- ManagedSession: state machine transitions (event type + from state),
  addMessage (role + block count), switchEngine (from/to + summary length)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ifecycle

Add comprehensive logging to close observability gaps in the memory
extraction and injection pipeline:

- MemoryService: DataBus event arrival, content skip reasons, extraction
  start/end with content length, context injection entry/result
- MemoryExtractor: LLM call success path with response length, candidate
  count, and duration
- MemoryQualityGate: per-candidate routing decisions (accept/merge/reject
  with reason), FTS search failure warning
- MemoryRetriever: retrieval entry with projectId, fetch counts for
  project/user memories
- MemoryDebounceQueue: enqueue with queue depth, flush with batch size,
  queue overflow warning
- SessionOrchestrator: memory context injection failure now logs warning
  instead of silent catch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tion

When the LLM returns a response but 0 candidates are parsed, there was
no visibility into why. Add logging for all silent failure paths:

- No memories array in response (wrong JSON structure) — logs keys + preview
- All candidates filtered out — logs per-reason counters (empty, too long,
  low confidence)

Prompted by real debugging: LLM returned 1065 chars but candidateCount=0
with no explanation in logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sized content

Root cause: LLM extracted a rich project memory (tech stack + architecture
+ conventions) exceeding the 500-char maxContentLength limit, which was
silently discarded. User saw nothing — worst outcome for valuable content.

Three-layer fix following "it just works" UX principle:

1. Raise maxContentLength 500 → 1000 — accommodates real-world structured
   knowledge that is naturally longer than simple preferences
2. Add length constraint to extraction prompt — LLM now knows the limit
   and is instructed to split rich content into multiple atomic memories
3. Graceful degradation — if content still exceeds the limit, truncate
   with ellipsis instead of discarding. Partial value > zero value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace truncation with pass-through: the extraction prompt already
specifies the length constraint. If the LLM still exceeds it, the extra
content is necessary for completeness — store as-is rather than
destroying information integrity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Project-scoped memories now display the project name alongside the scope
label (e.g., "Project · OpenCow") instead of just "Project". Resolves
the name from appStore.projects using the memory's projectId.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The filter toggle showed "全局/Global" while memory cards showed
"用户/User" for the same user-scope concept. Unified to "全局/Global"
everywhere since user-scope memories apply across all projects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d MemorySection

Replace inline switch implementations with the shared Switch from
@/components/ui/switch, consistent with ConnectionCard usage.
Eliminates duplicated toggle markup and ensures visual consistency.

Co-Authored-By: Claude <noreply@anthropic.com>
Tests used 600-char content to exceed the old 500-char limit, but the
limit was raised to 1000. Bump test content to 1100 chars so the
"too long" rejection paths are still exercised.

Co-Authored-By: Claude <noreply@anthropic.com>
@HikoQiu HikoQiu merged commit 40b5b63 into main Mar 30, 2026
6 checks passed
@HikoQiu HikoQiu deleted the feat/engine-agnostic-session branch March 30, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant