feat(cognition,#1385): generate_response PR-1 — pure types + prompt builder + identity-reminder template#1388
Merged
Conversation
…uilder + identity-reminder template Oxidizer for AIDecisionService.generateResponse (TS, see src/system/ai/server/AIDecisionService.ts:316-452 + buildResponseMessages helper). Sibling to check_redundancy stack (#1375) + should_respond (already oxidized). This is the LAST remaining TS-side AI logic in AIDecisionService.ts. ## What this ships (PR-1 scope — pure, atomic) - `GenerateResponseRequest` (ts-rs) — { context, model?, temperature?, max_tokens?, timeout_ms? } - `GenerateResponseResult` (ts-rs) — { text, model, response_time_ms, timestamp, tokens_used? } - `TokenUsage` (ts-rs) — { input, output, total } - `build_response_messages(&AIDecisionContext, current_time_ms) -> Vec<ChatMessage>` — pure. Composes: 1. System-prompt message (from context.system_prompt) 2. Conversation history with [HH:MM] time prefix + hour-gap markers (⏱️ N hour passed) 3. Identity-reminder system message at end - `build_identity_reminder(persona_name, members, current_time) -> String` — pure. Canonical ~50-line critical-topic-detection prompt. - `extract_room_members(system_prompt) -> &str` — pure. Pulls `Current room members: ...` from a system prompt body. - `format_current_time(ms) -> String` — pure. UTC `MM/DD/YYYY HH:MM`. - `format_time_prefix(Option<ms>) -> String` — pure. UTC `[HH:MM] `. - `hour_gap_marker(gap_ms) -> Option<String>` — pure. ## NOT in this PR - **PR-2**: cognition/generate-response IPC handler — async composer that calls build_response_messages -> AI provider (existing local Qwen router) -> result with timing + tokio::time::timeout replacing the TS Promise.race. - **PR-3**: TS shim — AIDecisionService.generateResponse delegates to RustCoreIPCClient.cognitionGenerateResponse. - **PR-4**: Delete dead TS — buildResponseMessages + inline identity-reminder template (~250 LOC removed). After PR-3 + PR-4, AIDecisionService.ts is pure slot-coordination + shim code. ## Discipline - All pure functions; caller passes current_time_ms so tests are deterministic. - UTC time formatting removes hidden TZ dependency the TS version had (server timezone was leaking into model prompts via toLocaleDateString). - Members extraction falls back to literal "unknown members" string — matches TS exactly so prompt machinery doesn't regress. - Empty system_prompt treated as missing (avoids emitting an empty system row that some providers reject). - Identity-reminder template byte-for-byte parity with TS modulo substitutions. - All ts-rs export bindings. ## Tests (29 — 26 logic + 3 ts-rs export) format_current_time: - mm/dd/yyyy hh:mm UTC at known timestamp - epoch zero boundary extract_room_members: - well-formed line extraction - no trailing newline - missing prefix -> UNKNOWN_MEMBERS fallback - empty after prefix -> UNKNOWN_MEMBERS fallback format_time_prefix: - HH:MM UTC render - None -> empty string hour_gap_marker: - under threshold -> None - 1 hour singular - 2+ hours plural identity_reminder: - embeds persona + members + time - preserves four-step protocol - preserves time-gap heuristic line build_response_messages: - system + history + identity in order - omits system when None - omits system when empty string - injects hour-gap marker for > 1h gaps - no marker under one hour - gap tracking ignores clockless messages (TS parity) - name fallback when missing - extracts members for identity reminder end-to-end - unknown members fallback when prompt missing line - no system prompt -> unknown members fallback - preserves role strings as-is (TS casts but Rust preserves) - empty history Full cognition regression: 325/325 pass. Ref: #1385 oxidizer card just filed; #1248 umbrella. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
joelteply
added a commit
that referenced
this pull request
May 18, 2026
…se + cognition/generate-response IPC handler (#1390) Stacks on PR-1 #1388 (pure types + prompt builder + identity-reminder template). PR-2 wires the async path: build_response_messages → adapter.generate_text (existing local Qwen router via global_registry) → result with timing + tokio::time::timeout replacing the TS Promise.race. ## What this ships (PR-2) - `evaluate_response(GenerateResponseRequest) -> Result<GenerateResponseResult, GenerateResponseError>` — async composer. Honors per-request model/temperature/max_tokens/ timeout overrides; defaults match TS (Qwen3.5 / 0.7 / 150 / 180_000ms). - `GenerateResponseError` — typed: NoAdapter, Generation, Timeout. No silent default-on-error; caller picks fail-open vs fail-closed. - `build_response_generation_request(&request, model, start_ms) -> TextGenerationRequest` — pure helper. Pins wire shape (provider="local", response_format=Text, purpose="cognition/generate-response", persona/room attribution). - `result_from_response(response, model, start_ms, end_ms) -> GenerateResponseResult` — pure helper. Trims text, stamps model + timing, populates tokens_used only when total_tokens > 0 (mirrors TS truthiness). - `cognition/generate-response` command arm in CognitionModule. ## Discipline - `tokio::time::timeout` wraps `adapter.generate_text` — clean Timeout variant on the error enum (TS Promise.race equivalent). - Saturating subtraction on response_time_ms — clock-backwards artifact (NTP adjustment mid-call) reports 0, not a wrapped huge u64. - tokens_used = None when provider reports zeros — avoids emitting fake {0,0,0} measurements for providers that don't instrument usage. - response_format=Text (TS default) — local Qwen takes plain text, no JSON-mode constraint. - All constants are documented (DEFAULT_GENERATE_PROVIDER/MODEL/ TEMPERATURE/MAX_TOKENS/TIMEOUT_MS). ## Tests (10 new — full module now 39 passing) build_response_generation_request: - defaults: provider=local, model=Qwen-default, temp=0.7, max=150, response_format=Text, purpose="cognition/generate-response", persona/room attribution, message count - overrides honored (custom model + temp + max) - caller timestamp embedded in identity reminder (time-flow through layers) result_from_response: - trims surrounding whitespace - stamps model + timing - populates tokens when provider reports total > 0 - tokens None when provider reports 0 - response_time saturates clock-backwards GenerateResponseError: - NoAdapter Display carries provider + model - Timeout Display includes duration Full cognition regression: 335/335 pass. ## NOT in this PR - **PR-3**: TS shim — AIDecisionService.generateResponse delegates to RustCoreIPCClient.cognitionGenerateResponse + cognition mixin binding. - **PR-4**: Delete dead TS — buildResponseMessages helper + inline identity-reminder template (~250 LOC removed). Ref: #1385 oxidizer card, #1388 PR-1 (MERGED). Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
joelteply
pushed a commit
that referenced
this pull request
May 18, 2026
… TS (PR-4 folded) Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response IPC handler). AIDecisionService.generateResponse now delegates to RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + timeout race + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in). ## What this ships - `AIDecisionService.generateResponse` now a thin shim: - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern) - client.cognitionGenerateResponse(request) — single IPC call - InferenceCoordinator.releaseSlot - logError + rethrow on failure (no fail-open silent default) - New TS binding method `cognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>` in the cognition mixin - `GenerateResponseRequest` + `GenerateResponseResult` re-exported from the generated barrel (already present from PR-1) ## Dead TS deleted (PR-4 folded in) - `private static buildResponseMessages(context)` helper (~115 LOC): system-prompt injection, conversation history with [HH:MM] prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1. - `import { AIProviderDaemon }` — no longer referenced after both checkRedundancy (#1383) + generateResponse migrations. - `import type { TextGenerationRequest, TextGenerationResponse }` — ditto, only used by deleted helper. - Inline timeout Promise.race code — replaced by Rust-side tokio::time::timeout in PR-2. After this PR, `AIDecisionService.ts` contains only: - evaluateGating (already shim to cognition/should-respond) - checkRedundancy (already shim to cognition/check-redundancy) - generateResponse (now shim to cognition/generate-response) - InferenceCoordinator slot management (TS-owned platform concern) - logging helpers (TS-owned platform concern) ## Discipline - No fail-open path — errors throw, caller decides (consistent with codex's check_redundancy shim pattern). - Cast `context as unknown as RustAIDecisionContext` matches the pattern in cognitionShouldRespond + cognitionCheckRedundancy — TS RAGContext.identity wraps the system prompt; TS already resolves to context.systemPrompt before sending. - Slot coordination explicitly stays TS — that's the seam codex drew with check_redundancy, preserved here. - Token shape preserved: `result.tokensUsed` is `TokenUsage | None`; TS just passes through (Rust already mapped from provider's UsageMetrics, returning None for zero-token providers). ## Stack progress - #1385 PR-1 (pure types + prompt builder + identity-reminder template): #1388 MERGED - #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN - #1385 PR-3 (TS shim + dead-TS delete): **this PR** - #1385 PR-4 (dead-TS delete): **folded into this PR** ## Refs - #1385 sub-card - #1388 PR-1 (MERGED) - #1390 PR-2 (in flight) - #1383 codex's check_redundancy PR-3 — same shape - #1248 umbrella Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 18, 2026
… TS (PR-4 folded) (#1402) Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response IPC handler). AIDecisionService.generateResponse now delegates to RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + timeout race + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in). ## What this ships - `AIDecisionService.generateResponse` now a thin shim: - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern) - client.cognitionGenerateResponse(request) — single IPC call - InferenceCoordinator.releaseSlot - logError + rethrow on failure (no fail-open silent default) - New TS binding method `cognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>` in the cognition mixin - `GenerateResponseRequest` + `GenerateResponseResult` re-exported from the generated barrel (already present from PR-1) ## Dead TS deleted (PR-4 folded in) - `private static buildResponseMessages(context)` helper (~115 LOC): system-prompt injection, conversation history with [HH:MM] prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1. - `import { AIProviderDaemon }` — no longer referenced after both checkRedundancy (#1383) + generateResponse migrations. - `import type { TextGenerationRequest, TextGenerationResponse }` — ditto, only used by deleted helper. - Inline timeout Promise.race code — replaced by Rust-side tokio::time::timeout in PR-2. After this PR, `AIDecisionService.ts` contains only: - evaluateGating (already shim to cognition/should-respond) - checkRedundancy (already shim to cognition/check-redundancy) - generateResponse (now shim to cognition/generate-response) - InferenceCoordinator slot management (TS-owned platform concern) - logging helpers (TS-owned platform concern) ## Discipline - No fail-open path — errors throw, caller decides (consistent with codex's check_redundancy shim pattern). - Cast `context as unknown as RustAIDecisionContext` matches the pattern in cognitionShouldRespond + cognitionCheckRedundancy — TS RAGContext.identity wraps the system prompt; TS already resolves to context.systemPrompt before sending. - Slot coordination explicitly stays TS — that's the seam codex drew with check_redundancy, preserved here. - Token shape preserved: `result.tokensUsed` is `TokenUsage | None`; TS just passes through (Rust already mapped from provider's UsageMetrics, returning None for zero-token providers). ## Stack progress - #1385 PR-1 (pure types + prompt builder + identity-reminder template): #1388 MERGED - #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN - #1385 PR-3 (TS shim + dead-TS delete): **this PR** - #1385 PR-4 (dead-TS delete): **folded into this PR** ## Refs - #1385 sub-card - #1388 PR-1 (MERGED) - #1390 PR-2 (in flight) - #1383 codex's check_redundancy PR-3 — same shape - #1248 umbrella Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Oxidizer for
AIDecisionService.generateResponse(TS, seesrc/system/ai/server/AIDecisionService.ts:316-452+buildResponseMessageshelper at lines 439-540). Sibling to the just-completedcheck_redundancystack (#1375 — PRs #1377/#1381/#1383 MERGED). This is the last remaining TS-side AI logic inAIDecisionService.ts.What this ships (PR-1 scope — pure, atomic)
GenerateResponseRequest(ts-rs) —{ context, model?, temperature?, max_tokens?, timeout_ms? }GenerateResponseResult(ts-rs) —{ text, model, response_time_ms, timestamp, tokens_used? }TokenUsage(ts-rs) —{ input, output, total }build_response_messages(&AIDecisionContext, current_time_ms) -> Vec<ChatMessage>— pure. Composes:context.system_prompt)[HH:MM]time prefix + hour-gap markers (⏱️ N hour passed)build_identity_reminder(persona_name, members, current_time) -> String— pure. Canonical ~50-line critical-topic-detection prompt template (byte-for-byte parity with TS modulo substitutions).extract_room_members(system_prompt) -> &str— pure. PullsCurrent room members: ...from a system prompt body.format_current_time(ms) -> String— pure. UTCMM/DD/YYYY HH:MM.format_time_prefix(Option<ms>) -> String— pure. UTC[HH:MM].hour_gap_marker(gap_ms) -> Option<String>— pure.NOT in this PR
cognition/generate-responseIPC handler — async composer that callsbuild_response_messages→ AI provider (existing local Qwen router) → result with timing +tokio::time::timeoutreplacing the TSPromise.race.AIDecisionService.generateResponsedelegates toRustCoreIPCClient.cognitionGenerateResponse.buildResponseMessages+ inline identity-reminder template (~250 LOC removed). After PR-3 + PR-4,AIDecisionService.tsis pure slot-coordination + shim code.Discipline
current_time_msso tests are deterministic.toLocaleDateString)."unknown members"string — matches TS exactly so prompt machinery doesn't regress.system_prompttreated as missing (avoids emitting an empty system row that some providers reject).Tests (29 — 26 logic + 3 ts-rs export)
format_current_time (2): mm/dd/yyyy hh:mm UTC at known timestamp, epoch zero boundary.
extract_room_members (4): happy path, no trailing newline, missing prefix→fallback, empty after prefix→fallback.
format_time_prefix (2): HH:MM UTC render, None→empty.
hour_gap_marker (3): under threshold→None, 1 hour singular, 2+ hours plural.
identity_reminder (3): embeds persona+members+time, preserves four-step protocol, preserves time-gap heuristic line.
build_response_messages (12): system+history+identity order, omits system when None, omits system when empty string, hour-gap marker for >1h gaps, no marker under one hour, gap tracking ignores clockless messages (TS parity), name fallback, members extracted for identity reminder end-to-end, unknown members fallback when prompt missing line, no system prompt→unknown members fallback, preserves role strings as-is, empty history.
ts-rs exports (3): all 3 wire types pin barrel consistency.
Full cognition regression: 325/325 pass.
Refs
rate_proposalsstack refactor(cognition,#1289): rate_proposals PR-1 — Rust types+prompt+parser slice #1290/refactor(cognition,#1289): rate_proposals PR-2 — IPC handler + orchestrator #1291/refactor(cognition,#1289): rate_proposals PR-3 — delete dead TS adapter (no production callers) #1293 (MERGED)should_respond.rs(already oxidized — evaluateGating arm)Test plan
cargo test --package continuum-core --lib --features metal,accelerate cognition::generate_response::— 29/29 passcargo test --package continuum-core --lib --features metal,accelerate cognition::— 325/325 pass (no regressions)🤖 Generated with Claude Code