feat(cognition,#1385): generate_response PR-2 — async evaluate_response + cognition/generate-response IPC handler by joelteply · Pull Request #1390 · CambrianTech/continuum

joelteply · 2026-05-18T16:24:07Z

Summary

Stacks on PR-1 #1388 (pure types + prompt builder + identity-reminder template, MERGED at 872e84a). PR-2 wires the async path: build_response_messages → adapter.generate_text (existing local Qwen router via global_registry) → result with timing + tokio::time::timeout replacing the TS Promise.race.

What this ships

evaluate_response(GenerateResponseRequest) -> Result<GenerateResponseResult, GenerateResponseError> — async composer. Honors per-request model/temperature/max_tokens/timeout_ms overrides; defaults match TS (Qwen3.5 / 0.7 / 150 / 180_000ms).
GenerateResponseError — typed: NoAdapter, Generation, Timeout. No silent default-on-error; caller picks fail-open vs fail-closed.
build_response_generation_request(&request, model, start_ms) -> TextGenerationRequest — pure helper. Pins wire shape (provider="local", response_format=Text, purpose="cognition/generate-response", persona/room attribution).
result_from_response(response, model, start_ms, end_ms) -> GenerateResponseResult — pure helper. Trims text, stamps model + timing, populates tokens_used only when total_tokens > 0 (mirrors TS truthiness check on usage object).
cognition/generate-response command arm in CognitionModule::handle_command.

Discipline

tokio::time::timeout wraps adapter.generate_text — clean Timeout variant on error enum (TS Promise.race equivalent).
Saturating subtraction on response_time_ms — clock-backwards artifact (NTP adjustment mid-call) reports 0, not a wrapped huge u64.
tokens_used = None when provider reports zeros — avoids emitting fake {0,0,0} measurements for providers that don't instrument usage.
response_format=Text (TS default) — local Qwen takes plain text, no JSON-mode constraint.
All constants documented with their TS-default origins.

Tests (10 new — full module now 39 passing)

build_response_generation_request (3): defaults shape (provider/model/temp/max/response_format/purpose/attribution/message count), overrides honored, caller timestamp embedded in identity reminder.

result_from_response (5): trims whitespace, stamps model + timing, populates tokens when provider reports total > 0, tokens None when zero, response_time saturates on clock-backwards.

GenerateResponseError (2): NoAdapter Display carries provider + model, Timeout Display includes duration.

Full cognition regression: 335/335 pass.

NOT in this PR

PR-3: TS shim — AIDecisionService.generateResponse delegates to RustCoreIPCClient.cognitionGenerateResponse + cognition mixin binding.
PR-4: Delete dead TS — buildResponseMessages helper + inline identity-reminder template (~250 LOC removed). After PR-3 + PR-4, AIDecisionService.ts is pure slot-coordination + IPC shim code.

Refs

oxidizer: migrate AIDecisionService.generateResponse to Rust cognition/generate-response #1385 — sub-card
feat(cognition,#1385): generate_response PR-1 — pure types + prompt builder + identity-reminder template #1388 — PR-1 (MERGED)
TS-side AI logic that violates 'TS is thin glue' directive — umbrella triage #1248 — umbrella: TS-side AI logic violates 'TS is thin glue' directive
Sibling pattern: oxidizer: migrate AIDecisionService.checkRedundancy to Rust cognition/check-redundancy #1375 check_redundancy stack (MERGED PRs feat(cognition,#1375): check_redundancy PR-1 — pure types + prompt + parser #1377/feat(cognition): wire check redundancy IPC #1381/feat(cognition): delegate redundancy shim to rust #1383)

Test plan

cargo test --package continuum-core --lib --features metal,accelerate cognition::generate_response:: — 39/39 pass
cargo test --package continuum-core --lib --features metal,accelerate cognition:: — 335/335 pass
Pre-push TS clean, ESLint baseline held, Rust compile clean
CI green
PR-3 (TS shim) + PR-4 (dead-TS delete) stack on top

🤖 Generated with Claude Code

…se + cognition/generate-response IPC handler Stacks on PR-1 #1388 (pure types + prompt builder + identity-reminder template). PR-2 wires the async path: build_response_messages → adapter.generate_text (existing local Qwen router via global_registry) → result with timing + tokio::time::timeout replacing the TS Promise.race. ## What this ships (PR-2) - `evaluate_response(GenerateResponseRequest) -> Result<GenerateResponseResult, GenerateResponseError>` — async composer. Honors per-request model/temperature/max_tokens/ timeout overrides; defaults match TS (Qwen3.5 / 0.7 / 150 / 180_000ms). - `GenerateResponseError` — typed: NoAdapter, Generation, Timeout. No silent default-on-error; caller picks fail-open vs fail-closed. - `build_response_generation_request(&request, model, start_ms) -> TextGenerationRequest` — pure helper. Pins wire shape (provider="local", response_format=Text, purpose="cognition/generate-response", persona/room attribution). - `result_from_response(response, model, start_ms, end_ms) -> GenerateResponseResult` — pure helper. Trims text, stamps model + timing, populates tokens_used only when total_tokens > 0 (mirrors TS truthiness). - `cognition/generate-response` command arm in CognitionModule. ## Discipline - `tokio::time::timeout` wraps `adapter.generate_text` — clean Timeout variant on the error enum (TS Promise.race equivalent). - Saturating subtraction on response_time_ms — clock-backwards artifact (NTP adjustment mid-call) reports 0, not a wrapped huge u64. - tokens_used = None when provider reports zeros — avoids emitting fake {0,0,0} measurements for providers that don't instrument usage. - response_format=Text (TS default) — local Qwen takes plain text, no JSON-mode constraint. - All constants are documented (DEFAULT_GENERATE_PROVIDER/MODEL/ TEMPERATURE/MAX_TOKENS/TIMEOUT_MS). ## Tests (10 new — full module now 39 passing) build_response_generation_request: - defaults: provider=local, model=Qwen-default, temp=0.7, max=150, response_format=Text, purpose="cognition/generate-response", persona/room attribution, message count - overrides honored (custom model + temp + max) - caller timestamp embedded in identity reminder (time-flow through layers) result_from_response: - trims surrounding whitespace - stamps model + timing - populates tokens when provider reports total > 0 - tokens None when provider reports 0 - response_time saturates clock-backwards GenerateResponseError: - NoAdapter Display carries provider + model - Timeout Display includes duration Full cognition regression: 335/335 pass. ## NOT in this PR - **PR-3**: TS shim — AIDecisionService.generateResponse delegates to RustCoreIPCClient.cognitionGenerateResponse + cognition mixin binding. - **PR-4**: Delete dead TS — buildResponseMessages helper + inline identity-reminder template (~250 LOC removed). Ref: #1385 oxidizer card, #1388 PR-1 (MERGED). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… TS (PR-4 folded) Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response IPC handler). AIDecisionService.generateResponse now delegates to RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + timeout race + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in). ## What this ships - `AIDecisionService.generateResponse` now a thin shim: - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern) - client.cognitionGenerateResponse(request) — single IPC call - InferenceCoordinator.releaseSlot - logError + rethrow on failure (no fail-open silent default) - New TS binding method `cognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>` in the cognition mixin - `GenerateResponseRequest` + `GenerateResponseResult` re-exported from the generated barrel (already present from PR-1) ## Dead TS deleted (PR-4 folded in) - `private static buildResponseMessages(context)` helper (~115 LOC): system-prompt injection, conversation history with [HH:MM] prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1. - `import { AIProviderDaemon }` — no longer referenced after both checkRedundancy (#1383) + generateResponse migrations. - `import type { TextGenerationRequest, TextGenerationResponse }` — ditto, only used by deleted helper. - Inline timeout Promise.race code — replaced by Rust-side tokio::time::timeout in PR-2. After this PR, `AIDecisionService.ts` contains only: - evaluateGating (already shim to cognition/should-respond) - checkRedundancy (already shim to cognition/check-redundancy) - generateResponse (now shim to cognition/generate-response) - InferenceCoordinator slot management (TS-owned platform concern) - logging helpers (TS-owned platform concern) ## Discipline - No fail-open path — errors throw, caller decides (consistent with codex's check_redundancy shim pattern). - Cast `context as unknown as RustAIDecisionContext` matches the pattern in cognitionShouldRespond + cognitionCheckRedundancy — TS RAGContext.identity wraps the system prompt; TS already resolves to context.systemPrompt before sending. - Slot coordination explicitly stays TS — that's the seam codex drew with check_redundancy, preserved here. - Token shape preserved: `result.tokensUsed` is `TokenUsage | None`; TS just passes through (Rust already mapped from provider's UsageMetrics, returning None for zero-token providers). ## Stack progress - #1385 PR-1 (pure types + prompt builder + identity-reminder template): #1388 MERGED - #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN - #1385 PR-3 (TS shim + dead-TS delete): **this PR** - #1385 PR-4 (dead-TS delete): **folded into this PR** ## Refs - #1385 sub-card - #1388 PR-1 (MERGED) - #1390 PR-2 (in flight) - #1383 codex's check_redundancy PR-3 — same shape - #1248 umbrella Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… TS (PR-4 folded) (#1402) Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response IPC handler). AIDecisionService.generateResponse now delegates to RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + timeout race + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in). ## What this ships - `AIDecisionService.generateResponse` now a thin shim: - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern) - client.cognitionGenerateResponse(request) — single IPC call - InferenceCoordinator.releaseSlot - logError + rethrow on failure (no fail-open silent default) - New TS binding method `cognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>` in the cognition mixin - `GenerateResponseRequest` + `GenerateResponseResult` re-exported from the generated barrel (already present from PR-1) ## Dead TS deleted (PR-4 folded in) - `private static buildResponseMessages(context)` helper (~115 LOC): system-prompt injection, conversation history with [HH:MM] prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1. - `import { AIProviderDaemon }` — no longer referenced after both checkRedundancy (#1383) + generateResponse migrations. - `import type { TextGenerationRequest, TextGenerationResponse }` — ditto, only used by deleted helper. - Inline timeout Promise.race code — replaced by Rust-side tokio::time::timeout in PR-2. After this PR, `AIDecisionService.ts` contains only: - evaluateGating (already shim to cognition/should-respond) - checkRedundancy (already shim to cognition/check-redundancy) - generateResponse (now shim to cognition/generate-response) - InferenceCoordinator slot management (TS-owned platform concern) - logging helpers (TS-owned platform concern) ## Discipline - No fail-open path — errors throw, caller decides (consistent with codex's check_redundancy shim pattern). - Cast `context as unknown as RustAIDecisionContext` matches the pattern in cognitionShouldRespond + cognitionCheckRedundancy — TS RAGContext.identity wraps the system prompt; TS already resolves to context.systemPrompt before sending. - Slot coordination explicitly stays TS — that's the seam codex drew with check_redundancy, preserved here. - Token shape preserved: `result.tokensUsed` is `TokenUsage | None`; TS just passes through (Rust already mapped from provider's UsageMetrics, returning None for zero-token providers). ## Stack progress - #1385 PR-1 (pure types + prompt builder + identity-reminder template): #1388 MERGED - #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN - #1385 PR-3 (TS shim + dead-TS delete): **this PR** - #1385 PR-4 (dead-TS delete): **folded into this PR** ## Refs - #1385 sub-card - #1388 PR-1 (MERGED) - #1390 PR-2 (in flight) - #1383 codex's check_redundancy PR-3 — same shape - #1248 umbrella Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joelteply merged commit 3fe8ecd into canary May 18, 2026
3 checks passed

joelteply deleted the feat/oxidizer-generate-response-pr2 branch May 18, 2026 16:24

github-actions Bot added the size: L label May 18, 2026

joelteply mentioned this pull request May 18, 2026

feat(cognition,#1385): generate_response PR-3 — TS shim + delete dead TS (PR-4 folded) #1402

Merged

4 tasks

This was referenced May 18, 2026

oxidizer: migrate ToolRegistry semantic search to Rust cognition/tool-embedding #1411

Closed

feat(cognition,#1411): tool_embedding PR-1 — pure types + cosine_similarity + threshold #1413

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cognition,#1385): generate_response PR-2 — async evaluate_response + cognition/generate-response IPC handler#1390

feat(cognition,#1385): generate_response PR-2 — async evaluate_response + cognition/generate-response IPC handler#1390
joelteply merged 1 commit into
canaryfrom
feat/oxidizer-generate-response-pr2

joelteply commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 18, 2026

Summary

What this ships

Discipline

Tests (10 new — full module now 39 passing)

NOT in this PR

Refs

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant