Skip to content

feat(cognition,#1385): generate_response PR-1 — pure types + prompt builder + identity-reminder template#1388

Merged
joelteply merged 1 commit into
canaryfrom
feat/oxidizer-generate-response-pr1
May 18, 2026
Merged

feat(cognition,#1385): generate_response PR-1 — pure types + prompt builder + identity-reminder template#1388
joelteply merged 1 commit into
canaryfrom
feat/oxidizer-generate-response-pr1

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

Oxidizer for AIDecisionService.generateResponse (TS, see src/system/ai/server/AIDecisionService.ts:316-452 + buildResponseMessages helper at lines 439-540). Sibling to the just-completed check_redundancy stack (#1375 — PRs #1377/#1381/#1383 MERGED). This is the last remaining TS-side AI logic in AIDecisionService.ts.

What this ships (PR-1 scope — pure, atomic)

  • GenerateResponseRequest (ts-rs) — { context, model?, temperature?, max_tokens?, timeout_ms? }
  • GenerateResponseResult (ts-rs) — { text, model, response_time_ms, timestamp, tokens_used? }
  • TokenUsage (ts-rs) — { input, output, total }
  • build_response_messages(&AIDecisionContext, current_time_ms) -> Vec<ChatMessage> — pure. Composes:
    1. System-prompt message (from context.system_prompt)
    2. Conversation history with [HH:MM] time prefix + hour-gap markers (⏱️ N hour passed)
    3. Identity-reminder system message at end
  • build_identity_reminder(persona_name, members, current_time) -> String — pure. Canonical ~50-line critical-topic-detection prompt template (byte-for-byte parity with TS modulo substitutions).
  • extract_room_members(system_prompt) -> &str — pure. Pulls Current room members: ... from a system prompt body.
  • format_current_time(ms) -> String — pure. UTC MM/DD/YYYY HH:MM.
  • format_time_prefix(Option<ms>) -> String — pure. UTC [HH:MM] .
  • hour_gap_marker(gap_ms) -> Option<String> — pure.

NOT in this PR

  • PR-2: cognition/generate-response IPC handler — async composer that calls build_response_messages → AI provider (existing local Qwen router) → result with timing + tokio::time::timeout replacing the TS Promise.race.
  • PR-3: TS shim — AIDecisionService.generateResponse delegates to RustCoreIPCClient.cognitionGenerateResponse.
  • PR-4: Delete dead TS — buildResponseMessages + inline identity-reminder template (~250 LOC removed). After PR-3 + PR-4, AIDecisionService.ts is pure slot-coordination + shim code.

Discipline

  • All pure functions; caller passes current_time_ms so tests are deterministic.
  • UTC time formatting removes hidden TZ dependency the TS version had (server timezone was leaking into model prompts via toLocaleDateString).
  • Members extraction falls back to literal "unknown members" string — matches TS exactly so prompt machinery doesn't regress.
  • Empty system_prompt treated as missing (avoids emitting an empty system row that some providers reject).
  • Identity-reminder template byte-for-byte parity with TS modulo substitutions.
  • Conversation types reused from gating stack (no new shapes invented for shared concepts).

Tests (29 — 26 logic + 3 ts-rs export)

format_current_time (2): mm/dd/yyyy hh:mm UTC at known timestamp, epoch zero boundary.

extract_room_members (4): happy path, no trailing newline, missing prefix→fallback, empty after prefix→fallback.

format_time_prefix (2): HH:MM UTC render, None→empty.

hour_gap_marker (3): under threshold→None, 1 hour singular, 2+ hours plural.

identity_reminder (3): embeds persona+members+time, preserves four-step protocol, preserves time-gap heuristic line.

build_response_messages (12): system+history+identity order, omits system when None, omits system when empty string, hour-gap marker for >1h gaps, no marker under one hour, gap tracking ignores clockless messages (TS parity), name fallback, members extracted for identity reminder end-to-end, unknown members fallback when prompt missing line, no system prompt→unknown members fallback, preserves role strings as-is, empty history.

ts-rs exports (3): all 3 wire types pin barrel consistency.

Full cognition regression: 325/325 pass.

Refs

Test plan

  • cargo test --package continuum-core --lib --features metal,accelerate cognition::generate_response:: — 29/29 pass
  • cargo test --package continuum-core --lib --features metal,accelerate cognition:: — 325/325 pass (no regressions)
  • Pre-push TS clean, ESLint baseline held, Rust compile clean
  • CI green
  • PR-2 IPC handler stack on top (separate PR)

🤖 Generated with Claude Code

…uilder + identity-reminder template

Oxidizer for AIDecisionService.generateResponse (TS, see
src/system/ai/server/AIDecisionService.ts:316-452 + buildResponseMessages
helper). Sibling to check_redundancy stack (#1375) + should_respond
(already oxidized). This is the LAST remaining TS-side AI logic in
AIDecisionService.ts.

## What this ships (PR-1 scope — pure, atomic)

- `GenerateResponseRequest` (ts-rs) — { context, model?, temperature?,
  max_tokens?, timeout_ms? }
- `GenerateResponseResult` (ts-rs) — { text, model, response_time_ms,
  timestamp, tokens_used? }
- `TokenUsage` (ts-rs) — { input, output, total }
- `build_response_messages(&AIDecisionContext, current_time_ms) ->
  Vec<ChatMessage>` — pure. Composes:
    1. System-prompt message (from context.system_prompt)
    2. Conversation history with [HH:MM] time prefix + hour-gap markers
       (⏱️ N hour passed)
    3. Identity-reminder system message at end
- `build_identity_reminder(persona_name, members, current_time) ->
  String` — pure. Canonical ~50-line critical-topic-detection prompt.
- `extract_room_members(system_prompt) -> &str` — pure. Pulls
  `Current room members: ...` from a system prompt body.
- `format_current_time(ms) -> String` — pure. UTC `MM/DD/YYYY HH:MM`.
- `format_time_prefix(Option<ms>) -> String` — pure. UTC `[HH:MM] `.
- `hour_gap_marker(gap_ms) -> Option<String>` — pure.

## NOT in this PR

- **PR-2**: cognition/generate-response IPC handler — async composer
  that calls build_response_messages -> AI provider (existing local
  Qwen router) -> result with timing + tokio::time::timeout replacing
  the TS Promise.race.
- **PR-3**: TS shim — AIDecisionService.generateResponse delegates to
  RustCoreIPCClient.cognitionGenerateResponse.
- **PR-4**: Delete dead TS — buildResponseMessages + inline
  identity-reminder template (~250 LOC removed). After PR-3 + PR-4,
  AIDecisionService.ts is pure slot-coordination + shim code.

## Discipline

- All pure functions; caller passes current_time_ms so tests are
  deterministic.
- UTC time formatting removes hidden TZ dependency the TS version had
  (server timezone was leaking into model prompts via
  toLocaleDateString).
- Members extraction falls back to literal "unknown members" string —
  matches TS exactly so prompt machinery doesn't regress.
- Empty system_prompt treated as missing (avoids emitting an empty
  system row that some providers reject).
- Identity-reminder template byte-for-byte parity with TS modulo
  substitutions.
- All ts-rs export bindings.

## Tests (29 — 26 logic + 3 ts-rs export)

format_current_time:
- mm/dd/yyyy hh:mm UTC at known timestamp
- epoch zero boundary

extract_room_members:
- well-formed line extraction
- no trailing newline
- missing prefix -> UNKNOWN_MEMBERS fallback
- empty after prefix -> UNKNOWN_MEMBERS fallback

format_time_prefix:
- HH:MM UTC render
- None -> empty string

hour_gap_marker:
- under threshold -> None
- 1 hour singular
- 2+ hours plural

identity_reminder:
- embeds persona + members + time
- preserves four-step protocol
- preserves time-gap heuristic line

build_response_messages:
- system + history + identity in order
- omits system when None
- omits system when empty string
- injects hour-gap marker for > 1h gaps
- no marker under one hour
- gap tracking ignores clockless messages (TS parity)
- name fallback when missing
- extracts members for identity reminder end-to-end
- unknown members fallback when prompt missing line
- no system prompt -> unknown members fallback
- preserves role strings as-is (TS casts but Rust preserves)
- empty history

Full cognition regression: 325/325 pass.

Ref: #1385 oxidizer card just filed; #1248 umbrella.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit 872e84a into canary May 18, 2026
1 check passed
@joelteply joelteply deleted the feat/oxidizer-generate-response-pr1 branch May 18, 2026 16:10
joelteply added a commit that referenced this pull request May 18, 2026
…se + cognition/generate-response IPC handler (#1390)

Stacks on PR-1 #1388 (pure types + prompt builder + identity-reminder
template). PR-2 wires the async path: build_response_messages →
adapter.generate_text (existing local Qwen router via global_registry)
→ result with timing + tokio::time::timeout replacing the TS
Promise.race.

## What this ships (PR-2)

- `evaluate_response(GenerateResponseRequest) -> Result<GenerateResponseResult, GenerateResponseError>`
  — async composer. Honors per-request model/temperature/max_tokens/
  timeout overrides; defaults match TS (Qwen3.5 / 0.7 / 150 / 180_000ms).
- `GenerateResponseError` — typed: NoAdapter, Generation, Timeout. No
  silent default-on-error; caller picks fail-open vs fail-closed.
- `build_response_generation_request(&request, model, start_ms) -> TextGenerationRequest`
  — pure helper. Pins wire shape (provider="local", response_format=Text,
  purpose="cognition/generate-response", persona/room attribution).
- `result_from_response(response, model, start_ms, end_ms) -> GenerateResponseResult`
  — pure helper. Trims text, stamps model + timing, populates
  tokens_used only when total_tokens > 0 (mirrors TS truthiness).
- `cognition/generate-response` command arm in CognitionModule.

## Discipline

- `tokio::time::timeout` wraps `adapter.generate_text` — clean Timeout
  variant on the error enum (TS Promise.race equivalent).
- Saturating subtraction on response_time_ms — clock-backwards artifact
  (NTP adjustment mid-call) reports 0, not a wrapped huge u64.
- tokens_used = None when provider reports zeros — avoids emitting
  fake {0,0,0} measurements for providers that don't instrument usage.
- response_format=Text (TS default) — local Qwen takes plain text,
  no JSON-mode constraint.
- All constants are documented (DEFAULT_GENERATE_PROVIDER/MODEL/
  TEMPERATURE/MAX_TOKENS/TIMEOUT_MS).

## Tests (10 new — full module now 39 passing)

build_response_generation_request:
- defaults: provider=local, model=Qwen-default, temp=0.7, max=150,
  response_format=Text, purpose="cognition/generate-response",
  persona/room attribution, message count
- overrides honored (custom model + temp + max)
- caller timestamp embedded in identity reminder (time-flow through layers)

result_from_response:
- trims surrounding whitespace
- stamps model + timing
- populates tokens when provider reports total > 0
- tokens None when provider reports 0
- response_time saturates clock-backwards

GenerateResponseError:
- NoAdapter Display carries provider + model
- Timeout Display includes duration

Full cognition regression: 335/335 pass.

## NOT in this PR

- **PR-3**: TS shim — AIDecisionService.generateResponse delegates to
  RustCoreIPCClient.cognitionGenerateResponse + cognition mixin
  binding.
- **PR-4**: Delete dead TS — buildResponseMessages helper + inline
  identity-reminder template (~250 LOC removed).

Ref: #1385 oxidizer card, #1388 PR-1 (MERGED).

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply pushed a commit that referenced this pull request May 18, 2026
… TS (PR-4 folded)

Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response
IPC handler). AIDecisionService.generateResponse now delegates to
RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt
assembly + timeout race + token decoding deleted. Mirrors codex's
check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in).

## What this ships

- `AIDecisionService.generateResponse` now a thin shim:
  - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern)
  - client.cognitionGenerateResponse(request) — single IPC call
  - InferenceCoordinator.releaseSlot
  - logError + rethrow on failure (no fail-open silent default)
- New TS binding method `cognitionGenerateResponse(GenerateResponseRequest)
  -> Promise<GenerateResponseResult>` in the cognition mixin
- `GenerateResponseRequest` + `GenerateResponseResult` re-exported
  from the generated barrel (already present from PR-1)

## Dead TS deleted (PR-4 folded in)

- `private static buildResponseMessages(context)` helper (~115 LOC):
  system-prompt injection, conversation history with [HH:MM] prefix,
  hour-gap markers, ~50-line identity-reminder template — all moved
  to Rust in PR-1.
- `import { AIProviderDaemon }` — no longer referenced after both
  checkRedundancy (#1383) + generateResponse migrations.
- `import type { TextGenerationRequest, TextGenerationResponse }` —
  ditto, only used by deleted helper.
- Inline timeout Promise.race code — replaced by Rust-side
  tokio::time::timeout in PR-2.

After this PR, `AIDecisionService.ts` contains only:
  - evaluateGating (already shim to cognition/should-respond)
  - checkRedundancy (already shim to cognition/check-redundancy)
  - generateResponse (now shim to cognition/generate-response)
  - InferenceCoordinator slot management (TS-owned platform concern)
  - logging helpers (TS-owned platform concern)

## Discipline

- No fail-open path — errors throw, caller decides (consistent with
  codex's check_redundancy shim pattern).
- Cast `context as unknown as RustAIDecisionContext` matches the
  pattern in cognitionShouldRespond + cognitionCheckRedundancy —
  TS RAGContext.identity wraps the system prompt; TS already
  resolves to context.systemPrompt before sending.
- Slot coordination explicitly stays TS — that's the seam codex
  drew with check_redundancy, preserved here.
- Token shape preserved: `result.tokensUsed` is `TokenUsage | None`;
  TS just passes through (Rust already mapped from provider's
  UsageMetrics, returning None for zero-token providers).

## Stack progress

- #1385 PR-1 (pure types + prompt builder + identity-reminder
  template): #1388 MERGED
- #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN
- #1385 PR-3 (TS shim + dead-TS delete): **this PR**
- #1385 PR-4 (dead-TS delete): **folded into this PR**

## Refs

- #1385 sub-card
- #1388 PR-1 (MERGED)
- #1390 PR-2 (in flight)
- #1383 codex's check_redundancy PR-3 — same shape
- #1248 umbrella

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply added a commit that referenced this pull request May 18, 2026
… TS (PR-4 folded) (#1402)

Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response
IPC handler). AIDecisionService.generateResponse now delegates to
RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt
assembly + timeout race + token decoding deleted. Mirrors codex's
check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in).

## What this ships

- `AIDecisionService.generateResponse` now a thin shim:
  - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern)
  - client.cognitionGenerateResponse(request) — single IPC call
  - InferenceCoordinator.releaseSlot
  - logError + rethrow on failure (no fail-open silent default)
- New TS binding method `cognitionGenerateResponse(GenerateResponseRequest)
  -> Promise<GenerateResponseResult>` in the cognition mixin
- `GenerateResponseRequest` + `GenerateResponseResult` re-exported
  from the generated barrel (already present from PR-1)

## Dead TS deleted (PR-4 folded in)

- `private static buildResponseMessages(context)` helper (~115 LOC):
  system-prompt injection, conversation history with [HH:MM] prefix,
  hour-gap markers, ~50-line identity-reminder template — all moved
  to Rust in PR-1.
- `import { AIProviderDaemon }` — no longer referenced after both
  checkRedundancy (#1383) + generateResponse migrations.
- `import type { TextGenerationRequest, TextGenerationResponse }` —
  ditto, only used by deleted helper.
- Inline timeout Promise.race code — replaced by Rust-side
  tokio::time::timeout in PR-2.

After this PR, `AIDecisionService.ts` contains only:
  - evaluateGating (already shim to cognition/should-respond)
  - checkRedundancy (already shim to cognition/check-redundancy)
  - generateResponse (now shim to cognition/generate-response)
  - InferenceCoordinator slot management (TS-owned platform concern)
  - logging helpers (TS-owned platform concern)

## Discipline

- No fail-open path — errors throw, caller decides (consistent with
  codex's check_redundancy shim pattern).
- Cast `context as unknown as RustAIDecisionContext` matches the
  pattern in cognitionShouldRespond + cognitionCheckRedundancy —
  TS RAGContext.identity wraps the system prompt; TS already
  resolves to context.systemPrompt before sending.
- Slot coordination explicitly stays TS — that's the seam codex
  drew with check_redundancy, preserved here.
- Token shape preserved: `result.tokensUsed` is `TokenUsage | None`;
  TS just passes through (Rust already mapped from provider's
  UsageMetrics, returning None for zero-token providers).

## Stack progress

- #1385 PR-1 (pure types + prompt builder + identity-reminder
  template): #1388 MERGED
- #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN
- #1385 PR-3 (TS shim + dead-TS delete): **this PR**
- #1385 PR-4 (dead-TS delete): **folded into this PR**

## Refs

- #1385 sub-card
- #1388 PR-1 (MERGED)
- #1390 PR-2 (in flight)
- #1383 codex's check_redundancy PR-3 — same shape
- #1248 umbrella

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant