feat(cognition,#1385): generate_response PR-3 — TS shim + delete dead TS (PR-4 folded)#1402
Merged
Merged
Conversation
… TS (PR-4 folded) Stacks on PR-2 #1390 (async evaluate_response + cognition/generate-response IPC handler). AIDecisionService.generateResponse now delegates to RustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + timeout race + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (folded PR-4 dead-code delete in). ## What this ships - `AIDecisionService.generateResponse` now a thin shim: - InferenceCoordinator.requestSlot (TS owns slot coordination — platform concern) - client.cognitionGenerateResponse(request) — single IPC call - InferenceCoordinator.releaseSlot - logError + rethrow on failure (no fail-open silent default) - New TS binding method `cognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>` in the cognition mixin - `GenerateResponseRequest` + `GenerateResponseResult` re-exported from the generated barrel (already present from PR-1) ## Dead TS deleted (PR-4 folded in) - `private static buildResponseMessages(context)` helper (~115 LOC): system-prompt injection, conversation history with [HH:MM] prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1. - `import { AIProviderDaemon }` — no longer referenced after both checkRedundancy (#1383) + generateResponse migrations. - `import type { TextGenerationRequest, TextGenerationResponse }` — ditto, only used by deleted helper. - Inline timeout Promise.race code — replaced by Rust-side tokio::time::timeout in PR-2. After this PR, `AIDecisionService.ts` contains only: - evaluateGating (already shim to cognition/should-respond) - checkRedundancy (already shim to cognition/check-redundancy) - generateResponse (now shim to cognition/generate-response) - InferenceCoordinator slot management (TS-owned platform concern) - logging helpers (TS-owned platform concern) ## Discipline - No fail-open path — errors throw, caller decides (consistent with codex's check_redundancy shim pattern). - Cast `context as unknown as RustAIDecisionContext` matches the pattern in cognitionShouldRespond + cognitionCheckRedundancy — TS RAGContext.identity wraps the system prompt; TS already resolves to context.systemPrompt before sending. - Slot coordination explicitly stays TS — that's the seam codex drew with check_redundancy, preserved here. - Token shape preserved: `result.tokensUsed` is `TokenUsage | None`; TS just passes through (Rust already mapped from provider's UsageMetrics, returning None for zero-token providers). ## Stack progress - #1385 PR-1 (pure types + prompt builder + identity-reminder template): #1388 MERGED - #1385 PR-2 (async evaluate_response + IPC handler): #1390 OPEN - #1385 PR-3 (TS shim + dead-TS delete): **this PR** - #1385 PR-4 (dead-TS delete): **folded into this PR** ## Refs - #1385 sub-card - #1388 PR-1 (MERGED) - #1390 PR-2 (in flight) - #1383 codex's check_redundancy PR-3 — same shape - #1248 umbrella Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b4b4f81 to
b4f257d
Compare
This was referenced May 18, 2026
joelteply
pushed a commit
that referenced
this pull request
May 18, 2026
…(Rust admits now) Follow-up to #1402. Joel's a89c8ab (admit generate-response through Rust resource gate) added ResourceAdmissionGate inside cognition/generate_response.rs::evaluate_response. TS-side InferenceCoordinator.requestSlot/releaseSlot calls in AIDecisionService.generateResponse are now redundant — they double-coordinate the same path. Per directive: hosts should not coordinate slots outside Rust. This PR removes them. ## What this changes - AIDecisionService.generateResponse: - Drop InferenceCoordinator.requestSlot/releaseSlot calls (success + error paths) - Drop messageId / isMentioned options (slot-coord-specific — unused without slot coord) - Drop messageId derivation + slot-denied fallback throw - Drop LOCAL_MODELS.DEFAULT fallback (Rust evaluate_response carries its own DEFAULT_GENERATE_MODEL constant; passing `undefined` lets Rust apply its default — single source of truth) - Drop LOCAL_MODELS import (no longer referenced in file) - InferenceCoordinator import kept (still used by evaluateGating + checkRedundancy — those still slot-coord because Rust admission hasn't been extended to those paths yet) After this PR: generateResponse is a 25-LOC try/catch around a single IPC call — the thinnest possible shim. Slot leak risk codex flagged on #1402 becomes structurally impossible (no slots = no leaks). ## Verification - npm run build:ts — clean - ESLint baseline held at 5435 (no new errors) - Greppable call sites of AIDecisionService.generateResponse: zero TS callers pass isMentioned or messageId (only a doc reference exists in widgets/WIDGET-ABSTRACTION-BREAKTHROUGH.md to a different daemon) ## Refs - #1402 — PR-3 of the generate_response oxidizer stack - a89c8ab — Joel's commit adding Rust ResourceAdmissionGate - #1385 — completed oxidizer sub-card (now closed) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
joelteply
added a commit
that referenced
this pull request
May 18, 2026
…(Rust admits now) (#1407) Follow-up to #1402. Joel's a89c8ab (admit generate-response through Rust resource gate) added ResourceAdmissionGate inside cognition/generate_response.rs::evaluate_response. TS-side InferenceCoordinator.requestSlot/releaseSlot calls in AIDecisionService.generateResponse are now redundant — they double-coordinate the same path. Per directive: hosts should not coordinate slots outside Rust. This PR removes them. ## What this changes - AIDecisionService.generateResponse: - Drop InferenceCoordinator.requestSlot/releaseSlot calls (success + error paths) - Drop messageId / isMentioned options (slot-coord-specific — unused without slot coord) - Drop messageId derivation + slot-denied fallback throw - Drop LOCAL_MODELS.DEFAULT fallback (Rust evaluate_response carries its own DEFAULT_GENERATE_MODEL constant; passing `undefined` lets Rust apply its default — single source of truth) - Drop LOCAL_MODELS import (no longer referenced in file) - InferenceCoordinator import kept (still used by evaluateGating + checkRedundancy — those still slot-coord because Rust admission hasn't been extended to those paths yet) After this PR: generateResponse is a 25-LOC try/catch around a single IPC call — the thinnest possible shim. Slot leak risk codex flagged on #1402 becomes structurally impossible (no slots = no leaks). ## Verification - npm run build:ts — clean - ESLint baseline held at 5435 (no new errors) - Greppable call sites of AIDecisionService.generateResponse: zero TS callers pass isMentioned or messageId (only a doc reference exists in widgets/WIDGET-ABSTRACTION-BREAKTHROUGH.md to a different daemon) ## Refs - #1402 — PR-3 of the generate_response oxidizer stack - a89c8ab — Joel's commit adding Rust ResourceAdmissionGate - #1385 — completed oxidizer sub-card (now closed) Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 18, 2026
Merged
joelteply
added a commit
that referenced
this pull request
May 18, 2026
…/should-respond, drop parallel reimpl (#1421) AIShouldRespondServerCommand carried a separate gating implementation (custom prompt + JSON-repair-via-second-LLM retry) parallel to the canonical cognition/should-respond Rust path that AIDecisionService. evaluateGating already used. Two paths could drift independently; the TS prompt was already stale relative to the Rust template. This PR collapses both into one: the TS command now constructs the AIDecisionContext from the public AIShouldRespondParams shape and delegates to client.cognitionShouldRespond. Rust owns the prompt, model call, parser, and typed AIGatingDecision contract. ## Diff - AIShouldRespondServerCommand.ts: -85 / +59 LOC (net -26) - ESLint baseline ratchet: 5435 -> 5433 (-2) ## Dead TS deleted - `import { AIProviderDaemon }` (no longer referenced in this file) - `import { TextGenerationRequest }` (parent-class type only; not used by the delegation) - `import { LOCAL_MODELS }` (Rust evaluate_gating carries its own DEFAULT_GATING_MODEL constant; missing model defaults Rust-side) - Inline gating instruction build + message-array construction (Rust cognition/should_respond.rs::build_gating_prompt owns it) - JSON-repair-via-second-LLM retry path (Rust returns typed errors; caller decides retry policy at the IPC seam) - Stale `>>> trigger <<<` marking logic (Rust handles trigger marking inside build_gating_prompt — verified parity) ## What stays - The thin TS shim: param -> RustAIDecisionContext mapping + AIShouldRespondResult construction. - Verbose debug mode: still emits ragContext.messageCount + conversationPreview (TS-derivable, no Rust round-trip needed). `promptSent` / `aiResponse` debug fields now sentinel-pointer to the Rust logs (`cognition::should_respond`) where they actually live. - Catch-around-throw error path (matches sibling shim discipline). ## Discipline - Cast `params -> RustAIDecisionContext` mirrors the existing pattern in AIDecisionService.evaluateGating (`as unknown as` for the structurally-matching surface). - Synthetic `triggerMessage.id` derived from the timestamp so repeat calls don't multiply observability noise (params don't carry one; Rust requires it). - No fail-open default — failures throw, caller catches via the existing error-return path. ## Refs - #1420 sub-card (just filed) - Existing Rust: cognition/should_respond.rs::evaluate_gating (already shipped, in production via AIDecisionService.evaluateGating) - Sibling pattern: codex's #1383 check_redundancy delegation, my #1402 generate_response delegation - #1248 umbrella Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacks on PR-2 #1390 (MERGED).
AIDecisionService.generateResponsenow delegates toRustCoreIPCClient.cognitionGenerateResponse; ~110 LOC of TS prompt assembly + Promise.race timeout + token decoding deleted. Mirrors codex's check_redundancy PR-3 #1383 shape (PR-4 dead-code delete folded in).What this ships
AIDecisionService.generateResponsenow a thin shim:InferenceCoordinator.requestSlot(TS owns slot coordination — platform concern)client.cognitionGenerateResponse(request)— single IPC callInferenceCoordinator.releaseSlotcognitionGenerateResponse(GenerateResponseRequest) -> Promise<GenerateResponseResult>in the cognition mixinGenerateResponseRequest+GenerateResponseResultre-exported from the generated barrel (already present from PR-1)Dead TS deleted (PR-4 folded in)
private static buildResponseMessages(context)helper (~115 LOC): system-prompt injection, conversation history with[HH:MM]prefix, hour-gap markers, ~50-line identity-reminder template — all moved to Rust in PR-1.import { AIProviderDaemon }— no longer referenced after bothcheckRedundancy(feat(cognition): delegate redundancy shim to rust #1383) +generateResponsemigrations.import type { TextGenerationRequest, TextGenerationResponse }— ditto, only used by deleted helper.tokio::time::timeoutin PR-2.After this PR,
AIDecisionService.tscontains only:evaluateGating(already shim tocognition/should-respond)checkRedundancy(already shim tocognition/check-redundancy)generateResponse(now shim tocognition/generate-response)InferenceCoordinatorslot management (TS-owned platform concern)Discipline
context as unknown as RustAIDecisionContextmatches the pattern incognitionShouldRespond+cognitionCheckRedundancy.result.tokensUsedisTokenUsage | None; TS just passes through (Rust already mapped from provider'sUsageMetrics, returningNonefor zero-token providers).Stack progress
Diff: +40 / -141 (2 files).
Refs
Test plan
npm run build:ts— TS compilation clean🤖 Generated with Claude Code