Skip to content

fix: key durable approval decisions by toolCallId instead of toolName#3062

Merged
anubra266 merged 4 commits intomainfrom
prd-6452
Apr 8, 2026
Merged

fix: key durable approval decisions by toolCallId instead of toolName#3062
anubra266 merged 4 commits intomainfrom
prd-6452

Conversation

@anubra266
Copy link
Copy Markdown
Contributor

Summary

  • Changes approvedToolCalls from Record<toolName, Array<decision>> to Record<toolCallId, { approved: boolean; reason?: string }> — a flat map keyed by toolCallId
  • Eliminates ordering ambiguity when the same tool is called multiple times in one execution: the old shift() approach could mismatch if the LLM reorders calls on replay
  • Removes the originalToolCallId indirection since the key itself is now the toolCallId

Files changed

  • agent-types.ts — simplified type definition
  • Agent.ts — updated setApprovedToolCalls signature
  • tool-approval.ts — replaced queue-based shift() lookup with direct toolCallId key lookup
  • tool-wrapper.ts — simplified pre-approved entry lookup and effectiveToolCallId
  • agentExecutionSteps.ts — changed from [toolName]: [{ ... }] to [toolCallId]: { ... }
  • relationTools.ts — changed delegated approval metadata serialization
  • generateTaskHandler.ts — updated type assertions for parsed approval data
  • executionHandler.ts — updated ExecutionHandlerParams type

Closes PRD-6452

Test plan

  • pnpm typecheck passes
  • pnpm format passes
  • Pre-commit hooks (lint-staged + tests) pass
  • Verify durable approval flow with single tool call
  • Verify durable approval flow with multiple calls to the same tool in one execution
  • Verify delegated tool approval via sub-agent delegation

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 8, 2026 6:08pm
agents-docs Ready Ready Preview, Comment Apr 8, 2026 6:08pm
agents-manage-ui Ready Ready Preview, Comment Apr 8, 2026 6:08pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 7, 2026

🦋 Changeset detected

Latest commit: 4b67998

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-core Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 7, 2026

TL;DR — Switches the durable approval lookup from a queue keyed by toolName to a flat map keyed by toolCallId, eliminating ordering bugs when a single tool is called multiple times in one agent turn. Also extracts ~20 repeated string literals (transfer_to_, delegate_to_, session event names, approval tokens, etc.) into shared constants in agents-core.

Key changes

  • Key approvedToolCalls by toolCallId instead of toolName — replaces the Record<string, Array<…>> queue with a simple Record<string, { approved, reason }> map, making lookups O(1) and order-independent.
  • Remove originalToolCallId validation path — the mismatch guard in tool-approval.ts is no longer needed because the map key itself is the tool call ID; ~30 lines of defensive span/logging code deleted.
  • Inline effectiveToolCallId to toolCallId — the indirection variable in tool-wrapper.ts existed to swap in originalToolCallId when present; with that field gone, every reference now uses toolCallId directly.
  • Simplify delegated approval serializationrelationTools.ts now writes a single object under the toolCallId key instead of wrapping it in an array under toolName.
  • Extract string literals into agents-core constants — new modules tool-names.ts, session-events.ts, workflow.ts, and relation-types.ts centralize ~20 magic strings (TRANSFER_TOOL_PREFIX, SESSION_EVENT_TOOL_CALL, DURABLE_APPROVAL_ARTIFACT_TYPE, etc.) with a barrel re-export through constants/index.ts.

Summary | 31 files | 5 commits | base: mainprd-6452


Flat map keyed by toolCallId replaces per-tool queues

Before: Pre-approved decisions were stored as arrays keyed by toolName — each approval/deny was shifted off a queue in FIFO order, with an originalToolCallId field to detect stale approvals after replay drift.
After: Each decision is stored directly under its toolCallId, so lookup is a single property access with no ordering dependency and no risk of queue misalignment.

The old queue model broke when the LLM issued multiple calls to the same tool in one turn — queue ordering could drift during durable workflow replay, causing the originalToolCallId mismatch guard to reject valid approvals. Keying by toolCallId makes the match exact and stateless.

Why was the mismatch guard removed instead of kept as a safety net? With toolCallId as the map key, a lookup either hits the correct decision or returns undefined (prompting a fresh approval request). The guard was protecting against queue-position drift, which no longer applies. Keeping it would add dead code.

tool-approval.ts · tool-wrapper.ts · agentExecutionSteps.ts · relationTools.ts


Shared constants for tool names, session events, and workflow tokens

Before: Strings like 'transfer_to_', 'approval-needed', 'tool-approval:', and 'durable-approval-required' were scattered as inline literals across 20+ files.
After: Four new modules in packages/agents-core/src/constants/tool-names.ts, session-events.ts, workflow.ts, relation-types.ts — define named constants, barrel-exported through constants/index.ts and the package root.

This is a pure refactor with no behavioral change. Every consuming file now imports from @inkeep/agents-core instead of hard-coding the string, enabling safe rename-refactors and grep-ability.

tool-names.ts · session-events.ts · workflow.ts · constants/index.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean simplification. Keying by toolCallId instead of toolName eliminates the ordering ambiguity from the shift()-based queue, and removing originalToolCallId drops a layer of indirection that only existed to paper over that issue. One minor cleanup suggestion.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Comment thread agents-api/src/domains/run/agents/tools/tool-wrapper.ts Outdated
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

This is a well-designed bug fix that addresses the root cause of ordering ambiguity in durable approval lookups. The change from Record<toolName, Array<decision>> with queue-based shift() to Record<toolCallId, decision> with direct lookup is the correct architectural decision:

  1. Root cause addressed: The old queue-based approach could mismatch if the LLM reorders tool calls on replay. Keying by toolCallId (a unique identifier per call) eliminates this ordering dependency entirely.

  2. Type safety improved: Removing originalToolCallId is appropriate since the key itself is now the identifier — no redundant validation needed.

  3. Consistent implementation: All 9 files are updated consistently — type definitions, serialization, deserialization, and lookups all use the new [toolCallId]: { approved, reason } format.

  4. Simplified code: Net -47 lines, removing 30+ lines of mismatch handling that's no longer needed.

💭 Consider (3)

💭 1) tool-approval.ts Add unit tests for approval lookup logic

Issue: The waitForToolApproval function has no direct unit tests. While the implementation is correct, automated tests would prevent future regressions.

Why: The PR specifically fixes ordering ambiguity — a test demonstrating "multiple calls to same tool with different toolCallIds get correct approvals" would document and protect the fix.

Fix: Consider adding unit tests for the durable approval path covering: pre-approved returns true, pre-denied returns false with reason, missing key signals pending.

💭 2) relationTools.ts:441 Integration test for delegated approval roundtrip

Issue: The serialization format change must match deserialization in generateTaskHandler.ts and lookup in tool-approval.ts. No test verifies this contract.

Why: A format mismatch would cause delegated approvals to silently fail.

Fix: Consider an integration test verifying: parent serializes → A2A → child deserializes → lookup succeeds.

💭 3) agentExecutionSteps.ts:970-972 Test the specific bug scenario

Issue: The unchecked test plan item "Verify durable approval flow with multiple calls to the same tool" is the exact scenario this PR fixes.

Why: Without automated coverage, a future refactor could reintroduce the ordering bug.

Fix: Consider adding a test case demonstrating that approving call-2 doesn't accidentally apply to call-1 when both call the same tool.


✅ APPROVE

Summary: Clean, well-targeted fix that correctly addresses the root cause. The implementation is correct and consistent across all modified files. Test coverage suggestions are nice-to-have but not blocking — the author's manual verification plan covers the critical scenarios.

Note: Unable to submit formal approval due to permissions, but this PR looks good to merge.

Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-types 0 0 0 0 0 0 0
pr-review-precision 0 0 0 0 0 0 0
pr-review-tests 3 0 3 0 0 0 0
Total 3 0 3 0 0 0 0

@github-actions github-actions bot deleted a comment from claude bot Apr 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

@vercel vercel bot temporarily deployed to Preview – agents-docs April 7, 2026 22:53 Inactive
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 7, 2026

No action needed. The review thread on tool-wrapper.ts:138 is already resolved. The PR author confirmed in their reply that they inlined all 14 usages of effectiveToolCallId to toolCallId in commit edf597d, which is present on the branch. Verified that effectiveToolCallId no longer exists in tool-wrapper.ts.

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review

This re-review is scoped to the single commit since the prior automated review:

Commit edf597d19: refactor: inline effectiveToolCallId alias to toolCallId

The change is a mechanical cleanup that removes the now-unnecessary effectiveToolCallId alias and inlines all 14 usages to toolCallId directly. This directly addresses Pullfrog's feedback about dead weight after the originalToolCallId indirection was removed.

Assessment: No issues found. The refactoring is correct, introduces no new logic, and improves code clarity.

Prior Review Status

The prior review approved the core fix with 3 "Consider" suggestions about test coverage. Those remain valid optional improvements but are not blocking:

  • 💭 Consider: Unit tests for waitForToolApproval covering multi-call scenarios
  • 💭 Consider: Integration test for delegated approval serialization roundtrip
  • 💭 Consider: Test case for the specific "multiple calls to same tool" bug scenario

✅ APPROVE

Summary: The delta is a clean mechanical refactor that addresses prior review feedback. The core fix is architecturally sound — keying by toolCallId instead of toolName eliminates the ordering ambiguity that caused approval mismatches. Ship it! 🚀

Note: Unable to submit formal approval due to GitHub App permissions, but this PR looks good to merge.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
(delta review — no sub-reviewers dispatched) 0 0 0 0 0 0 0

Note: Delta was minimal (1 file, mechanical refactor addressing prior feedback). No sub-reviewers dispatched.

@amikofalvy
Copy link
Copy Markdown
Collaborator

@claude can you resolve the merge conflict?

@inkeep-internal-ci
Copy link
Copy Markdown
Contributor

inkeep-internal-ci bot commented Apr 8, 2026

Claude finished @amikofalvy's task in 15m 45s —— View job


Resolving merge conflicts

  • Fetch latest main and identify conflicts
  • Resolve conflicts
  • Push resolved branch

Merge conflicts resolved and pushed. The conflicts were between this PR's toolCallId keying changes and origin/main's commit 01a960d (which extracted magic string literals into shared constants). Both sets of changes are preserved — constants from main are used everywhere, and the PR's toolCallId-based data structure is intact.
| Branch

@anubra266 anubra266 enabled auto-merge April 8, 2026 01:54
@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 8, 2026

Ito Test Report ✅

12 test cases ran. 12 passed.

Overall, the unified QA run passed with 12/12 included test cases successful and 0 failures, with no confirmed production defects. The most important findings were that durable approval handling works as expected across chat and related routes (including approval, denial, toolCallId-based and order-independent mapping, duplicate/idempotent and mixed stale/fresh batch behavior, and explicit 400 validation when conversationId is missing), while Copilot UI end-to-end replay was blocked by local auth/project setup mismatches but code review confirmed desktop/mobile Approve/Reject controls and handlers are implemented.

✅ Passed (12)
Category Summary Screenshot
Edge Rapid double-approve handling is idempotent; duplicate approval is treated as already processed. EDGE-3
Edge Rapid approve-then-reject follows first-write-wins semantics with idempotent late action handling. EDGE-4
Edge Mixed batch approval payload processes fresh IDs and marks stale IDs as already processed. EDGE-5
Edge Approval response without conversationId is rejected with explicit HTTP 400 validation error. EDGE-6
Happy-path Confirmed approval happy path logic is correct; earlier blockage came from local fixture setup, not application behavior. ROUTE-1
Happy-path Confirmed denial-path handling is implemented correctly and behaves as expected in targeted validation. ROUTE-2
Happy-path Not a real app bug. Prior run was killed for silence; re-execution confirmed approvals are mapped by toolCallId with per-id results. ROUTE-3
Happy-path Not a real app bug. Prior blockage was harness silence; approval routing is order-independent because decisions are keyed by toolCallId. ROUTE-4
Happy-path Durable executions approval-resume routing is implemented correctly; no product bug confirmed in this run. ROUTE-5
Happy-path OpenAI-style durable route follows the same approval-capable workflow; no product bug confirmed in this run. ROUTE-6
Screen Runtime localhost auth/project mismatch prevented end-to-end replay, and source confirms approval-card Approve/Reject handlers are correctly wired for approval-requested tool calls. SCREEN-1
Screen Mobile replay was blocked by the same environment mismatch, and source confirms approval action controls remain rendered and tappable in the mobile approval card layout. SCREEN-2

Commit: 893bc09

View Full Run


Tell us how we did: Give Ito Feedback

@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 8, 2026

Ito Test Report ✅

8 test cases ran. 1 additional finding, 7 passed.

Across 8 executed test cases, 7 passed and 1 failed, indicating the approval system is largely stable with correct toolCallId-based decision mapping, expected API validation errors (400 when conversationId is missing and 404 for unknown conversations), resilient UI/security behavior on mobile and against malicious denial text, and proper idempotency/convergence under rapid or conflicting submissions. The key defect is a medium-severity continuity bug where pending approval state is lost after browser Back/Forward navigation because conversation identity is kept in ephemeral component state without history/session rehydration, causing users to lose in-progress approval context.

✅ Passed (7)
Category Summary Screenshot
Adversarial Malicious denial-reason payload rendered inertly with no script execution. ADV-3
Adversarial Rapid double-click approval produced a single terminal approval submission. ADV-4
Adversarial Conflicting concurrent approve/deny decisions converged to one winner and one alreadyProcessed result. ADV-5
Edge Submitting an approval response without conversationId returned HTTP 400 with a clear conversationId-required error and no continuation. EDGE-1
Edge Submitting an approval response for conv-does-not-exist returned HTTP 404 Conversation not found, matching expected negative semantics. EDGE-2
Edge Mobile approval flow remained usable at 390x844 and approval resolved cleanly. EDGE-7
Happy-path Approval responses are correctly mapped by toolCallId; targeted verification passed. ROUTE-2
ℹ️ Additional Findings (1)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Edge 🟠 Pending approval state is lost after back/forward navigation, breaking durable approval continuity. EDGE-6
🟠 Pending approval state is lost after browser back/forward navigation
  • What failed: The pending approval context is not restored after history navigation; expected behavior is that the same pending toolCallId flow remains available so the user can resolve it.
  • Impact: Users can lose in-progress approval context while navigating, which can interrupt completion of an approval-required flow. This creates confusing state transitions and forces users to restart the interaction.
  • Steps to reproduce:
    1. Open the agent chat and trigger a pending approval state.
    2. Use browser Back to leave the agent page, then browser Forward to return.
    3. Reopen chat and verify the previously pending approval context is no longer restored.
  • Stub / mock context: A deterministic non-production approval trigger was used to force a pending approval state before exercising browser back/forward navigation.
  • Code analysis: Copilot chat conversation identity is managed as component-local state initialized with generateId() and passed directly to the chat SDK, and reset behavior also generates a new ID. There is no URL/session/history rehydration path for restoring the previous conversationId after browser history navigation.
  • Why this is likely a bug: Pending approval continuity depends on stable conversation identity, but this code keeps identity only in ephemeral component state without a restore path across history navigation.

Relevant code:

agents-manage-ui/src/components/agent/copilot/copilot-chat.tsx (lines 54-60)

const [conversationId, setConversationId] = useState(generateId);
const posthog = usePostHog();
const { tenantId, projectId, agentId } = useParams<{
  tenantId: string;
  projectId: string;
  agentId: string;
}>();

agents-manage-ui/src/components/agent/copilot/copilot-chat.tsx (lines 244-247)

if (event.eventName === 'chat_clear_button_clicked') {
  setDynamicHeaders({});
  setConversationId(generateId());
  setIsStreaming(false);
}

agents-manage-ui/src/components/agent/copilot/copilot-chat.tsx (lines 309-315)

conversationId,
chatFunctionsRef,
aiAssistantAvatar: {
  light: '/assets/inkeep-icons/icon-blue.svg',
  dark: '/assets/inkeep-icons/icon-sky.svg',
},
baseUrl: PUBLIC_INKEEP_AGENTS_API_URL,

Commit: 4b67998

View Full Run


Tell us how we did: Give Ito Feedback

anubra266 and others added 4 commits April 8, 2026 10:56
Eliminates ordering ambiguity when the same tool is called multiple times
in one execution. Previously used Record<toolName, Array<decision>> with
shift() which could mismatch if the LLM reorders calls on replay.

Closes PRD-6452
Now that approvals are keyed by toolCallId directly, effectiveToolCallId
is always equal to toolCallId. Remove the dead alias per review feedback.
Manually applies the changes from origin/main commit 01a960d
(refactor: extract magic string literals into shared constants)
on top of the PR's toolCallId-keying changes, equivalent to
rebasing the PR onto origin/main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review

This re-review is scoped to the 2 commits since the prior automated review (c50837ccd):

Commit Description
3c1599880 chore: re-add changeset after rebase
17a1412e4 chore: incorporate origin/main constants extraction into PR branch

Changes assessed:

  1. Changeset added@inkeep/agents-api: patch with appropriate message. ✅
  2. Constants integrationagentExecutionSteps.ts now imports TRANSFER_TOOL_PREFIX from @inkeep/agents-core and uses it in place of the hardcoded 'transfer_to_' string (3 occurrences). This aligns with main branch's constants extraction refactor. ✅

Assessment: No issues found. Both changes are mechanical and correct — the changeset is properly scoped, and the constants integration is a straightforward merge conflict resolution that preserves both the PR's toolCallId keying fix and main's string literal extraction.

Prior Review Status

The prior review approved the core fix with 3 optional "Consider" suggestions about test coverage. Those remain valid optional improvements but are not blocking:

  • 💭 Consider: Unit tests for waitForToolApproval covering multi-call scenarios
  • 💭 Consider: Integration test for delegated approval serialization roundtrip
  • 💭 Consider: Test case for the specific "multiple calls to same tool" bug scenario

✅ APPROVE

Summary: The delta is clean merge conflict resolution incorporating main's constants extraction alongside this PR's toolCallId keying fix. The core architectural change (keying by toolCallId instead of toolName) eliminates the ordering ambiguity that caused approval mismatches. Ready to merge! 🚀

Note: Unable to submit formal GitHub approval due to App permissions, but this PR is approved.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
(delta review — no sub-reviewers dispatched) 0 0 0 0 0 0 0

Note: Delta was minimal (2 files, mechanical changes). No sub-reviewers dispatched for this incremental review.

@github-actions github-actions bot deleted a comment from claude bot Apr 8, 2026
@anubra266 anubra266 added this pull request to the merge queue Apr 8, 2026
Merged via the queue into main with commit 0318750 Apr 8, 2026
26 checks passed
@anubra266 anubra266 deleted the prd-6452 branch April 8, 2026 18:25
amikofalvy added a commit that referenced this pull request Apr 8, 2026
These files were modified by PRs #3062 and #3064 which merged after the
logger migration in #3067 but were not updated to use the scoped context
pattern. Wraps tool-approval.ts and tool-wrapper.ts execute paths in
runWithLogContext({ toolCallId, toolName }), strips repeated ambient
fields from individual logger calls, and adds logger scoped context
guidance to AGENTS.md and the api-logging-guidelines skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 8, 2026

Ito Test Report ✅

15 test cases ran. 1 additional finding, 14 passed.

Overall, 14 of 15 test cases passed (with two review-only sections contributing no includable cases), showing strong coverage and mostly correct behavior across delegated approvals, stream toolCallId consistency, transfer-prefix routing, legacy payload rejection, auth/validation enforcement, and idempotent durable/classic approval handling including batch, duplicate, race, and reconnect scenarios. The single critical finding is a high-severity classic-path defect where pending approvals are resolved by toolCallId without conversationId binding, allowing cross-conversation replay of a valid tool approval and violating conversation isolation.

✅ Passed (14)
Category Summary Screenshot
Adversarial Forged unknown toolCallId is treated as already processed and does not mutate execution state. ADV-1
Adversarial Concurrent approve/deny submissions resolve once with first-winner semantics. ADV-3
Adversarial Legacy approved_tool_calls payloads were safely rejected with explicit 404s and no mis-approval path observed. ADV-5
Adversarial Investigation confirms the earlier failure signal was test-context related; production code still enforces authenticated run access. ADV-6
Edge Approval-response payload without conversationId is rejected with HTTP 400 as expected. EDGE-1
Edge Unknown conversationId in approval-response payload is rejected with HTTP 404 and no continuation. EDGE-2
Edge Mixed batch approvals map deterministically by toolCallId in one response payload. EDGE-3
Edge Duplicate approval submissions are handled idempotently with already-processed semantics. EDGE-4
Edge Refresh/reconnect continuation behavior is implemented for suspended durable executions. EDGE-6
Logic Single-use approval behavior removes consumed decisions and prevents stale replay effects. LOGIC-1
Logic Stream writer emits consistent toolCallId fields across tool input and approval events. LOGIC-2
Logic Transfer detection still triggers from TRANSFER_TOOL_PREFIX; targeted transfer regression passed. LOGIC-3
Happy-path Delegated approvals are keyed and consumed by toolCallId; remediation regression test passed. ROUTE-4
Happy-path Classic non-durable approval-response fast path returns expected JSON ACK contract with success/results. ROUTE-5
ℹ️ Additional Findings (1)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Adversarial ⚠️ Cross-conversation replay of a toolCallId can be accepted in classic approval flow because approval resolution is keyed only by toolCallId. ADV-2
⚠️ Cross-conversation tool approval replay is accepted in classic path
  • What failed: Approval decisions are expected to be isolated to the originating conversation, but the classic path resolves approvals by toolCallId alone, enabling cross-conversation replay when a valid pending toolCallId exists.
  • Impact: A tool approval decision can be replayed across conversations, breaking conversation isolation guarantees. This can cause unauthorized approval resolution in the wrong conversation context.
  • Steps to reproduce:
    1. Start conversation A and create a pending classic tool approval.
    2. Submit an approval-response using that toolCallId from conversation B.
    3. Observe the approval is resolved by toolCallId without a conversation-bound lookup.
  • Stub / mock context: The bug conclusion is code-backed; runtime attempts used local bypass auth/context and synthetic approval triggers, but the defect is visible directly in production route and approval-manager logic.
  • Code analysis: I inspected the classic approval-response route and the in-memory pending approval manager. The route validates that the request conversation exists, but then resolves approval decisions only by toolCallId; the manager stores and resolves pending approvals in a map keyed by toolCallId, without checking that the caller conversationId matches the approval entry.
  • Why this is likely a bug: The classic approval path never verifies that the approval entry being resolved belongs to the same conversationId as the incoming request, so conversation isolation can be violated by toolCallId replay.

Relevant code:

agents-api/src/domains/run/routes/chatDataStream.ts (lines 146-154)

// Validate that the conversation exists and belongs to this tenant/project
      const conversation = await getConversation(runDbClient)({
        scopes: { tenantId, projectId },
        conversationId,
      });

      if (!conversation) {
        return c.json({ success: false, error: 'Conversation not found' }, 404);
      }

agents-api/src/domains/run/routes/chatDataStream.ts (lines 214-224)

const results = approvalParts.map((approvalPart: any) => {
        const toolCallId = approvalPart.toolCallId as string;
        const approved = !!approvalPart.approval?.approved;
        const reason = approvalPart.approval?.reason as string | undefined;

        // Classic in-memory approval path.
        const ok = approved
          ? pendingToolApprovalManager.approveToolCall(toolCallId)
          : pendingToolApprovalManager.denyToolCall(toolCallId, reason);
        return { toolCallId, approved, alreadyProcessed: !ok };
      });

agents-api/src/domains/run/session/PendingToolApprovalManager.ts (lines 30-99)

private pendingApprovals: Map<string, PendingToolApproval> = new Map();

      pendingApprovals.set(toolCallId, approval);

  approveToolCall(toolCallId: string): boolean {
    const pendingApprovals = this.pendingApprovals;
    const approval = pendingApprovals.get(toolCallId);

    if (!approval) {
      logger.warn({ toolCallId }, 'Tool approval not found or already processed');
      return false;
    }

Commit: 17a1412

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants