Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: 4b67998 The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
TL;DR — Switches the durable approval lookup from a queue keyed by Key changes
Summary | 31 files | 5 commits | base: Flat map keyed by
|
There was a problem hiding this comment.
Clean simplification. Keying by toolCallId instead of toolName eliminates the ordering ambiguity from the shift()-based queue, and removing originalToolCallId drops a layer of indirection that only existed to paper over that issue. One minor cleanup suggestion.
Claude Opus | 𝕏
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
This is a well-designed bug fix that addresses the root cause of ordering ambiguity in durable approval lookups. The change from Record<toolName, Array<decision>> with queue-based shift() to Record<toolCallId, decision> with direct lookup is the correct architectural decision:
-
Root cause addressed: The old queue-based approach could mismatch if the LLM reorders tool calls on replay. Keying by
toolCallId(a unique identifier per call) eliminates this ordering dependency entirely. -
Type safety improved: Removing
originalToolCallIdis appropriate since the key itself is now the identifier — no redundant validation needed. -
Consistent implementation: All 9 files are updated consistently — type definitions, serialization, deserialization, and lookups all use the new
[toolCallId]: { approved, reason }format. -
Simplified code: Net -47 lines, removing 30+ lines of mismatch handling that's no longer needed.
💭 Consider (3)
💭 1) tool-approval.ts Add unit tests for approval lookup logic
Issue: The waitForToolApproval function has no direct unit tests. While the implementation is correct, automated tests would prevent future regressions.
Why: The PR specifically fixes ordering ambiguity — a test demonstrating "multiple calls to same tool with different toolCallIds get correct approvals" would document and protect the fix.
Fix: Consider adding unit tests for the durable approval path covering: pre-approved returns true, pre-denied returns false with reason, missing key signals pending.
💭 2) relationTools.ts:441 Integration test for delegated approval roundtrip
Issue: The serialization format change must match deserialization in generateTaskHandler.ts and lookup in tool-approval.ts. No test verifies this contract.
Why: A format mismatch would cause delegated approvals to silently fail.
Fix: Consider an integration test verifying: parent serializes → A2A → child deserializes → lookup succeeds.
💭 3) agentExecutionSteps.ts:970-972 Test the specific bug scenario
Issue: The unchecked test plan item "Verify durable approval flow with multiple calls to the same tool" is the exact scenario this PR fixes.
Why: Without automated coverage, a future refactor could reintroduce the ordering bug.
Fix: Consider adding a test case demonstrating that approving call-2 doesn't accidentally apply to call-1 when both call the same tool.
✅ APPROVE
Summary: Clean, well-targeted fix that correctly addresses the root cause. The implementation is correct and consistent across all modified files. Test coverage suggestions are nice-to-have but not blocking — the author's manual verification plan covers the critical scenarios.
Note: Unable to submit formal approval due to permissions, but this PR looks good to merge.
Reviewers (4)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-types |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-precision |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-tests |
3 | 0 | 3 | 0 | 0 | 0 | 0 |
| Total | 3 | 0 | 3 | 0 | 0 | 0 | 0 |
Preview URLsUse these stable preview aliases for testing this PR:
These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find. Raw Vercel deployment URLs
|
|
No action needed. The review thread on
|
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
Delta Review
This re-review is scoped to the single commit since the prior automated review:
Commit edf597d19: refactor: inline effectiveToolCallId alias to toolCallId
The change is a mechanical cleanup that removes the now-unnecessary effectiveToolCallId alias and inlines all 14 usages to toolCallId directly. This directly addresses Pullfrog's feedback about dead weight after the originalToolCallId indirection was removed.
Assessment: No issues found. The refactoring is correct, introduces no new logic, and improves code clarity.
Prior Review Status
The prior review approved the core fix with 3 "Consider" suggestions about test coverage. Those remain valid optional improvements but are not blocking:
- 💭 Consider: Unit tests for
waitForToolApprovalcovering multi-call scenarios - 💭 Consider: Integration test for delegated approval serialization roundtrip
- 💭 Consider: Test case for the specific "multiple calls to same tool" bug scenario
✅ APPROVE
Summary: The delta is a clean mechanical refactor that addresses prior review feedback. The core fix is architecturally sound — keying by toolCallId instead of toolName eliminates the ordering ambiguity that caused approval mismatches. Ship it! 🚀
Note: Unable to submit formal approval due to GitHub App permissions, but this PR looks good to merge.
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| (delta review — no sub-reviewers dispatched) | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: Delta was minimal (1 file, mechanical refactor addressing prior feedback). No sub-reviewers dispatched.
|
@claude can you resolve the merge conflict? |
|
Claude finished @amikofalvy's task in 15m 45s —— View job Resolving merge conflicts
Merge conflicts resolved and pushed. The conflicts were between this PR's toolCallId keying changes and |
Ito Test Report ✅12 test cases ran. 12 passed. Overall, the unified QA run passed with 12/12 included test cases successful and 0 failures, with no confirmed production defects. The most important findings were that durable approval handling works as expected across chat and related routes (including approval, denial, toolCallId-based and order-independent mapping, duplicate/idempotent and mixed stale/fresh batch behavior, and explicit 400 validation when conversationId is missing), while Copilot UI end-to-end replay was blocked by local auth/project setup mismatches but code review confirmed desktop/mobile Approve/Reject controls and handlers are implemented. ✅ Passed (12)Commit: Tell us how we did: Give Ito Feedback |
Ito Test Report ✅8 test cases ran. 1 additional finding, 7 passed. Across 8 executed test cases, 7 passed and 1 failed, indicating the approval system is largely stable with correct toolCallId-based decision mapping, expected API validation errors (400 when conversationId is missing and 404 for unknown conversations), resilient UI/security behavior on mobile and against malicious denial text, and proper idempotency/convergence under rapid or conflicting submissions. The key defect is a medium-severity continuity bug where pending approval state is lost after browser Back/Forward navigation because conversation identity is kept in ephemeral component state without history/session rehydration, causing users to lose in-progress approval context. ✅ Passed (7)ℹ️ Additional Findings (1)
🟠 Pending approval state is lost after browser back/forward navigation
Relevant code:
const [conversationId, setConversationId] = useState(generateId);
const posthog = usePostHog();
const { tenantId, projectId, agentId } = useParams<{
tenantId: string;
projectId: string;
agentId: string;
}>();
if (event.eventName === 'chat_clear_button_clicked') {
setDynamicHeaders({});
setConversationId(generateId());
setIsStreaming(false);
}
conversationId,
chatFunctionsRef,
aiAssistantAvatar: {
light: '/assets/inkeep-icons/icon-blue.svg',
dark: '/assets/inkeep-icons/icon-sky.svg',
},
baseUrl: PUBLIC_INKEEP_AGENTS_API_URL,Commit: Tell us how we did: Give Ito Feedback |
Eliminates ordering ambiguity when the same tool is called multiple times in one execution. Previously used Record<toolName, Array<decision>> with shift() which could mismatch if the LLM reorders calls on replay. Closes PRD-6452
Now that approvals are keyed by toolCallId directly, effectiveToolCallId is always equal to toolCallId. Remove the dead alias per review feedback.
Manually applies the changes from origin/main commit 01a960d (refactor: extract magic string literals into shared constants) on top of the PR's toolCallId-keying changes, equivalent to rebasing the PR onto origin/main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
Delta Review
This re-review is scoped to the 2 commits since the prior automated review (c50837ccd):
| Commit | Description |
|---|---|
3c1599880 |
chore: re-add changeset after rebase |
17a1412e4 |
chore: incorporate origin/main constants extraction into PR branch |
Changes assessed:
- Changeset added —
@inkeep/agents-api: patchwith appropriate message. ✅ - Constants integration —
agentExecutionSteps.tsnow importsTRANSFER_TOOL_PREFIXfrom@inkeep/agents-coreand uses it in place of the hardcoded'transfer_to_'string (3 occurrences). This aligns with main branch's constants extraction refactor. ✅
Assessment: No issues found. Both changes are mechanical and correct — the changeset is properly scoped, and the constants integration is a straightforward merge conflict resolution that preserves both the PR's toolCallId keying fix and main's string literal extraction.
Prior Review Status
The prior review approved the core fix with 3 optional "Consider" suggestions about test coverage. Those remain valid optional improvements but are not blocking:
- 💭 Consider: Unit tests for
waitForToolApprovalcovering multi-call scenarios - 💭 Consider: Integration test for delegated approval serialization roundtrip
- 💭 Consider: Test case for the specific "multiple calls to same tool" bug scenario
✅ APPROVE
Summary: The delta is clean merge conflict resolution incorporating main's constants extraction alongside this PR's toolCallId keying fix. The core architectural change (keying by toolCallId instead of toolName) eliminates the ordering ambiguity that caused approval mismatches. Ready to merge! 🚀
Note: Unable to submit formal GitHub approval due to App permissions, but this PR is approved.
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| (delta review — no sub-reviewers dispatched) | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: Delta was minimal (2 files, mechanical changes). No sub-reviewers dispatched for this incremental review.
These files were modified by PRs #3062 and #3064 which merged after the logger migration in #3067 but were not updated to use the scoped context pattern. Wraps tool-approval.ts and tool-wrapper.ts execute paths in runWithLogContext({ toolCallId, toolName }), strips repeated ambient fields from individual logger calls, and adds logger scoped context guidance to AGENTS.md and the api-logging-guidelines skill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ito Test Report ✅15 test cases ran. 1 additional finding, 14 passed. Overall, 14 of 15 test cases passed (with two review-only sections contributing no includable cases), showing strong coverage and mostly correct behavior across delegated approvals, stream toolCallId consistency, transfer-prefix routing, legacy payload rejection, auth/validation enforcement, and idempotent durable/classic approval handling including batch, duplicate, race, and reconnect scenarios. The single critical finding is a high-severity classic-path defect where pending approvals are resolved by toolCallId without conversationId binding, allowing cross-conversation replay of a valid tool approval and violating conversation isolation. ✅ Passed (14)ℹ️ Additional Findings (1)
|




































Summary
approvedToolCallsfromRecord<toolName, Array<decision>>toRecord<toolCallId, { approved: boolean; reason?: string }>— a flat map keyed bytoolCallIdshift()approach could mismatch if the LLM reorders calls on replayoriginalToolCallIdindirection since the key itself is now thetoolCallIdFiles changed
agent-types.ts— simplified type definitionAgent.ts— updatedsetApprovedToolCallssignaturetool-approval.ts— replaced queue-basedshift()lookup with directtoolCallIdkey lookuptool-wrapper.ts— simplified pre-approved entry lookup andeffectiveToolCallIdagentExecutionSteps.ts— changed from[toolName]: [{ ... }]to[toolCallId]: { ... }relationTools.ts— changed delegated approval metadata serializationgenerateTaskHandler.ts— updated type assertions for parsed approval dataexecutionHandler.ts— updatedExecutionHandlerParamstypeCloses PRD-6452
Test plan
pnpm typecheckpassespnpm formatpasses