Skip to content

fix(ui): harden permission event reconciliation#422

Closed
pascalandr wants to merge 4 commits into
NeuralNomadsAI:devfrom
Pagecran:fix/permission-flow-solidification
Closed

fix(ui): harden permission event reconciliation#422
pascalandr wants to merge 4 commits into
NeuralNomadsAI:devfrom
Pagecran:fix/permission-flow-solidification

Conversation

@pascalandr
Copy link
Copy Markdown
Contributor

@pascalandr pascalandr commented May 9, 2026

Fixes #290
Supersedes #367

Summary

  • keep permission replies idempotent by clearing both the global queue and message-v2 state after a successful reply
  • reconcile pending permissions when tool parts arrive, matching the existing question flow
  • prevent stale or duplicate permission events from leaving mismatched active/attached permission state
  • drain YOLO auto-accept from the permission flow itself instead of the rendered instance shell
  • preserve optimistic user attachments across SSE event reordering until official server parts arrive

Bugs and Fragile Flows Fixed

  • Reply cleanup depended on permission.replied SSE: sendPermissionResponse() removed only the global permission queue entry, so if the replied event was delayed or missed, inline tool-call UI could keep showing a stale permission.
  • Delayed permission events could resurrect handled prompts: a late permission.updated for an already accepted permission could re-add it locally until reload corrected state from permission.list().
  • Duplicate permission payloads were discarded or downgraded: repeated events for the same permission ID could drop richer session/worktree/tool-call metadata when a later payload was partial.
  • Pending session badges could drift: replacing a permission payload with a different session ID did not rebalance pending permission counters.
  • Permission/tool-call attachment was asymmetric with questions: live message.part.updated re-linked questions, but permissions depended on narrower incidental rebind paths or full message reload.
  • message-v2 could keep multiple attachment keys for one permission: a permission could remain under an old global/ID key and later also be inserted under the resolved tool part.
  • message-v2 active permission could stay stale: active state was only refreshed in limited cases instead of being derived from the ordered pending queue.
  • Optimistic user attachments could disappear during ID replacement: message.updated could replace a synthetic message ID before user part events arrived and clear the optimistic parts, including resolved pasted/file attachment content.
  • YOLO auto-accept depended on shell rendering: the auto-accept drain lived in instance-shell2, so heavy message rendering or permission UI churn could delay replies even though the request was already queued.
  • YOLO retry could amplify UI churn or outlive permissions: transient reply failures were retried through a reactive render effect, and retry state was not fully cleaned up when permissions were removed, queues were cleared, or YOLO was disabled.

Implementation

  • Add short-lived local tombstones for replied permission IDs to ignore stale permission updates after a successful local or SSE reply.
  • Merge duplicate permission entries conservatively in both the global queue and message-v2, preserving existing IDs and nested metadata/time/tool data when incoming events are partial.
  • Update duplicate permission entries in the global queue, including pending session counters and worktree slug mapping.
  • Remove permissions from both instances.ts and message-v2 after successful replies so the UI does not depend on receiving permission.replied.
  • Run reconcilePendingPermissionsV2() for every live tool part update.
  • Make message-v2 enforce one byMessage attachment per permission ID and recalculate active permission from the sorted queue.
  • Allow permission reconciliation to repair entries whose partId exists but whose byMessage attachment was lost.
  • Preserve optimistic user message parts when temporary message IDs are replaced, then remove matching synthetic text/file parts only when the official server part arrives.
  • Move YOLO draining into permission-auto-accept.ts and trigger it from permission sync, live permission enqueue, and the Status-tab toggle.
  • Add bounded retry backoff for failed YOLO replies while verifying the permission is still pending before retrying.
  • Clear YOLO retry/in-flight state when permissions are removed, queues are cleared, or per-session auto-accept is disabled.

YOLO Mode

  • No product behavior change: YOLO remains per-session and continues to reply with once only.
  • This only decouples auto-accept from rendered UI so queued permissions can be answered without waiting for permission/tool-call UI to render.

Validation

  • git diff --check
  • npm run typecheck --workspace @codenomad/ui
  • npm run build --workspace @codenomad/ui
  • final read-only gatekeeper review: clear

Notes

  • packages/ui does not currently define a test script/runner, so this PR is validated with typecheck/build rather than new automated unit tests.
  • Build still emits the existing virtua JSX import-source warning and large chunk warnings.

pascalandr added 2 commits May 9, 2026 22:09
Permission prompts could remain visible or reappear when SSE events arrived out of order: local replies only cleared the global queue, duplicate permission updates were discarded even when they carried better metadata, and delayed permission updates could resurrect already-handled requests.

Keep the global permission queue and message-v2 store in sync after successful replies, tombstone recently replied permission IDs to ignore stale updates, update duplicate permission payloads and pending counters, and reconcile permissions whenever tool parts arrive just like questions.

Message-v2 now keeps one attachment per permission ID and recalculates active permission ordering from the queue so inline tool-call, timeline, and permission-center state converge after out-of-order events.

Validated with git diff --check, npm run typecheck --workspace @codenomad/ui, and npm run build --workspace @codenomad/ui.
YOLO auto-accept was driven by an InstanceShell effect, so permission replies could be delayed by heavy message rendering or permission UI churn even though the mode itself is session-scoped and always replies once.

Move the drain trigger into the permission queue flow and run it from permission sync, live permission enqueue, and the status-tab toggle. Keep the product behavior unchanged: YOLO remains per session and sends only allow-once replies.

Add a bounded retry backoff for failed auto-accept replies so transient failures can recover without immediate reactive retry loops that can amplify UI freezes.

Validated with git diff --check, npm run typecheck --workspace @codenomad/ui, and npm run build --workspace @codenomad/ui.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

PR builds are available as GitHub Actions artifacts:

https://github.com/NeuralNomadsAI/CodeNomad/actions/runs/25610676103

Artifacts expire in 7 days.
Artifacts:

  • pr-422-4894835ea9fc8c9e60480e69e97484262bdd4a2b-electron-linux

Keep duplicate permission events from downgrading richer payloads so late or partial SSE updates do not detach pending approvals from their session, message, or tool metadata.

Preserve optimistic user message parts when temporary message IDs are replaced, then remove matching synthetic parts only after official server parts arrive. This avoids dropping resolved attachments during event reordering without leaving duplicate user input behind.

Clean up YOLO auto-accept retries when permissions are removed, queues are cleared, or per-session auto-accept is disabled, while keeping the behavior session-scoped and replying with once-only approvals.

Validated with git diff --check, UI typecheck, UI build, and a final read-only gatekeeper review.
@pascalandr
Copy link
Copy Markdown
Contributor Author

Final gatekeeper pass is clear after the latest fixes.

Validated locally:

  • git diff --check
  • npm run typecheck --workspace @codenomad/ui
  • npm run build --workspace @codenomad/ui
  • read-only gatekeeper review focused on duplicate permission merge, message-v2 optimistic attachment preservation, YOLO retry cleanup, StatusTab delegation, and YOLO once-only/session-scoped behavior

Latest commit pushed: 440e5668 fix(ui): preserve permission state during yolo drain.

@pascalandr
Copy link
Copy Markdown
Contributor Author

@CodeNomadBot

I want you to perform the gatekeeper checks for latest PR state

check for regressions
check for better possible implementations
Be the PR gatekeeper
Be ruthless about code quality

@codenomadbot
Copy link
Copy Markdown
Contributor

codenomadbot Bot commented May 9, 2026

Gatekeeper review for latest PR state complete.

What I understood: you wanted a fresh, ruthless read-only review of PR #422’s current head, focused on regressions and whether the permission/message-v2/YOLO implementation is sound. I treated this as a review only; no code changes were made.

Findings:

No blocking regressions found.

Non-blocking quality notes:

  • packages/ui/src/stores/instances.ts: replied permission tombstones are described as short-lived, but expiry is only opportunistic when the same permission ID is checked again. In long-lived UI sessions with many permission replies, repliedPermissionIdsByInstance can retain old IDs indefinitely. This is low risk, but a small pruning pass during markPermissionReplied, syncPendingPermissions, or instance cleanup would make the invariant real.
  • packages/ui/src/stores/instances.ts and packages/ui/src/stores/message-v2/instance-store.ts: permission merge logic is duplicated. The duplicated implementation is currently consistent, but this is a future drift risk because both stores must preserve the same metadata/session/tool-call semantics. A shared helper would be cleaner if this logic grows further.
  • packages/ui/src/stores/instances.ts: removePermissionFromQueue() leaves an unused updatedQueue local. Harmless, but it is dead code in a sensitive queue path and should be removed in a cleanup pass.

Regression review notes:

  • The local reply cleanup path is stronger than before: successful sendPermissionResponse() removes both global queue state and message-v2 state, so the UI no longer depends on permission.replied SSE arriving.
  • The tombstone check correctly protects against late permission.updated rehydrating a locally answered prompt.
  • Duplicate permission updates preserve richer prior payload data while allowing later metadata to hydrate queue entries.
  • Session pending counters are adjusted when duplicate permission hydration changes the resolved session ID.
  • message-v2 permission attachment dedupe enforces one by-message attachment per permission ID and recalculates active state from the sorted queue.
  • Tool-part updates now reconcile permissions and questions symmetrically, which closes the stale unattached permission path.
  • Optimistic user parts are preserved across synthetic message ID replacement and removed only when matching official parts arrive.
  • YOLO drain no longer depends on shell rendering and correctly verifies the permission is still pending before retrying.
  • YOLO in-flight/retry state is cleared on permission removal, queue clearing, and per-session disable.

Validation run:

  • git diff --check dev...HEAD: passed
  • npm ci: completed; reported existing dependency audit issues, not reviewed as part of this PR
  • npm run typecheck --workspace @codenomad/ui: passed
  • npm run build --workspace @codenomad/ui: passed

Build notes:

  • Existing virtua JSX import-source warning still appears.
  • Existing large chunk warnings still appear.

Worktree note:

  • I ignored an unrelated local modification in .opencode/package-lock.json; it is not part of the PR diff I reviewed.

Gatekeeper verdict: clear to proceed. The implementation is more complex than ideal, but the complexity is targeted at real event-ordering failures and I do not see a simpler safe implementation that preserves the same behavior without reintroducing stale permission/UI races.

--
Yours,
CodeNomadBot

Keep replied permission tombstones bounded in long-lived UI sessions by pruning expired entries whenever replies are marked or checked, and clear the map when an instance is removed.

Also removes a dead local variable from the permission removal path noted during gatekeeper review. The permission merge helper duplication is intentionally left untouched to avoid adding cross-store churn in this PR.

Validated with git diff --check, UI typecheck, and UI build.
@pascalandr
Copy link
Copy Markdown
Contributor Author

Addressed the non-blocking cleanup notes from the latest gatekeeper pass in 6f288b64:

  • replied permission tombstones are now pruned when replies are marked or checked
  • replied tombstones are cleared when an instance is removed
  • removed the dead updatedQueue local from removePermissionFromQueue()

I intentionally left the duplicate permission merge helper in place for this PR because extracting a shared helper would touch both queue/message-v2 boundaries again and add churn after the behavior is already validated. If this logic grows further, a shared helper is the right follow-up.

Revalidated:

  • git diff --check
  • npm run typecheck --workspace @codenomad/ui
  • npm run build --workspace @codenomad/ui

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

PR builds are available as GitHub Actions artifacts:

https://github.com/NeuralNomadsAI/CodeNomad/actions/runs/25611188333

Artifacts expire in 7 days.
Artifacts:

  • pr-422-440e56688a6e6495e5a6dec31b9170d5e9b49340-electron-linux

1 similar comment
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

PR builds are available as GitHub Actions artifacts:

https://github.com/NeuralNomadsAI/CodeNomad/actions/runs/25611188333

Artifacts expire in 7 days.
Artifacts:

  • pr-422-440e56688a6e6495e5a6dec31b9170d5e9b49340-electron-linux

@pascalandr
Copy link
Copy Markdown
Contributor Author

@CodeNomadBot
same review

@codenomadbot
Copy link
Copy Markdown
Contributor

codenomadbot Bot commented May 9, 2026

Fresh gatekeeper review after 6f288b6 complete.

What I understood: rerun the same read-only gatekeeper review after the cleanup commit that addressed the prior non-blocking notes. I reviewed the latest branch state only; no code changes were made.

Findings:

No blocking regressions found.

Prior cleanup notes status:

  • Replied permission tombstones are now pruned on mark/check and cleared on instance removal. This resolves the prior long-lived map concern.
  • The dead updatedQueue local in removePermissionFromQueue() is gone.
  • Duplicate permission merge logic remains duplicated by intent. I still agree this is acceptable for this PR because extracting it now would cross queue/message-v2 boundaries and add churn after the behavior is validated.

New non-blocking note:

  • packages/ui/src/stores/instances.ts: DisconnectedInstanceInfo now has an unused details?: string field introduced in 6f288b6. It is harmless and typecheck-clean, but it appears unrelated to the permission tombstone cleanup and could be dropped to keep the cleanup commit tight.

Regression review notes:

  • The tombstone pruning change preserves the stale-event protection: hasRecentlyRepliedPermission() still ignores late permission.updated events within the tombstone window.
  • Pruning before markPermissionReplied() and hasRecentlyRepliedPermission() is sufficient to prevent unbounded growth during normal permission activity.
  • Clearing replied tombstones from removeInstance() covers instance lifecycle cleanup.
  • Local reply cleanup still removes both global queue state and message-v2 state after successful permission.reply.
  • YOLO auto-accept state cleanup still runs on permission removal, queue clearing, and per-session disable.
  • Message-v2 permission dedupe and active-state recalculation still look consistent with the queue ordering model.
  • Optimistic user part preservation across synthetic ID replacement still looks correct for matching official text/file parts.

Validation run:

  • git status --short --branch: clean branch at 6f288b6
  • git diff --check dev...HEAD: passed
  • npm run typecheck --workspace @codenomad/ui: passed
  • npm run build --workspace @codenomad/ui: passed

Build notes:

  • Existing virtua JSX import-source warning still appears.
  • Existing large chunk warnings still appear.

Gatekeeper verdict: clear to proceed. The cleanup commit resolves the actionable notes from the previous review, and I do not see a blocker in the latest PR state.

--
Yours,
CodeNomadBot

@shantur
Copy link
Copy Markdown
Collaborator

shantur commented May 9, 2026

@pascalandr - Were you able to reproduce this issue?

@pascalandr
Copy link
Copy Markdown
Contributor Author

pascalandr commented May 9, 2026

we already had this discussion #367 (comment)

I already had all kind of buggy behavior with permission this is why I have dig into it and found several issues.

@shantur
Copy link
Copy Markdown
Collaborator

shantur commented May 9, 2026

TBH, this is touching too many core parts and I am not sure why synthetic parts are touched while fixing permissions?
We really need this to be targeted to be able to review it properly.
Why are we keeping a map of replied permissions in UI when the source of truth is the opencode server?
Would you be able to document the sure shot way to reproducing this? Currently the PR seems a combination of optimistic fixes done by LLM

@pascalandr
Copy link
Copy Markdown
Contributor Author

pascalandr commented May 9, 2026

Well, I described the permission problems I had observed, then explained what I thought could be causing them and looked for existing bugs or weak spots matching those behaviors.
For the YOLO mode issue specifically, I’m pretty sure the fix matches the observed problem exactly.
That said, I agree this should be reworked to be more targeted, conservative, and explained.

@pascalandr
Copy link
Copy Markdown
Contributor Author

pascalandr commented May 9, 2026

I split this broad PR into smaller targeted PRs for better review:

For reminder, these are issues I have personally hit multiple times. I cannot reproduce them deterministically on demand because they happen when the UI is under heavy pressure, but they happened often enough to reveal a consistent pattern.

@pascalandr pascalandr closed this May 10, 2026
shantur pushed a commit that referenced this pull request May 10, 2026
## Summary
- Split out from #422 as the YOLO-only permission fix.
- Moves YOLO auto-accept draining out of `InstanceShell` render effects
and into the permission queue flow.
- Keeps behavior unchanged: YOLO remains per-session and replies with
`once` only.
- Keeps in-flight cleanup for auto-accept attempts so duplicate sends
are guarded outside UI render timing.

## Why
In YOLO mode, the app could still appear to wait on a permission or
block the UI, even though auto-accept was enabled and the permission was
already queued.

The auto-accept drain was tied to an `InstanceShell` render effect, so
the reply path depended on a specific UI shell rendering. Draining from
permission sync/enqueue and from the YOLO toggle makes auto-accept run
from the permission queue itself instead of UI render timing.

## Validation
- `git diff --check`
- `npm run typecheck --workspace @codenomad/ui`
shantur pushed a commit that referenced this pull request May 10, 2026
## Summary
- Split out from #422 as the stale-permission-events fix.
- Clears permission UI state immediately after a successful local
permission reply instead of waiting for `permission.replied` SSE.
- Tracks replied permission IDs until a newer pending-permission sync
observes that the server no longer reports them pending.
- Marks SSE `permission.replied` events into the same replied-ID path so
delayed pending events/sync results cannot resurrect prompts that were
just answered.

## Why
A permission prompt could remain on screen after clicking Allow/Deny, or
clicking Allow could look like it did not take effect and require
another click. Reloading could fix the UI, which pointed to stale local
permission state rather than the server still waiting.

The UI receives permission state from local replies, SSE events, and
pending-permission sync. If an older pending event or sync result is
processed after a confirmed reply, the UI can re-add a permission that
was already answered. Replied IDs stay suppressed until a sync started
after the local reply proves the server has dropped that permission from
the pending list.

## Validation
- `git diff --check`
- `node --test packages/ui/src/stores/permission-replies.test.ts`
- `npm run typecheck --workspace @codenomad/ui`
pascalandr added a commit that referenced this pull request May 10, 2026
## Summary
- Split out from #422 as the permission/tool-call reconciliation fix.
- Reconciles pending permissions whenever live tool parts update,
matching the existing question re-link path.
- Keeps one message-v2 attachment per server permission ID and
recalculates the active permission from queue order.
- Conservatively merges duplicate or out-of-order permission updates so
known session/message/tool routing metadata is not lost.
- Fixes #290

## Why

The observed failure shape is that permission prompts can appear
missing, frozen, or attached in unexpected places when permission events
and tool-call parts are observed in different orders. In those cases,
the server-side permission may exist, but the UI can temporarily attach
it globally, attach it to the wrong tool location, or fail to move it
when the matching tool part arrives later.

This PR focuses on the UI-side attachment/order problem: one UI
attachment per server permission ID, re-linking permissions when tool
parts arrive, and preserving known routing metadata across
duplicate/out-of-order updates. It does not attempt semantic
deduplication across different permission IDs that happen to ask for the
same logical approval.

## Validation
- `git diff --check`
- `npm exec --no -- tsx --test packages/ui/src/types/permission.test.ts
packages/ui/src/stores/message-v2/instance-store.test.ts`
- `node --test packages/ui/src/stores/permission-replies.test.ts`
- `npm run typecheck --workspace @codenomad/ui`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: sometime missing display ask

2 participants