Skip to content

🤖 fix: show delegated workflow activity in sidebar#3459

Merged
ThomasK33 merged 6 commits into
mainfrom
sidebar-0h0p
Jun 5, 2026
Merged

🤖 fix: show delegated workflow activity in sidebar#3459
ThomasK33 merged 6 commits into
mainfrom
sidebar-0h0p

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

@ThomasK33 ThomasK33 commented Jun 4, 2026

Summary

Adds delegated-work sidebar state for slash-launched workflows so launcher workspaces and collapsed task groups stay visibly active while descendant sub-agents are running, without reusing misleading assistant streaming copy.

Background

Slash workflows run in the background and spawn child task workspaces, so the launcher workspace can otherwise look idle while the workflow is actively faning out. This change keeps the derived state frontend-local and rolls descendant task activity up through workspace metadata.

Implementation

  • Rolls descendant running/queued task state up by workspace id, including inherited workflow ownership and cycle/duplicate defenses.
  • Gives delegated work lower precedence than own workspace question/error/stream/provisioning states.
  • Shares the live-sidebar fallback with collapsed task-group counts so metadata lag does not make grouped rows look idle.
  • Exposes delegated and task-group status text through aria-describedby while keeping action-oriented row labels stable.
  • Keeps resumed descendants active even if they still carry stale reportedAt metadata from an earlier completed run.
  • Keeps non-finalized interrupted descendants delegated-active while their live stream is still visible, but suppresses finalized completed reports.
  • Adds real-row ProjectSidebar coverage. Dogfood screenshots/video were generated locally but intentionally left untracked because dogfood-output/ is ignored.

Validation

  • bun test src/browser/utils/ui/workspaceFiltering.test.ts src/browser/components/AgentListItem/AgentListItem.test.tsx src/browser/components/ProjectSidebar/TaskGroupListItem.test.tsx src/browser/components/ProjectSidebar/ProjectSidebar.test.tsx
  • TEST_INTEGRATION=1 bun x jest tests/ui/workspaces/subagents.test.ts --runInBand --silent=false
  • make typecheck
  • make fmt-check
  • make lint
  • make static-check
  • MUX_ESLINT_CONCURRENCY=1 make static-check after the default lint process was killed by local OOM.

Dogfooding

Dogfooding was performed with local artifacts under dogfood-output/workflow-sidebar/ (screenshots, video, console, errors). Those artifacts are not tracked in this PR per the repository ignore policy.

Risks

Moderate UI-state risk in the sidebar: the roll-up is derived from existing workspace metadata and live sidebar state, but it changes status precedence and grouped-row styling. Targeted unit/integration coverage exercises the precedence, terminal-state, metadata-lag, resumed-task, interrupted-live, and accessibility branches.


📋 Implementation Plan

Plan: Show slash-workflow delegated work in the left sidebar

Context and evidence

When a workflow is launched manually with /workflow ..., the launcher workspace can look grey/inactive even while workflow-spawned sub-agents are running. The investigation found this is expected with current data flow:

  • src/browser/components/AgentListItem/AgentListItem.tsx derives active row state from the workspace's own sidebar stream/provisioning state (canInterrupt, isStarting, isInitializing), not descendant task/workflow state.
  • src/browser/components/WorkspaceStatusIndicator/WorkspaceStatusIndicator.tsx only renders own workspace question/status/streaming/provisioning text.
  • Slash workflow launches in src/browser/utils/chatCommands.ts call api.workflows.start({ runInBackground: true, rawCommand, continuationOptions, ... }); the parent workspace is not streaming during the background workflow run.
  • Workflow steps spawn child task workspaces through src/node/services/workflows/WorkflowTaskServiceAdapter.tssrc/node/services/taskService.ts; those children carry parentWorkspaceId, optional workflowTask, and taskStatus metadata.
  • src/browser/components/ProjectSidebar/TaskGroupListItem.tsx already receives running/queued/completed counts for grouped children, but active groups still read visually muted.

Recommendation

Implement a distinct delegated work sidebar state: the parent workspace should look active while descendant sub-agents/workflow tasks are active, but it should not pretend the parent assistant is streaming.

Recommended user-facing behavior:

  • Parent row uses active affordance while descendant workflow/sub-agent work is running.
  • Secondary text is explicit, for example Workflow running · 2 sub-agents active or 2 sub-agents active.
  • Existing own-workspace states keep precedence: archiving/removing > system error > question > own stream/provisioning > delegated work > unread/seen.
  • Existing WorkspaceStatusIndicator streaming copy is not reused for delegated work, so the UI avoids false model - streaming... language.

Net product LoC estimate for the recommended approach: +120 to +190 LoC.

Recommended approach: frontend descendant roll-up

Phase 1 — Model delegated activity in the sidebar

  1. Add a small UI-only type near the sidebar code, e.g. WorkspaceDelegatedActivity:
    • activeCount
    • queuedCount
    • workflowActiveCount
    • workflowQueuedCount
  2. In ProjectSidebar.tsx or a small extracted helper (preferred for tests), derive a Map<string, WorkspaceDelegatedActivity> from the full per-project metadata list before completed-child filtering and task-group coalescing:
    • First de-dupe by workspace.id into a Map<workspaceId, metadata> so multi-project render paths cannot double count a workspace.
    • Build childrenByParentId from parentWorkspaceId.
    • Walk descendants recursively with cycle defense.
    • Count descendants with taskStatus === "running" || taskStatus === "awaiting_report" as active.
    • Count descendants with taskStatus === "queued" as queued/pending.
    • Treat workflow ownership as inherited: workflowOwned = own.workflowTask != null || ancestorWorkflowOwned, so nested descendants of workflow-owned tasks still produce workflow copy.
    • Optionally consult workspaceStore.getWorkspaceSidebarState(child.id) so a live child stream remains active if metadata lags; wrap this read in a safe helper/catch because metadata/store teardown can race.
  3. Feed this same derived signal into workspaceHasAttention() so project/section headers also wake up when a workspace tree has active delegated work.
  4. Pass the derived activity through every relevant render path: normal project rows, multi-project rows, and rows that may later be grouped/coalesced.

Quality gate after Phase 1:

  • Add focused unit coverage for recursive roll-up, ideally against a pure computeDelegatedActivityByWorkspaceId helper.
  • Cover parent → child, parent → child → grandchild, collapsed/grouped descendants, workflow-owned propagation through descendants, and cycle defense.
  • Confirm completed/reported descendants do not keep the parent active.

Phase 2 — Render delegated activity in workspace rows

  1. Extend AgentListItem props with optional delegated activity, keeping it UI-only and not crossing IPC boundaries.
  2. In AgentListItem.tsx, compute hasActiveDelegatedWork from the prop.
  3. Update visual state derivation so active delegated work can produce active styling only after own archiving/removing/error/question/stream/provisioning checks.
  4. Clarify secondary-status precedence in code: live own question/error/stream/provisioning still wins, but delegated-work text should beat stale inactive agentStatus/todo-derived status so old idle summaries do not hide Workflow running · .... Concretely, treat own status as stale/inactive only when there is no own canInterrupt, no isStarting, no awaitingUserQuestion, and no own system error.
  5. Add a small inline component/helper for delegated status text, for example:
    • Workflow running · 2 sub-agents active
    • Workflow running · 1 sub-agent active · 2 queued
    • 2 sub-agents active
    • Queued-only state should stay pending/muted rather than fully green/active unless product explicitly decides queued work should pulse active.
  6. Ensure delegated activity does not create a blank secondary row: render secondary text only when there is delegated text to show.

Quality gate after Phase 2:

  • AgentListItem.test.tsx covers:
    • idle own state + active delegated workflow → active dot + delegated text.
    • own stream + delegated workflow → existing streaming UI wins.
    • own question/system error + delegated workflow → question/error UI wins.
    • queued-only behavior matches chosen visual treatment.

Phase 3 — Improve grouped sub-agent rows

  1. In TaskGroupListItem.tsx, use runningCount > 0 (and optionally queuedCount > 0) to apply a clearer in-progress style to collapsed task groups.
  2. Keep the existing count text, but make the icon/title/status less muted when running work exists.
  3. Avoid adding new animations unless the existing active-dot style can be reused cleanly.

Quality gate after Phase 3:

  • Add or update a focused test for a coalesced task group with runningCount > 0.
  • Verify grouped and ungrouped active children produce consistent parent roll-up.

Alternatives considered

Alternative A — child-row task-status fallback only

Make child rows active when their own metadata.taskStatus is running or awaiting_report, even if WorkspaceStore streaming state is missing.

  • Pros: very small, fixes child rows that are grey due to stale/missing activity snapshots.
  • Cons: does not fix the parent launcher row looking idle while delegated work is active.
  • Net product LoC estimate: +25 to +60 LoC.

Alternative B — backend workflow activity source

Add workflow-run activity to backend/sidebar state so the parent row is active while any workflow run for that workspace is pending, running, or backgrounded, even when no child agent is currently running.

  • Pros: semantically complete for coordinator-only workflow gaps.
  • Cons: requires new live workflow activity plumbing or polling; more lifecycle/recovery surface area.
  • Net product LoC estimate: +250 to +420 LoC.

Alternative C — mark parent workspace streaming during workflow

Force the parent workspace activity snapshot to streaming: true while the workflow runs.

  • Pros: simplest-looking integration.
  • Cons: misleading UX (model - streaming...), wrong mental model, and likely wrong interrupt/control expectations.
  • Recommendation: do not implement.
  • Net product LoC estimate: +40 to +90 LoC, but rejected.

Acceptance criteria

  • Launching a slash workflow that spawns active sub-agents makes the launcher workspace row visibly active/delegated-work-aware.
  • The launcher row does not say streaming... unless the launcher workspace's own assistant turn is actually streaming.
  • The delegated-work status includes useful counts and distinguishes workflow-owned activity when present.
  • Own workspace question/error/stream/provisioning states retain priority over delegated work.
  • Active descendants roll up through nested parent chains, including nested descendants of workflow-owned tasks.
  • Finished/reported descendants do not keep ancestors active.
  • Collapsed/hidden/grouped descendants still contribute to parent roll-up.
  • Collapsed task groups with active members read as in-progress.
  • Stale inactive agentStatus/todo text does not mask active delegated-work text.
  • No IPC/backend type changes are required for the recommended implementation.

Validation plan

Run targeted checks first, then broader static checks:

  1. bun test src/browser/components/AgentListItem/AgentListItem.test.tsx
  2. bun test src/browser/components/ProjectSidebar/ProjectSidebar.test.tsx
  3. bun test src/browser/utils/ui/workspaceFiltering.test.ts if any shared filtering helper changes.
  4. make typecheck
  5. make lint
  6. If changes touch shared UI helpers broadly, run make test or the closest repo-approved broader suite.

Dogfooding plan

Skills read for dogfooding/setup: dogfood, agent-browser, and dev-server-sandbox.

  1. Start an isolated app/backend instance:
    make dev-server-sandbox DEV_SERVER_SANDBOX_ARGS="--clean-projects"
    Use the emitted VITE_PORT URL as the target URL.
  2. Before browser automation, load live agent-browser CLI instructions as directed by the agent-browser skill:
    agent-browser skills get core
  3. Create a dogfood output directory, for example:
    mkdir -p dogfood-output/workflow-sidebar/screenshots dogfood-output/workflow-sidebar/videos
  4. Open the dev server with a named session:
    agent-browser --session workflow-sidebar open http://localhost:<VITE_PORT>
    agent-browser --session workflow-sidebar wait --load networkidle
  5. Configure a test project/workspace in the sandbox, enable/verify dynamic workflows, and create or use a workflow that spawns at least two sub-agents. If no reusable workflow exists, create a small scratch workflow whose agent steps are long enough to observe in the sidebar.
  6. Record the main repro path:
    agent-browser --session workflow-sidebar record start dogfood-output/workflow-sidebar/videos/slash-workflow-sidebar.webm
    • Screenshot before running the workflow.
    • Run /workflow <name> <args> from the workspace chat input.
    • Screenshot while the workflow is running and child agents are active.
    • Collapse/expand relevant task groups and screenshot both states.
    • Wait for completion/terminal continuation and screenshot the final idle/completed state.
    • Stop recording:
      agent-browser --session workflow-sidebar record stop
  7. Required evidence artifacts:
    • screenshots/before.png
    • screenshots/workflow-running-parent-active.png
    • screenshots/task-group-running.png if grouped rows apply
    • screenshots/workflow-completed.png
    • videos/slash-workflow-sidebar.webm
  8. During dogfooding, also capture console/errors:
    agent-browser --session workflow-sidebar errors
    agent-browser --session workflow-sidebar console
  9. Verification checklist:
    • Parent row active during delegated work.
    • Parent row uses delegated-work copy, not streaming....
    • Child/sub-agent rows remain independently understandable.
    • Project header/section attention wakes up while delegated work is active.
    • State returns to normal after completion unless the terminal parent continuation is actually streaming.

Implementation notes and guardrails

  • Keep changes minimal and frontend-local for the recommended approach.
  • Prefer simple derived data over a new store/API unless dogfooding reveals coordinator-only workflow gaps are a major UX problem.
  • Use assertions in recursive roll-up helpers for non-empty workspace ids and add cycle defense to avoid malformed metadata causing render failures.
  • Do not add speculative controls to the parent row; workflow interrupt/resume remains on workflow cards/commands.
  • Avoid manual memoization unless needed for correctness; React Compiler is enabled.

Generated with mux • Model: openai:gpt-5.5 • Thinking: xhigh • Cost: $198.41

ThomasK33 added 2 commits June 4, 2026 19:48
Roll descendant task activity up to parent workspace rows so slash workflow launchers stay visibly active without reusing generic streaming copy. Add focused sidebar, row, filtering, and task group tests for delegated activity states.

---

_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `719628{MUX_COSTS_USD:-unknown}`_

<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=131.74 -->
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please review the delegated-work sidebar roll-up, status precedence, task-group live fallback, and accessibility wiring.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Pushed a follow-up test expectation update for the delegated-work active-dot behavior that failed the integration job.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ebe831460

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/ui/workspaceFiltering.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

Addressed Codex finding PRRT_kwDOPxxmWM6HMtMj:

  • Removed the reportedAt != null shortcut from delegated terminal detection so resumed running/queued descendants with stale report timestamps still roll up correctly.
  • Added a regression test covering a running descendant with stale reportedAt.
  • Kept terminal reported / interrupted descendants suppressed from stale live-active hints.

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please re-review after the resumed-delegated-task fix.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc08a04d1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/ui/workspaceFiltering.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

Addressed Codex finding PRRT_kwDOPxxmWM6HMzAU:

  • Delegated activity now allows non-finalized interrupted descendants to remain active when the live sidebar fallback reports they are still streaming.
  • Completed reports (reported, or interrupted with reportedAt) still suppress stale live hints.
  • Added regression coverage for a live interrupted descendant before report finalization.

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please re-review after the live-interrupted delegated activity fix.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab88d3b982

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread dogfood-output/workflow-sidebar/console.txt Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

Addressed Codex finding PRRT_kwDOPxxmWM6HM6y3:

  • Removed the ignored dogfood-output/ evidence artifacts from the tracked PR diff.
  • Kept the dogfood artifacts local/ignored and updated the PR body to state that they are not tracked.

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please re-review after removing the tracked dogfood artifacts.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Jun 5, 2026
Merged via the queue into main with commit 4f0a7fa Jun 5, 2026
24 checks passed
@ThomasK33 ThomasK33 deleted the sidebar-0h0p branch June 5, 2026 06:23
@mux-bot mux-bot Bot mentioned this pull request Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant