Skip to content

feat: breadcrumb navigation, subworkflow isolation, and stop reliability#113

Merged
jrob5756 merged 23 commits intomicrosoft:mainfrom
PolyphonyRequiem:feat/web-breadcrumb-navigation
May 4, 2026
Merged

feat: breadcrumb navigation, subworkflow isolation, and stop reliability#113
jrob5756 merged 23 commits intomicrosoft:mainfrom
PolyphonyRequiem:feat/web-breadcrumb-navigation

Conversation

@PolyphonyRequiem
Copy link
Copy Markdown
Member

Summary

Adds full subworkflow awareness to the per-run web dashboard with breadcrumb navigation, fixes node status pollution from repeated subworkflow runs, and fixes the Stop button not working during subworkflows.

Changes

State Pollution Fix

  • Added wfDepth counter to the Zustand workflow store — only depth-0 workflow_started initializes root context
  • Each subworkflow invocation gets an isolated SubworkflowContext with its own nodes/routes/agents/groupProgress maps
  • Repeated runs of the same subworkflow no longer share state (no more green/blue node bleed)

Subworkflow Event Handling

  • Added TypeScript types for subworkflow_started, subworkflow_completed, subworkflow_failed
  • Event handlers create/update child contexts and route events via activeTarget() helper

Breadcrumb Navigation

  • BreadcrumbBar component shows the context stack above the graph
  • Click any breadcrumb to navigate to that context level
  • Double-click a workflow agent node to dive into its subworkflow graph
  • getViewedContext() + useViewed*() hooks provide stable, memoized context data

Subworkflow Node Visual

  • New WorkflowNode React Flow node type with dashed border and Layers icon
  • SubworkflowDetail panel lists subworkflow runs with status/cost, click to navigate

Stop Button Reliability (Engine Fix)

  • _check_interrupt(): in subworkflows (depth > 0), raises InterruptError instead of silently consuming the interrupt event — unwinds the child engine back to the parent
  • _handle_web_pause(): in subworkflows, also watches interrupt_event so Stop-while-paused works without requiring Resume first

Context-Aware Rendering

  • All graph node components read from useViewed*() hooks instead of root state
  • Detail panels resolve nodes from the viewed context

Testing

  • Zero new TypeScript errors (all tsc errors are pre-existing)
  • Verified on live dashboard instance at localhost:60108

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 90.19608% with 5 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@af2bbec). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/conductor/web/server.py 86.48% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #113   +/-   ##
=======================================
  Coverage        ?   85.59%           
=======================================
  Files           ?       53           
  Lines           ?     7629           
  Branches        ?        0           
=======================================
  Hits            ?     6530           
  Misses          ?     1099           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@PolyphonyRequiem
Copy link
Copy Markdown
Member Author

Follow-up: extending this PR with the recursive-gate-navigation fix

Heads up — I'm pushing an additional commit to this branch that fixes two related issues in this same surface area, plus the engine plumbing they need:

Symptoms being fixed

  1. Gates inside nested sub-workflows are unreachable from the dashboard. The view doesn't follow the active context when a sub-workflow spawns deeper, so a human_gate inside a child workflow appears as already-selected/inactive in the breadcrumb tree until the user manually navigates to it.
  2. Cannot drill into sub-workflow contexts from the parent agent node detail panel reliably under concurrency.

Root cause

The store today resolves "which sub-workflow context owns this event" by reading state.activeContextPath, then mutates that path on every subworkflow_started. That works for strictly serial nesting but breaks under any concurrency, and means the view cursor (what the user is looking at) and the active cursor (what the engine is currently executing) are conflated. There's no engine-side identity for a sub-workflow context, so the frontend has to infer it.

Approach

Move the source of truth for sub-workflow identity to the engine:

  • Each WorkflowEngine carries _dashboard_context_path: list[str] — the slot-key path identifying its own position in the recursive sub-workflow tree. Root engine = []. Child engine spawned by _execute_subworkflow(agent, ..., slot_key=…) gets [*parent_path, slot_key].
  • _emit auto-stamps subworkflow_path on every event a sub-engine produces. The dashboard uses this to route per-iteration agent events to the correct context (essential once concurrency is in play — see feat(composition): allow sub-workflows in for_each groups (#102) #110 below).
  • subworkflow_started / _completed / _failed for sequential sub-workflows now include parent_path: list[str] and slot_key: str so the frontend never has to guess parentage from activeContextPath.
  • Frontend store: subworkflow_* and workflow_* handlers resolve the owning context strictly from engine-supplied paths (via a new resolveSlotPath helper). activeTarget consults subworkflow_path on the event payload before falling back to activeContextPath. Handlers gracefully fall back to legacy behavior when these new fields are absent — no regression for events from older engine versions.
  • viewContextPath sticks to the live edge when the user hasn't navigated away (deep-equal check before the engine advances activeContextPath), so newly-spawned gates are reachable without manual breadcrumb clicks. If the user has manually navigated, their view is left alone.
  • SubworkflowDetail now navigates by ctx.slotKey (stable across reorders) instead of (agent_name, iteration).
  • New engine test: TestSubWorkflowDashboardPath verifies sequential subworkflow_started/_completed carry parent_path/slot_key and that child workflow_completed is auto-stamped with subworkflow_path.

Relationship to #110 (feat/for-each-workflows)

The "concurrency" half of this fix is mostly latent until #110 lands — sequential sub-workflows can't have concurrent siblings under one parent. But the frontend handlers and the engine infra are both needed for #110's for_each-of-workflow path to render correctly in the dashboard (otherwise concurrent for_each iterations of a sub-workflow stack as descendants of one another instead of laying out as siblings of the parent group).

I'm pushing a no-op-until-this-lands companion commit to #110 that adds the for_each-side parent_path/slot_key/item_key/iteration emit fields and threads slot_key into _execute_subworkflow_with_inputs. It uses getattr for _dashboard_context_path so it doesn't blow up on #110 alone. Once both PRs merge to main, the wiring is end-to-end.

Order of merge between #110 and #113 doesn't matter — both PRs are independently safe.

PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 3, 2026
For each iteration of a for_each whose member is a sub-workflow,
emit subworkflow_started / _completed / _failed with three new fields:

- parent_path: snapshot of the parent engine's dashboard context path
  (empty list when the dashboard infra from microsoft#113 is absent)
- slot_key: a unique "<group>[<key>]" identity per iteration so
  concurrent iterations don't stack under one shared dashboard context
- iteration: 1-based ordinal (matches the existing item_key)

Conditionally thread `_dashboard_context_path` into the child engine
via getattr so this code is forward-compatible with PR microsoft#113 landing
in either order. When microsoft#113 has not landed, the kwarg is omitted and
behavior is unchanged. When microsoft#113 has landed, the child engine receives
[*parent_path, slot_key] and auto-stamps subworkflow_path on every
event it emits.

Wraps `_execute_subworkflow_with_inputs` in try/except so a failing
iteration emits subworkflow_failed before for_each_item_failed; the
original exception propagates to keep existing error handling intact.

Adds a regression test that runs three for_each iterations and asserts
each gets a distinct slot_key (batch[0], batch[1], batch[2]) on both
started and completed events.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PolyphonyRequiem
Copy link
Copy Markdown
Member Author

Pushed 34de287 (cherry-picked from local main, then frontend assets rebuilt for that bundle hash). Force-push from fork landed cleanly.

Status:

  • Gate runs: only license/cla shows on this head SHA. The branch is currently mergeStateStatus: DIRTY / behind_by: 5 against main, so the upstream CI workflows do not appear to have been triggered for this push. A rebase (or a maintainer "approve workflow runs" click) should kick them.
  • Companion PR feat(composition): allow sub-workflows in for_each groups (#102) #110 (feat/for-each-workflows) was pushed in parallel with the for_each-side companion changes and is fully green (7/7 checks). The two PRs are independently safe to land in either order — see the design discussion comment above.

@PolyphonyRequiem PolyphonyRequiem force-pushed the feat/web-breadcrumb-navigation branch from 34de287 to 5d4bb86 Compare May 3, 2026 20:24
@PolyphonyRequiem
Copy link
Copy Markdown
Member Author

Resolved the conflicts by rebasing feat/web-breadcrumb-navigation onto microsoft/conductor:main (was 5 behind, including #109's _execute_subworkflow_with_inputs and #107's run_context/metadata field).

Conflict resolution:

  • WorkflowEngine.__init__: kept both new kwargs (run_context, _dashboard_context_path) and both initialization blocks (system metadata + dashboard context path) — the two are orthogonal.
  • tests/test_engine/test_subworkflow.py: kept upstream's TestSubWorkflowInputMapping class as-is, appended TestSubWorkflowDashboardPath after it.

Rebased head is 5d4bb86. All 7 gates green.

jrob5756 pushed a commit that referenced this pull request May 4, 2026
* feat(composition): allow sub-workflows in for_each groups (#102)

Remove validator restriction blocking type='workflow' in for_each groups.
Wire execute_single_item() to call _execute_subworkflow_with_inputs()
for workflow agents, rendering input_mapping with loop variables in scope.

- Validator: Remove workflow rejection in for_each validation
- Engine: Add workflow branch in execute_single_item(), new helper
  _execute_subworkflow_with_inputs() for pre-built inputs
- Tests: Update test_workflow_in_for_each to validate (not reject)
- Experimental workflows: test-for-each-workflow parent/child pair

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* style: ruff format workflow.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Emit parent_path/slot_key/iteration on for_each-of-workflow lifecycle

For each iteration of a for_each whose member is a sub-workflow,
emit subworkflow_started / _completed / _failed with three new fields:

- parent_path: snapshot of the parent engine's dashboard context path
  (empty list when the dashboard infra from #113 is absent)
- slot_key: a unique "<group>[<key>]" identity per iteration so
  concurrent iterations don't stack under one shared dashboard context
- iteration: 1-based ordinal (matches the existing item_key)

Conditionally thread `_dashboard_context_path` into the child engine
via getattr so this code is forward-compatible with PR #113 landing
in either order. When #113 has not landed, the kwarg is omitted and
behavior is unchanged. When #113 has landed, the child engine receives
[*parent_path, slot_key] and auto-stamps subworkflow_path on every
event it emits.

Wraps `_execute_subworkflow_with_inputs` in try/except so a failing
iteration emits subworkflow_failed before for_each_item_failed; the
original exception propagates to keep existing error handling intact.

Adds a regression test that runs three for_each iterations and asserts
each gets a distinct slot_key (batch[0], batch[1], batch[2]) on both
started and completed events.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Green <dangreen@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PolyphonyRequiem PolyphonyRequiem force-pushed the feat/web-breadcrumb-navigation branch from 3fec4d9 to 9c1abed Compare May 4, 2026 15:07
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool change. Made a few comments. Then we can merge.

state.agentsTotal = agentNames.size;
} else {
// Child workflow — populate the active child context
const ctx = resolveContext(state.subworkflowContexts, state.activeContextPath);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrent for-each-of-workflow context corruption.

This handler routes via the global state.activeContextPath instead of the engine-stamped data.subworkflow_path. The PR went to lengths to add subworkflow_path auto-stamping in _emit (workflow.py:430-434) and uses it in workflow_completed (line 1293) and workflow_failed (line 1331) — workflow_started is the missing one.

Reproduction with concurrent for_each-of-workflow:

  1. Iteration A emits subworkflow_startedactiveContextPath=[0]
  2. Iteration B emits subworkflow_startedactiveContextPath=[1]
  3. Iteration A's child engine emits workflow_started (carrying subworkflow_path=[A_slot]) → resolved to activeContextPath=[1] → A's graph/agents populate B's context.

Fix: resolve via subworkflow_path first, falling back to activeContextPath:

const ctx = (data.subworkflow_path !== undefined
  ? resolveSlotPath(state.subworkflowContexts, data.subworkflow_path as string[])?.ctx
  : null)
  ?? resolveContext(state.subworkflowContexts, state.activeContextPath);

Copy link
Copy Markdown
Member Author

@PolyphonyRequiem PolyphonyRequiem May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d27109d. Adopted your suggested resolution pattern (subworkflow_path with activeContextPath fallback) and added subworkflow_path?: string[] to WorkflowStartedData so the resolver can read it without a cast.

Also added a regression test in test_subworkflow.py::TestSubWorkflowDashboardPath::test_concurrent_for_each_subworkflow_emits_distinct_slot_keys that exercises max_concurrent=3 (the existing for_each test only covered the sequential max_concurrent=1 case).

// engine-supplied parent_path so concurrent completions land correctly.
let parentIndexPath: number[];
if (data.parent_path !== undefined) {
const resolved = resolveSlotPath(state.subworkflowContexts, data.parent_path);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent fallback to root + activeContextPath corruption on slot-resolution miss.

resolved?.indexPath ?? [] silently falls back to the root context ([]), and state.activeContextPath = parentIndexPath later in the handler then unconditionally overwrites the active path. If resolveSlotPath returns null (e.g., event arrives before its sibling subworkflow_started, or any path inconsistency), the code:

  1. Targets state.nodes (root) and stamps status: 'completed' on a same-named root agent that has nothing to do with the subworkflow.
  2. Resets activeContextPath to [], breaking subsequent sibling routing.

Compare to subworkflow_started at line 1364 which correctly does if (!resolved) return;. Mirror that here:

if (data.parent_path !== undefined) {
  const resolved = resolveSlotPath(state.subworkflowContexts, data.parent_path);
  if (!resolved) return;
  parentIndexPath = resolved.indexPath;
} else {
  parentIndexPath = state.activeContextPath;
}

The identical bug exists in subworkflow_failed at line 1458 — please fix both.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 37d856d. Mirrored the if (!resolved) return; guard from subworkflow_started (line 1364) in both subworkflow_completed (line 1419) and subworkflow_failed (line 1458). On a slot-resolution miss, both handlers now return early instead of silently falling back to root and clobbering activeContextPath.

# unwinds the child engine back to the parent, stopping the workflow.
if self._web_dashboard is not None:
if self._subworkflow_depth > 0:
raise InterruptError(agent_name=current_agent_name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headline feature is untested.

This new if self._subworkflow_depth > 0: raise InterruptError(...) branch — and the matching new stop_task logic in _handle_web_pause (lines 1077-1110) — have zero test coverage. Every test in TestCheckInterrupt (tests/test_engine/test_workflow_interrupt.py:572-658) constructs an engine with _subworkflow_depth=0, so this branch is dead code under tests. You could delete lines 969-970 and all 45 tests in the suite would still pass.

Since "Stop button reliability during subworkflows" is a headline feature of this PR, please add at minimum:

  1. test_check_interrupt_raises_in_subworkflow — engine with _subworkflow_depth=1 and _web_dashboard set; assert InterruptError is raised.
  2. test_handle_web_pause_stop_event_in_subworkflow — pause a child engine, set _interrupt_event, assert InterruptError.
  3. test_nested_subworkflow_path_accumulates — depth ≥ 2; assert subworkflow_path chains correctly across multiple levels.
  4. test_subworkflow_failed_event_carries_parent_path_and_slot_key — make the child raise; assert the failed event payload (currently only the success path is asserted at tests/test_engine/test_subworkflow.py:1014-1015).
  5. A for_each-of-workflow test asserting distinct slot_key per iteration — the existing test only covers the trivial sequential case (slot_key == agent.name).

Copy link
Copy Markdown
Member Author

@PolyphonyRequiem PolyphonyRequiem May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fef6b36. Added the test cases you specified (and one bonus):

  • TestCheckInterruptSubworkflow::test_check_interrupt_raises_in_subworkflow - engine with _subworkflow_depth=1 and _web_dashboard set; asserts InterruptError is raised and interrupt_event is cleared.
  • TestCheckInterruptSubworkflow::test_check_interrupt_consumed_silently_at_root - regression guard for the root-depth silent-consume branch.
  • TestHandleWebPauseSubworkflow::test_handle_web_pause_stop_event_in_subworkflow - pauses a child engine, sets _interrupt_event mid-pause, asserts InterruptError.
  • TestHandleWebPauseSubworkflow::test_handle_web_pause_root_ignores_interrupt_event - documents the intentional root-vs-subworkflow asymmetry as an executable spec.
  • TestSubWorkflowDashboardPath::test_subworkflow_failed_event_carries_parent_path_and_slot_key - child raises, asserts the failed event payload (agent_name, parent_path, slot_key, error_type, message).
  • TestSubWorkflowDashboardPath::test_nested_subworkflow_path_accumulates - depth 2 (parent -> mid -> leaf), asserts subworkflow_path chains correctly across both levels and that the root engine still emits workflow_completed without a stamp.
  • TestSubWorkflowDashboardPath::test_concurrent_for_each_subworkflow_emits_distinct_slot_keys - your point Idle recovery: cumulative counter fails long-running agents — add per-workflow config and reset-on-progress #5. The existing test_for_each_subworkflow_emits_distinct_slot_keys already covers slot-key uniqueness for for_each-of-workflow, but it ran with max_concurrent=1 so iterations were sequential. The new test uses max_concurrent=3 so iterations actually overlap, proving uniqueness is not an artifact of serial execution.

Engine suite: 53 passed (was 46), one pre-existing Windows path failure in test_event_log.py is unrelated.

# the user to first Resume then wait for the next between-agent check.
stop_task = None
if self._subworkflow_depth > 0 and self._interrupt_event is not None:
self._interrupt_event.clear()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pause+Stop UX diverges between root and subworkflow.

Pre-clearing interrupt_event here means a quick double-click of Stop (where the second click lands before partial output reaches _handle_web_pause) silently discards the second signal. Behavior also differs from root: at root, a second Stop while paused is ignored until Resume (only Kill works); here, the second Stop is honored — but only if it arrives after the pause is established.

At minimum, add a code comment explaining the intentional asymmetry. Ideally, document the divergence in the dashboard UX docs (or align behavior between root and subworkflow).

Copy link
Copy Markdown
Member Author

@PolyphonyRequiem PolyphonyRequiem May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1c3d199 with an inline comment in _handle_web_pause covering all three points:

  1. Why root depth deliberately omits the interrupt_event subscription (pause exits only on Resume or Kill).
  2. Why subworkflow depth subscribes to it (so Stop unwinds the child engine without requiring Resume first).
  3. The tiny clear()/create_task() race window where a Stop click can be silently discarded, and the trade-off vs. carrying a stale Stop signal across pause cycles.

Also added TestHandleWebPauseSubworkflow::test_handle_web_pause_root_ignores_interrupt_event (in fef6b36) as an executable spec for the asymmetry, so the next person who tries to "fix" it gets a failing test.

Aligning behavior between root and subworkflow Stop semantics feels out of scope for this PR - happy to file a follow-up if you want them unified one way or the other.

PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
The workflow_started handler resolved the owning context via
state.activeContextPath, which is unsafe under concurrent
for_each-of-workflow:

  1. Iteration A emits subworkflow_started -> activeContextPath=[0]
  2. Iteration B emits subworkflow_started -> activeContextPath=[1]
  3. Iteration A's child engine emits workflow_started (carrying
     subworkflow_path=[A_slot]) -> resolved to activeContextPath=[1]
     -> A's graph/agents populate B's context.

Mirror the resolution pattern already used by workflow_completed and
workflow_failed: prefer subworkflow_path from the event payload, fall
back to activeContextPath only when the field is absent (legacy
engines).

Also adds the optional subworkflow_path field to the
WorkflowStartedData TypeScript type so the resolver can read it
without a cast.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Both handlers used 
esolved?.indexPath ?? [] as a fallback when
resolveSlotPath returned null. That silently routed the event to the
root context, with two distinct corruption modes:

  1. 	argetNodes defaulted to state.nodes (root), so a same-named
     root agent had its status overwritten to 'completed' or 'failed'
     by an unrelated subworkflow event.
  2. state.activeContextPath = parentIndexPath then unconditionally
     reset the active path to [], breaking subsequent sibling routing.

Mirror the guard that subworkflow_started already uses (line 1365):
when resolveSlotPath returns null, return early. A null result means
the event arrived before its sibling subworkflow_started or the path
is inconsistent — neither warrants polluting root state.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
…yloads

Adds coverage for code paths that previously had none, all introduced
by the breadcrumb-navigation PR:

  TestCheckInterruptSubworkflow
    - test_check_interrupt_raises_in_subworkflow
        Sub-engine in web mode propagates the interrupt as
        InterruptError so the child unwinds back to the parent.
    - test_check_interrupt_consumed_silently_at_root
        Regression guard for the root-depth web-mode silent-consume
        branch (the partial-output handler does the real pausing).

  TestHandleWebPauseSubworkflow
    - test_handle_web_pause_stop_event_in_subworkflow
        Pause-then-Stop in a sub-workflow exits via interrupt_event
        without requiring Resume first.
    - test_handle_web_pause_root_ignores_interrupt_event
        Documents the intentional root-vs-subworkflow asymmetry: at
        root, only Resume or Kill exit a pause.

  TestSubWorkflowDashboardPath (additions)
    - test_subworkflow_failed_event_carries_parent_path_and_slot_key
        The exception branch of _execute_subworkflow emits
        subworkflow_failed with parent_path and slot_key. Previously
        only the success-path emit was asserted.
    - test_nested_subworkflow_path_accumulates
        At depth >= 2 (parent -> mid -> leaf), each engine emits its
        own workflow_completed and the auto-stamped subworkflow_path
        chains correctly across nesting levels.
    - test_concurrent_for_each_subworkflow_emits_distinct_slot_keys
        Existing for_each test ran with max_concurrent=1; this variant
        uses max_concurrent=3 so iterations actually overlap, proving
        slot_key uniqueness is not an artifact of serial execution.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Adds an inline comment in _handle_web_pause explaining:

  1. Why root depth deliberately omits the interrupt_event subscription
     (pause exits only on Resume or Kill).
  2. Why subworkflow depth subscribes to it (so Stop unwinds the child
     engine without requiring Resume first).
  3. The tiny clear()/create_task() race window where a Stop click can
     be silently discarded, and the trade-off vs. carrying a stale Stop
     signal across pause cycles.

No behavior change. Documents an intentional design choice flagged in
PR microsoft#113 review.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
…yloads

Adds coverage for code paths that previously had none, all introduced
by the breadcrumb-navigation PR:

  TestCheckInterruptSubworkflow
    - test_check_interrupt_raises_in_subworkflow
        Sub-engine in web mode propagates the interrupt as
        InterruptError so the child unwinds back to the parent.
    - test_check_interrupt_consumed_silently_at_root
        Regression guard for the root-depth web-mode silent-consume
        branch (the partial-output handler does the real pausing).

  TestHandleWebPauseSubworkflow
    - test_handle_web_pause_stop_event_in_subworkflow
        Pause-then-Stop in a sub-workflow exits via interrupt_event
        without requiring Resume first.
    - test_handle_web_pause_root_ignores_interrupt_event
        Documents the intentional root-vs-subworkflow asymmetry: at
        root, only Resume or Kill exit a pause.

  TestSubWorkflowDashboardPath (additions)
    - test_subworkflow_failed_event_carries_parent_path_and_slot_key
        The exception branch of _execute_subworkflow emits
        subworkflow_failed with parent_path and slot_key. Previously
        only the success-path emit was asserted.
    - test_nested_subworkflow_path_accumulates
        At depth >= 2 (parent -> mid -> leaf), each engine emits its
        own workflow_completed and the auto-stamped subworkflow_path
        chains correctly across nesting levels.
    - test_concurrent_for_each_subworkflow_emits_distinct_slot_keys
        Existing for_each test ran with max_concurrent=1; this variant
        uses max_concurrent=3 so iterations actually overlap, proving
        slot_key uniqueness is not an artifact of serial execution.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Adds an inline comment in _handle_web_pause explaining:

  1. Why root depth deliberately omits the interrupt_event subscription
     (pause exits only on Resume or Kill).
  2. Why subworkflow depth subscribes to it (so Stop unwinds the child
     engine without requiring Resume first).
  3. The tiny clear()/create_task() race window where a Stop click can
     be silently discarded, and the trade-off vs. carrying a stale Stop
     signal across pause cycles.

No behavior change. Documents an intentional design choice flagged in
PR microsoft#113 review.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PolyphonyRequiem PolyphonyRequiem force-pushed the feat/web-breadcrumb-navigation branch 2 times, most recently from 4e07de4 to 6fb6e93 Compare May 4, 2026 18:08
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Both handlers used 
esolved?.indexPath ?? [] as a fallback when
resolveSlotPath returned null. That silently routed the event to the
root context, with two distinct corruption modes:

  1. 	argetNodes defaulted to state.nodes (root), so a same-named
     root agent had its status overwritten to 'completed' or 'failed'
     by an unrelated subworkflow event.
  2. state.activeContextPath = parentIndexPath then unconditionally
     reset the active path to [], breaking subsequent sibling routing.

Mirror the guard that subworkflow_started already uses (line 1365):
when resolveSlotPath returns null, return early. A null result means
the event arrived before its sibling subworkflow_started or the path
is inconsistent — neither warrants polluting root state.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
…yloads

Adds coverage for code paths that previously had none, all introduced
by the breadcrumb-navigation PR:

  TestCheckInterruptSubworkflow
    - test_check_interrupt_raises_in_subworkflow
        Sub-engine in web mode propagates the interrupt as
        InterruptError so the child unwinds back to the parent.
    - test_check_interrupt_consumed_silently_at_root
        Regression guard for the root-depth web-mode silent-consume
        branch (the partial-output handler does the real pausing).

  TestHandleWebPauseSubworkflow
    - test_handle_web_pause_stop_event_in_subworkflow
        Pause-then-Stop in a sub-workflow exits via interrupt_event
        without requiring Resume first.
    - test_handle_web_pause_root_ignores_interrupt_event
        Documents the intentional root-vs-subworkflow asymmetry: at
        root, only Resume or Kill exit a pause.

  TestSubWorkflowDashboardPath (additions)
    - test_subworkflow_failed_event_carries_parent_path_and_slot_key
        The exception branch of _execute_subworkflow emits
        subworkflow_failed with parent_path and slot_key. Previously
        only the success-path emit was asserted.
    - test_nested_subworkflow_path_accumulates
        At depth >= 2 (parent -> mid -> leaf), each engine emits its
        own workflow_completed and the auto-stamped subworkflow_path
        chains correctly across nesting levels.
    - test_concurrent_for_each_subworkflow_emits_distinct_slot_keys
        Existing for_each test ran with max_concurrent=1; this variant
        uses max_concurrent=3 so iterations actually overlap, proving
        slot_key uniqueness is not an artifact of serial execution.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Adds an inline comment in _handle_web_pause explaining:

  1. Why root depth deliberately omits the interrupt_event subscription
     (pause exits only on Resume or Kill).
  2. Why subworkflow depth subscribes to it (so Stop unwinds the child
     engine without requiring Resume first).
  3. The tiny clear()/create_task() race window where a Stop click can
     be silently discarded, and the trade-off vs. carrying a stale Stop
     signal across pause cycles.

No behavior change. Documents an intentional design choice flagged in
PR microsoft#113 review.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PolyphonyRequiem
Copy link
Copy Markdown
Member Author

@jrob5756 good catch — confirmed and fixed.

what happened: i did the review-fix work on a local branch from another machine, then a force-push of unrelated work (6fb6e93 viewport auto-fit, etc.) from this machine landed on top of 9c1abed and overwrote the chain. the fix commits stayed reachable in my local reflog but never made it back to origin.

remediation (no force-push):

  • f374a19 — fix(web): bail on slot-resolution miss in subworkflow_completed/failed (your finding Support file path as an input #2)
  • 75e2142 — test(engine): cover subworkflow stop/interrupt and dashboard event payloads (53 tests, all pass)
  • 15afe2c — docs(engine): document root-vs-subworkflow pause/stop asymmetry
  • e1212ed — build: rebuild static assets so the bundle reflects the slot-resolution fix

cherry-picked on top of current origin and fast-forwarded — no force-push, your in-flight reviews stay intact.

re finding #1 (workflow_started subworkflow_path routing): 22f1988 already on the branch is functionally equivalent to my orphaned d27109d (same fix to the same line range, just inline-cast vs typed-field). no need to re-cherry-pick.

PR head is now e1212ed. apologies for the noise — the cross-machine state had me chasing a ghost.

Daniel Green and others added 8 commits May 4, 2026 15:09
Adds full subworkflow awareness to the per-run web dashboard:

State pollution fix:
- Added wf_depth counter to workflow store — only depth-0
  workflow_started initializes root context. Inner workflow events
  are routed to isolated SubworkflowContext objects.
- Each subworkflow invocation gets its own nodes/routes/agents maps,
  keyed by (parentAgent, iteration). Repeated runs of the same
  subworkflow no longer share state.

Subworkflow event handling:
- Added TypeScript types for subworkflow_started, subworkflow_completed,
  subworkflow_failed events (mirrors engine emit).
- Event handlers create/update child contexts and track the active
  context path for routing subsequent events.

Breadcrumb navigation:
- New BreadcrumbBar component shows the context stack above the graph
  (e.g., Root > twig-sdlc-planning > plan-issue).
- Click any breadcrumb to navigate to that context level.
- Double-click a workflow agent node in the graph to dive into its
  subworkflow context.
- Graph rebuilds automatically when context changes.

Context stack architecture:
- SubworkflowContext[] tree structure mirrors workflow nesting.
- activeContextPath tracks where live events are routed.
- viewContextPath tracks what the user is viewing (independent).
- getViewedContext() returns the correct nodes/routes for rendering.
- All event handlers use activeTarget() helper to route to the
  correct context's nodes/groupProgress.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Phase 4-5 of breadcrumb navigation feature:

WorkflowNode component:
- New node type for type:'workflow' agents with dashed border and
  Layers icon (visually distinct from regular agent nodes).
- Shows child workflow name, elapsed time, and a chevron indicator
  when a SubworkflowContext exists.
- Double-click to navigate into the subworkflow graph.

SubworkflowDetail panel:
- New detail component shown when a workflow agent is selected.
- Lists all subworkflow runs for that agent with status, agent
  count, and cost summary.
- Click any run to navigate into its context.

Context-aware rendering:
- All graph node components (AgentNode, ScriptNode, GateNode,
  GroupNode, AnimatedEdge) now read from getViewedContext().nodes
  instead of root state.nodes — ensures correct status display
  when viewing child contexts.
- DetailPanel reads from viewed context for node lookup.
- GroupDetail reads groupProgress from viewed context.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
getViewedContext() creates a new object on every call, causing infinite
re-render loops (React error #185) when used inside Zustand selectors.

New hooks in use-viewed-context.ts use useMemo with stable state
references:
- useViewedNodes() — nodes map for current context
- useViewedGroupProgress() — group progress for current context
- useViewedHighlightedEdges() — edge highlights
- useViewedSubworkflowContexts() — child contexts
- useViewedGraphData() — full graph data for WorkflowGraph

All graph components and detail panels updated to use these hooks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a subworkflow agent is running, the shared interrupt_event was
being silently consumed by _check_interrupt() in web mode (line 857:
'if self._web_dashboard is not None: return None'). This meant Stop
would pause the current agent but then resume and continue — the
interrupt never propagated to the parent engine.

Two fixes:

1. _check_interrupt(): when _subworkflow_depth > 0, raise
   InterruptError instead of silently consuming the interrupt.
   This unwinds the child engine back to the parent's
   _execute_subworkflow try/except, stopping the workflow.

2. _handle_web_pause(): in subworkflows, also watch interrupt_event
   alongside resume/kill/disconnect events. A second Stop click
   while an agent is paused now raises InterruptError immediately,
   without requiring Resume first.

Root-level (depth 0) behavior is unchanged — Stop still pauses the
current agent with Resume/Kill options in the dashboard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On Windows with Python 3.14+, the proactor event loop's accept callback
can fire after Server.close() sets _sockets = None during shutdown,
causing an AssertionError in base_events.py:_attach that crashes the
workflow process.

Fix:
- Add _guarded_serve() wrapper that catches AssertionError when the
  uvicorn server is in shutdown state (should_exit = True)
- Install a custom event-loop exception handler during server lifetime
  that suppresses the same race when it surfaces through callbacks
- _is_proactor_shutdown_race() validates: AssertionError type, server
  shutdown state, and asyncio-originating traceback frames
- Restore original exception handler in stop()

The guard is narrowly scoped: only AssertionError during server shutdown
is suppressed. All other exceptions delegate to the original handler.

Tests: 9 new tests covering the race detection, exception handler
delegation, guarded serve behavior, and edge cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yers

When navigating between subworkflow layers via breadcrumbs or double-click,
old React Flow edges from the previous layer persisted as floating links
disconnected from any visible nodes.

Two fixes:
- WorkflowGraph: explicitly clear nodes and edges when switching to an
  empty context (subworkflow data not yet populated)
- graph-layout: filter edges against the actual node ID set to prevent
  orphan edges from routes referencing non-existent nodes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… nodes

Parse ?agent={name} and ?subworkflow={name} query params on initial load
to auto-select and center the matching node in the workflow graph. This
enables the meta-dashboard (conductor-dashboard) to generate clickable
breadcrumb links that open the conductor UI focused on a specific node.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update useDeepLink hook to:
- Parse slash-separated subworkflow paths (e.g., ?subworkflow=planning/design)
  for navigating multiple levels deep into nested subworkflows
- Support combined ?subworkflow=X&agent=Y to select an agent within
  a subworkflow context
- Remove dependency on subworkflowContexts selector (array mutation
  doesn't trigger re-renders); rely on late-joiner replay instead

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Daniel Green and others added 15 commits May 4, 2026 15:09
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrote useDeepLink to fix timing issues that prevented navigation:

- Use zustand.subscribe() instead of useEffect + selector reactivity.
  The old approach relied on subworkflowContexts selector changes,
  but the store mutates the array in-place during processEvent,
  so the selector never detected changes.
- Resolve the full subworkflow path in one shot via index walking
  instead of calling navigateIntoSubworkflow() in a loop.
- Set viewContextPath directly via setState instead of relying on
  action functions that might see stale state.
- Add error banner when deep-link target is invalid: shows the error
  message and a link back to the root dashboard.
- Validate agent exists in the target context's agent list.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When viewing a sub-workflow graph, the start and end nodes are now
replaced with distinct ingress/egress nodes that:
- Use a dashed border with rounded-xl style to visually distinguish
  them from regular start/end nodes
- Display 'From <parent agent>' on the ingress node
- Display 'Return to <parent agent>' on the egress node
- Navigate back to the parent workflow on double-click

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The breadcrumb navigation feature could not reach gates inside nested
sub-workflows: the dashboard's view path did not follow the active
context as a parent agent spawned a child workflow, so a human_gate
inside that child was unreachable from the breadcrumb tree.

The root cause: the store inferred sub-workflow parentage from the
shared activeContextPath, then mutated it on every subworkflow_started.
That conflated the user's view cursor with the engine's execution
cursor and made events impossible to route correctly under any
concurrency.

Engine: each WorkflowEngine now carries _dashboard_context_path, the
slot-key path identifying its position in the recursive sub-workflow
tree (root = []). _execute_subworkflow accepts a slot_key and threads
[*parent_path, slot_key] into the spawned child engine. _emit auto-
stamps subworkflow_path on every event a sub-engine produces.
Sequential subworkflow_started/_completed/_failed include parent_path
and slot_key.

Frontend: subworkflow_started/completed/failed and workflow_completed/
failed handlers resolve the owning context strictly from engine-
supplied parent_path / subworkflow_path via a new resolveSlotPath
helper, instead of mutating a single shared activeContextPath.
activeTarget consults subworkflow_path on event data when present so
per-iteration agent_message/tool/turn events land in the right per-
iteration context. viewContextPath sticks to the live edge when the
user has not navigated away, so newly-spawned gates inside sub-
workflows are reachable without manual breadcrumb clicks. Handlers
fall back to legacy behavior when these new fields are absent.

SubworkflowDetail navigates by ctx.slotKey (stable across iteration
reorders) instead of (agent_name, iteration). Breadcrumbs render slot
keys so concurrent iterations are distinguishable.

Adds TestSubWorkflowDashboardPath covering parent_path/slot_key
emission for sequential sub-workflows and the auto-stamped
subworkflow_path on the child workflow_completed event.

This is a self-contained slice of a broader fix that also addresses
concurrent for_each-of-workflow stacking; the matching for_each-side
emit changes are staged in microsoft#110 and become functional once both PRs
land. Order of merge does not matter — handlers degrade gracefully
when events lack the new fields.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add flexible segment matching for subworkflow deep-links:
  1. Exact slotKey match (e.g. plan_child[item-0])
  2. Positional index (e.g. plan_child#0, 0-based)
  3. Bare agent name (when unambiguous)

Ambiguous bare names (multiple for_each iterations) now produce
an actionable error listing valid alternatives instead of silently
failing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…arrows

Edges whose source or target was a child node inside a parallel group
rendered at incorrect absolute positions (React Flow positions children
relative to the parent group). This caused orphaned arrowheads in the
top-left corner of the graph.

Fix: remap any edge endpoint that references a group child to the
parent group node instead, so dagre can properly route the edge and
React Flow draws it at the correct position.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add FitViewOnContextSwitch component that triggers fitView when
navigating between sub-workflow contexts, preventing stale viewport
positioning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When ?agent= is provided without ?subworkflow=, explicitly pin
viewContextPath to root ([]) before the agent lookup runs. This
prevents the sticky-follow mechanism from advancing the view into
a stale subworkflow/for_each iteration during WS replay, which
caused 'Agent not found' errors for root-level agents.

Also handles ?subworkflow=&agent=foo (empty subworkflow string)
correctly — empty string is falsy so falls through to the
agent-only reset path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Under concurrent for_each iterations, the global activeContextPath is
advanced by each subworkflow_started event. When workflow_started for
an earlier iteration's child engine arrived, it was resolved against
the (now-stale) activeContextPath, so its agents/routes were written
to the wrong sibling ctx — or got overwritten by a later sibling.
That manifested as phantom routes (and missing routes) when the user
navigated into a deeply-nested for_each iteration: routes from one
iteration would surface in another, leaving edges drawn between
agents that don't appear in the iteration's view.

workflow_completed and workflow_failed already used the engine-
supplied subworkflow_path slot key (commit f77b20b); workflow_started
was the one handler missed. This change makes it consistent with
its sibling handlers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Both handlers used 
esolved?.indexPath ?? [] as a fallback when
resolveSlotPath returned null. That silently routed the event to the
root context, with two distinct corruption modes:

  1. 	argetNodes defaulted to state.nodes (root), so a same-named
     root agent had its status overwritten to 'completed' or 'failed'
     by an unrelated subworkflow event.
  2. state.activeContextPath = parentIndexPath then unconditionally
     reset the active path to [], breaking subsequent sibling routing.

Mirror the guard that subworkflow_started already uses (line 1365):
when resolveSlotPath returns null, return early. A null result means
the event arrived before its sibling subworkflow_started or the path
is inconsistent — neither warrants polluting root state.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yloads

Adds coverage for code paths that previously had none, all introduced
by the breadcrumb-navigation PR:

  TestCheckInterruptSubworkflow
    - test_check_interrupt_raises_in_subworkflow
        Sub-engine in web mode propagates the interrupt as
        InterruptError so the child unwinds back to the parent.
    - test_check_interrupt_consumed_silently_at_root
        Regression guard for the root-depth web-mode silent-consume
        branch (the partial-output handler does the real pausing).

  TestHandleWebPauseSubworkflow
    - test_handle_web_pause_stop_event_in_subworkflow
        Pause-then-Stop in a sub-workflow exits via interrupt_event
        without requiring Resume first.
    - test_handle_web_pause_root_ignores_interrupt_event
        Documents the intentional root-vs-subworkflow asymmetry: at
        root, only Resume or Kill exit a pause.

  TestSubWorkflowDashboardPath (additions)
    - test_subworkflow_failed_event_carries_parent_path_and_slot_key
        The exception branch of _execute_subworkflow emits
        subworkflow_failed with parent_path and slot_key. Previously
        only the success-path emit was asserted.
    - test_nested_subworkflow_path_accumulates
        At depth >= 2 (parent -> mid -> leaf), each engine emits its
        own workflow_completed and the auto-stamped subworkflow_path
        chains correctly across nesting levels.
    - test_concurrent_for_each_subworkflow_emits_distinct_slot_keys
        Existing for_each test ran with max_concurrent=1; this variant
        uses max_concurrent=3 so iterations actually overlap, proving
        slot_key uniqueness is not an artifact of serial execution.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds an inline comment in _handle_web_pause explaining:

  1. Why root depth deliberately omits the interrupt_event subscription
     (pause exits only on Resume or Kill).
  2. Why subworkflow depth subscribes to it (so Stop unwinds the child
     engine without requiring Resume first).
  3. The tiny clear()/create_task() race window where a Stop click can
     be silently discarded, and the trade-off vs. carrying a stale Stop
     signal across pause cycles.

No behavior change. Documents an intentional design choice flagged in
PR microsoft#113 review.

Reported-by: jrob5756 (PR microsoft#113 review)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rebuilds the production bundle so it incorporates the
subworkflow_completed/failed slot-resolution fix (f374a19) that was
cherry-picked into this branch. Without this rebuild the committed
`static/` bundle would not reflect the source change and the dashboard
would still exhibit the bug at runtime.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous fix (593f4e1) reset viewContextPath to root for agent-only
deep-links to stop sticky-follow from stranding the user inside a stale
for_each iteration during replay. That solved the false-negative case
(agent at root) but introduced a false-negative for nested agents: if
the agent only exists inside a sub-workflow / for_each iteration, the
resolver would now error with 'Agent X not found in root workflow.'
That broke external integrations (e.g. notification feeds) that surface
?agent=X without knowing the parent slot chain.

Resolution algorithm now:
  - subworkflow=foo&agent=bar: navigate to foo, look for bar there;
    if bar isn't at foo but exists elsewhere, list discovered locations
    in the error so the next click is obvious.
  - agent=bar (no subworkflow): try root first; otherwise walk every
    sub-workflow context. On exactly one match: navigate there. On
    many matches (e.g. bar ran in every for_each iteration): pick
    running > deepest > newest, mirroring the engine's live-event
    routing precedence.
  - Zero matches anywhere: deterministic error, view pinned to root
    so sticky-follow still doesn't strand.

Also reworks the wait condition: instead of firing on the first
state change after agents.length > 0 (which races against WS replay
of nested subworkflow_started events), the resolver debounces 200ms
of state quiescence and only applies once the target is resolvable
or the workflow has reached a terminal state. A 5s hard cap keeps
live-workflow deep-links from hanging when the target never appears.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PolyphonyRequiem PolyphonyRequiem force-pushed the feat/web-breadcrumb-navigation branch from bc0c747 to b2eb97f Compare May 4, 2026 22:09
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All critical and important fixes verified at b2eb97f:

  • ✅ C1: workflow_started routes via subworkflow_path with activeContextPath fallback
  • ✅ I1: bail-on-resolution-miss guards added in both subworkflow_completed and subworkflow_failed
  • ✅ I2: 9 new tests in TestCheckInterruptSubworkflow / TestHandleWebPauseSubworkflow / TestSubWorkflowDashboardPath cover the depth>0 interrupt branch, stop-while-paused, distinct slot_keys per for_each iteration, nested path accumulation, and failed event payload — all passing
  • ✅ S1: comprehensive root-vs-subworkflow asymmetry comment in _handle_web_pause

Carrying I3 (server.py over-broad AssertionError suppression on Windows shutdown) and S2/S3 (comment/docstring nits) into a follow-up issue rather than blocking on them. Nice work, ship it.

@jrob5756 jrob5756 merged commit b3e3485 into microsoft:main May 4, 2026
7 checks passed
PolyphonyRequiem pushed a commit to PolyphonyRequiem/conductor that referenced this pull request May 4, 2026
Bundles four follow-ups for PR microsoft#113:

* feat(web): "Dive In" affordance on for-each iteration rows so users can
  navigate into a specific iteration's subworkflow context. The detail
  panel for a for_each group now exposes a Layers button per iteration
  that resolves the slot key (`<group>[<key>]`) against
  subworkflowContexts and calls navigateIntoSubworkflow. Resolves the
  reported case where pg_execution_group[0] could not be linked into.

* fix(web/server): tighten _is_proactor_shutdown_race to require the
  deepest traceback frame to originate from asyncio internals (microsoft#145 I3).
  Previously the gate fell back to permissive True when the traceback
  was missing or empty, which could swallow unrelated AssertionErrors
  raised by user/workflow code during shutdown.

* docs(engine): explain why _check_interrupt clears the interrupt event
  before raising — single-listener reset semantics, no race, callers
  rely on the post-raise event being clear (microsoft#145 S2).

* docs(web): note on resolveSlotPath that newest-wins makes older
  for-each iterations unreachable by stale slot paths (microsoft#145 S3).

Tests:
- Updated TestProactorShutdownRace to assert the tightened contract:
  asyncio-framed traceback -> True, non-asyncio frame -> False, no
  traceback -> False. All 11 cases pass.
- 137 web/engine tests pass; ruff check + format clean.

Refs: microsoft#113, microsoft#145

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 pushed a commit that referenced this pull request May 5, 2026
…nits (#145) (#146)

* fix(web): dedupe duplicate routes in graph layout

When user YAML defines a conditional route plus a catch-all to the same
target (a common pattern, e.g. 'when: ready==false' then bare 'to: gate'
fallback), the engine evaluates them in order with first-match-wins
semantics — they represent ONE visual transition, not parallel edges.

Previously graph-layout.ts emitted an Edge per route entry, so duplicate
(from, to) pairs produced overlapping/diverging edges that dagre laid out
as phantom strands stretching off-canvas. Visible at the impl subworkflow
view for routes like preflight_lite -> preflight_lite_gate (when + bare)
and load_work_tree -> \ (when + bare).

Dedupe by (from, to) — first route wins, matching engine semantics.
Highlighted-edge state is already keyed by (from, to) so this is
consistent with the rest of the rendering pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: for-each dive-in nav + PR #113 review nits (#145)

Bundles four follow-ups for PR #113:

* feat(web): "Dive In" affordance on for-each iteration rows so users can
  navigate into a specific iteration's subworkflow context. The detail
  panel for a for_each group now exposes a Layers button per iteration
  that resolves the slot key (`<group>[<key>]`) against
  subworkflowContexts and calls navigateIntoSubworkflow. Resolves the
  reported case where pg_execution_group[0] could not be linked into.

* fix(web/server): tighten _is_proactor_shutdown_race to require the
  deepest traceback frame to originate from asyncio internals (#145 I3).
  Previously the gate fell back to permissive True when the traceback
  was missing or empty, which could swallow unrelated AssertionErrors
  raised by user/workflow code during shutdown.

* docs(engine): explain why _check_interrupt clears the interrupt event
  before raising — single-listener reset semantics, no race, callers
  rely on the post-raise event being clear (#145 S2).

* docs(web): note on resolveSlotPath that newest-wins makes older
  for-each iterations unreachable by stale slot paths (#145 S3).

Tests:
- Updated TestProactorShutdownRace to assert the tightened contract:
  asyncio-framed traceback -> True, non-asyncio frame -> False, no
  traceback -> False. All 11 cases pass.
- 137 web/engine tests pass; ruff check + format clean.

Refs: #113, #145

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address PR #146 review comments

1. _guarded_serve now reuses _is_proactor_shutdown_race gate instead of
   permissive should_exit-only check. Both exception handler sites now
   share the same strict contract: suppress only when the deepest
   traceback frame originates from asyncio internals.

2. Edge dedupe clears the 'when' label when collapsing routes with
   different conditions to the same target, avoiding a misleading
   single-condition label on a multi-condition edge.

3. Added test_guarded_serve_reraises_non_asyncio_assertion_during_shutdown
   to verify the tightened contract.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Green <dangreen@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 added a commit that referenced this pull request May 5, 2026
…, #121-#123, #125, #129, #130, #131, #139, #141-#144, #146)

CHANGELOG: add 6 newer PRs (#119, #121, #122, #123, #125, #113, #130, #131, #141, #146) to [Unreleased] alongside the previously documented batch.

docs/workflow-syntax.md:
  - Add metadata + instructions fields to the workflow configuration block.
  - Add input_mapping and max_depth to Sub-Workflow Steps; correct stale claims that circular references are rejected and that workflow steps cannot be used in for_each groups.
  - Add 'Sub-workflows in for_each groups' subsection with example.
  - Add JSON stdout auto-parsing note + example to Script Steps output section.
  - Add type-appropriate zero values table to Workflow Inputs.
  - Add 'Workflow Metadata Variables' subsection covering workflow.dir, workflow.file, workflow.name.
  - Update on_start hook context list to include the new workflow.dir/file vars.

docs/cli-reference.md:
  - Document --metadata/-m, --workspace-instructions, and --instructions flags on conductor run.
  - Add 'Metadata and Instructions' examples block.
  - Update conductor validate to describe the new template-reference error/warning checks added in #125.

docs/providers/claude.md, docs/providers/comparison.md:
  - Replace stale 'All models support a 200K token context window' / '200K (all models)' claims with notes that the dashboard now sources context_window_max from each provider's SDK at runtime (#144).

README.md:
  - Refresh the Features list to mention sub-workflow composition, dialog mode, workspace instructions, breadcrumb navigation, and the enhanced validate behavior.
  - Add --metadata, --workspace-instructions, --instructions to the conductor run options table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jrob5756 added a commit that referenced this pull request May 5, 2026
* docs: changelog + doc updates for unreleased PRs (#100, #109-#111, #119, #121-#123, #125, #129, #130, #131, #139, #141-#144, #146)

CHANGELOG: add 6 newer PRs (#119, #121, #122, #123, #125, #113, #130, #131, #141, #146) to [Unreleased] alongside the previously documented batch.

docs/workflow-syntax.md:
  - Add metadata + instructions fields to the workflow configuration block.
  - Add input_mapping and max_depth to Sub-Workflow Steps; correct stale claims that circular references are rejected and that workflow steps cannot be used in for_each groups.
  - Add 'Sub-workflows in for_each groups' subsection with example.
  - Add JSON stdout auto-parsing note + example to Script Steps output section.
  - Add type-appropriate zero values table to Workflow Inputs.
  - Add 'Workflow Metadata Variables' subsection covering workflow.dir, workflow.file, workflow.name.
  - Update on_start hook context list to include the new workflow.dir/file vars.

docs/cli-reference.md:
  - Document --metadata/-m, --workspace-instructions, and --instructions flags on conductor run.
  - Add 'Metadata and Instructions' examples block.
  - Update conductor validate to describe the new template-reference error/warning checks added in #125.

docs/providers/claude.md, docs/providers/comparison.md:
  - Replace stale 'All models support a 200K token context window' / '200K (all models)' claims with notes that the dashboard now sources context_window_max from each provider's SDK at runtime (#144).

README.md:
  - Refresh the Features list to mention sub-workflow composition, dialog mode, workspace instructions, breadcrumb navigation, and the enhanced validate behavior.
  - Add --metadata, --workspace-instructions, --instructions to the conductor run options table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: bump version to 0.1.11 and changelog #148

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants