Skip to content

fix: while/do-while loop condition reads stale iteration-0 step output#2662

Merged
mnriem merged 8 commits into
github:mainfrom
mnriem:fix/2592-while-loop-stale-condition
May 21, 2026
Merged

fix: while/do-while loop condition reads stale iteration-0 step output#2662
mnriem merged 8 commits into
github:mainfrom
mnriem:fix/2592-while-loop-stale-condition

Conversation

@mnriem
Copy link
Copy Markdown
Collaborator

@mnriem mnriem commented May 21, 2026

Summary

Fixes #2592while/do-while loop conditions always read the iteration-0 step output, ignoring updates from subsequent iterations.

Root cause

The engine namespaces nested step IDs per iteration (e.g., my-loop:my-gate:1) to preserve per-iteration history. However, the condition resolver (evaluate_condition_resolve_dot_path) looks up step output by the original unprefixed key (steps.my-gate), which is only written during iteration 0 and never updated afterward.

Fix

Keep namespaced step IDs for execution (preserving unique log entries, state keys, and on_step_start callback IDs per iteration), but execute each body step individually and immediately alias its result back to the unprefixed key after it completes successfully. This ensures:

  • evaluate_condition() sees the latest iteration's output
  • Inter-step references within multi-step loop bodies work correctly (step B sees step A's output from the current iteration, not a stale previous one)
  • Namespaced history entries are preserved for debugging (parent:child:1, parent:child:2, etc.)
  • Pause/fail is checked after each step — aliasing only occurs for successfully completed steps
  • Fully backward-compatible — same namespaced key format, same log IDs, no changes to condition syntax or workflow YAML

Changes

  • src/specify_cli/workflows/engine.py — Execute loop body steps one at a time with namespaced IDs, aliasing each result back to the unprefixed key immediately after successful completion
  • tests/test_workflows.py — 5 new engine-level regression tests in TestWorkflowEngine:
    • test_while_loop_condition_reads_latest_iteration (positive: loop stops early)
    • test_do_while_loop_condition_reads_latest_iteration (positive: loop stops early)
    • test_while_loop_runs_to_max_when_condition_stays_true (negative: loop exhausts max_iterations)
    • test_do_while_loop_runs_to_max_when_condition_stays_true (negative: loop exhausts max_iterations)
    • test_while_loop_multi_step_body_inter_step_refs (multi-step body: step B sees step A's current-iteration output)

Testing

All 2998 existing + new tests pass (2998 passed, 1 skipped, 0 failures).

After executing namespaced loop body steps, copy each iteration's
results back to the original unprefixed step key so that
evaluate_condition() sees the latest values instead of stale
iteration-0 data.

Fixes github#2592
Copilot AI review requested due to automatic review settings May 21, 2026 14:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a workflow-engine bug where while / do-while loop conditions kept reading stale iteration-0 nested-step outputs by copying the latest namespaced nested-step results back onto the original (unprefixed) step IDs after each loop iteration.

Changes:

  • Update loop execution in the workflow engine to map namespaced nested-step IDs back to original IDs so condition evaluation sees the latest outputs.
  • Add regression tests covering early loop termination and max-iteration exhaustion for both while and do-while.
Show a summary per file
File Description
src/specify_cli/workflows/engine.py Copies namespaced loop-iteration nested-step results back to the original step IDs so conditions read the latest values.
tests/test_workflows.py Adds regression tests ensuring loop conditions observe updated nested-step outputs across iterations.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 5

Comment thread tests/test_workflows.py Outdated
Comment thread tests/test_workflows.py Outdated
Comment thread tests/test_workflows.py Outdated
Comment thread tests/test_workflows.py Outdated
Comment thread src/specify_cli/workflows/engine.py Outdated
- Rewrite shell scripts in tests to use Python via script files
  instead of POSIX syntax, so they pass on Windows CI.
- Snapshot iteration-0 nested-step results under a namespaced key
  (parent:child:0) before the first copy-back overwrite, preserving
  complete per-iteration history for debugging.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread src/specify_cli/workflows/engine.py Outdated
Move the status check before the copy-back so that partial results
from paused or failed nested steps (e.g., a gate awaiting input)
do not overwrite the unprefixed key. This preserves correct resume
behavior.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 4

Comment thread tests/test_workflows.py
Comment thread tests/test_workflows.py
Comment thread tests/test_workflows.py
Comment thread tests/test_workflows.py
Quote both the Python executable and script file paths in the
run: commands to handle spaces in paths on Windows.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread src/specify_cli/workflows/engine.py Outdated
Instead of namespacing step IDs for execution and copying results
back, execute the loop body with original (unprefixed) step IDs so
results naturally land at the right keys.  Snapshot previous
iteration results to namespaced keys (parent:child:N) for history
only.

This fixes multi-step loop bodies where step B references step A's
output within the same iteration — previously step B would see
stale data until the copy-back ran after the entire iteration.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread src/specify_cli/workflows/engine.py Outdated
Comment thread src/specify_cli/workflows/engine.py Outdated
Revert to namespaced step IDs for execution (preserving unique
log entries and state keys per iteration) but copy each step's
result back to the unprefixed key immediately after it completes.

This preserves backward compatibility (same namespaced key format,
same log IDs) while fixing both the condition evaluation bug and
inter-step references within multi-step loop bodies.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread src/specify_cli/workflows/engine.py Outdated
Comment thread tests/test_workflows.py
Copy link
Copy Markdown

@doquanghuy doquanghuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mnriem! I checked out the branch locally and ran the new regression tests — all 4 pass. The fix logic is sound: per-step aliasing of the namespaced ID back to the unprefixed key cleanly handles the condition-resolver gap I reported in #2592.

A few observations for consideration, none are blockers:

1. Test placement (worth fixing)

The 4 new tests are inserted into the TestWorkflowValidation class (at lines 1892, 1951, 1996, 2047 — TestWorkflowValidation spans lines 1208 → ~2099 in this PR's view), but they exercise engine execution behaviour, not validation. Future maintainers grepping for loop tests in TestWhileStep (line 910) or TestDoWhileStep (line 965) won't find them, and the file's validation-vs-execution boundary gets muddied.

A small move to those existing classes (or a new TestLoopIterationOutput adjacent to them) keeps the file organisation clean.

2. Coverage of a claim in the PR description

The PR description states:

Inter-step references within multi-step loop bodies work correctly (step B sees step A's output from the current iteration, not a stale previous one)

The logic supports this — the per-step for ns in result.next_steps aliases after each step before the next runs — but no test exercises a multi-step body. A small steps: [stepA, stepB] test where stepB reads {{ steps.stepA.output.stdout }} would lock that contract explicitly.

3. The canonical bug-report scenario goes untested

My issue's reproducer used a gate step inside the loop (pause-resume across iterations). The PR's tests use only shell steps, so the pause-then-resume path through the new aliasing logic isn't exercised. Not a regression risk (resume re-runs the loop from scratch, so any aliased PAUSED-state value gets overwritten on the next iter-0), but the canonical use case from the bug report goes uncovered. A monkey-patched gate test would lock that no surprises lurk in the pause-then-alias ordering.

4. Minor: PR description vs. code

PR description:

Pause/fail is checked after each step — no aliasing of partial results

But the code does alias before checking status:

self._execute_steps([ns_copy], context, state, registry, step_offset=-1)
if orig and ns_copy["id"] in context.steps:
    context.steps[orig] = context.steps[ns_copy["id"]]
    state.step_results[orig] = context.steps[ns_copy["id"]]
if state.status in (RunStatus.PAUSED, RunStatus.FAILED, RunStatus.ABORTED):
    return

This is fine in practice — _execute_steps writes context.steps[step_id] at engine.py:606 before the pause check at :619, so the alias picks up a valid (if PAUSED-state) result, and resume re-runs the loop from scratch anyway. But the description implies a stricter guarantee than the code. Either tighten the wording or guard the alias with a status check (if state.status not in (PAUSED, FAILED, ABORTED):) — both reach the same end state.

Otherwise, the fix is exactly what #2592 asked for. Thanks for jumping on it!

AI disclosure: drafted with Claude Opus, human-reviewed.

- Move per-step aliasing below the PAUSED/FAILED/ABORTED status
  check so partial results from incomplete steps are not aliased
  back to the unprefixed key.
- Add test_while_loop_multi_step_body_inter_step_refs to exercise
  a multi-step loop body where step B reads step A's output within
  the same iteration, verifying per-step aliasing works correctly.

Addresses feedback from @doquanghuy (items 2 & 4) and Copilot
review on commit 9d0a222.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread src/specify_cli/workflows/engine.py Outdated
Comment thread tests/test_workflows.py Outdated
- Use enumerate() for stable fallback IDs when loop body steps lack
  an explicit id (step-0, step-1, etc. instead of always step-0).
- Rewrite multi-step body test so step B uses expression
  substitution ({{ steps.step-a.output.stdout }}) instead of
  reading the counter file directly, making it a true regression
  test for per-step aliasing.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 0 new

@mnriem mnriem merged commit 616eba6 into github:main May 21, 2026
11 checks passed
@mnriem mnriem deleted the fix/2592-while-loop-stale-condition branch May 21, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: while/do-while loop condition reads stale iteration-0 step output

3 participants