test(amber-python): add unit tests for evaluate-expression and retry-current-tuple handlers#4520
Merged
Yicong-Huang merged 6 commits intoApr 27, 2026
Conversation
…current-tuple handlers Two debugger-adjacent worker RPC handlers had no Python coverage: - EvaluateExpressionHandler (frontend "watch variable" feature): 4 cases — return value pass-through, runtime context exposes self/tuple_/input_, context is read fresh on each call (not snapshot at construction), tolerates None tuple/port before any input has arrived. - RetryCurrentTupleHandler (debugger "step over an exception"): 6 cases — chains the current tuple onto the front of the input iterator, resumes USER + EXCEPTION pause types in order, leaves DEBUG_PAUSE alone (so an active session is not silently dropped), no-ops when state is COMPLETED, handles an exhausted remaining iterator. No production code is touched. Async handlers are driven via asyncio.run to avoid pulling in pytest-asyncio. Closes apache#4516
…ndlers Self-review surfaced two gaps: - EvaluateExpressionHandler: when ExpressionEvaluator.evaluate raises (bad syntax, attribute error, etc.), the handler must propagate the exception so the RPC layer can surface it to the frontend, not swallow it. Added test_evaluator_exception_propagates. - RetryCurrentTupleHandler: previously only RUNNING and COMPLETED states were tested. The completion guard is `if not confirm_state( COMPLETED)`, so every other state (PAUSED, READY, UNINITIALIZED) must take the chain+resume path. PAUSED is the most likely real-world entry point — added test_paused_state_still_chains_and_ resumes. 12 cases total now (up from 10). No production code is touched. Refs apache#4516
This was referenced Apr 26, 2026
Yicong-Huang
added a commit
that referenced
this pull request
Apr 26, 2026
…gn_ecm (#4526) ### What changes were proposed in this PR? Fixes the recurring CI flake in `core/runnables/test_main_loop.py::TestMainLoop::test_main_loop_thread_can_align_ecm` and tightens the assertion to verify the actual production priority guarantee. **Why it flaked**: the test put a data tuple and an alignment-completing ECM into `input_queue` back-to-back, then assumed `output_queue.get()` would yield the `DataElement` first followed by the NoOperation reply. `output_queue` is a `LinkedBlockingMultiQueue` whose sub-queues are keyed by channel — control channels at priority 1, data channels at priority 2 (`internal_queue.py:80`). MainLoop produces in this order: data → NoOperation reply, but whether the test pops them in that order depends on whether MainLoop has finished both productions by the time the test calls `.get()`: - Fast machine: only data is in the queue → data first → ✅ - Slow CI: both items queued → priority returns the control reply first → ❌ The production semantics are intentional and correct: - Control to coordinator outranks data on egress (cross-channel priority — priority 1 vs 2 sub-queues). - Within a channel sub-queue, FIFO is preserved (so an ECM forwarded on a data channel stays behind the data tuples on that same channel). **Fix**: the test now expresses the priority semantics directly: 1. Wait until both expected channels have their item in `output_queue`'s sub-queues — the data channel to the downstream worker, and the control channel back to "sender". Sub-queues are added lazily on first put, so the wait safely treats a missing key as size zero. 2. With both items queued, assert the priority pop order: `output_queue.get()` first returns the `DCMElement` (NoOperation reply, control sub-queue priority 1), then the `DataElement` (data sub-queue priority 2). 3. Bounded with a 5s timeout so a regression that drops one of the channels fails with a descriptive message instead of hanging. If a future change flips priorities or routes the NoOperation reply to a different channel, the first `get()` now fails fast with "expected control reply first". No production code change. ### Any related issues, documentation, discussions? Closes #4524. Has been hitting unrelated PRs (#4512, #4520). ### How was this PR tested? Ran 30 consecutive iterations locally — 30 PASS, 0 FAIL. `ruff format --check .` and `ruff check .` clean. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Adds unit tests for the two debugger-adjacent worker RPC handlers that had no Python coverage:
test_evaluate_expression_handler.py(4 cases) — coversEvaluateExpressionHandler.evaluate_python_expression, which backs the frontend's "watch variable" feature. Verifies the evaluator's return is passed through unchanged, the runtime context exposes the executor asself/ current tuple astuple_/ current port id asinput_, the context is read fresh on each call (not snapshot at handler construction), and the handler toleratesNonetuple/port (worker before any input has arrived).test_replay_current_tuple_handler.py(6 cases) — coversRetryCurrentTupleHandler.retry_current_tuple(used by debugger "step over an exception" flows). Verifies it chains the current tuple onto the front of the input iterator, resumesUSER_PAUSE+EXCEPTION_PAUSEin order, does not resumeDEBUG_PAUSE(so an active debugging session is not silently dropped), no-ops when the worker isCOMPLETED, and still chains correctly when the remaining iterator is empty.No production code is touched. Async handlers are driven via
asyncio.runto avoid pulling inpytest-asyncio, matching the pattern from #4510 / #4512.Any related issues, documentation, discussions?
Closes #4516. Same gap pattern as #4509.
How was this PR tested?
ruff format --check .andruff check .clean locally.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7)