Skip to content

Conversation

@lionel-
Copy link
Contributor

@lionel- lionel- commented Feb 9, 2026

Branched from #1032 (Merge CommManager into Shell and IOPub)
Closes #1026

This PR is the payoff for the IOPub/Comm unification of #1032. It leverages deterministic message ordering to replace fuzzy message matching with strict sequential assertions in the test infrastructure. It also fixes several bugs made obvious by the new assertions, and removes emission of debug events during auto-stepping so that code injected for breakpoints is invisible to the frontend.

  • Replaces fuzzy matching with strict sequential assertions, making tests both more readable and more accurate. Before:

    frontend.recv_iopub_until(|acc| {
        acc.has_comm_method_count("start_debug", 2) &&
            acc.has_comm_method_count("stop_debug", 2) &&
            acc.saw_idle()
    });

    After (the different counts are due to other fixes, see below):

    frontend.recv_iopub_start_debug();
    frontend.assert_stream_stdout_contains("Called from:");
    frontend.assert_stream_debug_at(&file);
    frontend.recv_iopub_idle();
  • Eliminates workarounds like recv_iopub_busy_skip_streams, recv_iopub_execute_input_skip_streams, recv_iopub_async, and the entire MessageAccumulator / recv_iopub_until pattern

As part of a complete test review I also fixed:

  • Off-by-one line number issue in sourced code (missing initial #line directive)
  • Missing source file assertions on stack frames
  • Superfluous DAP events from auto-stepping. Code injected for breakpoints no longer emits start_debug/stop_debug cycles

Stream buffering

The one remaining source of non-determinism is R's stream output. R may batch or split stdout/stderr messages arbitrarily, independently of the ordering of non-stream messages. The infrastructure around stream assertions was reworked to robustly prevent test flakiness caused by stream buffering.

All recv_iopub_* methods now automatically buffer Stream messages and ignore them. The accumulated streams are now tested/asserted separately via assert_stream_stdout_contains() / assert_stream_stderr_contains(). Stream assertions are still order-sensitive: each match drains the buffer up to that point. But a stray stream can no longer cause other assertions to fail.

recv_iopub_idle() acts as a synchronization point: at that point, we know for sure R has finished streaming all output emitted during the request handling. The method panics if streams were received but not asserted, then clears the buffers.

There must be at least one stream assertion per request handling if output is emitted. Unfortunately this setup still allows unexpected stream output to go unnoticed as long as there is one stream assertion in the test. Due to arbitrary splitting though, I don't think we can do better. Unless we go for full snapshots, but this approach has issues of its own (scrubbing of stateful output, and stability across platforms and versions of R).

Auto-stepping before debug_start()

Previously, auto-stepping through injected breakpoint wrappers happened after debug_start() had already been sent to the frontend. Every breakpoint hit produced visible intermediate events (start_debug, Stopped, Continued, Continued, start_debug, Stopped). Tests had to account for all of these with recv_auto_step_through() and multiple start/stop cycles.

Now maybe_auto_step() is called before debug_start(). If we detect we're at injected code, we immediately return n to R without emitting any debug notifications. The frontend only sees a single start_debug + Stopped at the user expression.

Similarly, when pending_inputs is non-empty (multiple expressions queued), we skip debug_start() for intermediate browser prompts.

The main insight is that we never start a new debug session unless we're ready to close the active execution request, and we never close the active execution request until R has stopped at the final user-visible location.

@lionel- lionel- requested a review from DavisVaughan February 9, 2026 17:22
@lionel- lionel- force-pushed the task/iopub-comm-unification branch from b440eac to 1f2a4a4 Compare February 9, 2026 17:30
@lionel- lionel- force-pushed the task/dap-protocol-tests-sync-update branch 2 times, most recently from 1b60b80 to 8802e70 Compare February 9, 2026 19:47
Copy link
Contributor

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually build locally and try out any debugging since this was mostly concerned with test improvements

Comment on lines +6 to 8
#line 1 "file:///test.R"
base::.ark_auto_step(base::.ark_breakpoint(browser(), "file:///test.R", "1"))
#line 1 "file:///test.R"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to be on line 6 and 8?

serde_json = "1.0.94"
stdext = { path = "../stdext" }
tempfile = "3"
regex = "1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetical order?

Comment on lines +147 to +148
/// On Windows ARM, ZMQ poll with timeout blocks forever instead of
/// respecting the timeout. Use non-blocking poll with manual timing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows ARM is very worrying. Couldn't find any upstream issues though!

Here is another issue (this time flaky) I just observed:

https://github.com/posit-dev/ark/actions/runs/21896771039/job/63214590132?pr=1037#step:9:864

    failures:
        test_dap_evaluate_in_different_frames

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 7 filtered out; finished in 6.48s

  stderr ───

    thread 'test_dap_evaluate_in_different_frames' (3508) panicked at crates\ark\tests\dap_evaluate.rs:107:19:
    Failed to receive DAP message: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. (os error 10060)
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

        FAIL [   6.611s] (538/765) ark::dap_evaluate test_dap_evaluate_error

/// Accumulated stderr stream content
stream_stderr: RefCell<String>,
/// Put-back queue for non-stream messages encountered during stream assertions
pending_messages: RefCell<VecDeque<Message>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably say in the name that they are iopub messages

Comment on lines +229 to +232
trace_iopub_msg(&msg);
match msg {
Message::Stream(ref data) => self.buffer_stream(&data.content),
other => self.pending_messages.borrow_mut().push_back(other),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with how this trace_iopub_msg() stuff works or why its useful, but isn't it weird that you can trace twice for some messages?

For a message you put in pending_messages, it is traced here and when you pop it off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in assert_stream_matches_re

Comment on lines +379 to +380
stdout: std::mem::take(&mut *self.stream_stdout.borrow_mut()),
stderr: std::mem::take(&mut *self.stream_stderr.borrow_mut()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stdout: std::mem::take(&mut *self.stream_stdout.borrow_mut()),
stderr: std::mem::take(&mut *self.stream_stderr.borrow_mut()),
stdout: std::mem::take(&mut self.stream_stdout.borrow_mut()),
stderr: std::mem::take(&mut self.stream_stderr.borrow_mut()),

i think


#[deprecated = "Use assert_stream_stdout_contains() instead"]
#[allow(unused)]
pub fn recv_iopub_stream_stdout(&self, _expect: &str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, should we just remove them then?

Comment on lines +424 to +427
// Overriding methods for base DummyFrontend's `recv_iopub_` methods. These
// use `recv_iopub_next()` to auto-skip streams. Question: Maybe this
// behaviour should live in the base dummy frontend? The stream issues are
// likely not R specific.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea having to override things when we also own the upstream DummyFrontend is feeling very complicated.

Thinking about touching any of this code a few months from now is already giving me anxiety haha

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep the only thing that gave me pause is the possibility of an amalthea kernel with more predictable streams? But that seems unlikely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants