[datafusion-cli] Implement average LIST duration for object store profiling #19127

peterxcli · 2025-12-06T09:48:59Z

Which issue does this PR close?

Closes [datafusion-cli] Implement average LIST duration for object store profiling #18138

Rationale for this change

The list operation returns a stream, so it previously recorded duration: None, missing performance insights. Time-to-first-item is a useful metric for list operations, indicating how quickly results start. This adds duration tracking by measuring time until the first item is yielded (or the stream ends).

What changes are included in this PR?

Added TimeToFirstItemStream: A stream wrapper that measures elapsed time from creation until the first item is yielded (or the stream ends if empty).
Updated instrumented_list: Wraps the inner stream with TimeToFirstItemStream to record duration.
Changed requests field: Switched from Mutex<Vec<RequestDetails>> to Arc<Mutex<Vec<RequestDetails>>> to allow sharing across async boundaries (needed for the stream wrapper).
Updated tests: Modified instrumented_store_list to consume at least one stream item and verify that duration is now Some(Duration) instead of None.

Are these changes tested?

Yes. The existing test instrumented_store_list was updated to:

Consume at least one item from the stream using stream.next().await
Assert that request.duration.is_some() (previously is_none())

All tests pass, including the updated list test and other instrumented operation tests.

Are there any user-facing changes?

Users with profiling enabled will see duration values for list operations instead of nothing.

…mentedObjectStore.

alamb · 2025-12-08T21:12:21Z

FYI @BlakeOrth -- are you available to review this PR?

BlakeOrth · 2025-12-08T22:25:27Z

@alamb Yes, I will allocate time for a review.

BlakeOrth

Thank you for taking this on! Overall I think the strategy of wrapping the stream seems sound.

I do have one small concern about how the timing is being tracked; I've left an inline comment with more details.

BlakeOrth · 2025-12-08T23:40:20Z

datafusion-cli/src/object_storage/instrumented.rs

+        if !self.first_item_yielded && poll_result.is_ready() {
+            self.first_item_yielded = true;
+            let elapsed = self.start.elapsed();


I'm somewhat concerned this elapsed calculation could end up generating misleading results in some cases. The concern stems from the fact that self.start is set when the stream is created. While this will probably be pretty accurate in many scenarios, imagine a scenario where something like the following happens:

let list_stream = TimeToFirstItemStream::new(stream, Instant::now(), 0, requests); // The stream is created here, but has never been polled because the user has yet // to await the stream. However, the "timer" is already running. some_long_running_method().await; let item = list_stream.next().await.unwrap();

In this case the elapsed duration would effectively be measuring the time of both some_long_running_method() as well as the time it took to yield the first element on the stream.
I'm wondering if we can set self.start once on the first call to poll_next(...) and then set elapsed on the first time an element hits Poll::Ready (as you've already done here) to get more accurate results.

BlakeOrth · 2025-12-08T23:58:35Z

datafusion-cli/src/object_storage/instrumented.rs

+            let mut requests = self.requests.lock();
+            if let Some(request) = requests.get_mut(self.request_index) {
+                request.duration = Some(elapsed);
+            }


Based on the current implementation I believe this strategy is currently "safe" (e.g. we won't accidentally modify the duration of a different request, leading to errant data). However, it does rely on the assumption that self.requests never has items removed from the middle of the Vec.

It might be useful to find a place to leave a comment noting that requests should be append-only to make it less likely for this assumption to be broken in the future.

Implement TimeToFirstItemStream for measuring response time in Instru…

7b2e597

…mentedObjectStore.

BlakeOrth reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[datafusion-cli] Implement average LIST duration for object store profiling #19127

[datafusion-cli] Implement average LIST duration for object store profiling #19127

peterxcli commented Dec 6, 2025

Uh oh!

alamb commented Dec 8, 2025

Uh oh!

BlakeOrth commented Dec 8, 2025

Uh oh!

BlakeOrth left a comment

Uh oh!

BlakeOrth Dec 8, 2025 •

edited

Loading

Uh oh!

BlakeOrth Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[datafusion-cli] Implement average LIST duration for object store profiling #19127

Are you sure you want to change the base?

[datafusion-cli] Implement average LIST duration for object store profiling #19127

Conversation

peterxcli commented Dec 6, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb commented Dec 8, 2025

Uh oh!

BlakeOrth commented Dec 8, 2025

Uh oh!

BlakeOrth left a comment

Choose a reason for hiding this comment

Uh oh!

BlakeOrth Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BlakeOrth Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BlakeOrth Dec 8, 2025 •

edited

Loading