add query metrics for vsf mode #18727

clintropolis · 2025-11-07T01:33:58Z

Description

changes:

added canLoadSegments() method to SegmentManager and SegmentCacheManager
added 'query/load/batch/time' metric for 'wall time' spent waiting on all segments to load for a query
added 'query/load/time/avg' metric for average per segment load time for a query
added 'query/load/time/max' metric for max per segment load time for a query
added 'query/load/wait/avg' metric for average per segment wait time to begin loading for a query
added 'query/load/wait/max' metric for max per segment wait time to begin loading for a query
added 'query/load/count' metric for number of segments loaded on demand for a query
added 'query/load/bytes/total' metric for amount of data loaded on demand for a query

changes: * added `canLoadSegments()` method to `SegmentManager` and `SegmentCacheManager` * added 'query/load/time' metric for 'wall time' spent waiting on all segments to load for a query * added 'query/load/time/avg' metric for average per segment load time for a query * added 'query/load/time/max' metric for max per segment load time for a query * added 'query/load/wait/avg' metric for average per segment wait time to begin loading for a query * added 'query/load/wait/max' metric for max per segment wait time to begin loading for a query * added 'query/load/count' metric for number of segments loaded on demand for a query * added 'query/load/bytes' metric for amount of data loaded on demand for a query

server/src/main/java/org/apache/druid/segment/loading/SegmentCacheManager.java

-  }
+  boolean canLoadSegmentsOnDemand();
+
+  boolean canLoadSegmentOnDemand(DataSegment segment);


jtuglu1 · 2025-11-07T09:06:39Z

While we're at it, do you think we could add an explicit segment scan time metric? That is, the time it takes to fully load a segment from disk into memory (the Segment object)?

query/wait/time is useful, but IMO not granular enough as it takes into account other things like:

Other task runner creation/iteration overhead (Java/heap alloc overhead)
Processing pool queueing time (this can dominate the latency)

This would help a LOT with understanding various bottlenecks of the system (and help break that kitchen sink query/wait/time metric into more granular parts).

gianm

I think we should change the name of the metric query/load/time. It sounds very "clean" but it's actually the least clean metric in terms of future interpretability. Currently it means something like the amount of time the query was blocked waiting for segments to load. But in the future, we should be chunking and pipelining loading and execution. This will make the meaning of query/load/time increasingly murky.

How about calling it query/load/batch/time? That way, it's up to the execution model to define what a "batch" means and the interpretation of what it means to spend time "loading a batch".

The rest of the metrics generally LGTM.

clintropolis · 2025-11-13T02:35:05Z

While we're at it, do you think we could add an explicit segment scan time metric? That is, the time it takes to fully load a segment from disk into memory (the Segment object)?

We discussed this a bit offline, but for the sake of visibility for everyone else, query/segment/time is this basically, though it does not separate time spent reading from disk from time spent doing stuff since it sort of happens incrementally/interleaved as the query engine reads stuff and the mmap file is paged into memory from disk. I'm not sure the best way to collect that measurement, but it would sure be useful.

Also to clarify, query/wait/time is the time a segment spent waiting for a slot on the processing pool, before it gets to the point of doing any work or even reading anything from the segment, while query/segment/time starts where wait/time stops and ends when the internal sequence is fully consumed (so in some sense is dependent on how fast the caller consumes the sequence too). query/segmentAndCache/time includes time to check results cache + query/segment/time if there is no cache results.

github-actions bot added the Area - Segment Format and Ser/De label Nov 7, 2025

github-advanced-security bot found potential problems Nov 7, 2025

View reviewed changes

server/src/main/java/org/apache/druid/segment/loading/SegmentCacheManager.java

}

boolean canLoadSegmentsOnDemand();

boolean canLoadSegmentOnDemand(DataSegment segment);

Check notice

Code scanning / CodeQL

Useless parameter Note

The parameter 'segment' is never used.

gianm reviewed Nov 11, 2025

View reviewed changes

clintropolis added the jacoco:skip label Nov 12, 2025

clintropolis added 3 commits November 12, 2025 11:31

better metric name

39f538e

Merge remote-tracking branch 'upstream/master' into vsf-query-metrics

dabec8a

another name

f575697

clintropolis merged commit bbc62dc into apache:master Nov 13, 2025
103 of 104 checks passed

clintropolis deleted the vsf-query-metrics branch November 13, 2025 05:32

kgyrtkirk added this to the 36.0.0 milestone Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add query metrics for vsf mode #18727

add query metrics for vsf mode #18727

Uh oh!

clintropolis commented Nov 7, 2025 •

edited

Loading

Uh oh!

Check notice

jtuglu1 commented Nov 7, 2025 •

edited

Loading

Uh oh!

gianm left a comment

Uh oh!

clintropolis commented Nov 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

add query metrics for vsf mode #18727

add query metrics for vsf mode #18727

Uh oh!

Conversation

clintropolis commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Check notice

jtuglu1 commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

clintropolis commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

clintropolis commented Nov 7, 2025 •

edited

Loading

jtuglu1 commented Nov 7, 2025 •

edited

Loading

clintropolis commented Nov 13, 2025 •

edited

Loading