-
Notifications
You must be signed in to change notification settings - Fork 513
feat!: track cumulative wall time in analyze plan #5505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: track cumulative wall time in analyze plan #5505
Conversation
This adds a cumulative_wall metric to execution stats. This is useful for more clearly understanding the execution times experienced by the user. Unlike existing timers, these include IO. When IO is performed across children in parallel, the computed time range is based on the collective minimum and maximum bounds of the children.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
I have a simple script here: On this code it prints this: Prior to this commit it would print this: The problem with this is it doesn't actually convey that the query took 16s, which is really important information for analysis. If there is appetite for it, I would actually like to go a step further and replace "cumulative_cpu" with "cumulative_wall" completely. We could even consider calling it just "elapsed" as it would be pretty clear from context: (hand edited) There are a couple issues in this plan I am still hunting down - need to understand why the timings don't add up completely. But overall this is more compact than what we have today and it indicates much more clearly to the user that the time is in LanceRead. And also, the headline number is the timing experienced by the user so nobody experiences a 16s query and sees 452ms in the reporting. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
Updated to remove cumulative_cpu and reorder the formatting. Here is the new plan: |
83e4b18 to
6b723e8
Compare
5f80c6a to
b1dd51c
Compare
|
Marked as breaking due to removal of cumulative_cpu. However, I doubt there is a real API commitment around stability of plan representations, and we can remove the marker if so. |
|
Here is a more interesting plan, for hybrid search: |
| for child in plan.children() { | ||
| cumulative_cpu += self.calculate_cumulative_cpu(child); | ||
| let child_metrics = self.calculate_metrics(child); | ||
| min_start = Self::min_option(min_start, child_metrics.min_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a real possibility that the child start time could be before the parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't generally expect it
|
cumulative elapsed time does seem important to report. I guess cumulative cpu may not be as important since the mental work flow is more like "X took a lot of time" -> "why is that?" -> "there was a lot of CPU on step Y" which cumulative CPU doesn't help that much with. |
|
@cmccabe yeah, that's my thought too. I think usually the reason you're looking at one of these things is you experienced a slow wall time, so the wall time will generally lead you to the node you want to focus on. At that point the elapsed CPU (and potentially cumulative elapsed of the subplan) is useful, but I didn't want to clutter the whole tree with 2 cumulative timers. |
BREAKING CHANGE: removes cumulative_cpu metrics from analyze plan output.
This adds a wall time to execution stats. This is useful for more clearly understanding the execution times experienced by the user. Unlike existing timers, these include IO. When IO is performed across children in parallel, the computed time range is based on the collective minimum and maximum bounds of the children.
This PR also removes the timer "cumulative_cpu", to reduce the number of items presented across all operators.