Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Simplify tracing in exec plan #33737

Closed
westonpace opened this issue Jan 18, 2023 · 0 comments · Fixed by #33738
Closed

[C++] Simplify tracing in exec plan #33737

westonpace opened this issue Jan 18, 2023 · 0 comments · Fixed by #33738

Comments

@westonpace
Copy link
Member

westonpace commented Jan 18, 2023

Describe the enhancement requested

The old tracing model starts a span when a node starts and ends the span when the node marks itself finished. Some nodes start an additional InputReceived span with the above mentioned span as parent. This makes it rather difficult to tell where time is actually being spent because large blocks of the span represent idle time. It does not accurately reflect time spent.

I've would like to change the model to use async scheduler tasks as spans. In practice, this would mean that there is now a span per fragment. It may have child spans for each of the nodes that runs on the fragment (simple nodes may just mark their execution as an event). This also will allow us to get rid of the ExecNode::finsihed_ future as they are no longer really necessary (they can show up as "waiting for finish" spans that don't really provide any useful information).

Component(s)

C++

@westonpace westonpace self-assigned this Jan 18, 2023
westonpace added a commit that referenced this issue Jan 23, 2023
This PR does two things.  First, it requires that all "tasks" (for the AsyncTaskScheduler, not the executor) have a name.  Second, it simplifies and cleans up the way that exec nodes report their tracing using a `TracedNode` helper class.

* Closes: #33737

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
@westonpace westonpace added this to the 12.0.0 milestone Jan 23, 2023
sjperkins pushed a commit to sjperkins/arrow that referenced this issue Feb 10, 2023
This PR does two things.  First, it requires that all "tasks" (for the AsyncTaskScheduler, not the executor) have a name.  Second, it simplifies and cleans up the way that exec nodes report their tracing using a `TracedNode` helper class.

* Closes: apache#33737

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant