Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataflow: add dataflow rendering to EXPLAIN plans #7301

Closed
2 tasks
Tracked by #13299
maddyblue opened this issue Jul 3, 2021 · 4 comments
Closed
2 tasks
Tracked by #13299

dataflow: add dataflow rendering to EXPLAIN plans #7301

maddyblue opened this issue Jul 3, 2021 · 4 comments
Labels
A-optimization Area: query optimization and transformation C-feature Category: new feature or request

Comments

@maddyblue
Copy link
Contributor

maddyblue commented Jul 3, 2021

If there's a view with an index on a column, a point select on that column will be significantly faster. But an EXPLAIN will not surface the use of the index. We should add this information to EXPLAIN. In this case looking at the rendered dataflow will show index instead of scan. Users will be able to self-serve improve their queries better with this information.

Update: 2022/02/08

We added a rudimentary version of this in #8515. We were holding off documenting it, because it's very verbose and we expected to follow-up sooner. We didn't follow-up, it's still verbose, but came in handy a few times already, so I'm proposing the following sub-tasks:

  • Rename explain physical plan to explain dataflow plan
  • Add docs under docs/sql/explain following the existing structure (sections "Reading ...", "Operators in", and an extra warning that it's experimental)

There were various discussions in the pats to take a holistic look at explain output, but we didn't find a proper owner (cc @JLDLaughlin @andrioni). I think it's fine to do the above minimal things even if we expect things to change in the future.

@maddyblue maddyblue added the C-feature Category: new feature or request label Jul 3, 2021
@uce uce added this to Needs Triage in Compute Aug 5, 2021
@uce
Copy link
Contributor

uce commented Aug 5, 2021

This is mostly ready on the dataflow side but requires adding the SQL commands and pretty printing (we probably need to discuss what we actually want in there).

@uce uce moved this from Needs Triage to Icebox in Compute Aug 5, 2021
@uce uce added this to the 1.0 milestone Aug 5, 2021
@uce uce moved this from Icebox to To do in Compute Sep 10, 2021
@asenac asenac mentioned this issue Oct 4, 2021
2 tasks
@uce
Copy link
Contributor

uce commented Nov 3, 2021

The initial work has been done, but we'll keep this open to scope out the follow-up tasks.

@aalexandrov aalexandrov added the A-optimization Area: query optimization and transformation label Nov 4, 2021
@aalexandrov
Copy link
Contributor

@uce: can you put labels to the areas where the outstanding tasks fall? I've tentatively added A-optimization here but I'm not sure whether there is something more to be done there.

@aalexandrov
Copy link
Contributor

A variant of this was done in #13137.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-optimization Area: query optimization and transformation C-feature Category: new feature or request
Projects
None yet
Development

No branches or pull requests

5 participants