-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the output of SQL explain message #11908
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, @LakshSingla !
On the whole, the PR looks good.
There are a few things that need changing/explanation.
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
We can support both the formats simultaneously in Calcite's syntax using
So potentially, |
Commented out the |
@LakshSingla - just a quick question - will this be backward-compatible? |
@dbardbar - will you still need that custom code once this change is merged? This change is trying to solve the same problem of not being able to see the final native query. |
hmm. I guess you would still want to rid of the extra fields such as row signature etc |
@LakshSingla - the native query can be extracted from the response today, but it does require some logic to extract it. Not the prettiest piece of code, but not that complicated. |
@abhishekagarwal87 - for our use-case we want to extract from the native query on the part related to the 'WHERE' clause, so anyway we'll need some basic handling, even with the new code. With the new code, the extraction will be a bit easier. |
This looks nice 👍 Since this changes the output of a query, i'm +1 for adding a feature flag to allow the previous results to be returned. Since this output seems totally better I think it is ok to default it to the new stuff. It might be nice to also add a context parameter so that it could be overridden per query to make it easier for developers to migrate their apps to the new output while debugging without having to set it for the entire cluster. |
In accordance with the above suggestions, I have added a config option in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me overall. just one minor comment.
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
Should the property |
Yes. You can put it in |
Updated the description with pros and cons of the newer approach. |
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
Very excited for this change! I got 2 questions: (1) is it possible to provide the signature as JSON and not as a string? instead of provide it as the signature strings are really annoying to parse and these signatures are really important (sometimes more so than the query) (2) you included screenshots of the old format in the console query view, but how does it work with the explain dialog? Does it need to be updated as part of this PR or right afterwards? Just to be clear I am talking about this dialog: As you can see the current dialog parses the old format thus relying on it. I would LOVE nothing more than to switch to this new format (providing pt.1) but does this PR break this dialog? Ideally there would be a context flag to trigger the new format but the old format would be the default (at least for a few releases). |
it looks like |
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java
Outdated
Show resolved
Hide resolved
Thanks for the comment @vogievetsky
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
#11959 got merged faster than I was expecting 😅, so maybe consider the rename/inversion when fixing up conflicts
docs/configuration/index.md
Outdated
@@ -1827,7 +1827,7 @@ The Druid SQL server is configured through the following properties on the Broke | |||
|`druid.sql.planner.metadataSegmentCacheEnable`|Whether to keep a cache of published segments in broker. If true, broker polls coordinator in background to get segments from metadata store and maintains a local cache. If false, coordinator's REST API will be invoked when broker needs published segments info.|false| | |||
|`druid.sql.planner.metadataSegmentPollPeriod`|How often to poll coordinator for published segments list if `druid.sql.planner.metadataSegmentCacheEnable` is set to true. Poll period is in milliseconds. |60000| | |||
|`druid.sql.planner.authorizeSystemTablesDirectly`|If true, Druid authorizes queries against any of the system schema tables (`sys` in SQL) as `SYSTEM_TABLE` resources which require `READ` access, in addition to permissions based content filtering.|false| | |||
|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|false| | |||
|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|true| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the default has been flipped, I wonder if we should invert this setting and call something like `useJsonStringExplain" or something similar that defaults to false. (I just did a similar thing when swapping the default to keep old behavior in #11184, swapping a "use legacy" to "use new thing", which I think maybe is better since is a bit odd for "use legacy" to be the default of something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If in the future, the newer explain plan is to be used, then I think useLegacyDruidExplain
would work better, since we won't have to change the feature flag then. But it does look weird right now, since it hasn't been deprecated yet. I am fine with keeping it either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The choice is between legacy vs new plan so from that the flag name does sound good to me.
Just merged in the changes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice 👍
Minor comment:
which frees the referenced code from usage of |
I think that the current approach is fine since the new approach means creating a new type. If in future, we do similar deserialization in other places, we can use a custom type specifically for explain output. |
Thank you @LakshSingla. I have merged your change. |
This is a follow up to the PR #11908. This fixes the bug in top level union all queries when there are more than 2 SQL subqueries are present.
Description
Currently, when we try to do
EXPLAIN PLAN FOR
, it returns the structure of the SQL parsed (via Calcite's internal planner util), which is verbose (since it tries to explain about the nodes in the SQL, instead of the Druid Query), and not representative of the native Druid query which will get executed on the broker side.This PR aims to change the format when user tries to
EXPLAIN PLAN FOR
for queries which are executed by converting them into Druid's native queries (i.e. notsys
schemas).The explanation now will be a list of columns with information about
resources
(no change) andexplanation
.Explanation is a string representing an array of queries & their signatures (in JSON format). Final shape of the
explanation
will look like:Examples:
Query Shape:
EXPLAIN PLAN FOR ( SELECT dim1, dim2 FROM druid.foo )
BEFORE
Output in console
Plan (expanded)
AFTER
Query Shape:
EXPLAIN PLAN FOR ( SELECT dim1 FROM druid.foo UNION ALL SELECT dim2 FROM druid.foo2)
BEFORE
Output in console
Plan (expanded)
AFTER
Some parts truncated
Query Shape:
EXPLAIN PLAN FOR ( SELECT a.dim1, COUNT(*) FROM druid.foo a INNER JOIN ( SELECT dim1, dim2 FROM druid.foo UNION ALL SELECT dim1, dim2 FROM druid.foo2 ) b ON a.dim1 = b.dim1 WHERE a.dim1 = 1.0 b.dim1 = 2.0 GROUP BY a.dim1 )
BEFORE
Output in console
Plan(expanded)
AFTER
Some parts truncated
The older format vs the newer format
Older Format:
RelNodes
formed from the SQL statement. This could help in understanding how the given query was arrangedNewer Format
EXPLAIN PLAN FOR
semantics of other databases like SQL/Oracle etc. (Can be said for the older version as well)We are using a external context flag to switch between legacy and native mode, and not relying on
EXPLAIN PLAN FOR ... AS JSON
since the latter should ideally only vary the format of the EXPLAIN PLAN and not the output itself (which is being done here).Key changed/added classes in this PR
DruidPlanner#planExplanation
.RowSignatureTest
for serializabilityThis PR has: