Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the output of SQL explain message #11908

Merged
merged 22 commits into from
Nov 25, 2021

Conversation

LakshSingla
Copy link
Contributor

@LakshSingla LakshSingla commented Nov 11, 2021

Description

Currently, when we try to do EXPLAIN PLAN FOR, it returns the structure of the SQL parsed (via Calcite's internal planner util), which is verbose (since it tries to explain about the nodes in the SQL, instead of the Druid Query), and not representative of the native Druid query which will get executed on the broker side.

This PR aims to change the format when user tries to EXPLAIN PLAN FOR for queries which are executed by converting them into Druid's native queries (i.e. not sys schemas).

The explanation now will be a list of columns with information about resources (no change) and explanation.
Explanation is a string representing an array of queries & their signatures (in JSON format). Final shape of the explanation will look like:

[
  { 
    "query" : <native_druid_query>,
    "signature": <signature of the query>
  },
  {
    "query" : <native_druid_query>,
    "signature": <signature of the query>
  }
]

Examples:

  1. Simple query
    Query Shape:
    EXPLAIN PLAN FOR ( SELECT dim1, dim2 FROM druid.foo )

BEFORE

Output in console
image
Plan (expanded)
image

AFTER

[
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
      "virtualColumns": [],
      "resultFormat": "compactedList",
      "batchSize": 20480,
      "order": "none",
      "filter": null,
      "columns": [
        "dim1",
        "dim2"
      ],
      "legacy": false,
      "context": {
        "defaultTimeout": 300000,
        "maxScatterGatherBytes": 9223372036854776000,
        "sqlCurrentTimestamp": "2000-01-01T00:00:00Z",
        "sqlQueryId": "dummy",
        "vectorize": "false",
        "vectorizeVirtualColumns": "false"
      },
      "descending": false,
      "granularity": {
        "type": "all"
      }
    },
    "signature": "{dim1:STRING, dim2:STRING}"
  }
]
  1. UNION ALL which generates multiple native queries
    Query Shape:
    EXPLAIN PLAN FOR ( SELECT dim1 FROM druid.foo UNION ALL SELECT dim2 FROM druid.foo2)

BEFORE

Output in console
image
Plan (expanded)
image

AFTER

Some parts truncated

[
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
...
      }
    },
    "signature": "{dim1:STRING}"
  },
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo2"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
    ...
      }
    },
    "signature": "{dim2:STRING}"
  }
]
  1. JOIN on a Table datasource and a Union query
    Query Shape:
    EXPLAIN PLAN FOR ( SELECT a.dim1, COUNT(*) FROM druid.foo a INNER JOIN ( SELECT dim1, dim2 FROM druid.foo UNION ALL SELECT dim1, dim2 FROM druid.foo2 ) b ON a.dim1 = b.dim1 WHERE a.dim1 = 1.0 b.dim1 = 2.0 GROUP BY a.dim1 )

BEFORE

Output in console
image
Plan(expanded)
image

AFTER

Some parts truncated

[
  {
    "query": {
      "queryType": "groupBy",
      "dataSource": {
        "type": "join",
        "left": {
          "type": "table",
          "name": "foo"
        },
        "right": {
          "type": "query",
          "query": {
            "queryType": "scan",
            "dataSource": {
              "type": "union",
              "dataSources": [
                {
                  "type": "table",
                  "name": "foo4"
                },
                {
                  "type": "table",
                  "name": "foo"
                }
              ]
            },
            "intervals": {
              "type": "intervals",
              "intervals": [
                "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
              ]
            },
...
            },
            "descending": false,
            "granularity": {
              "type": "all"
            }
          }
        },
        "rightPrefix": "j0.",
        "condition": "(\"dim1\" == \"j0.dim1\")",
        "joinType": "INNER",
        "leftFilter": null
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
      "virtualColumns": [],
      "filter": {
        "type": "or",
     ...
          }
        ]
      },
      "granularity": {
        "type": "all"
      },
      "dimensions": [
        {
          "type": "default",
          "dimension": "dim1",
          "outputName": "d0",
          "outputType": "STRING"
        }
      ],
      "aggregations": [
        {
          "type": "count",
          "name": "a0"
        }
      ],
...
      "descending": false
    },
    "signature": "{d0:STRING, a0:LONG}"
  }
]

The older format vs the newer format

Older Format:

  1. Gave the structure of the RelNodes formed from the SQL statement. This could help in understanding how the given query was arranged
  2. The older variation was verbose, and repetitive. See the examples above.

Newer Format

  1. Gives the native queries as is, therefore gives a clearer understanding of what is going to be run under the hood.
  2. JSON output has a fixed structure and could be parsed easily.
  3. Doesn't match the EXPLAIN PLAN FOR semantics of other databases like SQL/Oracle etc. (Can be said for the older version as well)
  4. Since a default public facing change is being made, this would cause the users to update their applications, if they are relying on the structure of the format.

We are using a external context flag to switch between legacy and native mode, and not relying on EXPLAIN PLAN FOR ... AS JSON since the latter should ideally only vary the format of the EXPLAIN PLAN and not the output itself (which is being done here).


Key changed/added classes in this PR
  • Modified implementation of DruidPlanner#planExplanation.
  • RowSignature implemented to be JSON serializable
  • RowSignatureTest for serializability

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, @LakshSingla !
On the whole, the PR looks good.

There are a few things that need changing/explanation.

@LakshSingla
Copy link
Contributor Author

LakshSingla commented Nov 11, 2021

We can support both the formats simultaneously in Calcite's syntax using SqlExplainLevel and SqlExplain.Depth. From the Calcite's docs.

explain:
      EXPLAIN PLAN
      [ WITH TYPE | WITH IMPLEMENTATION | WITHOUT IMPLEMENTATION ]
      [ EXCLUDING ATTRIBUTES | INCLUDING [ ALL ] ATTRIBUTES ]
      [ AS JSON | AS XML | AS DOT ]
      FOR { query | insert | update | merge | delete }

So potentially,
EXPLAIN PLAN FOR {query} can give older output and say EXPLAIN PLAN EXCLUDING ATTRIBUTES {query} can give the newer one. (Can be altered to whatever makes more sense semantically).

@LakshSingla LakshSingla marked this pull request as ready for review November 11, 2021 20:15
@LakshSingla
Copy link
Contributor Author

Commented out the explanation of the original tests in case we decide to follow up with the above suggestion. Will remove it if not required.

@dbardbar
Copy link
Contributor

@LakshSingla - just a quick question - will this be backward-compatible?
I have a use-case that we use the EXPLAIN to convert SQL to native queries automatically, so I'm wondering if I'll need to adapt my code to the new proposed format.

@abhishekagarwal87
Copy link
Contributor

@dbardbar - will you still need that custom code once this change is merged? This change is trying to solve the same problem of not being able to see the final native query.

@abhishekagarwal87
Copy link
Contributor

hmm. I guess you would still want to rid of the extra fields such as row signature etc

@dbardbar
Copy link
Contributor

@LakshSingla - the native query can be extracted from the response today, but it does require some logic to extract it. Not the prettiest piece of code, but not that complicated.
If I understand correctly your proposal, it seems like it make our lives better, and will simplify our parser, but making a breaking change does have it's drawbacks.
If your new code will return the new response only based on some new flag (global, or on the request), then that would be great.

@dbardbar
Copy link
Contributor

@abhishekagarwal87 - for our use-case we want to extract from the native query on the part related to the 'WHERE' clause, so anyway we'll need some basic handling, even with the new code. With the new code, the extraction will be a bit easier.

@clintropolis
Copy link
Member

This looks nice 👍

Since this changes the output of a query, i'm +1 for adding a feature flag to allow the previous results to be returned. PlannerConfig would probably be the most appropriate place, maybe something like druid.sql.useLegacyDruidExplain?

Since this output seems totally better I think it is ok to default it to the new stuff. It might be nice to also add a context parameter so that it could be overridden per query to make it easier for developers to migrate their apps to the new output while debugging without having to set it for the entire cluster.

@LakshSingla
Copy link
Contributor Author

LakshSingla commented Nov 18, 2021

In accordance with the above suggestions, I have added a config option in the PlannerConfig which will allow the user to switch between the explain plan outputs. It can also be overridden on a per query basis.
By default the newer output would be visible. This default behavior can be changed by setting druid.sql.planner.useLegacyDruidExplain = true (default is false)
Irrespective of the default behavior set in the properties, the explain plan output can also be modified on a per query basis by setting the useLegacyDruidExplain to true or false in the query's context.

Copy link
Contributor

@abhishekagarwal87 abhishekagarwal87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me overall. just one minor comment.

@LakshSingla
Copy link
Contributor Author

Should the property druid.sql.planner.useLegacyDruidExplain and the overriding context key be documented somewhere?

@abhishekagarwal87
Copy link
Contributor

Should the property druid.sql.planner.useLegacyDruidExplain and the overriding context key be documented somewhere?

Yes. You can put it in docs/querying/query-context.md

@LakshSingla
Copy link
Contributor Author

Updated the description with pros and cons of the newer approach.

@vogievetsky
Copy link
Contributor

vogievetsky commented Nov 24, 2021

Very excited for this change! I got 2 questions:

(1) is it possible to provide the signature as JSON and not as a string?

instead of "signature": "{d0:STRING, a0:LONG}"

provide it as "signature": [{name: "d0", "type":"STRING"}, {"name":"a0", "type":"LONG"}]

the signature strings are really annoying to parse and these signatures are really important (sometimes more so than the query)

(2) you included screenshots of the old format in the console query view, but how does it work with the explain dialog? Does it need to be updated as part of this PR or right afterwards?

Just to be clear I am talking about this dialog:

image

image

As you can see the current dialog parses the old format thus relying on it. I would LOVE nothing more than to switch to this new format (providing pt.1) but does this PR break this dialog? Ideally there would be a context flag to trigger the new format but the old format would be the default (at least for a few releases).

@clintropolis
Copy link
Member

it looks like RowSignature is made JSON friendly in https://github.com/apache/druid/pull/11959/files#diff-efdda11ff1dc815218691ec3a5c8d487bd01cd5c297de95690b7b83cea1eb5b8R82 so might want to do that in a follow-up after #11959 goes in.

@LakshSingla
Copy link
Contributor Author

Thanks for the comment @vogievetsky

  1. As mentioned by @clintropolis, Gian's PR should make the RowSignature serializable. However, I have added a temporary JsonValue method till then which will help the Jackson serialize it, till those changes get merged. I have kept the output of my change is similar to Gian's change, so ideally no testcases need to updated, and the DruidPlanner's code also works without any tweaks. (This would introduce some merge conflicts in RowSignature and RowSignatureTest in SQL INSERT planner support. #11959, but resolving it should be simple - discard the older changes for the new ones.
  2. I haven't tested it out, but it should break the console output. I have changed the default behavior to show the legacy output (i.e. the original behaviour), and therefore, no breaking changes are introduced in the PR. There is no requirement of updating the UI code immediately.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

#11959 got merged faster than I was expecting 😅, so maybe consider the rename/inversion when fixing up conflicts

@@ -1827,7 +1827,7 @@ The Druid SQL server is configured through the following properties on the Broke
|`druid.sql.planner.metadataSegmentCacheEnable`|Whether to keep a cache of published segments in broker. If true, broker polls coordinator in background to get segments from metadata store and maintains a local cache. If false, coordinator's REST API will be invoked when broker needs published segments info.|false|
|`druid.sql.planner.metadataSegmentPollPeriod`|How often to poll coordinator for published segments list if `druid.sql.planner.metadataSegmentCacheEnable` is set to true. Poll period is in milliseconds. |60000|
|`druid.sql.planner.authorizeSystemTablesDirectly`|If true, Druid authorizes queries against any of the system schema tables (`sys` in SQL) as `SYSTEM_TABLE` resources which require `READ` access, in addition to permissions based content filtering.|false|
|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|false|
|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|true|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the default has been flipped, I wonder if we should invert this setting and call something like `useJsonStringExplain" or something similar that defaults to false. (I just did a similar thing when swapping the default to keep old behavior in #11184, swapping a "use legacy" to "use new thing", which I think maybe is better since is a bit odd for "use legacy" to be the default of something)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If in the future, the newer explain plan is to be used, then I think useLegacyDruidExplain would work better, since we won't have to change the feature flag then. But it does look weird right now, since it hasn't been deprecated yet. I am fine with keeping it either way.

Copy link
Contributor

@abhishekagarwal87 abhishekagarwal87 Nov 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The choice is between legacy vs new plan so from that the flag name does sound good to me.

@LakshSingla
Copy link
Contributor Author

Just merged in the changes!

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

@LakshSingla
Copy link
Contributor Author

Minor comment:
Currently, deserializing the list of queries in JSON is done by converting it into instance of ArrayNode, since a simple List<Map<String, Object>> was losing the information about queryType referenced code, stackoverflow I have recently found a way to not use these by creating an inner class

private static class ExplainOutputNode
  {
    @JsonProperty
    Query query;

    @JsonProperty
    RowSignature signature;

    public ExplainOutputNode(RowSignature signature, Query query) {
      this.signature = signature;
      this.query = query;
    }
  }

    String outputString =  jsonMapper.writerFor(new TypeReference<List<ExplainOutputNode>>()
    {
    }).writeValueAsString(queryList);

which frees the referenced code from usage of ArrayNode and ObjectNode.
Should that be a preferred approach to the current one?

@abhishekagarwal87
Copy link
Contributor

Minor comment: Currently, deserializing the list of queries in JSON is done by converting it into instance of ArrayNode, since a simple List<Map<String, Object>> was losing the information about queryType referenced code, stackoverflow I have recently found a way to not use these by creating an inner class

private static class ExplainOutputNode
  {
    @JsonProperty
    Query query;

    @JsonProperty
    RowSignature signature;

    public ExplainOutputNode(RowSignature signature, Query query) {
      this.signature = signature;
      this.query = query;
    }
  }

    String outputString =  jsonMapper.writerFor(new TypeReference<List<ExplainOutputNode>>()
    {
    }).writeValueAsString(queryList);

which frees the referenced code from usage of ArrayNode and ObjectNode. Should that be a preferred approach to the current one?

I think that the current approach is fine since the new approach means creating a new type. If in future, we do similar deserialization in other places, we can use a custom type specifically for explain output.

@abhishekagarwal87 abhishekagarwal87 merged commit c381cae into apache:master Nov 25, 2021
@abhishekagarwal87
Copy link
Contributor

Thank you @LakshSingla. I have merged your change.

abhishekagarwal87 pushed a commit that referenced this pull request Dec 1, 2021
…2009)

This is the UI followup to the work done in #11908

Updated the Explain dialog to use the new output format.
abhishekagarwal87 pushed a commit that referenced this pull request Dec 6, 2021
This is a follow up to the PR #11908. This fixes the bug in top level union all queries when there are more than 2 SQL subqueries are present.
@abhishekagarwal87 abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022
@LakshSingla LakshSingla deleted the sql-explain branch March 22, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants