Add SQL query planning time metric #12923

rohangarg · 2022-08-18T22:26:45Z

Adds sqlQuery/planningTimeMs metric per SQL query which computes the time it takes to build a native query from the SQL query.
While comparing the perf of native to SQL API for the same query, users tend to check the extra time taken by planning stage of the SQL. Currently, that time is estimated as the difference between running the SQL and then running the native query using EXPLAIN. To find the exact difference, the query/time metrics of both SQL and native queries are compared.
Further, for SQL query workloads there is no easy visiblity into the holistic planning time.

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

FrankChen021 · 2022-08-19T02:20:37Z

sql/src/main/java/org/apache/druid/sql/SqlExecutionReporter.java


      final Map<String, Object> statsMap = new LinkedHashMap<>();
      statsMap.put("sqlQuery/time", TimeUnit.NANOSECONDS.toMillis(queryTimeNs));
+      statsMap.put("sqlQuery/planningTimeMs", TimeUnit.NANOSECONDS.toMillis(planningTimeNanos));


I'm curious that if this metric is emitted in milliseconds, why do we need to hold it in nanosecond in the code?

I guess that it's because of the better accuracy.

2089/1000 - 1989/1000 = 1

(2089 - 1989) / 1000 = 0

0 seems better here given the minuscule difference between two values.

Also, my thinking was that given it doesn't need any extra space, it is better to store the time in the highest resolution. Then we can emit in whatever resolution is needed by the callers and also use the highest accuracy times for any arithmetic if needed in future.

But I don't see if there's any chance that we might change this metric from ms to ns in the future since that would cause incompatibility.

I think SQL planning phase should be very short, does this phase always take milliseconds to complete? I guess this phase should be done in micro-seconds, if it's true, the emitted metric is always zero in most cases.

Yes, I didn't mean to say that we'd update the current metric. Rather that incase anyone tries to use it for a new metric or derive a metric using the planningTime, they would have the value with highest accuracy.

In my local tests with SELECT 1 query, even after multiple quick invocations the lowest planning time I saw was 4ms. The lowest query/time observed was 6ms. Have you seen < millisecond scale planning in production clusters maybe - which my local test would be missing?

FWIW, I have seen many queries taking significant planning time. For E.g. we have seen this in particular with queries with a large IN clause or queries with many UNION operators. When debugging a slow query, we often don't have direct information about the time spent in planning. We tend to infer it by looking at other metrics and then comparing it with overall execution time. Having this metric available firsthand would help in that troubleshooting. If that planning time is too high, we would either fix it through a code change or adjust the query in some way.

So IMO there is definitely utility in having this metric.

@rohangarg I just guess based on my experience, and don't have evidence to prove that.

Add sql planning time metric

4c7309d

rohangarg force-pushed the sqlPlanningMetrics branch from 979d82f to 4c7309d Compare August 18, 2022 22:59

rohangarg added Area - Metrics/Event Emitting Area - SQL labels Aug 18, 2022

FrankChen021 reviewed Aug 19, 2022

View reviewed changes

abhishekagarwal87 approved these changes Aug 19, 2022

View reviewed changes

FrankChen021 approved these changes Aug 19, 2022

View reviewed changes

abhishekagarwal87 merged commit 3c129f6 into apache:master Aug 22, 2022

abhishekagarwal87 added this to the 24.0.0 milestone Aug 26, 2022

techdocsmith mentioned this pull request Aug 31, 2022

[Draft] 24.0 Release notes #12825

Closed

abhishekagarwal87 mentioned this pull request Sep 8, 2022

Test issue [Please ignore] #13055

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SQL query planning time metric #12923

Add SQL query planning time metric #12923

rohangarg commented Aug 18, 2022

FrankChen021 Aug 19, 2022

abhishekagarwal87 Aug 19, 2022

rohangarg Aug 19, 2022

FrankChen021 Aug 19, 2022

rohangarg Aug 19, 2022

abhishekagarwal87 Aug 19, 2022

FrankChen021 Aug 19, 2022

Add SQL query planning time metric #12923

Add SQL query planning time metric #12923

Conversation

rohangarg commented Aug 18, 2022

FrankChen021 Aug 19, 2022

Choose a reason for hiding this comment

abhishekagarwal87 Aug 19, 2022

Choose a reason for hiding this comment

rohangarg Aug 19, 2022

Choose a reason for hiding this comment

FrankChen021 Aug 19, 2022

Choose a reason for hiding this comment

rohangarg Aug 19, 2022

Choose a reason for hiding this comment

abhishekagarwal87 Aug 19, 2022

Choose a reason for hiding this comment

FrankChen021 Aug 19, 2022

Choose a reason for hiding this comment