-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove select query #8739
remove select query #8739
Conversation
Very supportive of this in general! |
Closing/opening to restart teamcity |
…eath-destroyer-of-select
Not sure why it is failing teamcity consistently with the same errors in files I didn't modify, merged latest master in to see if it changes things. |
I did not add any special handling for this, so it will show the default error:
I guess I could add a |
Ah, the teamcity failures are actually legitimate. The two errors are methods that only the select query engine was using: a variant of |
Ok, added back a
All it cost was adding 235 lines of code 😞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the idea of removing the select query; it's been deprecated for a while, causes clusters to crash (ouch), and the scan query is good enough now to mostly replace it. Suggested some wording changes.
docs/querying/select-query.md
Outdated
@@ -24,235 +24,11 @@ sidebar_label: "Select" | |||
--> | |||
|
|||
|
|||
> We encourage you to use the [Scan query](../querying/scan-query.md) type rather than Select whenever possible. | |||
> In situations involving larger numbers of segments, the Select query can have very high memory and performance overhead. | |||
> The native Druid Select query has been removed in favor of the [Scan query](../querying/scan-query.md), which should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The >
make this look weird when rendered (a note/quote block). Also the sentences read strangely together. I'd suggest:
Older versions of Apache Druid (incubating) included a Select query type. Since Druid 0.17.0, it has been removed and replaced by the Scan query, which offers improved memory usage and performance. This solves issues that users had with Select queries causing Druid to run out of memory or slow down.
The Scan query has a different syntax, but supports many of the features of the Select query, including time ordering and limiting. Scan does not include the Select query's pagination feature; however, in many cases pagination is unnecessary with Scan due to its ability to return a virtually unlimited number of results in one call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with your suggestion 👍
@@ -1544,7 +1544,7 @@ You can optionally only configure caching to be enabled on the Broker by setting | |||
|`druid.broker.cache.useResultLevelCache`|true, false|Enable result level caching on the Broker.|false| | |||
|`druid.broker.cache.populateResultLevelCache`|true, false|Populate the result level cache on the Broker.|false| | |||
|`druid.broker.cache.resultLevelCacheLimit`|positive integer|Maximum size of query response that can be cached.|`Integer.MAX_VALUE`| | |||
|`druid.broker.cache.unCacheable`|All druid query types|All query types to not cache.|`["groupBy", "select"]`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
…eath-destroyer-of-select
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Leaving open for the next committer to merge, as this needs one more design review.
When writing the release notes, make sure to mention that users of Druid SQL that do select queries must upgrade first to 0.15.0 or 0.16.0, then to 0.17.0, because 0.15.0 is the version that switched SQL selects to Scan. Upgrading directly from 0.14 to 0.17 will cause SQL selects to not work right during the rolling upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clintropolis nice work! I like removing stuffs.
I can not tell if you already do this from the diff but I think not: would it make sense to have a useful error message if someone does try to issue a select query explaining that it has been removed?
I'm wondering whether we really need this. Should we remove SelectQuery
at some point anyway because we cannot keep all legacy classes forever? A nice error message is always good for Druid users, but I guess this may not be needed since we will call out in the release notes and keep the doc for removed select query.
*/ | ||
@JsonTypeName("select") | ||
public class SelectQuery extends BaseQuery<Result<SelectResultValue>> | ||
public class SelectQuery implements Query<Object> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add @Deprecated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
I am in this camp as well, originally the PR just removed it all, but I guess this is a nicer user experience. I do think we could remove this placeholder in a release or two after the next one. |
I would suggest keeping the error message for at least a few releases. The Select query, even though flawed, was somewhat popular and I think this message will reduce confusion in the user base. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 after CI.
Description
This PR removes the Select query type, in favor of the Scan query. Since the introduction of time ordered scan query with #7133, our documentation has already been recommending people use scan queries instead of select due to the heavy memory impact of the select query, and Druid SQL has not generated select queries at all since #7373. Functionally, scan is equivalent, less the paging functionality, however the more efficient nature of the scan query makes the paging functionality un-necessary.
Additionally, I feel that removing the choice between 2 very similar query types should be more user friendly, and for code hygiene and lower maintenance for the query processing layer on our end.
This is obviously incompatible with previous releases so will need to be called out prominently in the release notes should we merge this PR.
This PR has: