Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix KQL and PRQL distributed tables queries #59674

Closed

Conversation

kothiga
Copy link
Contributor

@kothiga kothiga commented Feb 6, 2024

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fixed a minor bug that prevented distributed table queries sent from either KQL or PRQL dialect clients to be executed on replicas.

Documentation entry for user-facing changes

This PR address #56289 and #57378 which found that queries using a distributed table with the kusto dialect resulted in a syntax error.

Root cause of this issue seems to be with how the queries are passed onto replicas for execution. A query such as

distributed_table | limit 2

is transformed to

SELECT * FROM distributed_table LIMIT 2

When executing this, the query is sent to connected replicas, which includes a modified copy of the settings. Within this, was the inclusion of the dialect change, which caused the replica to try and execute an SQL query with the KQL parser. Modifying the dialect prior to distribution resolves this issue.

This PR depends on #59626 being merged prior.

Edit: As observed below, this bug also extends to PRQL for the same reason. Adjustments have been made to just blanket set the dialect to ClickHouse-SQL before querying replicas.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@nikitamikhaylov nikitamikhaylov added the can be tested Allows running workflows for external contributors label Feb 6, 2024
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-bugfix Pull request with bugfix, not backported by default label Feb 6, 2024
@robot-ch-test-poll1
Copy link
Contributor

robot-ch-test-poll1 commented Feb 6, 2024

This is an automated comment for commit e23dd5b with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
A SyncThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Docs checkBuilds and tests the documentation✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Check nameDescriptionStatus
Bugfix validationChecks that either a new test (functional or integration) or there some changed tests that fail with the binary built on master branch❌ failure
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR⏳ pending
ClickHouse build checkBuilds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process⏳ pending
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here❌ failure
Mergeable CheckChecks if all other necessary checks are successful❌ failure

@@ -177,6 +177,10 @@ void HedgedConnections::sendQuery(
{
Settings modified_settings = settings;

// Kusto queries to replicas are transformed to ClickHouse-SQL. Change the setting before sending.
if (modified_settings.dialect == Dialect::kusto)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong. Why only Kusto?
What about the PRQL dialect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will investigate. I hadn't seen any issues submitted for PRQL having the same issue with distributed tables, and thought maybe the dialect was doing something else making it unaffected. I will report what I find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that this issue also effects PRQL. Ex. error output

Received exception from server (version 23.12.2):
Code: 62. DB::Exception: Received from localhost:9000. DB::Exception: Received from clickhouse03:9000. DB::Exception: PRQL syntax error: 'Error:
   ╭─[:1:1]
   │
 1 │ SELECT `shared_test_table`.`id` FROM `default`.`shared_test_table`
   │ ───┬──
   │    ╰──── Unknown name
───╯
'. (SYNTAX_ERROR)

Here, we can see the same behavior, of the transformed SQL statement trying to be executed in the incorrect dialect.

I will extend my change and add appropriate tests.

Copy link
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests.

@alexey-milovidov
Copy link
Member

Also, write a test with PRQL.

@alexey-milovidov alexey-milovidov self-assigned this Feb 6, 2024
@kothiga kothiga changed the title Fix KQL distributed tables queries Fix KQL and PRQL distributed tables queries Feb 7, 2024
@@ -126,6 +126,10 @@ void MultiplexedConnections::sendQuery(

Settings modified_settings = settings;

// Kusto and PRQL queries to replicas are transformed to ClickHouse-SQL. Ensure the setting before sending.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is non-generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the change that was made in 01e203d in an earlier commit, and ultimately changed it to what I had because of backwards compatibility tests throwing a Unknown setting dialect.

I definitely agree that the generic form of blanket setting the dialect is the better way to go.

@@ -177,6 +177,10 @@ void HedgedConnections::sendQuery(
{
Settings modified_settings = settings;

// Kusto and PRQL queries to replicas are transformed to ClickHouse-SQL. Ensure the setting before sending.
if (modified_settings.dialect == Dialect::kusto || modified_settings.dialect == Dialect::prql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the code.

@alexey-milovidov
Copy link
Member

Sorry, editing through web UI was a bad idea. Continued here: #60470

alexey-milovidov added a commit that referenced this pull request Feb 28, 2024
@alexey-milovidov
Copy link
Member

Done.

kothiga pushed a commit to ClibMouse/ClickHouse that referenced this pull request Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants