Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry pick #55678 to 23.9: Fix filtering by virtual columns with OR filter in query (resubmit) #55755

Conversation

robot-ch-test-poll3
Copy link
Contributor

Original pull-request #55678

This pull-request is a first step of an automated backporting.
It contains changes like after calling a local command git cherry-pick.
If you intend to continue backporting this changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.

The check results does not matter at this step - you can safely ignore them.

Note

This pull-request will be merged automatically as it reaches the mergeable state, do not merge it manually.

If the PR was closed and then reopened

If it stuck, check #55678 for pr-backports-created and delete it if necessary. Manually merging will do nothing, since pr-backports-created prevents the original PR #55678 from being processed.

If you want to recreate the PR: delete the pr-cherrypick label and delete this branch.
You may also need to delete the pr-backports-created label from the original PR.

azat and others added 3 commits October 16, 2023 15:42
The problem with the initial implementation #52653 was:
- OR can have multiple arguments
- It simply not correct to assume that if there are two arguments this is OK.
  Consider the following example:

    "WHERE (column_not_from_partition_by = 1) OR false OR false"

  Will be converted to:

    "WHERE false OR false"

And it will simply read nothing.

Yes, we could apply some optimization for bool, but this will not always
work, since to optimize things like "0 = 1" we need to execute it.

And the only way to make handle this correctly (with ability to ignore
some commands during filtering) is to make is_constant() function return
has it use something from the input block, so that we can be sure, that
we have some sensible, and not just "false".

Plus we cannot simply ignore the difference of the input and output
arguments of handling OR, we need to add always-true (1/true) if the
size is different, since otherwise it could break invariants (see
comment in the code).

This includes (but not limited to):
- _part* filtering for MergeTree
- _path/_file for various File/HDFS/... engines
- _table for Merge
- ...

P.S. analyzer does not have this bug, since it execute expression as
whole, and this is what filterBlockWithQuery() should do actually
instead, but this will be a more complex patch.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
(cherry picked from commit b107712)
After the previous patch "x OR 1" will not execute "x", and because of
this test_system_merges::test_mutation_simple started to fail since
"sleep" function did not executed.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Fix filtering by virtual columns with OR filter in query (resubmit)
@robot-ch-test-poll3 robot-ch-test-poll3 added pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only! do not test disable testing on pull request labels Oct 17, 2023
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 merged commit 7facf74 into backport/23.9/55678 Oct 17, 2023
2 checks passed
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 deleted the cherrypick/23.9/79eccfb6421eeaf7d3cf0ceb3af619fb45962a09 branch October 17, 2023 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not test disable testing on pull request pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants