Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continue of #42008, rewrite countDistinctIf with count_distinct_implementation #46051

Merged
merged 20 commits into from Aug 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/en/operations/settings/settings.md
Expand Up @@ -4578,6 +4578,17 @@ Type: Int64

Default: 0

## rewrite_count_distinct_if_with_count_distinct_implementation

Allows you to rewrite `countDistcintIf` with [count_distinct_implementation](#settings-count_distinct_implementation) setting.

Possible values:

- true — Allow.
- false — Disallow.

Default value: `false`.

## precise_float_parsing {#precise_float_parsing}

Switches [Float32/Float64](../../sql-reference/data-types/float.md) parsing algorithms:
Expand Down
1 change: 1 addition & 0 deletions src/Core/Settings.h
Expand Up @@ -536,6 +536,7 @@ class IColumn;
M(Bool, optimize_move_functions_out_of_any, false, "Move functions out of aggregate functions 'any', 'anyLast'.", 0) \
M(Bool, optimize_normalize_count_variants, true, "Rewrite aggregate functions that semantically equals to count() as count().", 0) \
M(Bool, optimize_injective_functions_inside_uniq, true, "Delete injective functions of one argument inside uniq*() functions.", 0) \
M(Bool, rewrite_count_distinct_if_with_count_distinct_implementation, false, "Rewrite countDistinctIf with count_distinct_implementation configuration", 0) \
M(Bool, convert_query_to_cnf, false, "Convert SELECT query to CNF", 0) \
M(Bool, optimize_or_like_chain, false, "Optimize multiple OR LIKE into multiMatchAny. This optimization should not be enabled by default, because it defies index analysis in some cases.", 0) \
M(Bool, optimize_arithmetic_operations_in_aggregate_functions, true, "Move arithmetic operations out of aggregation functions", 0) \
Expand Down
1 change: 1 addition & 0 deletions src/Core/SettingsChangesHistory.h
Expand Up @@ -80,6 +80,7 @@ namespace SettingsChangesHistory
/// It's used to implement `compatibility` setting (see https://github.com/ClickHouse/ClickHouse/issues/35972)
static std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges> settings_changes_history =
{
{"23.8", {{"rewrite_count_distinct_if_with_count_distinct_implementation", false, true, "Rewrite countDistinctIf with count_distinct_implementation configuration"}}},
{"23.7", {{"function_sleep_max_microseconds_per_block", 0, 3000000, "In previous versions, the maximum sleep time of 3 seconds was applied only for `sleep`, but not for `sleepEachRow` function. In the new version, we introduce this setting. If you set compatibility with the previous versions, we will disable the limit altogether."}}},
{"23.6", {{"http_send_timeout", 180, 30, "3 minutes seems crazy long. Note that this is timeout for a single network write call, not for the whole upload operation."},
{"http_receive_timeout", 180, 30, "See http_send_timeout."}}},
Expand Down
9 changes: 9 additions & 0 deletions src/Interpreters/TreeRewriter.cpp
Expand Up @@ -110,6 +110,9 @@ using CustomizeCountDistinctVisitor = InDepthNodeVisitor<OneTypeMatcher<Customiz
char countifdistinct[] = "countifdistinct";
using CustomizeCountIfDistinctVisitor = InDepthNodeVisitor<OneTypeMatcher<CustomizeFunctionsData<countifdistinct>>, true>;

char countdistinctif[] = "countdistinctif";
using CustomizeCountDistinctIfVisitor = InDepthNodeVisitor<OneTypeMatcher<CustomizeFunctionsData<countdistinctif>>, true>;

char in[] = "in";
using CustomizeInVisitor = InDepthNodeVisitor<OneTypeMatcher<CustomizeFunctionsData<in>>, true>;

Expand Down Expand Up @@ -1368,6 +1371,12 @@ void TreeRewriter::normalize(
CustomizeIfDistinctVisitor::Data data_distinct_if{"DistinctIf"};
CustomizeIfDistinctVisitor(data_distinct_if).visit(query);

if (settings.rewrite_count_distinct_if_with_count_distinct_implementation)
{
CustomizeCountDistinctIfVisitor::Data data_count_distinct_if{settings.count_distinct_implementation.toString() + "If"};
CustomizeCountDistinctIfVisitor(data_count_distinct_if).visit(query);
}

ExistsExpressionVisitor::Data exists;
ExistsExpressionVisitor(exists).visit(query);

Expand Down
@@ -0,0 +1,6 @@
2
SELECT countDistinctIf(number % 10, (number % 5) = 2)
FROM numbers(1000)
2
SELECT uniqExactIf(number % 10, (number % 5) = 2)
FROM numbers(1000)
@@ -0,0 +1,8 @@
-- Tags: no-parallel
SELECT countDistinctIf(number % 10, number % 5 = 2) FROM numbers(1000);
EXPLAIN SYNTAX SELECT countDistinctIf(number % 10, number % 5 = 2) FROM numbers(1000);

-- disable by default
SET rewrite_count_distinct_if_with_count_distinct_implementation = 1;
SELECT countDistinctIf(number % 10, number % 5 = 2) FROM numbers(1000);
EXPLAIN SYNTAX SELECT countDistinctIf(number % 10, number % 5 = 2) FROM numbers(1000);