You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
droptable if exists data;
droptable if exists dist;
set max_bytes_before_external_group_by =1;
set group_by_two_level_threshold_bytes =1;
set max_untracked_memory ='1Mi';
set memory_profiler_step ='1Mi';
createtabledata (key String) Engine=Memory;
createtabledist (key LowCardinality(String)) engine=Distributed(test_cluster_two_shards, currentDatabase(), data);
insert into data values ('foo');
select*from dist group by key;
The last query returns (non-deterministically) the following result which is wrong (key is duplicated):
┌─key─┐
│ foo │
└─────┘
┌─key─┐
│ foo │
└─────┘
In logs we can see that identical keys go to different buckets:
2023.12.13 14:18:23.348050 [ 2207426 ] {8c005767-91d3-44ae-8d67-c5b4ba610f79} <Trace> Aggregator: Merging partially aggregated blocks (bucket = 174).
2023.12.13 14:18:23.348077 [ 2207426 ] {8c005767-91d3-44ae-8d67-c5b4ba610f79} <Debug> Aggregator: Merged partially aggregated blocks for bucket #174. Got 1 rows, 1.00 B from 1 source rows in 1.7239e-05 sec. (58008.005 rows/sec., 56.65 KiB/sec.)
2023.12.13 14:18:23.348123 [ 2207477 ] {8c005767-91d3-44ae-8d67-c5b4ba610f79} <Trace> Aggregator: Merging partially aggregated blocks (bucket = 222).
2023.12.13 14:18:23.348153 [ 2207477 ] {8c005767-91d3-44ae-8d67-c5b4ba610f79} <Debug> Aggregator: Merged partially aggregated blocks for bucket #222. Got 1 rows, 1.00 B from 1 source rows in 1.9668e-05 sec. (50844.011 rows/sec., 49.65 KiB/sec.)
Unfortunately I couldn't reproduce it on more real example. Also note that if LowCardinality(String)) is changed to String the result is correct. So probably there is some issue with conversion. However there are the same aggregation methods selected on both shards: Aggregation method: key_string.
The text was updated successfully, but these errors were encountered:
Consider the following sql script:
The last query returns (non-deterministically) the following result which is wrong (key is duplicated):
In logs we can see that identical keys go to different buckets:
Unfortunately I couldn't reproduce it on more real example. Also note that if
LowCardinality(String))
is changed toString
the result is correct. So probably there is some issue with conversion. However there are the same aggregation methods selected on both shards:Aggregation method: key_string
.The text was updated successfully, but these errors were encountered: