Skip to content

Fix SummingMergeTree aggregation for Nested LowCardinality columns#90927

Merged
Avogar merged 1 commit intoClickHouse:masterfrom
bobrik:ivan/summing-merge-tree-nested-low-cardinality
Dec 4, 2025
Merged

Fix SummingMergeTree aggregation for Nested LowCardinality columns#90927
Avogar merged 1 commit intoClickHouse:masterfrom
bobrik:ivan/summing-merge-tree-nested-low-cardinality

Conversation

@bobrik
Copy link
Copy Markdown
Contributor

@bobrik bobrik commented Nov 26, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixes SummingMergeTree aggregation for Nested LowCardinality columns

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

With String we see the expected behavior of the Nested column:

:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key String, sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         7 │ ['a','b']   │ [3,4]       │
2. │ lol │         3 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [4,6]       │
   └─────┴───────────┴─────────────┴─────────────┘

With LowCardinality(String) before we see incorrect summing of the Nested column:

:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key LowCardinality(String), sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         7 │ ['a','b']   │ [3,4]       │
2. │ lol │         3 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘

With this patch applied LowCardinality(String) behaves exactly like String, as it should:

:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key LowCardinality(String), sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         3 │ ['a','b']   │ [1,2]       │
2. │ lol │         7 │ ['a','b']   │ [3,4]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [4,6]       │
   └─────┴───────────┴─────────────┴─────────────┘

With `String` we see the expected behavior of the `Nested` column:

```
:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key String, sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         7 │ ['a','b']   │ [3,4]       │
2. │ lol │         3 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [4,6]       │
   └─────┴───────────┴─────────────┴─────────────┘
```

With `LowCardinality(String)` before we see incorrect summing of the `Nested` column:

```
:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key LowCardinality(String), sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         7 │ ['a','b']   │ [3,4]       │
2. │ lol │         3 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [1,2]       │
   └─────┴───────────┴─────────────┴─────────────┘
```

With this patch applied `LowCardinality(String)` behaves exactly like `String`, as it should:

```
:) create table sums (key LowCardinality(String), sumOfSums UInt64, sumsMap Nested (key LowCardinality(String), sum UInt64)) ENGINE = SummingMergeTree primary key (key)

:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 3, ['a', 'b'], [1, 2])
:) insert into sums (key, sumOfSums, sumsMap.key, sumsMap.sum) values ('lol', 7, ['a', 'b'], [3, 4])

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │         3 │ ['a','b']   │ [1,2]       │
2. │ lol │         7 │ ['a','b']   │ [3,4]       │
   └─────┴───────────┴─────────────┴─────────────┘

:) optimize table sums

:) select * from sums
   ┌─key─┬─sumOfSums─┬─sumsMap.key─┬─sumsMap.sum─┐
1. │ lol │        10 │ ['a','b']   │ [4,6]       │
   └─────┴───────────┴─────────────┴─────────────┘
```
@tuanpach tuanpach added the can be tested Allows running workflows for external contributors label Nov 26, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Nov 26, 2025

Workflow [PR], commit [45a93b6]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel) failure
03312_explain_syntax_analyzer FAIL cidb
Fatal messages (in clickhouse-server.log or clickhouse-server.err.log) FAIL cidb
Integration tests (arm_binary, distributed plan, 3/4) failure
test_s3_plain_rewritable/test.py::test[s3_plain_rewritable-data/] FAIL cidb
Integration tests (amd_tsan, 5/6) failure
test_restore_db_replica/test.py::test_query_after_restore_db_replica[alter table-no exists table-no restart] FAIL cidb, flaky
test_dictionaries_update_and_reload/test.py::test_reload_while_loading FAIL cidb
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: FAIL cidb
BuzzHouse (arm_asan) failure
Segmentation fault (STID: 1486-4a2a) FAIL cidb
BuzzHouse (amd_ubsan) failure
UndefinedBehaviorSanitizer: undefined behavior (STID: 1486-329a) FAIL cidb
Performance Comparison (amd_release, master_head, 3/6) failure
Start failure
Performance Comparison (amd_release, master_head, 4/6) failure
Start failure
Performance Comparison (amd_release, master_head, 5/6) failure
Start failure

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Nov 26, 2025
@Avogar Avogar self-assigned this Nov 27, 2025
@bobrik
Copy link
Copy Markdown
Contributor Author

bobrik commented Nov 27, 2025

I'm going to need some help with CI failures.

Some performance builds failed to start, which doesn't seem related to these changes. Perhaps they just need a restart?

2025.11.27 02:53:31.539705 [ 1461 ] {} <Error> void DB::KeeperDispatcher::initialize(const Poco::Util::AbstractConfiguration &, bool, bool, const MultiVersion<Macros>::Version &): Code: 568. DB::Exception: Cannot create interserver listener on port 9234 after trying both IPv6 and IPv4. (RAFT_ERROR), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000001504bc1f
1. DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000ccbfd0e
3. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000ccbf700
4. DB::Exception::Exception<int>(int, FormatStringHelperImpl<std::type_identity<int>::type>, int&&) @ 0x000000000e1d2e2b
5. DB::KeeperServer::launchRaftServer(Poco::Util::AbstractConfiguration const&, bool) @ 0x000000001c2e3bd2
6. DB::KeeperServer::startup(Poco::Util::AbstractConfiguration const&, bool) @ 0x000000001c2e7376
7. DB::KeeperDispatcher::initialize(Poco::Util::AbstractConfiguration const&, bool, bool, std::shared_ptr<DB::Macros const> const&) @ 0x000000001c2c246b
8. DB::Context::initializeKeeperDispatcher(bool) const @ 0x0000000019b3b8d1
9. DB::Server::main(std::vector<String, std::allocator<String>> const&) @ 0x00000000153ed506
10. Poco::Util::Application::run() @ 0x0000000020cc0de6
11. DB::Server::run() @ 0x00000000153d5b02
12. mainEntryClickHouseServer(int, char**) @ 0x00000000153d2613
13. main @ 0x000000000ccb7cdb
14. __pow_finite @ 0x0000000000029d90
15. __libc_start_main @ 0x0000000000029e40
16. _start @ 0x0000000007cdcf2e
 (version 25.12.1.185 (official build))

Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel) has 03312_explain_syntax_analyzer that failed due to server being down and there's this in server logs:

clickhouse-server.err.log:2025.11.27 02:16:01.117016 [ 40228 ] {49a1818e-70c4-489a-9da3-c1f1861564dc} <Fatal> : Logical error: 'std::exception. Code: 1001, type: std::out_of_range, e.what() = `InlinedVector::at(size_type)` failed bounds check (version 25.12.1.187), Stack trace:
clickhouse-server.err.log:2025.11.27 02:16:01.126858 [ 40228 ] {49a1818e-70c4-489a-9da3-c1f1861564dc} <Fatal> : Stack trace (when copying this message, always include the lines below):
clickhouse-server.err.log:2025.11.27 02:16:01.127007 [ 1147 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
clickhouse-server.err.log:2025.11.27 02:16:01.127022 [ 1147 ] {} <Fatal> BaseDaemon: (version 25.12.1.187, build id: 92709D9029FF74BD4208050F633690CC231E3D16, git hash: fa4916fb8eb697f92f36010c89655865d997e7b8, architecture: x86_64) (from thread 40228) Received signal 6
clickhouse-server.err.log:2025.11.27 02:16:01.127027 [ 1147 ] {} <Fatal> BaseDaemon: Signal description: Aborted
clickhouse-server.err.log:2025.11.27 02:16:01.127029 [ 1147 ] {} <Fatal> BaseDaemon: 
clickhouse-server.err.log:2025.11.27 02:16:01.127038 [ 1147 ] {} <Fatal> BaseDaemon: Stack trace: 0x00007f54221a69fd 0x00007f5422152476 0x00007f54221387f3 0x000055860c70e309 0x000055860c70e5ec 0x000055860c7161e4 0x0000558612727be1 0x00005586127343c6 0x000055861272ad73 0x0000558615ac0d28 0x0000558615ade3d6 0x000055861af4d6c7 0x000055861af4dc7e 0x000055861aee90ff 0x000055861aee67cf 0x00007f54221a4ac3 0x00007f54222368c0
clickhouse-server.err.log:2025.11.27 02:16:01.127043 [ 1147 ] {} <Fatal> BaseDaemon: ########################################
clickhouse-server.err.log:2025.11.27 02:16:01.127070 [ 1147 ] {} <Fatal> BaseDaemon: (version 25.12.1.187, build id: 92709D9029FF74BD4208050F633690CC231E3D16, git hash: fa4916fb8eb697f92f36010c89655865d997e7b8) (from thread 40228) (query_id: 49a1818e-70c4-489a-9da3-c1f1861564dc) (query: SELECT dictGet();) Received signal Aborted (6)
clickhouse-server.err.log:2025.11.27 02:16:01.127085 [ 1147 ] {} <Fatal> BaseDaemon: 
clickhouse-server.err.log:2025.11.27 02:16:01.127096 [ 1147 ] {} <Fatal> BaseDaemon: Stack trace: 0x00007f54221a69fd 0x00007f5422152476 0x00007f54221387f3 0x000055860c70e309 0x000055860c70e5ec 0x000055860c7161e4 0x0000558612727be1 0x00005586127343c6 0x000055861272ad73 0x0000558615ac0d28 0x0000558615ade3d6 0x000055861af4d6c7 0x000055861af4dc7e 0x000055861aee90ff 0x000055861aee67cf 0x00007f54221a4ac3 0x00007f54222368c0
clickhouse-server.err.log:2025.11.27 02:16:01.127147 [ 1147 ] {} <Fatal> BaseDaemon: 3. pthread_kill @ 0x00000000000969fd
clickhouse-server.err.log:2025.11.27 02:16:01.127164 [ 1147 ] {} <Fatal> BaseDaemon: 4. raise @ 0x0000000000042476
clickhouse-server.err.log:2025.11.27 02:16:01.127179 [ 1147 ] {} <Fatal> BaseDaemon: 5. __lgamma_r_finite @ 0x00000000000287f3
clickhouse-server.err.log:2025.11.27 02:16:01.136183 [ 1147 ] {} <Fatal> BaseDaemon: 6. ./ci/tmp/build/./src/Common/Exception.cpp:52: DB::abortOnFailedAssertion(String const&, void* const*, unsigned long, unsigned long) @ 0x000000001183f309
clickhouse-server.err.log:2025.11.27 02:16:01.144557 [ 1147 ] {} <Fatal> BaseDaemon: 7. ./ci/tmp/build/./src/Common/Exception.cpp:58: ? @ 0x000000001183f5ec
clickhouse-server.err.log:2025.11.27 02:16:01.154592 [ 1147 ] {} <Fatal> BaseDaemon: 8. ./ci/tmp/build/./src/Common/Exception.cpp:583: DB::getCurrentExceptionMessageAndPattern(bool, bool, bool) @ 0x00000000118471e4
clickhouse-server.err.log:2025.11.27 02:16:01.202890 [ 1147 ] {} <Fatal> BaseDaemon: 9. ./ci/tmp/build/./src/Interpreters/executeQuery.cpp:906: DB::logExceptionBeforeStart(String const&, unsigned long, std::shared_ptr<DB::Context const>, std::shared_ptr<DB::IAST>, std::shared_ptr<DB::OpenTelemetry::SpanHolder> const&, unsigned long, bool) @ 0x0000000017858be1
clickhouse-server.err.log:2025.11.27 02:16:01.240811 [ 1147 ] {} <Fatal> BaseDaemon: 10. ./ci/tmp/build/./src/Interpreters/executeQuery.cpp:1898: DB::executeQueryImpl(char const*, char const*, std::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, std::unique_ptr<DB::ReadBuffer, std::

The addresses aren't symbolicated, so I'm not sure how to interpret this.

Integration tests (amd_tsan, 5/6) has test_s3_plain_rewritable/test.py::test[s3_plain_rewritable-data/] that failed, and I'm not sure why.

There are a few fuzzing things:

  • BuzzHouse (amd_debug) with Inconsistent AST formatting seems common in other PRs
  • BuzzHouse (arm_asan) with fails here:
inlined from ./src/IO/VarInt.h:32: DB::writeVarUInt(unsigned long, DB::WriteBuffer&)

BuzzHouse (amd_ubsan) too fails there:

#0 0x559113842737 in DB::writeVarUInt(unsigned long, DB::WriteBuffer&) ci/tmp/build/./src/IO/VarInt.h:32:22

That stuff appears in other PRs too, so it seems unrelated to the changes here.

It would be nice to restart the failed bits and get some guidance on whether there's something for me to fix otherwise.

@Avogar
Copy link
Copy Markdown
Member

Avogar commented Dec 4, 2025

Failed tests are most likely unrelated to the changes, but let me check

@Avogar
Copy link
Copy Markdown
Member

Avogar commented Dec 4, 2025

@Avogar Avogar added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Dec 4, 2025
@Avogar Avogar added this pull request to the merge queue Dec 4, 2025
Merged via the queue into ClickHouse:master with commit ae74a95 Dec 4, 2025
120 of 130 checks passed
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Dec 4, 2025
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Dec 4, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 4, 2025
Cherry pick #90927 to 25.3: Fix SummingMergeTree aggregation for Nested LowCardinality columns
robot-clickhouse added a commit that referenced this pull request Dec 4, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 4, 2025
Cherry pick #90927 to 25.8: Fix SummingMergeTree aggregation for Nested LowCardinality columns
robot-clickhouse added a commit that referenced this pull request Dec 4, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 4, 2025
Cherry pick #90927 to 25.9: Fix SummingMergeTree aggregation for Nested LowCardinality columns
robot-clickhouse added a commit that referenced this pull request Dec 4, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 4, 2025
Cherry pick #90927 to 25.10: Fix SummingMergeTree aggregation for Nested LowCardinality columns
robot-clickhouse added a commit that referenced this pull request Dec 4, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 4, 2025
Cherry pick #90927 to 25.11: Fix SummingMergeTree aggregation for Nested LowCardinality columns
robot-clickhouse added a commit that referenced this pull request Dec 4, 2025
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Dec 4, 2025
clickhouse-gh bot added a commit that referenced this pull request Dec 4, 2025
Backport #90927 to 25.10: Fix SummingMergeTree aggregation for Nested LowCardinality columns
clickhouse-gh bot added a commit that referenced this pull request Dec 4, 2025
Backport #90927 to 25.11: Fix SummingMergeTree aggregation for Nested LowCardinality columns
clickhouse-gh bot added a commit that referenced this pull request Dec 4, 2025
Backport #90927 to 25.8: Fix SummingMergeTree aggregation for Nested LowCardinality columns
clickhouse-gh bot added a commit that referenced this pull request Dec 4, 2025
Backport #90927 to 25.9: Fix SummingMergeTree aggregation for Nested LowCardinality columns
@bobrik bobrik deleted the ivan/summing-merge-tree-nested-low-cardinality branch December 5, 2025 06:08
Avogar added a commit that referenced this pull request Dec 9, 2025
Backport #90927 to 25.3: Fix SummingMergeTree aggregation for Nested LowCardinality columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants