Skip to content

Don't delete default rows in coalescing merge tree#83892

Merged
scanhex12 merged 12 commits intoClickHouse:masterfrom
scanhex12:fix_sum_mt
Jul 24, 2025
Merged

Don't delete default rows in coalescing merge tree#83892
scanhex12 merged 12 commits intoClickHouse:masterfrom
scanhex12:fix_sum_mt

Conversation

@scanhex12
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

This closes #81303

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jul 17, 2025

Workflow [PR], commit [22b37f3]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Jul 17, 2025
@scanhex12 scanhex12 changed the title Don't delete default rows in summing merge tree Don't delete default rows in summing/coalescing merge tree Jul 17, 2025
@UnamedRus
Copy link
Copy Markdown
Contributor

For SummingMergeTree it was kinda intended and expected behavior, not sure that we should change it in such way.

@scanhex12 scanhex12 marked this pull request as draft July 17, 2025 10:54
@scanhex12 scanhex12 marked this pull request as ready for review July 17, 2025 11:20
@scanhex12 scanhex12 requested a review from jkartseva July 17, 2025 11:46
@den-crane
Copy link
Copy Markdown
Contributor

den-crane commented Jul 17, 2025

@scanhex12 what about this

CREATE TABLE test_table
(
    key UInt32,
    A Nullable(UInt32),
    B Nullable(String)
)
ENGINE = CoalescingMergeTree()
ORDER BY key;

INSERT INTO test_table Values(1, 1,  'x');
SELECT * FROM test_table final;
   +-key-+-A-+-B-+
1. |   1 | 1 | x |
   +-----+---+---+

INSERT INTO test_table Values(1, 2,  'y');
SELECT * FROM test_table final;
   +-key-+-A-+-B-+
1. |   1 | 2 | x |                               -----<<<--- expected 'y'
   +-----+---+---+

INSERT INTO test_table Values(1, 3,  'z');
SELECT * FROM test_table final;
   +-key-+-A-+-B-+
1. |   1 | 3 | x |                               -----<<<--- expected 'z'
   +-----+---+---+

https://fiddle.clickhouse.com/16be1755-560e-44f0-be6f-9220bad086f5

@scanhex12 scanhex12 changed the title Don't delete default rows in summing/coalescing merge tree Don't delete default rows in coalescing merge tree Jul 17, 2025
@jkartseva jkartseva self-assigned this Jul 17, 2025
@den-crane
Copy link
Copy Markdown
Contributor

den-crane commented Jul 17, 2025

Also the documentation is very vague about requirements of Nullable columns and about a behaviour without Nullable types.

Examples are misleading

CREATE TABLE test_table
(
    key UInt32,
    value UInt32
)
ENGINE = CoalescingMergeTree()
ORDER BY key

INSERT INTO test_table VALUES(1,NULL)      -- in reality Null will be converted to 0 by the insert.

I would expect different results:

CREATE TABLE test_table
(
    key UInt32,
    value UInt32
)
ENGINE = CoalescingMergeTree()
ORDER BY key;

INSERT INTO test_table VALUES(1,6);   
INSERT INTO test_table VALUES(1,NULL); 

select * from test_table final;
┌─key─┬─value─┐
│   1 │     0 │
└─────┴───────┘


CREATE TABLE test_table
(
    key UInt32,
    value Nullable(UInt32)
)
ENGINE = CoalescingMergeTree()
ORDER BY key;

INSERT INTO test_table VALUES(1,6);   
INSERT INTO test_table VALUES(1,NULL); 

select * from test_table final;
┌─key─┬─value─┐
│   1 │     6 │
└─────┴───────┘

@jkartseva
Copy link
Copy Markdown
Member

jkartseva commented Jul 18, 2025

I think that the results should be consistent with the coalesce function:

ip-172-31-8-86.us-west-1.compute.internal :) select coalesce(6, NULL)

SELECT coalesce(6, NULL)

Query id: 20164203-d7bb-4817-bde9-428a66e0a563

   ┌─coalesce(6, NULL)─┐
1. │                 6 │
   └───────────────────┘

1 row in set. Elapsed: 0.002 sec.

ip-172-31-8-86.us-west-1.compute.internal :)

(with the difference that the latest is used instead of the leftmost)

@jkartseva
Copy link
Copy Markdown
Member

The behavior is different depending on whether the data is inserted in a single batch or not: https://pastila.nl/?02fba128/cc7bbde5979bde7d70370afe6644caf3#oD+cHhOAh/vr4virt1xzHg==

@den-crane
Copy link
Copy Markdown
Contributor

den-crane commented Jul 18, 2025

One more example.
I would expect similar behavior (at least it's a desired behavior of an upsert) from AggregatingMergeTree+SimpleAggregateFunction(anyLast and CoalescingMergeTree

https://fiddle.clickhouse.com/3a33d254-c620-4401-acbf-a246cdcc4949

@den-crane
Copy link
Copy Markdown
Contributor

@den-crane
Copy link
Copy Markdown
Contributor

@timxian
Copy link
Copy Markdown

timxian commented Jul 22, 2025

May I know when this fix will be released?

@scanhex12
Copy link
Copy Markdown
Member Author

@timxian @jkartseva @den-crane I added all your cases and updated the documentation, now everything should work. Please check.

Copy link
Copy Markdown
Member

@jkartseva jkartseva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@scanhex12 scanhex12 enabled auto-merge July 24, 2025 10:42
@scanhex12 scanhex12 added this pull request to the merge queue Jul 24, 2025
Merged via the queue into ClickHouse:master with commit ff1d2c8 Jul 24, 2025
124 checks passed
@scanhex12 scanhex12 deleted the fix_sum_mt branch July 24, 2025 14:46
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jul 24, 2025
@bharatnc
Copy link
Copy Markdown
Contributor

@scanhex12 I'll add a backport for this bug-fix so that it can be fixed in v25.6 and later (when CoalescingMergeTree seems to be first available).

@bharatnc bharatnc added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Jul 31, 2025
@robot-ch-test-poll robot-ch-test-poll added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Jul 31, 2025
robot-clickhouse-ci-1 added a commit that referenced this pull request Jul 31, 2025
Cherry pick #83892 to 25.7: Don't delete default rows in coalescing merge tree
clickhouse-gh bot added a commit that referenced this pull request Jul 31, 2025
Backport #83892 to 25.7: Don't delete default rows in coalescing merge tree
robot-ch-test-poll4 added a commit that referenced this pull request Aug 2, 2025
Cherry pick #83892 to 25.6: Don't delete default rows in coalescing merge tree
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Aug 2, 2025
clickhouse-gh bot added a commit that referenced this pull request Aug 8, 2025
Backport #83892 to 25.6: Don't delete default rows in coalescing merge tree
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CoalescingMergeTree is unable to store default values, like zeros or empty strings

8 participants