Skip to content

Fix coalescing merge tree segfault for large strings #85709

Merged
scanhex12 merged 2 commits intoClickHouse:masterfrom
scanhex12:fix_arena_mt
Aug 19, 2025
Merged

Fix coalescing merge tree segfault for large strings #85709
scanhex12 merged 2 commits intoClickHouse:masterfrom
scanhex12:fix_arena_mt

Conversation

@scanhex12
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix coalescing merge tree segfault for large strings. This closes #84582

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 15, 2025

Workflow [PR], commit [095d70a]

Summary:

job_name test_name status info comment
Stateless tests (amd_tsan, sequential, 1/2) failure
03532_crash_in_aggregation_because_of_lost_exception FAIL
Killed by signal (in clickhouse-server.log or clickhouse-server.err.log) FAIL
Stateless tests (amd_tsan, s3 storage, parallel) failure
02443_detach_attach_partition FAIL
Stateless tests (arm_binary, parallel) failure
01825_new_type_json_ghdata_insert_select FAIL

@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Aug 15, 2025
@antaljanosbenjamin antaljanosbenjamin self-assigned this Aug 15, 2025
@antaljanosbenjamin
Copy link
Copy Markdown
Member

  • 01825_new_type_json_ghdata_insert_select is flaky
  • 02443_detach_attach_partition is flaky

For the OOM kill I have to check the logs.

@antaljanosbenjamin
Copy link
Copy Markdown
Member

antaljanosbenjamin commented Aug 19, 2025

As @nikitamikhaylov pointed out, these memory issues are from the CI Logs cluster, not locally.

So memory tracker was already killing things from the very beginning:

2025.08.18 20:32:27.232460 [ 1628 ] {} <Information> Application: Will watch for the process with pid 1634
2025.08.18 20:32:27.233617 [ 1634 ] {} <Information> Application: Forked a child process to watch
2025.08.18 20:32:27.234095 [ 1634 ] {} <Information> StatusFile: Writing pid 1634 to /etc/clickhouse-server/clickhouse-server.pid
2025.08.18 20:32:27.234728 [ 1634 ] {} <Debug> CrashWriter: Sending crash reports is initialized with https://crash.clickhouse.com/ endpoint (anonymized)
2025.08.18 20:32:27.234913 [ 1634 ] {} <Debug> Application: Sending logical errors is enabled
2025.08.18 20:32:27.737934 [ 1634 ] {} <Information> Application: Starting ClickHouse 25.8.1.1 (revision: 54501, git hash: a06ab022194df35e98ef285cabcbfe4b444103c3, build id: 0559120BED547127CDC20CB27639C1C036DA93FF), PID 1634
...
2025.08.18 20:32:49.825299 [ 2204 ] {BgDistSchPool::0cb96e5d-0265-4976-be16-20bc42b2bebb} <Error> system.opentelemetry_span_log_sender.DistributedInsertQueue.default: Code: 241. DB::Exception: Received from kng4alm55c.us-east-2.aws.clickhouse-staging.com:9440. DB::Exception: User memory limit exceeded: would use 29.84 GiB (attempt to allocate chunk of 1.19 MiB bytes), maximum: 29.80 GiB. OvercommitTracker decision: Memory overcommit has not freed enough memory. Stack trace:

Therefore I think the issue is not connected to the fix, but I will restart the job to be more confident.

The new test I think should have run in that batch, but it didn't run yet. If it fails again, we might have a problem.

@scanhex12 scanhex12 added this pull request to the merge queue Aug 19, 2025
Merged via the queue into ClickHouse:master with commit b499c8d Aug 19, 2025
237 of 242 checks passed
@scanhex12 scanhex12 deleted the fix_arena_mt branch August 19, 2025 09:26
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Aug 19, 2025
@robot-ch-test-poll2 robot-ch-test-poll2 added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Aug 19, 2025
robot-ch-test-poll4 added a commit that referenced this pull request Aug 19, 2025
Cherry pick #85709 to 24.8: Fix coalescing merge tree segfault for large strings
robot-ch-test-poll4 added a commit that referenced this pull request Aug 19, 2025
Cherry pick #85709 to 25.3: Fix coalescing merge tree segfault for large strings
robot-ch-test-poll4 added a commit that referenced this pull request Aug 19, 2025
Cherry pick #85709 to 25.5: Fix coalescing merge tree segfault for large strings
robot-ch-test-poll4 added a commit that referenced this pull request Aug 19, 2025
Cherry pick #85709 to 25.6: Fix coalescing merge tree segfault for large strings
robot-ch-test-poll4 added a commit that referenced this pull request Aug 19, 2025
Cherry pick #85709 to 25.7: Fix coalescing merge tree segfault for large strings
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Aug 19, 2025
clickhouse-gh bot added a commit that referenced this pull request Aug 19, 2025
Backport #85709 to 25.6: Fix coalescing merge tree segfault for large strings
clickhouse-gh bot added a commit that referenced this pull request Aug 19, 2025
Backport #85709 to 25.7: Fix coalescing merge tree segfault for large strings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v25.6-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CoalescingMergeTree insert UTF-8 string SEGV

5 participants