Skip to content

Add ability limit amount of jemalloc profile flushes on MEMORY_LIMIT_EXCEEDED per time interval#101396

Merged
azat merged 2 commits intoClickHouse:masterfrom
azat:jemalloc_flush_profile_on_memory_exceeded_interval
Apr 2, 2026
Merged

Add ability limit amount of jemalloc profile flushes on MEMORY_LIMIT_EXCEEDED per time interval#101396
azat merged 2 commits intoClickHouse:masterfrom
azat:jemalloc_flush_profile_on_memory_exceeded_interval

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Mar 31, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add ability limit amount of jemalloc profile flushes on MEMORY_LIMIT_EXCEEDED per time interval

@azat azat requested a review from antonio2368 March 31, 2026 14:04
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 31, 2026

Workflow [PR], commit [f1f37e2]

Summary:

job_name test_name status info comment
Stateless tests (amd_asan_ubsan, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
Stateless tests (amd_tsan, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
Stateless tests (amd_debug, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
Stress test (arm_msan) failure
Server died FAIL cidb
MemorySanitizer: use-of-uninitialized-value (STID: 2410-47e0) FAIL cidb

AI Review

Summary

This PR adds a server setting jemalloc_flush_profile_on_memory_exceeded_interval and wires it through Server into MemoryTracker to rate-limit jemalloc profile dumps during MEMORY_LIMIT_EXCEEDED. The implementation is small and localized, and I did not find new correctness, safety, or performance issues introduced by the current PR head.

Missing context
  • ⚠️ CI is still running (several required checks are pending), so runtime validation is not complete yet.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Mar 31, 2026
Comment thread src/Common/MemoryTracker.cpp Outdated
@antonio2368 antonio2368 self-assigned this Mar 31, 2026
Comment thread src/Common/MemoryTracker.cpp Outdated
@azat azat force-pushed the jemalloc_flush_profile_on_memory_exceeded_interval branch from 8a741e9 to 4386a95 Compare April 1, 2026 09:36
Comment thread src/Common/MemoryTracker.cpp Outdated
{
auto flushed_profile = DB::Jemalloc::flushProfile(flush_prefix);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushProfile calls checkProfilingEnabled and can throw when ClickHouse is started without jemalloc profiling (opt.prof=false).

In this MEMORY_LIMIT_EXCEEDED path, that would replace the intended memory-limit exception with BAD_ARGUMENTS, which is misleading and can break callers that rely on MEMORY_LIMIT_EXCEEDED handling.

Please keep the old guard (prof.active / opt.prof) before calling flushProfile, or catch and log profiling-related exceptions here so the original memory-limit flow is preserved.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it is better to keep it, since we do not enable jemalloc for non-Linux, I will do this separately, for now I will get the original check back.

@azat azat enabled auto-merge April 2, 2026 10:25
@azat azat force-pushed the jemalloc_flush_profile_on_memory_exceeded_interval branch from 4386a95 to f1f37e2 Compare April 2, 2026 10:37
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 2, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.10% +0.10%
Functions 90.90% 90.90% +0.00%
Branches 76.50% 76.60% +0.10%

Changed lines: 72.41% (42/58) | lost baseline coverage: 13 line(s) · Uncovered code

Full report · Diff report

@azat azat added this pull request to the merge queue Apr 2, 2026
Merged via the queue into ClickHouse:master with commit 1d1c667 Apr 2, 2026
158 of 163 checks passed
@azat azat deleted the jemalloc_flush_profile_on_memory_exceeded_interval branch April 2, 2026 17:28
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants