Reduce query memory tracking drift for executable UDF#83929
Merged
azat merged 1 commit intoClickHouse:masterfrom Jul 18, 2025
Merged
Reduce query memory tracking drift for executable UDF#83929azat merged 1 commit intoClickHouse:masterfrom
azat merged 1 commit intoClickHouse:masterfrom
Conversation
Contributor
azat
reviewed
Jul 17, 2025
Member
azat
left a comment
There was a problem hiding this comment.
Thanks, looks good! But the test requires some adjustments
Currently executable functions, when called over multiple blocks during single query execution, will produce a significant overhead in query memory tracker. Mostly, this is caused by the execution of tasks sending data to shell commands: `TimeoutWriteBufferFromFileDescriptor`, which is a part of `SendDataTask` is created in the query thread and increments its memory counter, but then `SendDataTask`s are executed and destroyed in "anonymous" threads from global pool, decrementing only global memory counter, which causes significant difference in tracked memory values at the query level and the global one, and may lead to MEMORY_LIMIT_EXCEEDED errors. This patch partially fixes the memory drift, specifically its part produced by `TimeoutWriteBufferFromFileDescriptor` objects, by attaching data sending threads to the query thread group.
d22ad6b to
fb79ace
Compare
Contributor
Author
Looks unrelated, but I'll still check what caused the failure. |
azat
approved these changes
Jul 18, 2025
Merged
via the queue into
ClickHouse:master
with commit Jul 18, 2025
31e0fc2
120 of 123 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently executable functions, when called over multiple blocks during single query execution, will produce a significant overhead in query memory tracker.
Mostly, this is caused by the execution of tasks sending data to shell commands:
TimeoutWriteBufferFromFileDescriptor, which is a part ofSendDataTaskis created in the query thread and increments its memory counter, but thenSendDataTasks are executed and destroyed in "anonymous" threads from global pool, decrementing only global memory counter, which causes significant difference in tracked memory values at the query level and the global one, and may lead to MEMORY_LIMIT_EXCEEDED errors.This patch partially fixes the memory drift, specifically its part produced by
TimeoutWriteBufferFromFileDescriptorobjects, by attaching data sending threads to the query thread group.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Reduce query memory tracking overhead for executable user-defined functions.
Documentation entry for user-facing changes