Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory accounting in case of MEMORY_LIMIT_EXCEEDED errors #40249

Merged
merged 4 commits into from Aug 17, 2022

Conversation

azat
Copy link
Collaborator

@azat azat commented Aug 15, 2022

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix memory accounting in case of MEMORY_LIMIT_EXCEEDED errors (previously [peak] memory usage was takes failed allocations into account)

In case of memory allocation had been failed, it should not be take into
account anywhere, otherwise it will report incorrect memory usage for
query and in logs.

I found this while was digging into OOM in stress tests, and saw memory
allocations that had been significantly greater the memory limit (even
with overcommit), where something interesting happened,
consider the query with failed allocation for 1TiB,
but since ThreadStatus::untracked_memory still includes it,
it will be "freed" in ThreadStatus dtor, however it will not free but allocate,
since untracked_memory > 0, and since the code called from dtor,
exceptions from MemoryTracker will be locked, and amount/peak will be updated.

@azat azat marked this pull request as ready for review August 15, 2022 21:48
@robot-ch-test-poll robot-ch-test-poll added the pr-improvement Pull request with some product improvements label Aug 15, 2022
@KochetovNicolai
Copy link
Member

#39857

azat and others added 2 commits August 16, 2022 18:08
In case of memory allocation had been failed, it should not be take into
account anywhere, otherwise it will report incorrect memory usage for
query and in logs.

I found this while was digging into OOM in stress tests, and saw memory
allocations that had been significantly greater the memory limit (even
with overcommit).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
@azat azat force-pushed the fix-memory-account-on-error branch from 0673f92 to 8e4c967 Compare August 16, 2022 16:10
@azat azat marked this pull request as draft August 16, 2022 22:11
@KochetovNicolai KochetovNicolai marked this pull request as ready for review August 17, 2022 09:46
@KochetovNicolai
Copy link
Member

ASTFuzzer: #39991

@KochetovNicolai KochetovNicolai added the skip mergeable check The label to omit checks for required statuses. Does not help with `pr-feature` new docs check label Aug 17, 2022
@KochetovNicolai KochetovNicolai merged commit 8ec3f08 into ClickHouse:master Aug 17, 2022
@KochetovNicolai
Copy link
Member

@azat I have muted the test cause it is flaky. If you run it a few times in a row, it fails. Like here. I do not know why -- { serverError 241 } cannot mute exception.

@azat
Copy link
Collaborator Author

azat commented Aug 17, 2022

@azat I have muted the test cause it is flaky. If you run it a few times in a row, it fails. Like here. I do not know why -- { serverError 241 } cannot mute exception.

Yeah, I was on it actually (that's why I marked it as draft).
The problem here is not in serverError, but instead that the query indeed uses more memory.
Anyway, the problem should be in a different place, I will came back with a fix later.

/// more. It could be useful to enlarge Exception message in rethrow logic.
Int64 tmp = current_thread->untracked_memory;
current_thread->untracked_memory_limit_increase = current_thread->untracked_memory_limit;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KochetovNicolai found and interesting case here - #40424 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm suggesting #40321 anyway

azat added a commit to azat/ClickHouse that referenced this pull request Aug 22, 2022
It looks useless nowadays, because operator new cannot throw
MEMORY_LIMIT_EXCEEDED today, and so any code that works on Exception is
likely safe.

Refs: ClickHouse#40249
Refs: ClickHouse#24483
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements skip mergeable check The label to omit checks for required statuses. Does not help with `pr-feature` new docs check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants