Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try fix 02151_hash_table_sizes_stats.sh test #48178

Merged
merged 2 commits into from
Mar 31, 2023

Conversation

nickitat
Copy link
Member

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

@nickitat nickitat requested a review from tavplubix March 29, 2023 15:45
@nickitat nickitat linked an issue Mar 29, 2023 that may be closed by this pull request
@tavplubix
Copy link
Member

tavplubix commented Mar 29, 2023

Our memory trackers work with sanitizers as well. So why does OOM happen?

@tavplubix tavplubix self-assigned this Mar 29, 2023
@nickitat
Copy link
Member Author

I have no idea, let's at least reduce the amount of noise

@tavplubix
Copy link
Member

tavplubix commented Mar 29, 2023

Let's try to debug it. OOMs like this usually happen when we allocate a huge chunk of memory without checking that it doesn't exceed limitations. In other words, when we call CurrentMemoryTracker::allocNoThrow instead of CurrentMemoryTracker::alloc for huge allocations. In other words, when we use new or malloc for huge allocations without additional checks (malloc and new call allocNoThrow because they cannot throw). I quickly looked through Aggregator and found two suspicious places:

std::unique_ptr<AggregateDataPtr[]> places(new AggregateDataPtr[row_end]);

std::unique_ptr<AggregateDataPtr[]> places(new AggregateDataPtr[row_end]);

Where row_end is actually a block size and can be quite big. I tried increasing max_block_size and got an interesting result:
https://pastila.nl/?00b0c940/f011d9e7124b0c30b20cdb4be9872be6

At first, it logs

[ 982431 ] ... <Debug> MemoryTracker: Current memory usage (for query): 16.00 GiB.

and after that it throw an exception in another thread:

Memory limit (for user) exceeded: would use 16.00 GiB (attempt to allocate chunk of 17179953272 bytes), maximum: 953.67 MiB

The maximum was about 1GiB, but looks like Aggregator had allocated 16GiB.

Can it be the reason? Do we have other (less obvious) places in Aggregator where huge allocations may happen without proper checks?

Copy link
Member

@tavplubix tavplubix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge it to suppress the issue temporarily, but the root cause must be investigated. We definitely have some bugs with memory accounting in Aggregator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-not-for-changelog This PR should not be mentioned in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

02151_hash_table_sizes_stats triggers OOM
3 participants