Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: colexec: partially fix memory accounting in the external sorter #61016

Merged
merged 2 commits into from Feb 24, 2021

Conversation

yuzefovich
Copy link
Member

Backport 1/4 commits from #60284.
Backport 1/2 commits from #60593.

/cc @cockroachdb/release


colexec: remove double-counting of some memory in external sort

Previously, we would be registering the memory usage of the partition
that is currently being spilled twice:

  • in a standalone memory account in inputPartitioningOperator, this is
    needed to know when to start a new partition
  • in the in-memory sorter used by the external sort to sort each
    partition, this would happen during "spooling" phase.

However, the first usage is "fake" - we don't actually buffer all of the
data flowing through the input partitioning operator, so we would
effectively double-count some memory. This commit fixes that behavior by
removing the memory account altogether because a simple variable is
sufficient to determine where the partitions' boundaries are.

Release note: None

colexec: register memory used by dequeued batches from partitioned queue

Previously, we forgot to perform the memory accounting of the batches
that are dequeued from the partitions in the external sort (which could
be substantial when we're merging multiple partitions at once and the
tuples are wide) and in the hash based partitioner. This is now fixed.

Additionally, this commit retains references to some internal operators
in the external sort in order to reuse the memory under the dequeued
batches (this will be beneficial if we perform repeated merging).

Also, this commit fixes an issue with repeated re-initializing of the
disk-backed operators in the disk spiller if the latter has been reset
(the problem would lead to redundant allocations and not reusing of the
available memory).

Slight complication with accounting was because of the fact that we were
using the same allocator for all usages. This would be quite wrong
because in the merge phase we have two distinct memory usage with
different lifecycles - the memory under the dequeued batches is kept
(and reused later) whereas the memory under the output batch of the
ordered synchronizer is released. We now correctly handle these
lifecycles by using separate allocators.

Release note (bug fix): CockroachDB previously didn't account for some
RAM used when disk-spilling operations (like sorts and hash joins) were
using the temporary storage in the vectorized execution engine. This
could result in OOM crashes, especially when the rows are large in size.

Previously, we would be registering the memory usage of the partition
that is currently being spilled twice:
- in a standalone memory account in `inputPartitioningOperator`, this is
needed to know when to start a new partition
- in the in-memory sorter used by the external sort to sort each
partition, this would happen during "spooling" phase.

However, the first usage is "fake" - we don't actually buffer all of the
data flowing through the input partitioning operator, so we would
effectively double-count some memory. This commit fixes that behavior by
removing the memory account altogether because a simple variable is
sufficient to determine where the partitions' boundaries are.

Release note: None
Previously, we forgot to perform the memory accounting of the batches
that are dequeued from the partitions in the external sort (which could
be substantial when we're merging multiple partitions at once and the
tuples are wide) and in the hash based partitioner. This is now fixed.

Additionally, this commit retains references to some internal operators
in the external sort in order to reuse the memory under the dequeued
batches (this will be beneficial if we perform repeated merging).

Also, this commit fixes an issue with repeated re-initializing of the
disk-backed operators in the disk spiller if the latter has been reset
(the problem would lead to redundant allocations and not reusing of the
available memory).

Slight complication with accounting was because of the fact that we were
using the same allocator for all usages. This would be quite wrong
because in the merge phase we have two distinct memory usage with
different lifecycles - the memory under the dequeued batches is kept
(and reused later) whereas the memory under the output batch of the
ordered synchronizer is released. We now correctly handle these
lifecycles by using separate allocators.

Release note (bug fix): CockroachDB previously didn't account for some
RAM used when disk-spilling operations (like sorts and hash joins) were
using the temporary storage in the vectorized execution engine. This
could result in OOM crashes, especially when the rows are large in size.
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich
Copy link
Member Author

I picked these two commits for a backport because the first one has very low risk but is a very nice cleanup (and makes backporting the second easier) and the second commit is omission of some memory accounting entirely.

@yuzefovich yuzefovich merged commit 32b5305 into cockroachdb:release-20.2 Feb 24, 2021
@yuzefovich yuzefovich deleted the backport20.2-60284 branch February 24, 2021 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants