Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid humongous blocks #103340

Merged
merged 5 commits into from Dec 18, 2023
Merged

Avoid humongous blocks #103340

merged 5 commits into from Dec 18, 2023

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Dec 12, 2023

The HeapAttackIT#testFetchTooManyMvLongs was OOM, even though plenty of memory was available and all memory usage was accurately tracked. The issue lies in the wasteful space occupied by humongous objects, specifically blocks with large backing arrays. The space left in regions by these humongous objects remains unused. This PR addresses the problem by forcing blocks to use BigArrays once the memory used by their primitive backing array exceeds a specified threshold - 512KB in this PR.

*/
public final class BooleanBigArrayBlock extends AbstractArrayBlock implements BooleanBlock {

private static final long BASE_RAM_BYTES_USED = 0; // TODO: fix this
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will address this in a follow-up for both BigArrays Vectors and Blocks.

);
int currentLength = array.length;
// TODO: should be ANDed instead?
if (currentLength - valueCount > 1024 || valueCount < (currentLength / 2)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be ANDed instead of ORed. Let's discuss this, and I can make the change in a follow-up if we agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ happy to separate it.

@@ -344,7 +344,6 @@ public void testFetchMvLongs() throws IOException {
assertMap(map, matchesMap().entry("columns", columns));
}

@AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/100528")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can unmute this test now.

@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn marked this pull request as ready for review December 12, 2023 20:05
@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Dec 12, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is a good change. it would be interesting to see what impact it might have on performance, but we can check that post-merge in the nightly benchmarks, since the reliability is the higher-order bit here.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 13, 2023

it would be interesting to see what impact it might have on performance, but we can check that post-merge in the nightly benchmarks, since the reliability is the higher-order bit here

Thanks Chris! Yes, that's very possible. I will keep an eye on the benchmark. If there is a performance impact, then I think we switch to BigArrays blocks only in the build method.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should limit the number of elements in the block rather than move to big arrays? We don't have a good way to cut existing blocks but maybe that'd be better?

private double[] values;
private interface DoubleValues extends Releasable {
void setValue(int index, double value);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about these things slowing things down by making the write bimorphic instead of an array access.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed 5e5330d.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 13, 2023

Maybe we should limit the number of elements in the block rather than move to big arrays?

Thanks, Nik. I think we prefer not to fail the query in this case and we don't have a correct estimate upfront for multi-valued fields. I think we can avoid the write bimorphic. Since the number of concurrent builders is bounded by the number of execution pipepline, it's okay to always use primitive arrays for the block builders. And in the build method, we can decide to switch to BigsArray if its size greater than the threshold. I will update this PR.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only got to take a short look at this PR, but I think the approach is fine. Left a minor remark only.

@dnhatn dnhatn requested a review from nik9000 December 13, 2023 17:27
@alex-spies
Copy link
Contributor

Maybe we should limit the number of elements in the block rather than move to big arrays?

Thanks, Nik. I think we prefer not to fail the query in this case and we don't have a correct estimate upfront for multi-valued fields. I think we can avoid the write bimorphic. Since the number of concurrent builders is bounded by the number of execution pipepline, it's okay to always use primitive arrays for the block builders. And in the build method, we can decide to switch to BigsArray if its size greater than the threshold. I will update this PR.

Hmm, but do we really need to fail the query?

I guess, ideally we would want to emit multiple blocks once a certain total position count of the block is reached, no? (Currently, I think we do this based on position count only, which doesn't work with MVs.)

With BigArray-backed blocks, isn't there a risk that some downstream operator makes wasteful use of memory because it expects a reasonably-sized block that it can process in one go? testFetchTooManyMvLongs only tests a simple query FROM mv_longs, but I'm not sure if that really guarantees that a more complex query couldn't do something bad with humongous blocks, no?

In any case, IMO this is fine as it avoids OOM in more cases, but maybe a longer-term solution would be to cut blocks after a certain size when working with MVs.

@dnhatn
Copy link
Member Author

dnhatn commented Dec 18, 2023

Thanks, everyone! I am going to merge this PR and will keep an eye on the benchmark for the next several days.

@dnhatn dnhatn merged commit 4a583d9 into elastic:main Dec 18, 2023
15 checks passed
@dnhatn dnhatn deleted the big-arrays branch December 18, 2023 15:34
javanna pushed a commit to javanna/elasticsearch that referenced this pull request Dec 21, 2023
The HeapAttackIT#testFetchTooManyMvLongs was OOM, even though plenty of 
memory was available and all memory usage was accurately tracked. The
issue lies in the wasteful space occupied by humongous objects,
specifically blocks with large backing arrays. The space left in regions
by these humongous objects remains unused. This PR addresses the problem
by forcing blocks to use BigArrays once the memory used by their
primitive backing array exceeds a specified threshold - 512KB in this
PR.
navarone-feekery pushed a commit to navarone-feekery/elasticsearch that referenced this pull request Dec 22, 2023
The HeapAttackIT#testFetchTooManyMvLongs was OOM, even though plenty of 
memory was available and all memory usage was accurately tracked. The
issue lies in the wasteful space occupied by humongous objects,
specifically blocks with large backing arrays. The space left in regions
by these humongous objects remains unused. This PR addresses the problem
by forcing blocks to use BigArrays once the memory used by their
primitive backing array exceeds a specified threshold - 512KB in this
PR.
jbaiera pushed a commit to jbaiera/elasticsearch that referenced this pull request Jan 10, 2024
The HeapAttackIT#testFetchTooManyMvLongs was OOM, even though plenty of 
memory was available and all memory usage was accurately tracked. The
issue lies in the wasteful space occupied by humongous objects,
specifically blocks with large backing arrays. The space left in regions
by these humongous objects remains unused. This PR addresses the problem
by forcing blocks to use BigArrays once the memory used by their
primitive backing array exceeds a specified threshold - 512KB in this
PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:QL (Deprecated) Meta label for query languages team v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants