ESQL: Limit memory usage of `fold` (#118602) #120100

nik9000 · 2025-01-14T11:08:41Z

fold can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a Literal. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the CircuitBreaker infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't Releasable, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

nik9000 added backport :Analytics/ES|QL AKA ESQL v8.18.0 labels Jan 14, 2025

nik9000 mentioned this pull request Jan 14, 2025

ESQL: Limit memory usage of fold #118602

Merged

nik9000 enabled auto-merge (squash) January 14, 2025 15:13

alex-spies added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jan 15, 2025

alex-spies approved these changes Jan 15, 2025

View reviewed changes

nik9000 merged commit a61670e into elastic:8.x Jan 15, 2025
15 checks passed

nik9000 deleted the fold_ctx_2_8x branch January 15, 2025 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ESQL: Limit memory usage of `fold` (#118602) #120100

ESQL: Limit memory usage of `fold` (#118602) #120100

Uh oh!

nik9000 commented Jan 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ESQL: Limit memory usage of fold (#118602) #120100

ESQL: Limit memory usage of fold (#118602) #120100

Uh oh!

Conversation

nik9000 commented Jan 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ESQL: Limit memory usage of `fold` (#118602) #120100

ESQL: Limit memory usage of `fold` (#118602) #120100