Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve internal row to columnar host memory by using a combined spillable buffer #10450

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

jbrennan333
Copy link
Collaborator

This improves the host memory oom handling for InternalRowToColumnarBatchIterator by using a single spillable host buffer for both the data and offsets, and then slicing it to pull out the individual host buffers whenever we need to operate on them.

This simplifies the code and makes the host memory handling work better. Previously we were allocating one host buffer and making it spillable, and then allocating another buffer and making it spillable, all inside a single withRetry block. By doing this, we were sometimes able to allocate the second only because we spilled the first, and we really need both or nothing. When we later tried to get both host buffers from the two separate spillable buffers, we more frequently hit OOM with no retries.

Prior to this change, I needed about 32GB host memory limit run q75 and q10 with ShuffleExchangeExec operator disabled on my desktop at scale 100 with 16 executor cores. With this change, I can run them with a 12GB host memory limit.

I've run unit tests and integration tests for this and tested q10 and q75 locally.
I am in the process of running the full nds benchmark with host memory restricted.

…lable buffer

Signed-off-by: Jim Brennan <jimb@nvidia.com>
@jbrennan333 jbrennan333 added improve reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Feb 20, 2024
@jbrennan333 jbrennan333 self-assigned this Feb 20, 2024
@jbrennan333 jbrennan333 linked an issue Feb 20, 2024 that may be closed by this pull request
@jbrennan333 jbrennan333 linked an issue Feb 20, 2024 that may be closed by this pull request
@jbrennan333
Copy link
Collaborator Author

build

@jbrennan333
Copy link
Collaborator Author

I have run power run at scale 100 on my desktop with ShuffleExchangeExec disabled (to force more row to columnar conversions) and with host memory limited to 24g. I am not seeing any ooms in InternalRowToColumnarBatchIterator, but I do see some in other places (ColumnarToRowIterator.loadNextBatch and AcceleratedColumnarToRowIterator.setupBatchAndClose). I don't see these failures if I don't disable ShuffleExchangeExec.

@jbrennan333
Copy link
Collaborator Author

I was able to run the full power-run at scale 100 with host memory limited to 8g with no ooms (this was without disabling ShuffleExchangeExec, so it doesn't test this change much - just following up on my previous comment.

@jbrennan333 jbrennan333 merged commit d2c9ed4 into NVIDIA:branch-24.04 Feb 21, 2024
40 of 41 checks passed
@sameerz
Copy link
Collaborator

sameerz commented Feb 24, 2024

Was there any performance impact seen when running the NDS tests?

@jbrennan333
Copy link
Collaborator Author

Was there any performance impact seen when running the NDS tests?

This code is not normally exercised with nds, unless I disable the ShuffleExchangeExec operator. I did not do a performance comparison for this. I think it is unlikely that there would be any noticeable change except in restricted host memory situations, where this will do better because it reduces thrashing when tasks get only one of the buffers they need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improve reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add Host Memory Retry for Row to Columnar Conversion
3 participants