feat(spill): use memory pool for arrow operations#270
Merged
Conversation
lxy-9602
reviewed
May 11, 2026
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Routes Arrow IPC spill read/write and in-memory Arrow compute sorting through the project MemoryPool so Arrow allocations are consistently tracked during merge-tree external sorting / spill workflows.
Changes:
- Thread
MemoryPoolintoSpillWritercreation and configure Arrow IPC write options to use it. - Configure Arrow IPC read options to use the reader’s Arrow memory pool.
- Use an
arrow::compute::ExecContextbacked by the project pool forSortIndices.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/paimon/core/mergetree/spill_writer.h | Extends SpillWriter API to accept a MemoryPool and stores an Arrow pool pointer. |
| src/paimon/core/mergetree/spill_writer.cpp | Creates Arrow pool from project pool and sets Arrow IPC write options to use it. |
| src/paimon/core/mergetree/spill_reader.cpp | Sets Arrow IPC read options to use the reader’s memory pool. |
| src/paimon/core/mergetree/external_sort_buffer.cpp | Passes the buffer’s pool into SpillWriter::Create. |
| src/paimon/core/mergetree/spill_reader_writer_test.cpp | Updates test helper to pass pool_ into SpillWriter::Create. |
| src/paimon/core/io/key_value_in_memory_record_reader.h | Stores an Arrow memory pool for Arrow compute operations. |
| src/paimon/core/io/key_value_in_memory_record_reader.cpp | Uses pooled ExecContext for arrow::compute::SortIndices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
440583a to
5547af2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: N/A
Route Arrow sort and spill IPC operations through the project
MemoryPoolso allocations are accounted consistently during in-memory key-value sorting and external spill read/write paths.Tests
cmake --build build --target paimon-core-testAPI and Format
No public API, storage format, or protocol changes.
Documentation
No new feature documentation required.
Generative AI tooling
Generated-by: OpenAI Codex