Skip to content

feat(spill): use memory pool for arrow operations#270

Merged
lucasfang merged 7 commits into
mainfrom
codex/arrow-memory-pool
May 12, 2026
Merged

feat(spill): use memory pool for arrow operations#270
lucasfang merged 7 commits into
mainfrom
codex/arrow-memory-pool

Conversation

@zjw1111
Copy link
Copy Markdown
Collaborator

@zjw1111 zjw1111 commented May 11, 2026

Purpose

Linked issue: N/A

Route Arrow sort and spill IPC operations through the project MemoryPool so allocations are accounted consistently during in-memory key-value sorting and external spill read/write paths.

Tests

  • cmake --build build --target paimon-core-test

API and Format

No public API, storage format, or protocol changes.

Documentation

No new feature documentation required.

Generative AI tooling

Generated-by: OpenAI Codex

@zjw1111 zjw1111 marked this pull request as ready for review May 11, 2026 06:40
Copilot AI review requested due to automatic review settings May 11, 2026 06:40
@zjw1111 zjw1111 changed the title fix(mergetree): use memory pool for arrow operations feat(spill): use memory pool for arrow operations May 11, 2026
Comment thread src/paimon/core/mergetree/spill_writer.cpp
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Routes Arrow IPC spill read/write and in-memory Arrow compute sorting through the project MemoryPool so Arrow allocations are consistently tracked during merge-tree external sorting / spill workflows.

Changes:

  • Thread MemoryPool into SpillWriter creation and configure Arrow IPC write options to use it.
  • Configure Arrow IPC read options to use the reader’s Arrow memory pool.
  • Use an arrow::compute::ExecContext backed by the project pool for SortIndices.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/paimon/core/mergetree/spill_writer.h Extends SpillWriter API to accept a MemoryPool and stores an Arrow pool pointer.
src/paimon/core/mergetree/spill_writer.cpp Creates Arrow pool from project pool and sets Arrow IPC write options to use it.
src/paimon/core/mergetree/spill_reader.cpp Sets Arrow IPC read options to use the reader’s memory pool.
src/paimon/core/mergetree/external_sort_buffer.cpp Passes the buffer’s pool into SpillWriter::Create.
src/paimon/core/mergetree/spill_reader_writer_test.cpp Updates test helper to pass pool_ into SpillWriter::Create.
src/paimon/core/io/key_value_in_memory_record_reader.h Stores an Arrow memory pool for Arrow compute operations.
src/paimon/core/io/key_value_in_memory_record_reader.cpp Uses pooled ExecContext for arrow::compute::SortIndices.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/paimon/core/mergetree/spill_writer.cpp Outdated
Comment thread src/paimon/core/mergetree/spill_writer.h
Comment thread src/paimon/core/io/key_value_in_memory_record_reader.h
Comment thread src/paimon/core/io/key_value_in_memory_record_reader.cpp
Comment thread src/paimon/core/mergetree/spill_writer.cpp
@zjw1111 zjw1111 force-pushed the codex/arrow-memory-pool branch from 440583a to 5547af2 Compare May 11, 2026 10:12
@alibaba alibaba deleted a comment from CLAassistant May 11, 2026
Copy link
Copy Markdown
Collaborator

@lxy-9602 lxy-9602 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Collaborator

@lucasfang lucasfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@lucasfang lucasfang merged commit 36e1e94 into main May 12, 2026
19 checks passed
@zjw1111 zjw1111 deleted the codex/arrow-memory-pool branch May 21, 2026 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants