Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Shrink compressed Arrow IPC buffer #36293

Closed
marin-ma opened this issue Jun 26, 2023 · 0 comments · Fixed by #36294
Closed

[C++] Shrink compressed Arrow IPC buffer #36293

marin-ma opened this issue Jun 26, 2023 · 0 comments · Fixed by #36294

Comments

@marin-ma
Copy link
Contributor

Describe the enhancement requested

We use the Arrow IPC data format for shuffle operation in our project https://github.com/oap-project/gluten. To evaluate the performance, we run TPCH benchmarks of SF2T on a bare metal machine. By immediately shrinking the buffer after buffer compression, we observed that the average memory usage decreases 31%, and the total runtime decreases 7%.

Component(s)

C++

pitrou pushed a commit that referenced this issue Jun 28, 2023
…er and shrink after compression (#36294)

### Rationale for this change

Described in issue #36293 #34025.

### What changes are included in this PR?

* Allocate buffer for compressed data using the memory pool given by the user
* Shrink compressed data buffer after compression to conserve memory, as the compressed data might be much smaller than the theoretical max compressed data size

### Are these changes tested?

Covered by existing tests.

### Are there any user-facing changes?

No.

* Closes: #36293
* Closes: #34025

Authored-by: Rong Ma <rong.ma@intel.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 13.0.0 milestone Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants