Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16187: [Go][Parquet] Properly utilize BufferedStream and buffer size when reading #12876

Closed

Conversation

zeroshade
Copy link
Member

Currently the BufferSize in the ReaderProperties isn't utilized properly when enabling BufferedStreams. This fixes that issue so that enabling BufferedStream reading via the properties will correctly utilize the given buffer size when reading. The default buffersize is currently 16K, so reads that are larger than that will ignore the buffering and just pull directly from the underlying reader when BufferedStream is enabled, pulling the entire page or otherwise from the reader if BufferedStream is not enabled.

The buffer size can be set larger so that controlled reads can improve performance on high-latency readers without having to use the memory to read the entire column/page/row group into memory.

@github-actions
Copy link

@emkornfield
Copy link
Contributor

+1 LGTM

@zeroshade zeroshade force-pushed the arrow-16187-parquet-buffered-stream branch from c52e860 to 0d03bbd Compare April 21, 2022 21:26
@zeroshade zeroshade closed this in 8bd5514 Apr 21, 2022
@zeroshade zeroshade deleted the arrow-16187-parquet-buffered-stream branch April 21, 2022 22:34
@ursabot
Copy link

ursabot commented Apr 25, 2022

Benchmark runs are scheduled for baseline = cbf7d76 and contender = 8bd5514. 8bd5514 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.34% ⬆️0.04%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/580| 8bd5514f ec2-t3-xlarge-us-east-2>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/568| 8bd5514f test-mac-arm>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/566| 8bd5514f ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/578| 8bd5514f ursa-thinkcentre-m75q>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/579| cbf7d76a ec2-t3-xlarge-us-east-2>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/567| cbf7d76a test-mac-arm>
[Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/565| cbf7d76a ursa-i9-9960x>
[Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/577| cbf7d76a ursa-thinkcentre-m75q>
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants