[C++] Enable fine-grained I/O (coalescing) in IPC reader #28430

asfimport · 2021-05-07T13:44:55Z

ARROW-11772 enables I/O coalescing in the IPC reader, but the reader operates at the granularity of an entire record batch; even if you're loading only a few columns, the entire record batch is read. When on a high-latency file system (e.g. S3), we may be able to get further performance improvement by traversing the schema and reading only the buffers we need to read. This can be combined with coalescing to reduce the number of I/O calls that need to be made.

(Maybe there's another savings here in that instead of traversing the schema every time to figure out the buffer layout, we can do that only once up front and then reuse the layout subsequently?)

While ArrayLoader already appears to perform this optimization, it's being handed an in-memory buffer in the first place, so no savings are accomplished.

Reporter: David Li / @lidavidm
Assignee: Yue Ni / @niyue

Related issues:

[C++][Dataset] Projection pushdown in IPC (feather) format (is duplicated by)
[C++] Add asynchronous read to ipc::RecordBatchFileReader (relates to)
Read out only the required columns from a Feather file on Disk (is related to)
[C++] Enable fine grained IO for async IPC reader (is related to)

PRs and other links:

GitHub Pull Request #11486

_{Note: This issue was originally created as ARROW-12683. Please see the migration documentation for further details.}

asfimport · 2021-11-03T14:25:16Z

David Li / @lidavidm:
Issue resolved by pull request 11486
#11486

asfimport closed this as completed Nov 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++] Enable fine-grained I/O (coalescing) in IPC reader #28430

[C++] Enable fine-grained I/O (coalescing) in IPC reader #28430

asfimport commented May 7, 2021 •

edited

Loading

asfimport commented Nov 3, 2021

[C++] Enable fine-grained I/O (coalescing) in IPC reader #28430

[C++] Enable fine-grained I/O (coalescing) in IPC reader #28430

Comments

asfimport commented May 7, 2021 • edited Loading

Related issues:

PRs and other links:

asfimport commented Nov 3, 2021

asfimport commented May 7, 2021 •

edited

Loading