You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When writing RecordBatches using ArrowStreamWriter, if the ArrowBuffers being written aren't all 8 byte aligned, the serialized RecordBatch won't conform to the Arrow specification. This leads to other languages' readers to throw an error when reading Arrow streams written by the C# writer.
For example, if reading the stream from Python or C++, an error is raised here:
It is required to have all the contiguous memory buffers in an IPC payload aligned at 8-byte boundaries. In other words, each buffer must start at an aligned 8-byte offset. Additionally, each buffer should be padded to a multiple of 8 bytes.
When writing RecordBatches using ArrowStreamWriter, if the ArrowBuffers being written aren't all 8 byte aligned, the serialized RecordBatch won't conform to the Arrow specification. This leads to other languages' readers to throw an error when reading Arrow streams written by the C# writer.
For example, if reading the stream from Python or C++, an error is raised here:
arrow/cpp/src/arrow/ipc/reader.cc
Lines 107 to 110 in f77c342
A similar error is raised when Java tries to read the stream.
We should be ensuring that the buffers being written to the stream are padded to 8 bytes, no matter their length, as specified in https://arrow.apache.org/docs/format/Layout.html#requirements-goals-and-non-goals
Reporter: Eric Erhardt / @eerhardt
Assignee: Eric Erhardt / @eerhardt
PRs and other links:
Note: This issue was originally created as ARROW-5908. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: