-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader #2616
Conversation
…tables from the same source stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata changes look good to me
@@ -53,7 +52,7 @@ export class RecordBatchMetadata extends Message { | |||
public buffers: BufferMetadata[]; | |||
constructor(version: MetadataVersion, length: Long | number, nodes: FieldMetadata[], buffers: BufferMetadata[], bodyLength?: Long | number) { | |||
if (bodyLength === void(0)) { | |||
bodyLength = buffers.reduce((s, b) => align(s + b.length + (b.offset - s), 8), 0); | |||
bodyLength = buffers.reduce((bodyLength, buffer) => bodyLength + buffer.length, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are your buffer lengths here guaranteed to a multiple of 8? Otherwise should round up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this size is rounded up in the writer here
Cool, @TheNeuralBit does this look good? |
👍 LGTM |
Fixes https://issues.apache.org/jira/browse/ARROW-3336. Realized too late I branched from the #2616, so that's why there's some extra commits at the front. Check out the relevant tests here: https://github.com/trxcllnt/arrow/blob/860f61046fca0081dbe0ce986a97408ed5934e22/js/test/unit/writer-tests.ts#L26. This test covers the primitive, nested, and dictionary vectors, but it'd be worth adding more (for example, Unions). Author: ptaylor <paul.e.taylor@me.com> Closes #2638 from trxcllnt/js-fix-writing-sliced-arrays and squashes the following commits: bcd58ca <ptaylor> fix writer-tests.ts import fa4d8f5 <ptaylor> only use vector offset to serialize bitmaps and data buffers of variable-width vectors, as all other buffers have already been adjusted as part of the original call to Vector#slice() 5ece080 <ptaylor> reset internal children Array in case the childData for each has been sliced e093f7c <ptaylor> add test validating writer serializes sliced arrays correctly f9dd91b <ptaylor> move vector comparison extension into separate module 380d655 <ptaylor> fix table constructor args
Yield all messages from the stream reader to support reading multiple tables from the same source stream. This is something a stream-reader refactor should support out of the box, but this PR adds this ability to master until that refactor is done. I added an integration test that shows what such a reader would look like today.