Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader #2616

Closed

Conversation

trxcllnt
Copy link
Contributor

Yield all messages from the stream reader to support reading multiple tables from the same source stream. This is something a stream-reader refactor should support out of the box, but this PR adds this ability to master until that refactor is done. I added an integration test that shows what such a reader would look like today.

@trxcllnt trxcllnt changed the title ARROW-3304: [JS]: yield all messages emit all messages from the stream reader ARROW-3304: [JS] yield all messages emit all messages from the stream reader Sep 24, 2018
@trxcllnt trxcllnt changed the title ARROW-3304: [JS] yield all messages emit all messages from the stream reader ARROW-3304: [JS] yield all messages from the stream reader Sep 24, 2018
Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata changes look good to me

@@ -53,7 +52,7 @@ export class RecordBatchMetadata extends Message {
public buffers: BufferMetadata[];
constructor(version: MetadataVersion, length: Long | number, nodes: FieldMetadata[], buffers: BufferMetadata[], bodyLength?: Long | number) {
if (bodyLength === void(0)) {
bodyLength = buffers.reduce((s, b) => align(s + b.length + (b.offset - s), 8), 0);
bodyLength = buffers.reduce((bodyLength, buffer) => bodyLength + buffer.length, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are your buffer lengths here guaranteed to a multiple of 8? Otherwise should round up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this size is rounded up in the writer here

@wesm
Copy link
Member

wesm commented Sep 26, 2018

Cool, @TheNeuralBit does this look good?

@TheNeuralBit TheNeuralBit changed the title ARROW-3304: [JS] yield all messages from the stream reader ARROW-3256,3304: [JS] fix file footer inconsistency, yield all messages from the stream reader Sep 26, 2018
@TheNeuralBit
Copy link
Member

👍 LGTM

@wesm wesm closed this in c1cf985 Sep 27, 2018
trxcllnt added a commit that referenced this pull request Sep 27, 2018
Fixes https://issues.apache.org/jira/browse/ARROW-3336. Realized too late I branched from the #2616, so that's why there's some extra commits at the front.

Check out the relevant tests here: https://github.com/trxcllnt/arrow/blob/860f61046fca0081dbe0ce986a97408ed5934e22/js/test/unit/writer-tests.ts#L26. This test covers the primitive, nested, and dictionary vectors, but it'd be worth adding more (for example, Unions).

Author: ptaylor <paul.e.taylor@me.com>

Closes #2638 from trxcllnt/js-fix-writing-sliced-arrays and squashes the following commits:

bcd58ca <ptaylor> fix writer-tests.ts import
fa4d8f5 <ptaylor> only use vector offset to serialize bitmaps and data buffers of variable-width vectors, as all other buffers have already been adjusted as part of the original call to Vector#slice()
5ece080 <ptaylor> reset internal children Array in case the childData for each has been sliced
e093f7c <ptaylor> add test validating writer serializes sliced arrays correctly
f9dd91b <ptaylor> move vector comparison extension into separate module
380d655 <ptaylor> fix table constructor args
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants