Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reading dictionaries from nested structs in ipc StreamReader #1550

Merged
merged 3 commits into from Apr 13, 2022

Conversation

dispanser
Copy link
Contributor

Which issue does this PR close?

Closes #1549.

Rationale for this change

Fixing a bug in the ipc stream reader, aligning with the FileReader implementation.

What changes are included in this PR?

The actual bugfix is 4 characters, but this PR also includes a test to reproduce the bug.

Are there any user-facing changes?

No documetation changes needed.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 12, 2022
@alamb
Copy link
Contributor

alamb commented Apr 12, 2022

Thank you @dispanser ! @viirya I wonder if you might have a chance to review this PR?

@viirya
Copy link
Member

viirya commented Apr 12, 2022

Thanks @dispanser @alamb. Yeah, I will review this today.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix is correct as we do this to fill dictionaries_by_field.

for (i, field) in schema.all_fields().iter().enumerate() {
   dictionaries_by_field[i] = ...
}

Actually, in FileReader, it does the same:

let mut dictionaries_by_field = vec![None; schema.all_fields().len()];

arrow/src/ipc/reader.rs Outdated Show resolved Hide resolved
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, only one minor comment.

@codecov-commenter
Copy link

Codecov Report

Merging #1550 (cd838e5) into master (68038f5) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head cd838e5 differs from pull request most recent head 30f7b30. Consider uploading reports for the commit 30f7b30 to get more accurate results

@@            Coverage Diff             @@
##           master    #1550      +/-   ##
==========================================
+ Coverage   82.82%   82.84%   +0.01%     
==========================================
  Files         190      190              
  Lines       54941    54966      +25     
==========================================
+ Hits        45507    45535      +28     
+ Misses       9434     9431       -3     
Impacted Files Coverage Δ
arrow/src/ipc/reader.rs 88.28% <100.00%> (+0.48%) ⬆️
arrow/src/array/transform/mod.rs 86.46% <0.00%> (+0.11%) ⬆️
parquet/src/encodings/encoding.rs 93.56% <0.00%> (+0.18%) ⬆️
parquet_derive/src/parquet_field.rs 66.43% <0.00%> (+0.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68038f5...30f7b30. Read the comment docs.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you @dispanser

@tustvold tustvold merged commit ffb9b0b into apache:master Apr 13, 2022
@dispanser dispanser deleted the bugfix/fix-nested-dict-ipc-stream branch April 14, 2022 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read dictionary from nested struct in ipc stream reader panics
5 participants