Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Fixed error in reading nested parquet structs #1015

Merged
merged 3 commits into from
May 27, 2022

Conversation

jorgecarleitao
Copy link
Owner

Closes #1014 - thanks @ahmedriza for the report!

@jorgecarleitao jorgecarleitao added the bug Something isn't working label May 27, 2022
@jorgecarleitao jorgecarleitao changed the title Fix error in reading nested structs Fix error in reading nested parquet structs May 27, 2022
@codecov
Copy link

codecov bot commented May 27, 2022

Codecov Report

Merging #1015 (fb6765c) into main (bbe7209) will increase coverage by 0.00%.
The diff coverage is 73.40%.

@@           Coverage Diff           @@
##             main    #1015   +/-   ##
=======================================
  Coverage   71.68%   71.69%           
=======================================
  Files         359      359           
  Lines       19860    19880   +20     
=======================================
+ Hits        14237    14252   +15     
- Misses       5623     5628    +5     
Impacted Files Coverage Δ
src/io/parquet/read/deserialize/binary/mod.rs 100.00% <ø> (ø)
src/io/parquet/read/deserialize/boolean/mod.rs 100.00% <ø> (ø)
src/io/parquet/read/deserialize/primitive/mod.rs 100.00% <ø> (ø)
src/io/parquet/read/deserialize/mod.rs 47.73% <57.69%> (+1.29%) ⬆️
src/io/parquet/read/deserialize/nested_utils.rs 74.47% <91.30%> (+0.25%) ⬆️
src/io/parquet/read/deserialize/struct_.rs 87.50% <93.75%> (+4.16%) ⬆️
src/io/parquet/read/deserialize/binary/nested.rs 61.25% <100.00%> (ø)
src/io/parquet/read/deserialize/boolean/nested.rs 62.74% <100.00%> (ø)
...rc/io/parquet/read/deserialize/primitive/nested.rs 73.61% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bbe7209...fb6765c. Read the comment docs.

@ahmedriza
Copy link

ahmedriza commented May 27, 2022

Thanks a lot @jorgecarleitao, I think this is still in progress. The reading looks good now. Getting an assertion failure during the write:

Writing /tmp/two_level_nested_verify.parquet
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `3`,
 right: `1`', /home/ahmed/.cargo/git/checkouts/arrow2-8a2ad61d97265680/f10a626/src/io/parquet/write/pages.rs:210:5

@jorgecarleitao
Copy link
Owner Author

That is expected - I need to improve the error message but the issue is that we pass vec![Encoding::Plain] to a field that has multiple columns - we require one encoding per parquet column.

@jorgecarleitao jorgecarleitao merged commit 7cc874f into main May 27, 2022
@jorgecarleitao jorgecarleitao deleted the fix_parquet_nest_read branch May 27, 2022 17:54
@jorgecarleitao jorgecarleitao changed the title Fix error in reading nested parquet structs Fixed error in reading nested parquet structs May 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in reading Nested Parquet
2 participants