Skip to content

ARROW-6895: [C++][Parquet] Do not reset dictionary in ByteArrayDictionaryRecordReader during incremental reads#6460

Closed
adamhooper wants to merge 3 commits intoapache:masterfrom
adamhooper:ARROW-6895
Closed

ARROW-6895: [C++][Parquet] Do not reset dictionary in ByteArrayDictionaryRecordReader during incremental reads#6460
adamhooper wants to merge 3 commits intoapache:masterfrom
adamhooper:ARROW-6895

Conversation

@adamhooper
Copy link
Contributor

Previously, ByteArrayDictionaryRecordReader would clear the dictionary
after returning a chunk. Now, it persists the dictionary in case a
future chunk needs it.

Sorry -- I feel I'm missing the expertise to write a unit test. But I can attest that in Arrow 0.16 the code snippet in ARROW-6895 segfaults on the bad.parquet attached to that issue; and after this patch, it works correctly.

@github-actions
Copy link

@wesm wesm changed the title ARROW-6895: [C++] handle dictionary deltas ARROW-6895: [C++][Parquet] Do not reset dictionary in ByteArrayDictionaryRecordReader during incremental reads Feb 20, 2020
@wesm
Copy link
Member

wesm commented Feb 20, 2020

I changed the PR title, it seems like (unless I've misunderstood) the issue is about incremental reads using this code path, and our current unit tests only test all-or-nothing reads of row groups. So we should write some unit tests with incremental reads that exhibit the problem

adamhooper and others added 3 commits March 25, 2020 20:21
Previously, ByteArrayDictionaryRecordReader would clear the dictionary
after returning a chunk. Now, it persists the dictionary in case a
future chunk needs it.
@wesm
Copy link
Member

wesm commented Mar 26, 2020

I just added a unit test to verify the fix. Thanks @adamhooper for the fix

@wesm
Copy link
Member

wesm commented Mar 27, 2020

+1. The CI failures are unrelated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments