[HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates#9789
Merged
bvaradar merged 1 commit intoapache:masterfrom Oct 30, 2023
Merged
[HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates#9789bvaradar merged 1 commit intoapache:masterfrom
bvaradar merged 1 commit intoapache:masterfrom
Conversation
codope
reviewed
Sep 26, 2023
Member
codope
left a comment
There was a problem hiding this comment.
Good catch! Is it possible to add a unit test to test the state change of eof?
Collaborator
danny0405
reviewed
Sep 27, 2023
| // NOTE: This is required for idempotency | ||
| if (eof) { | ||
| return false; | ||
| } |
Contributor
There was a problem hiding this comment.
Seems a fundamental change, why the HFile reader can work now?
Contributor
There was a problem hiding this comment.
Under what condition does the infinite iteration happen? How to reproduce it in a test?
Contributor
There was a problem hiding this comment.
This is being handled correctly in KeyPrefixIterator but is missing in this RecordIterator. The fix is to bring the change also to this class.
bvaradar
approved these changes
Oct 30, 2023
| // NOTE: This is required for idempotency | ||
| if (eof) { | ||
| return false; | ||
| } |
Contributor
There was a problem hiding this comment.
This is being handled correctly in KeyPrefixIterator but is missing in this RecordIterator. The fix is to bring the change also to this class.
nsivabalan
pushed a commit
that referenced
this pull request
Nov 21, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change Logs
org.apache.hudi.io.storage.HoodieAvroHFileReader.RecordIterator#hasNext uses org.apache.hadoop.hbase.io.hfile.HFileScanner#isSeeked to seek to the first line of the file.
if isSeeked returns false, scanner seeks to start of file.
After end of file is reached, isSeeked would still return false and the next time hasNext is called it seeks to start of file again leading to an infinite loop.
Documentation for HFileScanner#isSeeked
True is scanner has had one of the seek calls invoked; i.e. seekBefore(Cell) or seekTo() or seekTo(Cell). Otherwise returns false.
The PR adds a flag(eof) so that false is returned for hasNext if the flag is true.
Impact
NA
Risk level (write none, low medium or high below)
low
Documentation Update
NA
Contributor's checklist