-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiReaderIterator EOF check + test #211
Conversation
@@ -169,7 +169,7 @@ func (it *multiReaderIterator) moveIteratorToValidNext(iter Iterator) bool { | |||
} | |||
|
|||
err := iter.Err() | |||
if it.err == nil && err != nil { | |||
if it.err == nil && err != nil && err != io.EOF { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as our use case is concerned, the underlying reader iterators should never run out of bytes because the encoded byte stream associated with each iterator should always end with an end-of-stream marker (see https://github.com/m3db/m3db/blob/master/encoding/scheme.go#L32), at which point the iterator knows it should stop reading bytes from the stream. IOW, an EOF error from the underlying stream is indicative of an underlying data issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, where did you run into this @prateek ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I ran into this while working on tests for updating series' data. The test file in the PR replicates the error condition.
Is the issue an artifact of how I'm constructing the MultiReaderIterator from the Encoder?
This was an issue with how I was constructing encoder. The write-path in the test had intOptimized set to false, but the read-path had it set to true. Does it make sense to put that value in the the header/footer of the encoder to check against at decode time? |
Not sure getting a slightly better error message is worth the extra overhead: a few more bits per encoded byte stream, slightly more complicated decoder logic, etc. On top of that, making it backward-compatible won't be trivial considering we already have M3DB running in prod with encoded data flushed onto disks, and wiping the data out and restarting unfortunately takes time |
Yep agreed. To my knowledge, the only place we've run into this so far is my tests so I definitely don't think it's worth the hassle to make the errors explicit. Can revisit if that changes. |
Addressing a weird edge case I ran into while working on some unrelated code. Looks like the MultiReaderIterator propagates EOF errs from ReaderIterator(s) below it. IMO, that shouldn't be exposed as an error on the ReaderIterator interface but one of the tests for that object had an explicit check for it. So I moved the err check to the level above it (MulitReaderIterator)
Created a new test to capture the behaviour. It's placed in integration because it depended on both storage and encoding packages. Happy to move to a better place if there is one.
/cc @ben-lerner @robskillington @xichen2020