-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bytes: bytes.Reader returns EOF on zero-byte Read, which doesn't conform with io.Reader interface documentation #40385
Comments
@gopherbot please remove Documentation |
It seems to me that the behavior of You are asking for a special case in |
I am hoping for a discussion on whether this behaviour is expected or natural. Getting an EOF error on zero-byte read looks like a replacement for a missing EOF() method. However, I agree that changing multiple Read methods is not a favourable option, because it could break a lot of stuff. Edit. |
I think there can be discussion on the topic. My comment above was my attempt at discussion, by pointing out the issues that I found relevant. Sorry if I seem to be preventing discussion. |
Is there a reason why there is no Add. |
On something like a Unix pipe, an |
As far as I remember, |
Read()-ing seems to be inconsistent in the standard library:
In the |
The io.Reader interface is like no other in the Go ecosystem. Read is the only method where the caller must examine the other values returned from a function/method call before examining the error value. |
It is sad that I had to figure it out in runtime. When writing the unit tests, I was not expecting the last field to have zero length, but that's on me. |
It seems incorrect to make the EOF condition directly a function of the input's length. Your example reads from a data source that contains no data. It reads 0 bytes from that data source and triggers the end of file condition because it knows that the next read will also return To me it does not make sense for the first read to be successful on a data source that contains nothing. The same |
I appreciate the way the discussion goes. Let's say we want the Reader to report And then there is this:
A zero-length slice that is out-of-bound by a byte is just an empty slice, which I find to be similar to reading zero bytes from the end of a bytes buffer. It's kind of the no-op I expect, when making zero-length reads... |
I have decided to do some tests:
Contrary to my belief, there were no discrepancies between OSes. |
There are many different kinds of readers. The Currently that contract permits returning The options I see here are:
Does anybody see any other options? Thanks. |
I vote for 2. /cc @minux who spent a lot of time arguing for this a few years back. |
It would be nice to have a warning in the |
Why would this need a warning? It seems like the correct behaviour. |
For me at least, |
I have a code which reads string in format #bytes, [bytes] So, I read #bytes, create byte array of that size, then read into it. Now, when string of length 0 appears in the middle of reader, it will read successfully, but when the string of length 0 is at the end of reader, it will not. IMHO this is an indicator of a problem in design, reading of 0 bytes is not special and should always succeed independent of read position. |
@hrissan As far as I can see the choices are as listed at #40385 (comment). What do you recommend? |
@ianlancetaylor Format some_encoding(#bytes), [bytes], is common, and any well-tested parser (and all fuzz-tested parsers) definitely already have "if #bytes != 0" protection around reading [bytes] part, so they already behave as if approach 2) was implemented. Any code which has no "if #bytes != 0" protection around reading of potentially zero bytes will behave differently at the middle and at the end of the bytes.Reader and all other readers which have this bug. It seems to me, most if not all code without such protection is already incorrect due to this. So approach 1) will break already incorrect code, which might be actually good for that code. BTW all similar code which uses readers which return error on every attempt to read 0 bytes, independent of stream position, already has "if #bytes != 0" protection and will not break from approach 1). |
Right now, today, the behavior of |
I wasn't expecting another user to have exactly the same issue like mine, so soon. I will invoke the timeless mantra |
@metala Ian explained that bytes.Reader is not broken as described by the io.Reader contract. What warning do you think should be added? |
@davecheney Yes, indeed. I am not arguing that About the warning / notice, I am thinking of something like: |
@metala I don't understand how |
@davecheney I am not sure what you want, but lets take those two cases: // Deserialisation using bytes.Reader // Deserialisation using bytes.Buffer... the same code, single line changed. The deserialisation of fields follow the structure, read field length, if necessary, then read content. If there is an error or the length is different, return an error. |
There is an error in your code
The io.Reader contract states the caller must process https://godoc.org/io#ReadFull might be a better choice for your application. |
You are probably referring to this paragraph, which is either ambiguous or it doesn't cover the case
Since the input is a byte slice, and not a stream, it doesn't make much sense to waste CPU cycles on |
This argument is specious; if you're worried about the overhead of |
You are right, but I did not think about that when I wrote the code. However, it's a bit cleaner to use a reader instead of incrementing and passing indices. |
In fact, I don't care why read on empty slice can be returned immediately, it has hardly no affect for me and most developers. For a function like this
I don't care about any thing like POSIX , linux man page, or any other, I just think that all std lib should be treated according to uniform standards. And because the caller only know that this is an io.Reader, he doesn't care the underlying implement of this reader. |
@dashjay There is no expectation that all Readers behave the same way, even all Readers in the standard library. Different Readers are free to employ different buffering and error handling strategies. |
Change https://go.dev/cl/498355 mentions this issue: |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I was deserialising binary data and I hit an unexpected
io.EOF
when reading zero bytes. Here is a minimal example that illustrates the problem.What did you expect to see?
What did you see instead?
On conformity with io.Reader and other standards
An excerpt from the documentation of io.Reader with the important parts emboldened:
An excerpt from the man page of read(3):
Related issues
It looks like this issue is in stark contrast to Go2 proposal issue #27531.
There is also an interesting discussion (#5310) about returning (0, nil) from a Read(p []byte).
The text was updated successfully, but these errors were encountered: