-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: io: add ReadPeeker and implement Peek in bytes.Reader #63548
Comments
The point of To put it another way, if you feel a need to call a |
It's a common optimisation pattern to check for specific types inside functions that work with interfaces, therefore any function which accepts an io.Reader could have a fast-path for bytes.Reader if it were possible to get to the byte slice from the Reader. |
Can you point us to some existing code that would take advantage of this optimization? Thanks. |
Pretty much any streaming parser, even stdlib's bufio.Reader could have a special case for the reader being bytes.Reader avoiding copying everything. |
You could implement this today using |
So I've tried to showcase this in master...mhr3:go:bytes-reader-next I've added bytes.Reader.Next and adapted bufio.Reader to use the byte slice directly, while not exactly pretty (and I'm not proposing stdlib should do this), it shows how such an API could be used and mainly this is not possible without this new API. Also result of benchstat of a very simple benchmark in the bufio package (master vs this branch):
While preparing this benchmark result, I also realised strings.Reader should have the same Next() method. |
The At present, the That said, it'd be nice if there was a unified API for copy-less reads. Since |
I think the Next() method suits better for this (given no IO needs to happen, so there's no potential for an error happening, plus the simplicity), but ultimately I don't have a strong opinion, as long as there's a way to do this - Bufferred() + Peek() + Discard() would do the trick too. |
This proposal has been added to the active column of the proposals project |
@dsnet, is Buffered really necessary? Can we use just Peek and Discard? |
Looking through my code, I couldn't convince myself The Thus, there is more utility to |
We could alter the behavior of |
Here's a concrete proposal:
I decided to put |
I believe Peek should definitely be coupled with Discard (/Advance/Skip), special behaviour for passing a peeked buffer into Read sounds very strange. Would rather have the interface specify that a peeked buffer is only valid until next Read(). |
Related: https://dave.cheney.net/paste/gophercon-sg-2023.html / https://github.com/pkg/json/blob/main/reader.go which draws on https://philpearl.github.io/post/reader/ I wonder whether, in addition to new bytes.Reader methods and an |
It doesn't feel like we've found the right path forward yet. This all feels very kludgy. |
@rsc agreed re: the proposed solutions. But do you think that the problem at least is clear? |
As @dsnet pointed out, the two methods that are necessary for these use cases are:
Therefore, how about we scope this down to just adding And one last point - |
I don't think a next call is sufficient. Some form of peek+discard pattern is needed by many applications beyond just optimization purposes. For example, in |
Well, |
An On the other hand, most implementation of an |
My point was that Len+Next+io.Seeker definitely satisfies what you'd need to do. I understand that you'd still want to have a proper interface for these PeekReaders, but a consensus on that wasn't reached, so I'm suggesting something simpler. |
I think the discrepancy stems from the fact that I'm trying to tackle this problem for I'm arguing against |
I agree we need an established convention here. Does anyone see any strong objections to ReadPeeker? |
Should we rename this issue if we're thinking more broadly about a generic interface? |
Looks like bufio.Reader.Read does not do the Peek optimization. That's fine - a future version can. Are there any Readers with the Peek optimization now? |
This discussion has died out a bit over the holidays. Does ReadPeeker address the original need raised in the proposal? Is it good enough? Also maybe it should be PeekReader. |
TBH I'm not super happy with the idea of adding |
IIUC, you're arguing against the addition of I also feel uncomfortable exposing a degree of type unsafety to That said, I'd still like to see a common interface for copy-less and exact read operations. There is both a performance and semantic benefit to this API. A number of APIs (e.g., "compress/flate") assert for [*] Technically, you can access the underlying the |
Correct. |
@Merovius we can already get a reference to the buffer of a type badWriter struct{}
func (badWriter) Write(b []byte) (int, error) {
return copy(b, "bonjour"), nil
}
func main() {
b := []byte("hello !")
r := bytes.NewReader(b)
_, err := r.WriteTo(badWriter{})
if err != nil { panic(err) }
fmt.Println(string(b))
} I'm not arguing this code is somehow valid, but that Peek doesn't do anything new. |
Looking back at #63548 (comment), it is a bit annoying for clients that Peek can return ErrBufferFull or whatever for a "too big" n where "too big" is not defined. So code trying to use this seems like it has to always wrap Peek in a potential fall back to Read. That seems awkward. What if an implementation can only Peek 2 bytes? Or 1024 bytes but you need 1536? If we had a way to get the maximum Peek value maybe that would be less awkward, but looking back at the original proposal with Buffered and Discard, in bufio.Reader, Buffered is not an upper bound on the amount you can Peek, because Peek will try to refill the buffer if it needs to. So Buffered wouldn't work as that method. This all seems not quite there yet. |
It seems like maybe we should decline this and wait for more inspiration, maybe try again in a year or so. |
Looking through my programs using a
I'm not sure what the utility of this API would be as an implementation usually doesn't care about what the maximum peek value may be, but just that it can peek enough bytes (perhaps everything available) and lets the underlying Also, being overly concerned with the "maximum" |
This still seems awkward. I don't think we've found the right API yet. |
Based on the discussion above, this proposal seems like a likely decline. |
No change in consensus, so declined. |
Is there any reason why bytes.Reader doesn't have a Bytes() or Next(int) method similar to bytes.Buffer?
Such a method would allow zero-copy usage of bytes.Reader, cause currently all public methods require a separate buffer to copy the data into. Plus these methods already exist on bytes.Buffer, so gophers are already familiar with them.
The text was updated successfully, but these errors were encountered: