bufio: revert central buffer pool? figure out a plan for garbage. #6086

Closed
rsc opened this Issue Aug 9, 2013 · 7 comments

Comments

Projects
None yet
3 participants
Contributor

rsc commented Aug 9, 2013

CLs 8819049 and 9459044 should probably be reverted, and possibly replaced by something
else.

[Copy of response on CL 12687043.]

To expand on my comment, Go doesn't have Free because we have a garbage collector. The
garbage collector protects programmers from use-after-free bugs in their own code and in
code they are linking against.

It looks like an implicit "free" slipped in while I wasn't looking, in CL
8819049, and this just uses it in one more place. I believe that CL should be reverted,
along with CL 9459044. Unlike fmt's little reused buffer, there is API in both
bufio.Reader and bufio.Writer that exposes slices into the buffers to client code. A
badly written client might continue to read from or write to the slice after the buffer
is "freed", causing failures in completely unrelated pieces of code that just
happen to also use bufio.

I understand that you are after performance and buffer reuse, but there are safer ways
to achieve that when clients need it.

For example you could define a Reset(r io.Reader) on bufio.Reader and a Reset(w
io.Writer) on bufio.Writer, to allow clients to reuse a single bufio.Reader or
bufio.Writer for multiple of their own streams. If the client screws up, they shoot
themselves in the foot. With the current scheme, in which a client can "free"
a buffer into the general pool, a bad piece of code can shoot all the other code in the
foot.

This kind of thing is exactly why we removed runtime.Free.
Owner

bradfitz commented Aug 9, 2013

Comment 1:

I'm pretty sure I was careful about this. ReadBytes and ReadLine never free the buffer. 
They're already documented that their buffers are invalid after their next interactions
with the bufio.Reader.  Any data race or bug they had before is still a bug.
Could you explain the order of calls you think would result in a new bug?
Contributor

rsc commented Aug 9, 2013

Comment 2:

I would expect something like this to be problematic:
package main
import (
"bufio"
"strings"
"fmt"
)
func main() {
r := bufio.NewReader(strings.NewReader("hello world\n"))
l, _ := r.ReadSlice('\n')
r.ReadByte() // sees EOF, kills buffer
 r1 := bufio.NewReader(strings.NewReader("hello world\n"))
l1, _ := r1.ReadSlice('\n')
copy(l, []byte("xxxxxxxxxxxx"))
fmt.Printf("%s", l1) // prints xxxxxxxxx because l1 == l
}
That said, it looks to me like the only possible way for the buffer reuse
to trigger in bufio is for a reader to return n > 0 and io.EOF at the same
time. Since strings.NewReader doesn't do this, the buffer is not reused and
the example behaves correctly.
When you checked in the reader change you said that BenchmarkReaderEmpty
went from 3 allocs to 2. It is back at 3.
Especially given that the code is not triggering and therefore not being
depended upon, I think we should revert it.
In the Writer case, ReadFrom exposes the buffer slice.
Owner

bradfitz commented Aug 9, 2013

Comment 3:

Your example program was already buggy.  ReadSlice documentation says:
http://golang.org/pkg/bufio/#Reader.ReadSlice -- "The bytes stop being valid at the next
read call.  ... Because the data returned from ReadSlice will be overwritten by the next
I/O operation, most clients should use ReadBytes or ReadString instead. "
Regarding:
> When you checked in the reader change you said that
> BenchmarkReaderEmpty went from 3 allocs to 2. It is back at 3.
Something else changed int the meantime.  I added io.WriteString to ioutil.Discard and
it's back to 2.  But it's still not as aggressive with recycling as it could be.  This
did matter and show up in benchmarks (net/http stuff probably) or I wouldn't have
submitted it.
I would fix bufio's buffer recycling to be more aggressive, but I don't want to waste
any more of my time on it you're just going to remove it, though.
Finally,
> In the Writer case, ReadFrom exposes the buffer slice.
We always expose slices in Read([]byte) and Write([]byte) and it's always considered
wrong to retain references to them after the Read or Write returns.  How is this new?
In any case, I could detect that rare case in Write and not recycle.
But before I fix anything, let's figure out a plan.
I think bufio is safe now.  If you find bugs that I didn't think of and they're not
fixable, we need a way to address the garbage issues:  maybe
bufio.Reader.SwitchTo(io.Reader) and bufio.Writer.SwitchTo(io.Writer).  That would let
me remove all this stuff from net/http/server.go:
// A switchReader can have its Reader changed at runtime.
// It's not safe for concurrent Reads and switches.
type switchReader struct {
        io.Reader
}
// A switchWriter can have its Writer changed at runtime.
// It's not safe for concurrent Writes and switches.
type switchWriter struct {
        io.Writer
}
Contributor

rsc commented Aug 9, 2013

Comment 4:

I think you should add Reader.Reset and Writer.Reset regardless (SwitchTo
is too weird a name, you're just resetting all the buffer state and giving
a new reader/writer.)
The current bufio.Reader only recycles a buffer if the underlying reader is
the kind that returns n > 0, err == io.EOF on the final read, because
putBuf is only called on the n > 0 return cases, and it requires err ==
io.EOF. Put in some prints in the bufio test and you'll see. It triggers
for the dataAndEOFReader in the test, but that's about it. I can't
demonstrate any possible problems because in real use cases it just doesn't
happen.
We expose slices in Read and Write all the time. What we don't then do, as
far as I know, is give the buffer to completely unrelated code to reuse. If
the decision to give it away is buggy, now completely unrelated code is
paying the price.
Owner

bradfitz commented Aug 9, 2013

Comment 5:

> I think you should add Reader.Reset and Writer.Reset regardless
I'll send that first.  That will let me simplify net/http.
> The current bufio.Reader only recycles a buffer if the underlying reader ...
Yes, the current bufio recycling isn't as aggressive as it could be.  I've noticed that
as well.  I would like to improve it, if we can keep it.
Can you think of a case where the buffer recycling is actually problematic?  The example
in comment #2 is already buggy.
Owner

bradfitz commented Aug 9, 2013

Comment 6:

I did them both in one CL.  Dropped the buffer reuse and added Reset:
https://golang.org/cl/12603049
I'll do net/http sometime later.

Owner changed to @bradfitz.

Status changed to Started.

Owner

bradfitz commented Aug 11, 2013

Comment 7:

This issue was closed by revision ede9aa9.

Status changed to Fixed.

@rsc rsc added fixed labels Aug 11, 2013

@rsc rsc added this to the Go1.2 milestone Apr 14, 2015

@rsc rsc removed the go1.2 label Apr 14, 2015

@gopherbot gopherbot locked and limited conversation to collaborators Jun 24, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.