-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
The Reader type is initialized with a buffer size. Once set, there is no API to adjust this.
This can make it challenging when parsing a file with a combination of Peek and Discard calls.
One such pattern is something like:
var r *bufio.Reader = ...
for {
buf, err := r.Peek(headerSize)
if err != nil {
return err
}
r.Discard(len(buf))
hdr := parseHeader(buf)
buf, err := r.Peek(hdr.Size)
if err != nil {
return err
}
r.Discard(len(buf))
processData(buf)
}The code above works just fine, but fails unexpectedly when the payload size exceeds the internal buffer of Reader.
The current workaround is to check for ErrBufferFull, allocate a separate buffer, and read the expected number of bytes into that buffer. The workaround looks like:
var r *bufio.Reader = ...
for {
...
buf, err := r.Peek(hdr.Size)
- if err != nil {
- return err
- }
- r.Discard(len(buf))
+ switch {
+ case err == nil:
+ r.Discard(len(buf))
+ case err == bufio.ErrBufferFull:
+ buf = make([]byte, hdr.Size)
+ if _, err := io.ReadFull(r, buf); err != nil {
+ return err
+ }
+ default:
+ return err
+ }
processData(buf)
}It is unfortunate that I need to allocate a relatively large buffer in order to read large records and that I'm not able to reuse that buffer for subsequent Reader operations.
I propose the addition of Reader.Buffer, with similar semantics as Scanner.Buffer:
Buffer sets the buffer to use when reading and the maximum size of buffer that may be allocated during reading. The maximum Peek or ReadSlice size is the larger of max and cap(buf). If max <= cap(buf), Reader will use this buffer only and do no allocation.
Buffer panics if it is called after scanning has started.
Using this new API, the code above would look like:
var r *bufio.Reader = ...
+ r.Buffer(nil, 64<<20) // we never expect records to exceed 64MiB
for {
...
}\cc @josharian
Metadata
Metadata
Assignees
Labels
Type
Projects
Status