Skip to content

bufio: unexpected behavior of (*Scanner).Scan(), (*Scanner).Bytes() after ErrTooLong #65257

@elmeyer

Description

@elmeyer

Go version

go version go1.21.6 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/larsmeyer/Library/Caches/go-build'
GOENV='/Users/larsmeyer/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/larsmeyer/go/pkg/mod'
GONOPROXY='<redacted>'
GONOSUMDB='<redacted>'
GOOS='darwin'
GOPATH='/Users/larsmeyer/go'
GOPRIVATE='<redacted>'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.21.6/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.21.6/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.6'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/Users/larsmeyer/src/bufiotest/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/z6/xqtk238d3fv7krs5hmpsnlr40000gn/T/go-build33939231=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I created a Scanner whose input contains a line longer than MaxScanTokenSize.

Attaching as ZIP because the code is too large for go.dev/play:
bufiotest.zip

What did you see happen?

When encountering the line, Scan() returns false as expected. Err() returns ErrTooLong as expected. However, Bytes() unexpectedly returns nil. Only a subsequent call to Scan(), which once again (and for the final time) returns true, causes Bytes() to be non-nil, returning the last read MaxScanTokenSize bytes.

What did you expect to see?

Bytes() returning the last read MaxScanTokenSize bytes immediately when Scan() first returns false.

I do not understand the rationale for the return false here:

go/src/bufio/scan.go

Lines 193 to 196 in cc85462

if len(s.buf) >= s.maxTokenSize || len(s.buf) > maxInt/2 {
s.setErr(ErrTooLong)
return false
}

From my reading, other errors, such as I/O errors from the underlying io.Reader, do not cause an immediate return and thus cause Bytes() to return the remaining buffer immediately when Scan() first returns false:

go/src/bufio/scan.go

Lines 220 to 223 in cc85462

if err != nil {
s.setErr(err)
break
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions