Skip to content

strings: IndexAny performance regression vs 1.9 #22750

@TocarIP

Description

@TocarIP

What version of Go are you using (go version)?

go version devel +ef0e2af Mon Nov 6 15:55:31 2017 +0000 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/nfs/site/home/itocar/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/localdisk/itocar/gopath/"
GORACE=""
GOROOT="/localdisk/itocar/golang"
GOTMPDIR=""
GOTOOLDIR="/localdisk/itocar/golang/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build447443283=/tmp/go-build -gno-record-gcc-switches"

/proc/cpuinfo:
model name : Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz

What did you do?
Compared strings/IndexAnyASCII/1 performance to 1.9 and found regression.
It is most prominent on ndexAnyASCII/1:4 subbenchmark, but others also show this regression.

src/strings/IndexAnyASCII/1:4 15.0ns ± 0% 17.1ns ± 0% +14.00% (p=0.000 n=8+8)

Bisect points to c82ee79 and b78b54f

b78b54f looks mostly unrelated, but
c82ee79 changes generated code of IndexAny.
Removing useless check makes sense to me, but it also caused changes inside main loop.
Main problem that I see is an extra LoadReg in a loop.

jmpq   df                                                                                                                                                                    
mov    0xb8(%rsp),%r8        // this was done outside of the loop, before                                                                                                                                                                                                                             
cmp    %r8,%rdi                                                                                                                                                              
jge    dc                                                                                                                                                                    mov    0xb0(%rsp),%r9                                                                                                                                                        
movzbl (%r9,%rdi,1),%r10d                                                                                                                                                    
cmp    $0x80,%r10d         

It is placed there by regalloc pass.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions