Skip to content

runtime: optimize memmove for 1-16 MB overlapping case on AMD64 #49058

Open
@weishi-deng

Description

@weishi-deng

What version of Go are you using (go version)?

$ go1.16.4

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/golang"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build287939000=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I run some test cases using function 'runtime.memmove' when data size is over 1MB and below 16MB with address overlap.

What did you expect to see?

'runtime.memmove' choose the most efficient way (from the non-temporal store and temporal store) to copy data.

What did you see instead?

When the test case is with address overlap and the size is over 1MB and below 16MB, the non-temporal store copying is slower than temporal store copying, but 'runtime.memmove' chooses to copy data using non-temporal store copying.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions