Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: add some code motion optimizations #56620

Closed
erifan opened this issue Nov 7, 2022 · 11 comments
Closed

cmd/compile: add some code motion optimizations #56620

erifan opened this issue Nov 7, 2022 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@erifan
Copy link

erifan commented Nov 7, 2022

What version of Go are you using (go version)?

$ go version
go version devel go1.20-3b6b39c869 Mon Nov 7 09:02:20 2022 +0800 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/erifan01/Library/Caches/go-build"
GOENV="/Users/erifan01/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/usr/local/gopath/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/usr/local/gopath"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/erifan01/go-master"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/erifan01/go-master/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="devel go1.20-3b6b39c869 Mon Nov 7 09:02:20 2022 +0800"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/erifan01/go-master/src/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/vm/32796pv57pg0hxrpyyvkww040000gt/T/go-build2092224947=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Run the following benchmark:
$ go test -run ^$ -count 5 -bench BenchmarkEfaceEq -cpuprofile cpu.out

type TS uint16

var (
        e  any
        ts TS
)

func BenchmarkEfaceEq(b *testing.B) {
        for n := 0; n < b.N; n++ {
                _ = e != ts
        }
}

The core code is the following lines:

                _ = e != ts
  0x1000efd94           f0000abb                ADRP 1404928(PC), R27
  0x1000efd98           794a2f63                MOVHU 1302(R27), R3
  0x1000efd9c           79005fe3                MOVH R3, 46(RSP)
  0x1000efda0           d000091b                ADRP 1187840(PC), R27
  0x1000efda4           f946eb63                MOVD 3536(R27), R3
  0x1000efda8           d000091b                ADRP 1187840(PC), R27      // hotspot but useless instruction for the test
  0x1000efdac           f946ef64                MOVD 3544(R27), R4         // hotspot but useless instruction for the test
  0x1000efdb0           b00001e5                ADRP 249856(PC), R5
  0x1000efdb4           913600a5                ADD $3456, R5, R5
  0x1000efdb8           eb05007f                CMP R5, R3
  0x1000efdbc           54fffe41                BNE -14(PC)

However, the hot-spot instructions for this test are two useless instructions for the test because the test does not go to that code path ( a call to runtime.efaceeq). If we can move these instructions out of the execution path of this test and into another branch, I believe the performance of this test can be greatly improved.

This is just a demo case, I believe there are more places where code motion/sinking optimization is needed. But currently there is no such optimization in Go?

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 7, 2022
@mknyszek mknyszek added this to the Backlog milestone Nov 7, 2022
@mknyszek
Copy link
Contributor

mknyszek commented Nov 7, 2022

CC @golang/compiler

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 7, 2022
@randall77
Copy link
Contributor

This looks like a restriction of the tighten pass, where it can't handle moving ops that use memory (a load, in this case).
See cmd/compile/internal/ssa/tighten.go:29.
That restriction could be lifted in some cases, like this one. We'd just need to compute memory states at the start of each block first.

@randall77 randall77 modified the milestones: Backlog, Unplanned Nov 7, 2022
@randall77 randall77 added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 7, 2022
@erifan
Copy link
Author

erifan commented Nov 8, 2022

Indeed, I didn't see that we already have some code motion optimizations. Thanks @randall77

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/458755 mentions this issue: cmd/compile: enhance tighten pass for memory values

@mvdan
Copy link
Member

mvdan commented May 3, 2023

Reopening since I had to revert the CL.

@mvdan mvdan reopened this May 3, 2023
@erifan
Copy link
Author

erifan commented May 4, 2023

The failure is due to this assumption being broken:
1. The start memory state of a block is InitMem, a Phi node of type mem or an only incoming memory value.

For the failed case, b13 is as follow:
image

So b13 has two incoming memory values.

It is broken by CL 478475

@randall77 Does the assumption above still hold?

@randall77
Copy link
Contributor

My guess is that's a bug in 478475. I'll take a look tomorrow.

@erifan
Copy link
Author

erifan commented May 4, 2023

Thanks, this is a reproducer.

package t

import(
        "encoding/binary"
)

type digest struct {
        h   [5]uint32
        x   [chunk]byte
        nx  int
        len uint64
}

const (
        chunk = 64
        magic         = "sha\x01"
        marshaledSize = len(magic) + 5*4 + chunk + 8
)

func (d *digest) MarshalBinary() ([]byte, error) {
        b := make([]byte, 0, marshaledSize)
        b = append(b, magic...)
        b = binary.BigEndian.AppendUint32(b, d.h[0])
        b = binary.BigEndian.AppendUint32(b, d.h[1])
        b = binary.BigEndian.AppendUint32(b, d.h[2])
        b = binary.BigEndian.AppendUint32(b, d.h[3])
        b = binary.BigEndian.AppendUint32(b, d.h[4])
        b = append(b, d.x[:d.nx]...)
        b = b[:len(b)+len(d.x)-d.nx] // already zero
        b = binary.BigEndian.AppendUint64(b, d.len)
        return b, nil
}

@randall77
Copy link
Contributor

Fix at CL 492616. @erifan Once that is in, you can retry your CL.

@erifan
Copy link
Author

erifan commented May 5, 2023

Ok, thanks.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/492895 mentions this issue: cmd/compile: enhance tighten pass for memory values

@golang golang locked and limited conversation to collaborators May 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

5 participants