Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: function argument value escapes no matter the run-time condition #42781

Open
Julio-Guerra opened this issue Nov 23, 2020 · 6 comments
Open

Comments

@Julio-Guerra
Copy link

@Julio-Guerra Julio-Guerra commented Nov 23, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15.5 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/julio/.cache/go-build"
GOENV="/home/julio/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/julio/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/julio/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.15"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.15/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build392370816=/tmp/go-build -gno-record-gcc-switches"

What did you do?

While benchmarking a time histogram implementation, I found an unexpected allocation in a IncrementOrStore() method implementation, happening directly in the prolog code the compiler generated, where the return value is allocated regardless of whether it's needed or not.

I could reproduce the issue with the following dummy function:

func foo(cond bool, v int) *int {
    if cond {
        return &v
    }
    return nil
}

The compiled function of the previous example will allocate the int value in its function prolog rather than in the true-condition path:

TEXT    "".foo(SB), ABIInternal, $24-24
MOVQ    (TLS), CX
CMPQ    SP, 16(CX)
PCDATA  $0, $-2
JLS     102
PCDATA  $0, $-1
SUBQ    $24, SP
MOVQ    BP, 16(SP)
LEAQ    16(SP), BP
FUNCDATA        $0, gclocals·54241e171da8af6ae173d69da0236748(SB)
FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
LEAQ    type.int(SB), AX
MOVQ    AX, (SP)
PCDATA  $1, $0
CALL    runtime.newobject(SB) // <--- The int value is allocated in the function prolog, no matter the condition
MOVQ    8(SP), AX
MOVQ    "".v+40(SP), CX
MOVQ    CX, (AX)
MOVBLZX "".cond+32(SP), CX
NOP
TESTB   CL, CL
JEQ     83
MOVQ    AX, "".~r2+48(SP)
MOVQ    16(SP), BP
ADDQ    $24, SP
RET
MOVQ    $0, "".~r2+48(SP)
MOVQ    16(SP), BP
ADDQ    $24, SP
RET
NOP
PCDATA  $1, $-1
PCDATA  $0, $-2
CALL    runtime.morestack_noctxt(SB)
PCDATA  $0, $-1
JMP     0

The workaround I use for now is to rather return the value from the condition block scope:

func foo(cond bool, v int) *int {
    if cond {
+       v := v // new declaration in this scope to avoid using function scope
        return &v
    }
    return nil
}

Resulting in:

TEXT    "".foo(SB), ABIInternal, $24-24
MOVQ    (TLS), CX
CMPQ    SP, 16(CX)
PCDATA  $0, $-2
JLS     101
PCDATA  $0, $-1
SUBQ    $24, SP
MOVQ    BP, 16(SP)
LEAQ    16(SP), BP
FUNCDATA        $0, gclocals·54241e171da8af6ae173d69da0236748(SB)
FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
MOVBLZX "".cond+32(SP), AX
TESTB   AL, AL
JEQ     82
LEAQ    type.int(SB), AX
MOVQ    AX, (SP)
PCDATA  $1, $0
CALL    runtime.newobject(SB)  // <--- The int value is no longer allocated by the function prolog, and according to the condition
MOVQ    8(SP), AX
MOVQ    "".v+40(SP), CX
MOVQ    CX, (AX)
MOVQ    AX, "".~r2+48(SP)
MOVQ    16(SP), BP
ADDQ    $24, SP
RET
MOVQ    $0, "".~r2+48(SP)
MOVQ    16(SP), BP
ADDQ    $24, SP
RET
NOP
PCDATA  $1, $-1
PCDATA  $0, $-2
CALL    runtime.morestack_noctxt(SB)
PCDATA  $0, $-1
JMP     0
@cuonglm
Copy link
Contributor

@cuonglm cuonglm commented Nov 23, 2020

Would you mind elaborating more details? Both your programs above will be heap allocated the v parameter.

@Julio-Guerra
Copy link
Author

@Julio-Guerra Julio-Guerra commented Nov 23, 2020

Would you mind elaborating more details? Both your programs above will be heap allocated the v parameter.

I update the issue with the generated asm, and highlighting the change in the allocation position: one is always, while the other is conditional.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 23, 2020

Yes, the escape analysis is per-variable and is not flow-sensitive. Making it flow sensitive would require identifying when a variable can safely be split--basically, automatically inserting the v := v that you use in your second example. This seems doable in principle, but it doesn't seem easy. I'm not sure that it would help many real programs.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 23, 2020

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Nov 23, 2020

This is one case where if we do escape analysis in SSA it probably can be done fairly easily. (We've been thinking about it for a while. But it needs a lot of work.)

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Nov 24, 2020

I'd prefer you didn't call it "wrong".
Lots of code is slower than it could be if we focused our attention on that one single piece of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.