New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unexplained allocation for convT2I #15451

Open
robpike opened this Issue Apr 26, 2016 · 3 comments

Comments

Projects
None yet
4 participants
@robpike
Contributor

robpike commented Apr 26, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?

go version devel +2bf7034 Mon Apr 25 16:18:10 2016 +0000 darwin/amd64

  1. What operating system and processor architecture are you using (go env)?

`````` darwin/amd64```

  1. What did you do?

See attached typescript. I do not understand why line 28, the call to sort.Sort, is allocating. It is in a call to convT2I but a, a.r, and the slice a.r[i] points to are already on the heap. It seems like escape analysis and/or optimizations are missing a chance to avoid an allocation.

# _/Users/r/bug
./x_test.go:35: can inline sortable.Len
./x_test.go:36: can inline sortable.Less
./x_test.go:37: can inline sortable.Swap
./x_test.go:25: make([]int, 5) escapes to heap
./x_test.go:28: sortable(r) escapes to heap
./x_test.go:23: new(A) escapes to heap
./x_test.go:12: t.common escapes to heap
./x_test.go:8: leaking param: t
./x_test.go:12: "allocs:" escapes to heap
./x_test.go:12: allocs escapes to heap
./x_test.go:14: t.common escapes to heap
./x_test.go:14: "expected 23 allocations, got " escapes to heap
./x_test.go:14: allocs escapes to heap
./x_test.go:9: TestParseAllocs func literal does not escape
./x_test.go:12: TestParseAllocs ... argument does not escape
./x_test.go:14: TestParseAllocs ... argument does not escape
./x_test.go:35: sortable.Len s does not escape
./x_test.go:36: sortable.Less s does not escape
./x_test.go:37: sortable.Swap s does not escape
<autogenerated>:1: inlining call to sortable.Len
<autogenerated>:1: (*sortable).Len .this does not escape
<autogenerated>:2: inlining call to sortable.Less
<autogenerated>:2: (*sortable).Less .this does not escape
<autogenerated>:3: inlining call to sortable.Swap
<autogenerated>:3: (*sortable).Swap .this does not escape
<autogenerated>:4: leaking param: .this
<autogenerated>:5: leaking param: .this
<autogenerated>:6: leaking param: .this
# testmain
/var/folders/g9/s3yf6f1n54bgn16vwpxnbxvm0004fc/T/go-build913880170/_/Users/r/bug/_test/_testmain.go:52: inlining call to testing.MainStart
/var/folders/g9/s3yf6f1n54bgn16vwpxnbxvm0004fc/T/go-build913880170/_/Users/r/bug/_test/_testmain.go:37: leaking param: pat
/var/folders/g9/s3yf6f1n54bgn16vwpxnbxvm0004fc/T/go-build913880170/_/Users/r/bug/_test/_testmain.go:37: leaking param: str
/var/folders/g9/s3yf6f1n54bgn16vwpxnbxvm0004fc/T/go-build913880170/_/Users/r/bug/_test/_testmain.go:52: main &testing.M literal does not escape
--- FAIL: TestParseAllocs (0.00s)
    x_test.go:12: allocs: 11
    x_test.go:14: expected 23 allocations, got  11
FAIL
exit status 1
FAIL    _/Users/r/bug   0.031s
dunnart=% go test -memprofile=foo -memprofilerate=1
PASS
ok      _/Users/r/bug   0.034s
dunnart=% go tool pprof --alloc_objects bug.test foo 
Entering interactive mode (type "help" for commands)
(pprof) list foo
Total: 1145
ROUTINE ======================== _/Users/r/bug.foo in /Users/r/bug/x_test.go
      1111       1111 (flat, cum) 97.03% of Total
         .          .     18:type A struct {
         .          .     19:   r [5][]int
         .          .     20:}
         .          .     21:
         .          .     22:func foo() *A {
       101        101     23:   a := new(A)
         .          .     24:   for i := 0; i < 5; i++ {
       505        505     25:       a.r[i] = make([]int, 5)
         .          .     26:   }
         .          .     27:   for _, r := range a.r {
       505        505     28:       sort.Sort(sortable(r))
         .          .     29:   }
         .          .     30:   return a
         .          .     31:}
         .          .     32:
         .          .     33:type sortable []int
(pprof) disasm foo
Total: 1145
ROUTINE ======================== _/Users/r/bug.foo
      1111       1111 (flat, cum) 97.03% of Total
         .          .      66740: GS MOVQ GS:0x8a0, CX
         .          .      66749: LEAQ -0x60(SP), AX
         .          .      6674e: CMPQ 0x10(CX), AX
         .          .      66752: JBE 0x668d0
         .          .      66758: SUBQ $0xe0, SP
         .          .      6675f: LEAQ 0x7cb9a(IP), AX
         .          .      66766: MOVQ AX, 0(SP)
       101        101      6676a: CALL runtime.newobject(SB)
         .          .      6676f: MOVQ 0x8(SP), AX
         .          .      66774: MOVQ AX, 0x48(SP)
         .          .      66779: XORL CX, CX
         .          .      6677b: MOVQ CX, 0x30(SP)
         .          .      66780: CMPQ $0x5, CX
         .          .      66784: JGE 0x667fa
         .          .      66786: LEAQ 0x70333(IP), DX
         .          .      6678d: MOVQ DX, 0(SP)
         .          .      66791: MOVQ $0x5, 0x8(SP)
         .          .      6679a: MOVQ $0x5, 0x10(SP)
       505        505      667a3: CALL runtime.makeslice(SB)
         .          .      667a8: MOVQ 0x20(SP), AX
         .          .      667ad: MOVQ 0x28(SP), CX
         .          .      667b2: MOVQ 0x18(SP), DX
         .          .      667b7: MOVQ 0x48(SP), BX
         .          .      667bc: TESTL AL, 0(BX)
         .          .      667be: MOVQ 0x30(SP), BP
         .          .      667c3: LEAQ 0(BP)(BP*2), R8
         .          .      667c8: MOVQ AX, 0x8(BX)(R8*8)
         .          .      667cd: MOVQ CX, 0x10(BX)(R8*8)
         .          .      667d2: LEAQ 0(BX)(R8*8), AX
         .          .      667d6: MOVL 0x13a034(IP), CX
         .          .      667dc: TESTL CL, CL
         .          .      667de: JNE 0x668b3
         .          .      667e4: MOVQ DX, 0(BX)(R8*8)
         .          .      667e8: LEAQ 0x1(BP), CX
         .          .      667ec: MOVQ BX, AX
         .          .      667ef: MOVQ CX, 0x30(SP)
         .          .      667f4: CMPQ $0x5, CX
         .          .      667f8: JL 0x66786
         .          .      667fa: MOVQ 0(AX), CX
         .          .      667fd: MOVQ CX, 0x68(SP)
         .          .      66802: LEAQ 0x8(AX), SI
         .          .      66806: LEAQ 0x70(SP), DI
         .          .      6680b: CALL 0x507ae
         .          .      66810: XORL CX, CX
         .          .      66812: LEAQ 0x68(SP), DX
         .          .      66817: MOVQ CX, 0x38(SP)
         .          .      6681c: MOVQ DX, 0x40(SP)
         .          .      66821: CMPQ $0x5, CX
         .          .      66825: JGE 0x668a3
         .          .      66827: MOVQ 0x10(DX), BX
         .          .      6682b: MOVQ 0x8(DX), BP
         .          .      6682f: MOVQ 0(DX), SI
         .          .      66832: MOVQ SI, 0x50(SP)
         .          .      66837: MOVQ BP, 0x58(SP)
         .          .      6683c: MOVQ BX, 0x60(SP)
         .          .      66841: LEAQ 0x10d518(IP), BX
         .          .      66848: MOVQ BX, 0(SP)
         .          .      6684c: LEAQ 0x50(SP), BX
         .          .      66851: MOVQ BX, 0x8(SP)
         .          .      66856: MOVQ $0x0, 0x10(SP)
       505        505      6685f: CALL runtime.convT2I(SB)
         .          .      66864: MOVQ 0x20(SP), AX
         .          .      66869: MOVQ 0x18(SP), CX
         .          .      6686e: MOVQ CX, 0(SP)
         .          .      66872: MOVQ AX, 0x8(SP)
         .          .      66877: CALL sort.Sort(SB)
         .          .      6687c: MOVQ 0x40(SP), BX
         .          .      66881: LEAQ 0x18(BX), DX
         .          .      66885: MOVQ 0x38(SP), BX
         .          .      6688a: LEAQ 0x1(BX), CX
         .          .      6688e: MOVQ 0x48(SP), AX
         .          .      66893: MOVQ CX, 0x38(SP)
         .          .      66898: MOVQ DX, 0x40(SP)
         .          .      6689d: CMPQ $0x5, CX
         .          .      668a1: JL 0x66827
         .          .      668a3: MOVQ AX, 0xe8(SP)
         .          .      668ab: ADDQ $0xe0, SP
         .          .      668b2: RET
         .          .      668b3: MOVQ AX, 0(SP)
         .          .      668b7: MOVQ DX, 0x8(SP)
         .          .      668bc: CALL runtime.writebarrierptr(SB)
         .          .      668c1: MOVQ 0x48(SP), BX
         .          .      668c6: MOVQ 0x30(SP), BP
         .          .      668cb: JMP 0x667e8
         .          .      668d0: CALL runtime.morestack_noctxt(SB)
         .          .      668d5: JMP _/Users/r/bug.foo(SB)
         .          .      668da: INT $0x3
         .          .      668db: INT $0x3
         .          .      668dc: INT $0x3
         .          .      668dd: INT $0x3
         .          .      668de: INT $0x3
(pprof) quit
dunnart=% cat x_test.go
package foo

import (
    "sort"
    "testing"
)

func TestParseAllocs(t *testing.T) {
    allocs := testing.AllocsPerRun(100, func() {
        foo()
    })
    t.Log("allocs:", allocs)
    if allocs != 11 {
        t.Fatal("expected 11 allocations, got ", allocs)
    }
}

type A struct {
    r [5][]int
}

func foo() *A {
    a := new(A)
    for i := 0; i < 5; i++ {
        a.r[i] = make([]int, 5)
    }
    for _, r := range a.r {
        sort.Sort(sortable(r))
    }
    return a
}

type sortable []int

func (s sortable) Len() int           { return len(s) }
func (s sortable) Less(i, j int) bool { return s[i] < s[j] }
func (s sortable) Swap(i, j int)      { s[i], s[j] = s[j], s[i] }
dunnart=% 
@randall77

This comment has been minimized.

Contributor

randall77 commented Apr 26, 2016

package main

import "sort"

func f1(a []int) {
    sort.Sort(sortable(a))
}
func f2(a []int) {
    g(sortable(a))
}
//go:noescape
func g(s sort.Interface)

f1 allocates and f2 doesn't. (You can tell just by looking at the generated assembly and see what the 3rd arg to convT2I is. If it is 0, then it allocates. If it is a pointer to a stack temp, it doesn't.)

With -m, we get (among other reports):

tmp3.go:6: sortable(a) escapes to heap
tmp3.go:10: f2 sortable(a) does not escape

So I think this has something to do with the analysis. Maybe because quickSort is recursive and escape analysis gives up?

@dr2chase

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Apr 26, 2016

Was just looking at this, decided (in the process) that the explainer (-m -m) could use a bit more work (did not finger the call properly for the escape at line 28 -- it is to Sort).

And the short answer is "you passed the interface to Sort, and Sort was recorded as escUnknown, so it escaped". The longer answer is "and Sort contains a call to quickSort which is recursive, and escape analysis punts on recursive functions, and thought I did work on a CL for not punting, it was very, very fragile and did not produce much benefit, so I let it get pretty stale". I will try to revive that CL and see if it helps here at all -- if it does, that raises its priority, and if not, then not.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Apr 27, 2016

Did a bit more work and brought the recursive-function-nest-escape-analyzer forward, and that by itself was not enough.

@bradfitz bradfitz added the Performance label May 4, 2016

@bradfitz bradfitz added this to the Unplanned milestone May 4, 2016

@josharian josharian changed the title from cmd/gc: unexplained allocation for convT2I to cmd/compile: unexplained allocation for convT2I Feb 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment