Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Missed opportunity for post-inlining devirtualization #38992

Open
alex opened this issue May 11, 2020 · 8 comments
Open

cmd/compile: Missed opportunity for post-inlining devirtualization #38992

alex opened this issue May 11, 2020 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@alex
Copy link
Contributor

alex commented May 11, 2020

What version of Go are you using (go version)?

$ go version
1.14

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

amd64/linux

What did you do?

When compiling the following code, there are missed opportunities for devirtualization: https://go.godbolt.org/z/7tPdJK

package main

type funcTable interface {
    Cmp(a interface{}, b interface{}) bool
	MinMax(minVal interface{}, maxVal interface{}, val interface{}) (interface{}, interface{})
}

func Min(table funcTable, a interface{}, b interface{}) interface {} {
    if table.Cmp(a, b) {
        return a
    } else {
        return b
    }
}

func Max(table funcTable, a interface{}, b interface{}) interface {} {
    if table.Cmp(a, b) {
        return b
    } else {
        return a
    }
}

type int32FuncTable struct{}

func (_ int32FuncTable) Cmp(a interface{}, b interface{}) bool {
    return a.(int32) < b.(int32)
}

func (table int32FuncTable) MinMax(minVal interface{}, maxVal interface{}, val interface{}) (interface{}, interface{}) {
    return Min(table, minVal, val), Max(table, maxVal, val)
}

What did you expect to see?

I expected that in32FuncTable.MinMax contains no indirect calls: Min/Max are inlined, and recursively Cmp is devirtualized and inlined.

What did you see instead?

Virtual calls to Cmp are emitted in int32FuncTable_MinMax_pc0:

        movq    go.itab."".int32FuncTable,"".funcTable+24(SB), AX
        // ....
        call    AX
@randall77 randall77 added this to the Unplanned milestone May 11, 2020
@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 11, 2020
@alex
Copy link
Contributor Author

alex commented May 11, 2020

I guess there's really two pieces here:
a) That call should just be a fully direct call, since it's reading a function pointer from a known itab.
b) Then it should be inlined because the function is below the inlining threshold.

@josharian
Copy link
Contributor

Inlining in the compiler happens far before devirtualization. Changing that is difficult, unfortunately. Missed optimization opportunities due to compiler ordering problems are among the most difficult to address.

@alex
Copy link
Contributor Author

alex commented May 11, 2020

Yes agreed. You basically end up needing to run some variant of passes 2x. I can imagine an architecture which tries to minimize costs looking something like:

  • Inlining phase tags functions which are inlineable (this must already be happening, -m shows <source>:26:6: can inline int32FuncTable.Cmp).
  • When devirtualization devirtualizes a call, if was tagged as an inlineable function it gets stored somewhere (I don't know gc's IR well, so this may require solving the part (a) problem I flagged where gc is still emitting an indirect call even after it was devirtualized)
  • Post-devirtualization, a mini-inlining pass occurs which inlines everything that was stored by the devirtualization pass.

Someone will inevitably be back a month after this and ask you to run devirtualization again after the second inlining, because unless you just run the whole optimizer stack again and again until you reach a fixed point, you'll always miss something, but this approach seems tractable and like it probably balances compiler performance with emitted code performance.

@josharian
Copy link
Contributor

It is unfortunately far more complicated than that. The internal representation in the compiler changes dramatically between those passes. You can’t just run a second “mini-inlining” pass later. I’d like to move inlining later in the compiler in general, but that’s a major undertaking and raises a number of thorny issues.

@alex
Copy link
Contributor Author

alex commented May 11, 2020

Oof, yes, different IRs would significantly complicate this. I'll refrain from further speculation, since while I was a compiler engineer in a former life, I don't know gc's internals at all.

@josharian
Copy link
Contributor

Well, if you feel inspired to learn, we’re happy to help you find your way around. More hands are always welcome. :)

@alex
Copy link
Contributor Author

alex commented May 11, 2020

For additional context, this issue was extracted from a search for a less code-duplicative way to implement the patch which became xitongsys/parquet-go#258

@silbinarywolf
Copy link
Contributor

In my closed duplicate thread, I have proposed changes in devirtualize.go that could achieve what you're asking. There's likely a penalty on the calls to EditChildren however that could/should ideally be lessened to only replace changed Call nodes.

#60032

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

6 participants