Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: support inlining OCALLPART #18493

Open
josharian opened this issue Jan 2, 2017 · 6 comments

Comments

@josharian
Copy link
Contributor

commented Jan 2, 2017

context.WithCancel allocates three objects. One is a *cancelCtx, unavoidable. Another is a chan struct{}, hard to eliminate without runtime magic or a language change. The last is a closure:

	return &c, func() { c.cancel(true, Canceled) }

Canceled is a top-level var initialized with errors.New; it is an *errorString under the hood. Note that true and Canceled are both static data. And c has already leaked to the heap, courtesy of the &c return. The method cancel can be staticly resolved. So it seems like in principle it ought to be possible not to allocate for the closure; everything we need is either static or on the heap already.

Here's a straw man for how this might be accomplished. Generate a wrapper method:

func (c * cancelCtx) cancelTrueCanceled() { c.cancel(true, Canceled) }

and rewrite the original return to:

	return &c, c.cancelTrueCanceled

(This can also be done manually, if this first transformation is too hard.) Now we have a partial method call with no free variables instead of a closure.

Now represent this partial method call as a tuple: <receiver, method>. The receiver we already have. The method could be a function pointer. The receiver and function pointers are values and can be passed around safely on the stack. This is not dissimilar to what we do now, except that the existing formulation always requires heap allocation. In some (but not all) cases, the heap allocation could be avoided.

This is might be too big a change relative to the benefits it would bring, but thought I'd record it anyway. And maybe someone else sees a simpler way to eliminate this allocation.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jan 2, 2017

I'd support doing this by hand at least, considering how often context is used.

@bradfitz bradfitz added this to the Go1.9 milestone Jan 2, 2017
@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2017

Yeah, I filed this because package context accounts for almost 25% of allocations in my GC-bound program.

The first part (introducing a wrapper method) we can do by hand. The second part (keeping things on the stack in some cases) requires a compiler change.

I wish there was a way to tackle the chan struct{} allocation as well, but all my ideas seem like non-starters: there's no way to embed a channel value in a struct, can't have a sync.Pool of channels because there's no way to un-close a channel, etc.

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2017

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Jan 27, 2017

I just spent a bit of time with this. The only path forward I see that doesn't involve pretty major upheaval is teaching the compiler to inline OCALLPART, in the hopes that context.WithCancel would be inlined, and that in most cases the returned CancelFunc wouldn't escape, so that the closure data could go on the heap. I'm not sure that that would suffice, but it'd be a first place to look. (It also currently requires the manual code transformation done above to return a method value.)

@josharian josharian modified the milestones: Unplanned, Go1.9 Feb 28, 2017
@gopherbot

This comment has been minimized.

Copy link

commented Feb 18, 2018

Change https://golang.org/cl/94764 mentions this issue: cmd/compile: fold bit masking on bits that have been shifted away.

@mdempsky mdempsky changed the title cmd/compile: eliminate some partial function / closure allocations cmd/compile: support inlining OCALLPART Apr 22, 2019
@mdempsky

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

Agreed that extending the inliner to support inlining OCALLPART seems like the way forward here. Retitling to reflect that.

Naively, I think this might be as easy as just removing the logic from hairyVisitor.visit that prevents inlining of functions containing OCALLPART. Off hand, I can't think of anything special we need to do with OCALLPART for inlining.

The other two more general related issues that would address this are:

  1. Fixing inlining of functions with function literals (#28727); fixing inlining of OCALLPART is probably a subset of this.

  2. Support pushing allocation of returned objects into caller frames (#22081), which would allow allocating the closure in the caller frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.