New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: dead code elimination for side-effect free functions #14840

Open
mdempsky opened this Issue Mar 16, 2016 · 10 comments

Comments

Projects
None yet
5 participants
@mdempsky
Member

mdempsky commented Mar 16, 2016

Currently if we compile a Go program like:

package main
type dead int
func newDead() *dead { return new(dead) }
var x = newDead()
func main() {}

the linker can't dead code eliminate x or dead. This is because the "x = newDead()" initialization is compiled to an implicit init function, which causes the linker to pull in x and newDead.

(See https://golang.org/cl/20765 for a real world example.)

In general, just because x is otherwise unused, the linker can't get rid of the newDead() call because it might have side-effects. However, it should be possible for the compiler to help identify functions that are side-effect free, which could in turn let the linker be more aggressive about dead code elimination.

We would probably also need to tweak how cmd/compile generates package initializer functions for the linker to be able to eliminate individual initializers.

@mdempsky mdempsky added this to the Unplanned milestone Mar 16, 2016

@bradfitz

This comment has been minimized.

Member

bradfitz commented Mar 16, 2016

This could help with shrinking init-time map literal construction in e.g. the unicode package.

If the unicode.init funcs generating the unicode's various maps were flagged as side-effect-free, then the whole init funcs can be deadcode eliminated if nobody referred to those maps.

/cc @crawshaw @josharian

@randall77

This comment has been minimized.

Contributor

randall77 commented Mar 16, 2016

init methods are a problem because they may reference (& initialize)
otherwise dead globals. Anything we can do to initialize globals without
inits will help (see https://go-review.googlesource.com/c/17398/ for an
example). It doesn't look like this is an easy transformation here,
because newDead() could do anything.
If instead it was:

var x = new(dead)

maybe we could rewrite that in the compiler to

var _x dead
var x = &_x

the latter wouldn't need any init code, just relocations, so if x was
otherwise unused both x and _x would be stripped out at link time.

On Wed, Mar 16, 2016 at 1:49 PM, Brad Fitzpatrick notifications@github.com
wrote:

This could help with shrinking init-time map literal construction in e.g.
the unicode package.

If the unicode.init funcs generating the unicode's various maps were
flagged as side-effect-free, then the whole init funcs can be deadcode
eliminated if nobody referred to those maps.

/cc @crawshaw https://github.com/crawshaw @josharian
https://github.com/josharian


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#14840 (comment)

@mdempsky

This comment has been minimized.

Member

mdempsky commented Mar 16, 2016

@randall77 I think during escape analysis we could conservatively detect whether a function has side effects?

@randall77

This comment has been minimized.

Contributor

randall77 commented Mar 16, 2016

I don't think the current escape analysis can help, as it runs after the init functions are built. You'd want to decide the side-effect-freeness of the RHS of global initializers before the init function is built.
Probably a very simple analysis would capture most of the cases we care about.

And then you'd have to have an init function per global, have it associated with the global, and only link it in if the global was otherwise reachable.

@josharian

This comment has been minimized.

Contributor

josharian commented Mar 16, 2016

The compiler already has samesafeexpr. We could remove the "same" part and steal that.

I wonder whether have an init function per global (with all the associated overhead) would end up outweighing the other good done.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Mar 16, 2016

Overhead from too many init functions is a fair concern. I'd imagine a lot of the functions could be compiled as nosplit, which might reduce the overhead.

Also, traditionally ELF compilers generate .init sections with straight line instructions that are just pasted together, skipping function call overhead for simple functions. (The .init code still ends up forming a proper function because the linker will also include C runtime files like crti.o and crtn.o, which supply the appropriate function prologue/epilogue in their .init sections.)

@crawshaw

This comment has been minimized.

Contributor

crawshaw commented Mar 17, 2016

Anyone writing code for this? If not, I'll spend a couple of days on it.

Assuming each global with side-effect free initialization is split off, I suspect the linker will need to merge the safe init functions together after reachability analysis. But that should be safe to do as long as we distinguish the safe generated init code from general init functions.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Mar 17, 2016

All yours.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Mar 17, 2016

@crawshaw Go for it.

@crawshaw

This comment has been minimized.

Contributor

crawshaw commented Mar 21, 2016

I made a bit of progress on this, but after looking closer I don't know how it will help binary size.

The obvious target is the range tables in the unicode package. It would be nice to make it possible for the linker's deadcode pass to catch exported symbols like Nl, Nd, etc. But: all of these tables are included in the unicode.Categories map, which is used by the regexp package.

All the real world programs I looked at use package regexp. Even tiny ones like objdump. So while this will make the "hello world" program from #6853 look good, I don't see how it will help in practice.

If anyone can convince me otherwise I'll dust this off, but for now this looks like an academic exercise to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment