New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: improve escape analysis of make([]T, n) where n is non-constant #20533

Open
alandonovan opened this Issue May 30, 2017 · 9 comments

Comments

Projects
None yet
9 participants
@alandonovan
Contributor

alandonovan commented May 30, 2017

$ cat a.go
package p

var n = 3

func f() {
        slice := make([]*int, n)
        var i int
        slice[0] = &i
}

$ go tool compile  -m a.go
a.go:5:6: can inline f
a.go:6:15: make([]*int, n) escapes to heap
a.go:8:13: &i escapes to heap
a.go:7:6: moved to heap: i

In this program, the compiler's escape analysis judges that the array allocated by make escapes to the heap, when in fact it does not, presumably because its size is non-constant and thus it cannot be allocated on the stack (without alloca).

The lack of this optimization makes it hard to write a good bytecode interpreter in Go because the interpreter's operand stack has a non-constant size, and is thus heap-allocated, even though it is guaranteed by construction not to escape. Consequently, the interpreter incurs a heap allocation at the start of each function, and then a GC write barrier each time it stores an operand to the stack, which is a common operation.

Perhaps the notions of "escapes to heap" and "requires a write barrier" could be decoupled so that a non-constant-sized non-escaping heap variable could avoid write barriers. Or perhaps the compiler could use alloca to allocate non-constant-size non-escaping variables on the stack.

@randall77

This comment has been minimized.

Contributor

randall77 commented May 30, 2017

Implementing alloca wouldn't be impossible now that we have frame pointers everywhere. It's still tricky, though, because stack space is limited.
If we allocated unescaping things on the heap, we could have gc treat them as some sort of "extended stack" where write barriers were not necessary. Also tricky but probably doable.

@aclements

@josharian josharian changed the title from gc: improve escape analysis of make([]T, n) where n is non-constant to cmd/compile: improve escape analysis of make([]T, n) where n is non-constant May 31, 2017

@valyala

This comment has been minimized.

Contributor

valyala commented Jun 2, 2017

Please, do not allocate big chunks of data on stack. Otherwise issues similar to #18625 and #19817 will appear. Additionally, it would be great to have stack size profiler described at #20010 in order to detect stack (ab)users if optimizations similar to this one will go into go :)

Probably it would be better to add new type of memory - scope-allocated, which is allocated from a special heap and automatically freed on exit from the corresponding scope if the compiler can prove the memory doesn't escape from the scope.

@aclements

This comment has been minimized.

Member

aclements commented Jun 5, 2017

Probably it would be better to add new type of memory - scope-allocated, which is allocated from a special heap and automatically freed on exit from the corresponding scope if the compiler can prove the memory doesn't escape from the scope.

@RLH and I have been talking about doing this sort of thing for a while. There are several cases where escape analysis can in principle determine that an object's extent does not exceed its scope but it is still forced to heap-allocate it. It wouldn't necessarily require a special heap. We've been talking about adding an explicit runtime.free function the compiler could generate calls to. The design of the memory allocator makes this reasonable to support right in the regular heap.

@joshlf

This comment has been minimized.

joshlf commented Jan 24, 2018

@ezrosent and I were recently discussing a similar idea - that you could prove that a given pointer to a heap-allocated object is the only pointer to it, and thus even if it gets sent across a channel or in some other way escapes its scope, it could still be deterministically freed. The notion of such a special heap or an explicit runtime.free would definitely help with this optimization.

warpfork added a commit to polydawn/refmt that referenced this issue Mar 3, 2018

First pass of omitEmpty support.
Structs now inspect the value before each key, because yielding of the key
must of course be skipped if the value is to be skipped.

And yet, we're not done here, and that test is commented out for a reason!
This is more complicated than for e.g. stdlib json.Marshal -- we have to
emit length information at the beginning of an object.

And this, in turn, is capital-H Hard.  Emitting the correct length information
*up front* will require significantly more code changes, and they're a tad
controvertial.  We have to inspect *all* the fields to see if they're going
to be skipped.  And strangely, I think we're going to have to do that twice.
Checking for the fields to skip must happen at the top, that much is clear;
but then to remember which ones we already know will be skipped would require
O(n) memory in the length of the struct... which would imply a heap allocation
to track!  (Worrying about heap allocs is not news in the refmt project because
of our stepfunc design, but it's interesting to note we'd be in trouble anyway:
Go actually always lets runtime-sized slice creation escape to heap:
golang/go#20533 .)  So.  An O(2n) runtime is going
to be a better trade than slipping from constant to O(n) memory.  Hrmph.
Anyway, that bit will be in the next commits.

Signed-off-by: Eric Myhre <hash@exultant.us>

warpfork added a commit to polydawn/refmt that referenced this issue Mar 3, 2018

First pass of omitEmpty support.
Structs now inspect the value before each key, because yielding of the key
must of course be skipped if the value is to be skipped.

And yet, we're not done here, and that test is commented out for a reason!
This is more complicated than for e.g. stdlib json.Marshal -- we have to
emit length information at the beginning of an object.

And this, in turn, is capital-H Hard.  Emitting the correct length information
*up front* will require significantly more code changes, and they're a tad
controvertial.  We have to inspect *all* the fields to see if they're going
to be skipped.  And strangely, I think we're going to have to do that twice.
Checking for the fields to skip must happen at the top, that much is clear;
but then to remember which ones we already know will be skipped would require
O(n) memory in the length of the struct... which would imply a heap allocation
to track!  (Worrying about heap allocs is not news in the refmt project because
of our stepfunc design, but it's interesting to note we'd be in trouble anyway:
Go actually always lets runtime-sized slice creation escape to heap:
golang/go#20533 .)  So.  An O(2n) runtime is going
to be a better trade than slipping from constant to O(n) memory.  Hrmph.
Anyway, that bit will be in the next commits.

Signed-off-by: Eric Myhre <hash@exultant.us>

@ALTree ALTree added this to the Unplanned milestone Mar 28, 2018

@FlorianUekermann

This comment has been minimized.

Contributor

FlorianUekermann commented Jul 10, 2018

@ezrosent and I were recently discussing a similar idea - that you could prove that a given pointer to a heap-allocated object is the only pointer to it, and thus even if it gets sent across a channel or in some other way escapes its scope, it could still be deterministically freed.

I'm pretty sure this is quite hard to prove in cases like the one you describe (while limiting compile time), so you would have to resort to something like reference counting (not that I'm opposed to reference counting in general).
However, in simpler cases like var t = new(T) or var t = NewT() should be possible to prove whether the pointer escapes or not.

There is already escape analysis for function arguments (which had a huge impact on the performance of the fmt package when it was introduced), so the kind of solution proposed by @valyala and @aclements should be quite possible. Can anyone confirm whether there has been any work in this direction?

@joshlf

This comment has been minimized.

joshlf commented Jul 10, 2018

I'm pretty sure this is quite hard to prove in cases like the one you describe (while limiting compile time), so you would have to resort to something like reference counting (not that I'm opposed to reference counting in general).

Algorithmically, the idea is to do something roughly analogous to Rust's ownership tracking. Essentially, you start off with the assumption that all objects have a single unique owner - and thus, that they can be deallocated when they go out of scope - and then you work forwards from the allocation point to see if it's ever the case that the same pointer gets sent to two or more different places (sent across two channels, sent across a channel and stored in a data structure, etc), in which case the unique ownership property is broken. If the property is never broken, then you can free the object when it goes out of scope.

I haven't actually prototyped this, but my guess is that that algorithm would be pretty reasonable in terms of execution time. Total speculation, though.

@FlorianUekermann

This comment has been minimized.

Contributor

FlorianUekermann commented Jul 10, 2018

I think we're going off topic here and should probably continue on the mailing list. Let me just point out why:

Algorithmically, the idea is to do something roughly analogous to Rust's ownership tracking.

Not really. I wouldn't take Rust as and indication that this is possible in Go. Rust puts the burden of deciding ownership on the developer, not the compiler, by having different kinds of references and strict rules on how you can use them. Their compiler doesn't have to do analysis like this.

and then you work forwards from the allocation point to see if it's ever the case that the same pointer gets sent to two...

You mean, backwards from the receiving end I guess, since you want to know there if you are the sole owner of everything that comes in or if it could have escaped. This is both complex and ends up in the "whole program analysis" category. Especially the latter doesn't play well with Go's compile time goals and seems quite different than the more localized escape analysis discussed in this issue.

Don't get me wrong. I would love to see something like this too, even if it only works in very specific cases. I just don't think that it is reasonable to expect that to be accomplished on the same timescale as the original issue.

@joshlf

This comment has been minimized.

joshlf commented Jul 10, 2018

I think we're going off topic here and should probably continue on the mailing list.

Agreed. Suffice it to say that I think there's promise here, but most of my thinking on this was done five months ago, so I don't remember why I arrive at that conclusion. If folks are interested, I'd be happy to discuss this in more detail somewhere, but I certainly don't have time to make a prototype now, so I'm not going to drive this myself. Sorry to hijack the thread :)

@gopherbot

This comment has been minimized.

gopherbot commented Dec 4, 2018

Change https://golang.org/cl/152478 mentions this issue: cmd/compile: use Node.Name.Defn in optimizations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment