New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: automatically stack-allocate small non-escaping slices of dynamic size #27625

Open
rasky opened this Issue Sep 11, 2018 · 6 comments

Comments

Projects
None yet
4 participants
@rasky
Member

rasky commented Sep 11, 2018

This commit:
95a11c7

shows a real-world performance gain triggered by moving a small non-escaping slice to the stack. It is my understanding that the Go compiler always allocated the slice in the heap because the length was not known at compile time.

Would it make sense to attempt a similar code transformation for many/all non escaping slices? What would be the cons? Any suggestion on how to identify which slices could benefit from this transformation and which would possibly just create overhead?

@rasky rasky added the Performance label Sep 11, 2018

@randall77

This comment has been minimized.

Contributor

randall77 commented Sep 11, 2018

I think it's almost always a win to start on the stack, if we can.

This example is tricky, and probably very common:

var b []byte
for ... {
    b = append(b, ...)
}

How do we preallocate some space for b? We'd want to do:

var bStore [32]byte  // on stack
b := bStore[:0]

But that isn't right if for loop runs for 0 iterations. The result must be nil (and have 0 capacity).

@mvdan

This comment has been minimized.

Member

mvdan commented Sep 11, 2018

I wonder if always applying this transformation, even if it never hurt performance, would make binaries noticeably bigger.

Would this be done for all allocations of small non-escaping slices? Or only for those where the capacity is known at compile time to be small?

@randall77

This comment has been minimized.

Contributor

randall77 commented Sep 11, 2018

We already allocate small non-escaping slices on the stack if their capacity is known at compile time.
This issue would be about when the size is not known at compile time.

@ALTree

This comment has been minimized.

Member

ALTree commented Sep 11, 2018

Is this #20533?

@randall77

This comment has been minimized.

Contributor

randall77 commented Sep 11, 2018

It's definitely similar. #20533 is going more down the road of really allocating n bytes when you do a := make([]byte, n) (alloca-style, or on the heap with explicit free). This one is about allocating a constant size buffer and only using it when n is small enough.

@rasky

This comment has been minimized.

Member

rasky commented Sep 11, 2018

Transforming a := make([]byte, n) into:

var a []byte
if n < 64 {
    a = make([]byte, n, 64)   // stack allocation
} else {
    a = make([]byte, n)       // heap allocation
}

can surely have some code size impact. In some cases, maybe prove is able to remove one of the two branches but I'm not holding my breath on that. I wonder if it's still worth, performance wise.

We should also explore doing this for slices of different types (while keeping total stack allocation within a certain limit).

@rasky rasky changed the title from cmd/compile: automatically stack-allocate small non-escaping slices to cmd/compile: automatically stack-allocate small non-escaping slices of dynamic size Sep 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment