Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: automatically stack-allocate small non-escaping slices of dynamic size #27625

Open
rasky opened this issue Sep 11, 2018 · 7 comments
Open
Labels

Comments

@rasky
Copy link
Member

@rasky rasky commented Sep 11, 2018

This commit:
95a11c7

shows a real-world performance gain triggered by moving a small non-escaping slice to the stack. It is my understanding that the Go compiler always allocated the slice in the heap because the length was not known at compile time.

Would it make sense to attempt a similar code transformation for many/all non escaping slices? What would be the cons? Any suggestion on how to identify which slices could benefit from this transformation and which would possibly just create overhead?

@rasky rasky added the Performance label Sep 11, 2018
@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 11, 2018

I think it's almost always a win to start on the stack, if we can.

This example is tricky, and probably very common:

var b []byte
for ... {
    b = append(b, ...)
}

How do we preallocate some space for b? We'd want to do:

var bStore [32]byte  // on stack
b := bStore[:0]

But that isn't right if for loop runs for 0 iterations. The result must be nil (and have 0 capacity).

@mvdan
Copy link
Member

@mvdan mvdan commented Sep 11, 2018

I wonder if always applying this transformation, even if it never hurt performance, would make binaries noticeably bigger.

Would this be done for all allocations of small non-escaping slices? Or only for those where the capacity is known at compile time to be small?

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 11, 2018

We already allocate small non-escaping slices on the stack if their capacity is known at compile time.
This issue would be about when the size is not known at compile time.

@ALTree
Copy link
Member

@ALTree ALTree commented Sep 11, 2018

Is this #20533?

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 11, 2018

It's definitely similar. #20533 is going more down the road of really allocating n bytes when you do a := make([]byte, n) (alloca-style, or on the heap with explicit free). This one is about allocating a constant size buffer and only using it when n is small enough.

@rasky
Copy link
Member Author

@rasky rasky commented Sep 11, 2018

Transforming a := make([]byte, n) into:

var a []byte
if n < 64 {
    a = make([]byte, n, 64)   // stack allocation
} else {
    a = make([]byte, n)       // heap allocation
}

can surely have some code size impact. In some cases, maybe prove is able to remove one of the two branches but I'm not holding my breath on that. I wonder if it's still worth, performance wise.

We should also explore doing this for slices of different types (while keeping total stack allocation within a certain limit).

@rasky rasky changed the title cmd/compile: automatically stack-allocate small non-escaping slices cmd/compile: automatically stack-allocate small non-escaping slices of dynamic size Sep 11, 2018
@navytux
Copy link
Contributor

@navytux navytux commented Apr 30, 2020

Transforming a := make([]byte, n) into ...

Recent example where such transformation was done by hand for performance: 17d5cef (CL 230657).

/cc @martisch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.