New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: stack reuse for aggregate data #62077
Comments
Yeah, currently the compiler doesn't reuse stack slots for named variables and some temporaries. It might be a little trickier for data containing pointers, due to GC metadata. But (as you mentioned) for scalars, it should be fine.
I don't think whether it is aggregate matters. We could reuse stack slots for a local variable of type Do you have a sense how much it helps in real code? It will reduce some stack sizes, but I'm not sure how much it matters in practice. |
uber-go/zap#1310 implemented a fix for this that reduced stack usage for zap.Any from 4856 to 192 bytes. Due to stack goroutine stack growth/copying this was enough to notice a visible impact on profiles (~5% of total CPU in some workloads) |
yarpc/yarpc-go#2220 also hit a similar issue:
A large refactoring was initially proposed to avoid this, but in the end the fix was to reduce the size of |
See discussion in https://github.com/uber-go/zap/pull/1310/files#r1280107231. golang/go#62077 now exists, so we can link it.
…1387) See discussion in https://github.com/uber-go/zap/pull/1310/files#r1280107231. golang/go#62077 now exists, so we can link it.
We propose adding support for aggregate data in the stack reuse to the Go compiler. The Go 1.20 compiler cannot handle the local array effectively in the stack allocation. Consider an example as follows.
The corresponding disassembly is as below.
In the example provided, it is evident that a distinct temporary stack variable is allocated to store the returned Field from the callees, ensuring that they do not share the same stack memory. However, this approach is suboptimal due to the live ranges of these variables are not overlapped. Each temporary occupies its own exclusive space, leading to the necessity of sharing the stack slot. To achieve optimal code, the allocation should only cater to the maximum size among all possible temporaries across all switch cases, rather than summing up their individual sizes.
Multiple compilers for C/C++ and other programming languages have the capability to reuse the stack for scalar variables and/or arrays. For instance, LLVM's stack coloring pass allows for the reuse of the stack memory for local arrays.
In summary, we propose adding stack reuse for aggregate data in the go compiler. Inputs are welcome.
The text was updated successfully, but these errors were encountered: