Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: huge waste stack when make function calls #38588

Closed
zylthinking opened this issue Apr 22, 2020 · 3 comments
Closed

cmd/compile: huge waste stack when make function calls #38588

zylthinking opened this issue Apr 22, 2020 · 3 comments

Comments

@zylthinking
Copy link

@zylthinking zylthinking commented Apr 22, 2020

when make function calls, SP decrease at least 0x60 bytes even there is no use of stack at all. The stack waste forces newstack being called more frequently

@randall77
Copy link
Contributor

@randall77 randall77 commented Apr 22, 2020

Please fill out the issue template. Tell us what you did, and how you came to your conclusions.
Among the things we need to take any action on this report - Go version, OS, processor architecture, and an example program.

@andybons andybons changed the title cmd/compile huge waste stack when make function calls cmd/compile: huge waste stack when make function calls Apr 22, 2020
@zylthinking
Copy link
Author

@zylthinking zylthinking commented Apr 23, 2020

func test() {
    fmt.Println(1)
}

linux x64, go 1.14;
function test like above, check the assembly generated, you will see
SUBQ $0x60, SP

I don't know what is the extra 80 bytes for (96bytes - 8bytes for argment - 8 bytes for return address). Maybe rbp need 8 bytes? Well, at least 60+ bytes is wasted for nothing.

And 0x60 seems to be the smallest, I have seen 204 bytes being used, and in fact, there are only 5 local variables in the function, 8 bytes each.

Because of this, when there are more calling frames, the stack will drain quickly and newstack will get called to expand the stack, which is much slower, because of alloc, memory copy, and update pointers...; As a result, It took 50+% of executing time of my code.

  assign.go:170         0xe00b20                64488b0c25f8ffffff      MOVQ FS:0xfffffff8, CX
  assign.go:170         0xe00b29                483b6110                CMPQ 0x10(CX), SP
  assign.go:170         0xe00b2d                0f8698000000            JBE 0xe00bcb
  assign.go:170         0xe00b33                4883ec60                SUBQ $0x60, SP
  assign.go:170         0xe00b37                48896c2458              MOVQ BP, 0x58(SP)
  assign.go:170         0xe00b3c                488d6c2458              LEAQ 0x58(SP), BP
@randall77
Copy link
Contributor

@randall77 randall77 commented Apr 23, 2020

I see SUBQ $0x58, SP, for a total frame size of 96 bytes.

All that space makes sense to me.

  • 64 bytes for the outargs section of the frame, to call fmt.Fprintln (fmt.Println is inlined).
    • 16 for the io.Writer
    • 24 for the slice
    • 8 for the n return value
    • 16 for the error
  • 16 bytes for the backing store for the slice (a [1]interface{})
  • 8 bytes for the return address
  • 8 bytes for the frame pointer

We can't really make it smaller without going to a register-based calling convention (issue #18597). I'm going to close this as not actionable / working as intended.

@randall77 randall77 closed this Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.