Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/debug: SetMaxStack doesn't work for sizes >= 4 GB #41228

Closed
vjorlikowski opened this issue Sep 4, 2020 · 10 comments
Closed

runtime/debug: SetMaxStack doesn't work for sizes >= 4 GB #41228

vjorlikowski opened this issue Sep 4, 2020 · 10 comments
Labels
Milestone

Comments

@vjorlikowski
Copy link

@vjorlikowski vjorlikowski commented Sep 4, 2020

While experimenting with larger values of k for the "Man or Boy" test (http://rosettacode.org/wiki/Man_or_boy_test#Go), I hit the default maximum stack size limit.

"No problem", say I. "I'll import runtime/debug and increase the stack size."
This works great for k=24, but breaks with k=25...and then, I try to increase the stack size to 4 gigabytes.

Which breaks.

At one byte shy of 4GB for the stack, I get the following error:

vjo@glaucus/pts/9:go/man_or_boy % time ./test     
runtime: goroutine stack exceeds 4294967295-byte limit
runtime: sp=0xc080160368 stack=[0xc080160000, 0xc100160000]
fatal error: stack overflow

runtime stack:
runtime.throw(0x4be4f3, 0xe)
        /home/vjo/local_godev/go/src/runtime/panic.go:1116 +0x72
runtime.newstack()
        /home/vjo/local_godev/go/src/runtime/stack.go:1060 +0x78d
runtime.morestack()
        /home/vjo/local_godev/go/src/runtime/asm_amd64.s:449 +0x8f

At a full 4GB stack, I instead see:

vjo@glaucus/pts/9:go/man_or_boy % time ./test     
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0xbf8003e380 pc=0x464281]

runtime stack:
runtime.throw(0x4c3674, 0x2a)
        /home/vjo/local_godev/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /home/vjo/local_godev/go/src/runtime/signal_unix.go:704 +0x4ac
runtime.memmove(0xbf8003e370, 0xc080160370, 0x7ffffc90)
        /home/vjo/local_godev/go/src/runtime/memmove_amd64.s:422 +0x501
runtime.copystack(0xc000000180, 0x100000000)
        /home/vjo/local_godev/go/src/runtime/stack.go:884 +0x17e
runtime.newstack()
        /home/vjo/local_godev/go/src/runtime/stack.go:1069 +0x28d
runtime.morestack()
        /home/vjo/local_godev/go/src/runtime/asm_amd64.s:449 +0x8f

The referenced line is here:

VMOVNTDQ Y0, (DI)

I would test without using the AVX instructions for memmove - except, there's no way to force useAVXmemmove to be false.

The version of go I'm using is:

vjo@glaucus/pts/9:go/man_or_boy % go version
go version go1.15.1 linux/amd64
@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 4, 2020

You can force AVX memmove off by setting the GODEBUG environment variable to cpu.avx=off.

There are several places in the runtime that assume stacks are <4GB. SetMaxStack is really for decreasing the size, not increasing it. Probably the issue in this particular case is that stackalloc only takes a 32-bit size:

func stackalloc(n uint32) stack

That said, we should error if the requested size is infeasible, not just randomly crash later.

@randall77 randall77 changed the title AVX memmove fails with stack sizes >= 4 GB runtime/debug: SetMaxStack doesn't work for sizes >= 4 GB Sep 4, 2020
@vjorlikowski
Copy link
Author

@vjorlikowski vjorlikowski commented Sep 4, 2020

@randall77 Agreed; I had just figured out the very things you listed to try in your reply via my own testing, and realized it wasn't related to AVX at all.

Fair enough, re: assumptions of a <4GB stack, and agreed that raising an error (rather than crashing) is the correct way forward (at least until the <4GB assumptions are re-visited).

Thanks!

@randall77 randall77 added this to the Go1.16 milestone Sep 4, 2020
@andybons andybons added the NeedsFix label Sep 8, 2020
@tpaschalis
Copy link
Contributor

@tpaschalis tpaschalis commented Sep 9, 2020

I'd agree with the proposed solution of returning an error from SetMaxStack; I'll take a stab and open a CL if there are no objections.

Also, I see the implicit assumptions of the 4GB limit in the code, do we think that we should mention it somewhere alongside the initial 1GB max size?

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 9, 2020

Yes, we should mention the limit in the SetMaxStack docs.

Fixing this will be a bit tricky because SetMaxStack can't report an error. What happens when we call it with a number that's too big? We can't return an error. We'd have to either panic or just not change the stack size. Or change it to the biggest feasible value.

Speaking of which, the biggest feasible value is probably 2GB.

@tpaschalis
Copy link
Contributor

@tpaschalis tpaschalis commented Sep 10, 2020

Just so that I'm not missing something, we can't change SetMaxStack's signature due to the compatibility promise, right?

Of the three options (panic, no-op, or change it to a pre-defined max), I'm not sure which is preferred approach, and y'all are certainly more experienced to gauge that.

For my 2c, I'd personally give a little edge to the panic option. Given it's in the debug package, I suppose most of the time end-users will use it interactively, well for debugging, so it might be worth to let them know upfront. In the other cases, they'd have to call SetMaxStack at least twice to see that the value did not change, and then refer to the docs to see why it didn't change.

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 10, 2020

Just so that I'm not missing something, we can't change SetMaxStack's signature due to the compatibility promise, right?

Yes.

The current behavior is to always accept any value you give it, and then randomly crash if we try to allocate a stack that is too big. Because that's the case, the change of least surprise would be to accept any value we give it, but give a stack overflow error if we try to allocate a stack that is too big. So we run the same set of programs; we just give a better crash message when we crash. So I think the pre-defined max is the way to go. It should be ok to do SetMaxStack(1<<40) as long as you don't actually allocate a stack >2GB.

So the change could be as simple as in stack.go, line 1053:

if newsize > maxstacksize || newsize > 1<<31

(And maybe that latter bound needs to be different for 32 and 64 bit.)

@tpaschalis
Copy link
Contributor

@tpaschalis tpaschalis commented Sep 11, 2020

I see your point, and to be honest, it makes a lot more sense. I'd be more than happy to open a CL if you think this is the way to go.

In regards to this 'new' pre-defined maximum, do you think that a value of twice the initial maxstacksize is a reasonable? (i.e. 500MB for 32-bit, 2GB for 64-bit)

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 11, 2020

I see your point, and to be honest, it makes a lot more sense. I'd be more than happy to open a CL if you think this is the way to go.

Go for it.

In regards to this 'new' pre-defined maximum, do you think that a value of twice the initial maxstacksize is a reasonable? (i.e. 500MB for 32-bit, 2GB for 64-bit)

Yes, that sounds fine.

I don't want to mention the hard limits in the docs, just something like "there may be a system-imposed maximum stack limit regardless of the value provided to SetMaxStack".

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 18, 2020

Change https://golang.org/cl/255997 mentions this issue: runtime: improve error messages after allocating a stack that is too big

@gopherbot gopherbot closed this in 331614c Sep 20, 2020
@tpaschalis
Copy link
Contributor

@tpaschalis tpaschalis commented Sep 20, 2020

@randall77 thanks a lot for the guidance and the confidence boost, I'm looking forward to picking up more issues!

Also, thanks @vjorlikowski for the report and @rasky for submitting the change!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.