Reduce default stack size to 512KB. #10019

juj · 2019-12-12T15:58:29Z

Reduce default main thread and pthread stack sizes to 512KB. This does not need to be as big as in native code, because control flow is not part of this stack.

This is something I have been meaning to propose for ages. E.g. In Visual Studio native ARM, x86 and x64 built code, default stack size is 1MB.

WebAssembly/JS is very special compared to native stack sizes in that neither execution control flow and regular local variables count into this size, but control flow is guarded inside browser, and regular locals are handled by local variables in JS code or in wasm. Only variables that have their addresses taken, or large structs or arrays utilize this limit.

Because of that specialty, I have never seen applications to need much of a stack at all. The largest usage I ever see come from applications that do something like temp char str[4096];s on their stack for string manipulation and similar. Even 512KB stack size here seems much larger than is typically needed, and on the safe side. Something like 64KB is probably of the order that applications commonly use.

And we have stack size checks in place, so when applications really need more, they should be able to catch any errors.

kripken

In general I agree with this, good idea. However, that we have 2 tests in the test suite that fail on it, and required fixing, makes me worried it will affect users. Curious to hear other's thoughts on the risk level here.

This should be mentioned in the Changelog.

bvibber · 2019-12-12T21:35:32Z

fwiw I didn't have any trouble with ogv.js with my modules built with this change (TOTAL_STACK and DEFAULT_PTHREAD_STACK_SIZE set to 512 KiB) or even with it cranked down to 64 KiB, but things fail weirdly if I turn it too far down like to 4 KiB.

With my modules built at -O3 I get no stack overflow warning or exception; I either get assertion failures in my code itself or it just hangs mysteriously. Folks making optimized builds probably will have a hard time debugging failures if they do happen without turning assertions back on.

kripken · 2019-12-12T22:12:38Z

I agree, and in addition, I noticed that on the wasm backend assertions don't turn on full stack checks currently. I'm working to fix that now. With that at least with assertions the errors should be very clear.

bvibber · 2019-12-12T22:19:58Z

(Crazy thought -- is it possible to move the stack to before the static data segment, so attempts to read or write beyond the end of the stack trigger a trap on out of range memory access wrapping around from 0?)

sbc100 · 2019-12-12T22:23:34Z

(Crazy thought -- is it possible to move the stack to before the static data segment, so attempts to read or write beyond the end of the stack trigger a trap on out of range memory access wrapping around from 0?)

Funny you should say that. We have a lld option to enable just that and I got a request at the WebAssembly meetup last night to make it the default.

I'm think I might make it the default even if emscripten chooses to say with stack second.

IIRC the rational that is can actually make the binary smaller, since the LEB encoding of the addresses of you static data can often fix in a byte or two rather than three. Strange optimization I admit ..

kripken · 2019-12-12T22:26:09Z

In emscripten perhaps we'd use that option when not optimizing for size?

In normal mode we call a JS import, but we can't import from JS in standalone mode. Instead, just trap in that case with an unreachable. (The error reporting is not as good in this case, but at least it catches all errors and halts, and the emitted wasm is valid for standalone mode.) Helps emscripten-core/emscripten#10019

juj · 2019-12-13T08:38:06Z

This should be mentioned in the Changelog.

Added Changelog entry.

With my modules built at -O3 I get no stack overflow warning or exception; I either get assertion failures in my code itself or it just hangs mysteriously. Folks making optimized builds probably will have a hard time debugging failures if they do happen without turning assertions back on.

I agree, and in addition, I noticed that on the wasm backend assertions don't turn on full stack checks currently. I'm working to fix that now. With that at least with assertions the errors should be very clear.

Yeah, in optimized builds without the stack guards in place, things will certainly crash randomly. I don't think that is an issue, since developers have a habit of doing a debug build to troubleshoot random errors. Once Wasm backend has the check in place, perhaps this could land?

kripken · 2019-12-13T17:16:11Z

@sbc100 what's the default stack size in clang for wasm?

sbc100 · 2019-12-13T17:32:52Z

looks like we currently default to just a single wasm page (64k): https://github.com/llvm/llvm-project/blob/a2cd4600ec6710f3218f071128e2a81edd23a2b2/lld/wasm/Driver.cpp#L357

kripken · 2019-12-13T18:22:35Z

Have there been complaints? Maybe we should have the same default in both places. But 64k does sound low on the other hand.

sbc100 · 2019-12-13T18:50:22Z

I think most other wasm-ld users (i.e. wasi-sdk) are still only building fairly small projects, but no, no complaints yet.

We did get requests to make stack-first the default though, so that stack overflow would trap. So that might be sign that people are hitting it.

juj · 2019-12-13T20:06:16Z

I noticed that on the wasm backend assertions don't turn on full stack checks currently.

Btw, the current -s STACK_OVERFLOW_CHECK=1 check is an after the fact check and not a "check on each stack bump" test. Are the wasm backend stack checks in the beginning of each function when doing a bump for local function stack frame? (or an after the fact check?)

bvibber · 2019-12-13T20:16:21Z

The cookie check may also fail to register some overflows; if a function allocates a too-large buffer on the stack and then writes into only part of it, it may not overwrite the cookie value even though it overwrites part of static data.

(This seems to be what is happening with the module I found to fail with a 4 KiB stack, as it reads data into a char[8192] buffer.)

… not need to be as big as in native code, because control flow and regular local variables are not part of this stack.

juj · 2019-12-14T19:26:07Z

I would love to reduce the default stack all the way to 64KB here as well..

kripken · 2019-12-16T21:24:06Z

I think maybe we can do this change, or even to 64K, after (1) the upstream backend changes to put the stack first, at least when not optimizing for size, so there are no surprise errors, and (2) we remove fastcomp entirely - as otherwise we can't reduce the default stack size there without risk, as we don't have the stack first there.

And in an assertions build we should check for memory traps and add extra logging if they seem like they could be stack overflows - that is, we can look at the position of the stack after such a trap, and if the pointer is in a suspicious place, warn.

I think with all those we should be fairly safe?

dschuff · 2019-12-20T17:26:29Z

@sbc100's comment above indicated that stack-first actually makes the binary smaller. So why wouldn't we just use the same layout all the time?

kripken · 2019-12-20T18:04:55Z

@dschuff I think it's more compact with the stack second. When it's first,

[..stack..] FIRST_GLOBAL

then the address of FIRST_GLOBAL might be 512K, which takes more than one byte to LEB encode. But when the stack is second,

FIRST_GLOBAL [..stack..]

then FIRST_GLOBAL might have address 8.

The key thing is that stack addresses are not hardcoded throughout the binary, but static global addresses are.

juj · 2020-02-24T12:18:45Z

(1) the upstream backend changes to put the stack first, at least when not optimizing for size, so there are no surprise errors, and (2) we remove fastcomp entirely - as otherwise we can't reduce the default stack size there without risk, as we don't have the stack first there.

Having stack seconds seems better to me as well, a lot of global references can occur in builds, so having them in low addresses is good.

Also I don't think the location of the stack is a good measure of safety, since there are still pthreads to consider about. The best would be to have good stack checking codegen emitted for stack bumps, i.e. when stack is pushed to set up space for function local variables, the bump size is verified. That would guard against out of stack space errors, and along with a cookie check (perhaps change to a configurable sized guard area check?) that should help resolve all types of stack errors.

juj · 2020-05-06T12:57:00Z

Closing this to not leave unlanded PRs lingering around.

xbcnn · 2022-02-28T10:25:23Z

@juj The default TOTAL_STACK is still 5MB in settings.js. Curious that why the change not merged?

sbc100 · 2022-02-28T16:37:53Z

Here is a more recent (and still open) PR attempting to reduce TOTAL_SIZE, but currently stalled: #14177

juj · 2022-03-01T10:52:29Z

Progress indeed got stalled. Commented on 14177 above.

xbcnn · 2022-03-02T03:29:49Z

Hi, @juj, I have some questions that could you shed some light on:

WebAssembly/JS is very special compared to native stack sizes in that neither execution control flow and regular local variables count into this size, but control flow is guarded inside browser

Here execution control flow, do you mean the stack frames? I know in native x86 debugging env, there are EBP/ESP to follow up tracing all frames. In WebAssembly, how calling frames are built, and any way we can see those lower data?

and regular locals are handled by local variables in JS code or in wasm.

I always wondering that where the passed arguments were put, as I find nothing in the STAK area in linear memory of wasm. (I use C/C++ DevTools Support extension for debugging in Browser, and view memory with Memory Inspector). So there is another stack maintained by the wasm engine in browser internally, is this correct?

Only variables that have their addresses taken, or large structs or arrays utilize this limit.

I understand address taken variables. But as for large structs or arrays, how large is it? Is there any spec/docs about this? or any examples.

Thanks very much.

xbcnn · 2022-03-02T03:42:50Z

I'd like to know more details about how functions calling implemented in WebAssembly, such as how arguments passing, execution control transferring, local variables alloc/free process, and results returned. Where is way to investigate or debugging? @juj @sbc100, Any guidance/information is welcome.

juj · 2022-03-02T08:05:44Z

know in native x86 debugging env, there are EBP/ESP to follow up tracing all frames. In WebAssembly, how calling frames are built, and any way we can see those lower data?

That's right, in Wasm, there are two stacks. There is the internal Wasm VM managed control flow stack, which is completely hidden from Wasm user code to observe and examine - there is no way to see or manipulate any data in that stack, it is completely hidden for browser security reasons. That stack would contain function call frames (the EBP registers).

This stack that is Emscripten/LLVM established is only for "fat local data". If you have a general int foo as a local in a function, it will not be part of this data stack, but it will become a local in Wasm code. If you have an array char str[256]; or if you take an address of that int foo, then that data will be placed in the data stack.

I always wondering that where the passed arguments were put

The same story with passed arguments: they are part of the hidden Wasm VM stack, and not the LLVM data stack.

But as for large structs or arrays, how large is it? Is there any spec/docs about this?

Anything that is accessed via addresses or array indexing will need to be part of the stack. The Wasm locals cannot be array indexed. If a local variable does not have its address taken and is not indexed via an array, it will become a Wasm code local var. If the variable's address is taken, or it is indexed via an array subscript, it must go on the LLVM data stack.

how functions calling implemented in WebAssembly, such as how arguments passing, execution control transferring, local variables alloc/free process, and results returned

All of this information is internally implemented in a Wasm VM, and not exposed to user code to observe. The way it is implemented varies per each browser, so you may want to dig in to the browser source code to figure that out.

Emscripten/LLVM data stack position is tracked using a __stack_pointer global variable in the wasm module.

xbcnn · 2022-03-02T08:40:11Z

Thanks for the detailed replies, @juj. I got following good points and that answers part of my confusions.

in Wasm, there are two stacks.
Anything that is accessed via addresses or array indexing will need to be part of the stack.

For Wasm function calling model, I think I have to dive into browser source (probably v8/src/wasm) to find out.

juj force-pushed the more_reasonable_default_stack_size branch from bf7aa73 to 9e109ab Compare December 12, 2019 16:51

kripken reviewed Dec 12, 2019

View reviewed changes

kripken mentioned this pull request Dec 12, 2019

Support stack overflow checks in standalone mode WebAssembly/binaryen#2525

Merged

juj added 3 commits December 14, 2019 09:53

Reduce default main thread and pthread stack size to 512KB. This does…

8500197

… not need to be as big as in native code, because control flow and regular local variables are not part of this stack.

Update tests to 512KB stack size

4103a16

Add ChangeLog entry

b013525

juj force-pushed the more_reasonable_default_stack_size branch from 37bd6df to b013525 Compare December 14, 2019 07:53

juj changed the base branch from incoming to master January 14, 2020 18:57

juj closed this May 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce default stack size to 512KB. #10019

Reduce default stack size to 512KB. #10019

juj commented Dec 12, 2019 •

edited

kripken left a comment

bvibber commented Dec 12, 2019

kripken commented Dec 12, 2019

bvibber commented Dec 12, 2019

sbc100 commented Dec 12, 2019

kripken commented Dec 12, 2019

juj commented Dec 13, 2019

kripken commented Dec 13, 2019

sbc100 commented Dec 13, 2019

kripken commented Dec 13, 2019

sbc100 commented Dec 13, 2019

juj commented Dec 13, 2019

bvibber commented Dec 13, 2019

juj commented Dec 14, 2019

kripken commented Dec 16, 2019

dschuff commented Dec 20, 2019

kripken commented Dec 20, 2019

juj commented Feb 24, 2020

juj commented May 6, 2020

xbcnn commented Feb 28, 2022

sbc100 commented Feb 28, 2022

juj commented Mar 1, 2022

xbcnn commented Mar 2, 2022

xbcnn commented Mar 2, 2022 •

edited

juj commented Mar 2, 2022

xbcnn commented Mar 2, 2022

Reduce default stack size to 512KB. #10019

Reduce default stack size to 512KB. #10019

Conversation

juj commented Dec 12, 2019 • edited

kripken left a comment

Choose a reason for hiding this comment

bvibber commented Dec 12, 2019

kripken commented Dec 12, 2019

bvibber commented Dec 12, 2019

sbc100 commented Dec 12, 2019

kripken commented Dec 12, 2019

juj commented Dec 13, 2019

kripken commented Dec 13, 2019

sbc100 commented Dec 13, 2019

kripken commented Dec 13, 2019

sbc100 commented Dec 13, 2019

juj commented Dec 13, 2019

bvibber commented Dec 13, 2019

juj commented Dec 14, 2019

kripken commented Dec 16, 2019

dschuff commented Dec 20, 2019

kripken commented Dec 20, 2019

juj commented Feb 24, 2020

juj commented May 6, 2020

xbcnn commented Feb 28, 2022

sbc100 commented Feb 28, 2022

juj commented Mar 1, 2022

xbcnn commented Mar 2, 2022

xbcnn commented Mar 2, 2022 • edited

juj commented Mar 2, 2022

xbcnn commented Mar 2, 2022

juj commented Dec 12, 2019 •

edited

xbcnn commented Mar 2, 2022 •

edited