Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce default stack size to 512KB. #10019

Closed

Conversation

juj
Copy link
Collaborator

@juj juj commented Dec 12, 2019

Reduce default main thread and pthread stack sizes to 512KB. This does not need to be as big as in native code, because control flow is not part of this stack.

This is something I have been meaning to propose for ages. E.g. In Visual Studio native ARM, x86 and x64 built code, default stack size is 1MB.

WebAssembly/JS is very special compared to native stack sizes in that neither execution control flow and regular local variables count into this size, but control flow is guarded inside browser, and regular locals are handled by local variables in JS code or in wasm. Only variables that have their addresses taken, or large structs or arrays utilize this limit.

Because of that specialty, I have never seen applications to need much of a stack at all. The largest usage I ever see come from applications that do something like temp char str[4096];s on their stack for string manipulation and similar. Even 512KB stack size here seems much larger than is typically needed, and on the safe side. Something like 64KB is probably of the order that applications commonly use.

And we have stack size checks in place, so when applications really need more, they should be able to catch any errors.

@juj juj force-pushed the more_reasonable_default_stack_size branch from bf7aa73 to 9e109ab Compare Dec 12, 2019
Copy link
Member

@kripken kripken left a comment

In general I agree with this, good idea. However, that we have 2 tests in the test suite that fail on it, and required fixing, makes me worried it will affect users. Curious to hear other's thoughts on the risk level here.

This should be mentioned in the Changelog.

@brion
Copy link
Collaborator

@brion brion commented Dec 12, 2019

fwiw I didn't have any trouble with ogv.js with my modules built with this change (TOTAL_STACK and DEFAULT_PTHREAD_STACK_SIZE set to 512 KiB) or even with it cranked down to 64 KiB, but things fail weirdly if I turn it too far down like to 4 KiB.

With my modules built at -O3 I get no stack overflow warning or exception; I either get assertion failures in my code itself or it just hangs mysteriously. Folks making optimized builds probably will have a hard time debugging failures if they do happen without turning assertions back on.

@kripken
Copy link
Member

@kripken kripken commented Dec 12, 2019

I agree, and in addition, I noticed that on the wasm backend assertions don't turn on full stack checks currently. I'm working to fix that now. With that at least with assertions the errors should be very clear.

@brion
Copy link
Collaborator

@brion brion commented Dec 12, 2019

(Crazy thought -- is it possible to move the stack to before the static data segment, so attempts to read or write beyond the end of the stack trigger a trap on out of range memory access wrapping around from 0?)

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Dec 12, 2019

(Crazy thought -- is it possible to move the stack to before the static data segment, so attempts to read or write beyond the end of the stack trigger a trap on out of range memory access wrapping around from 0?)

Funny you should say that. We have a lld option to enable just that and I got a request at the WebAssembly meetup last night to make it the default.

I'm think I might make it the default even if emscripten chooses to say with stack second.

IIRC the rational that is can actually make the binary smaller, since the LEB encoding of the addresses of you static data can often fix in a byte or two rather than three. Strange optimization I admit ..

@kripken
Copy link
Member

@kripken kripken commented Dec 12, 2019

In emscripten perhaps we'd use that option when not optimizing for size?

kripken added a commit to WebAssembly/binaryen that referenced this issue Dec 13, 2019
In normal mode we call a JS import, but we can't import from JS
in standalone mode. Instead, just trap in that case with an
unreachable. (The error reporting is not as good in this case, but
at least it catches all errors and halts, and the emitted wasm is
valid for standalone mode.)

Helps emscripten-core/emscripten#10019
@juj
Copy link
Collaborator Author

@juj juj commented Dec 13, 2019

This should be mentioned in the Changelog.

Added Changelog entry.

With my modules built at -O3 I get no stack overflow warning or exception; I either get assertion failures in my code itself or it just hangs mysteriously. Folks making optimized builds probably will have a hard time debugging failures if they do happen without turning assertions back on.

I agree, and in addition, I noticed that on the wasm backend assertions don't turn on full stack checks currently. I'm working to fix that now. With that at least with assertions the errors should be very clear.

Yeah, in optimized builds without the stack guards in place, things will certainly crash randomly. I don't think that is an issue, since developers have a habit of doing a debug build to troubleshoot random errors. Once Wasm backend has the check in place, perhaps this could land?

@kripken
Copy link
Member

@kripken kripken commented Dec 13, 2019

@sbc100 what's the default stack size in clang for wasm?

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Dec 13, 2019

@kripken
Copy link
Member

@kripken kripken commented Dec 13, 2019

Have there been complaints? Maybe we should have the same default in both places. But 64k does sound low on the other hand.

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Dec 13, 2019

I think most other wasm-ld users (i.e. wasi-sdk) are still only building fairly small projects, but no, no complaints yet.

We did get requests to make stack-first the default though, so that stack overflow would trap. So that might be sign that people are hitting it.

@juj
Copy link
Collaborator Author

@juj juj commented Dec 13, 2019

I noticed that on the wasm backend assertions don't turn on full stack checks currently.

Btw, the current -s STACK_OVERFLOW_CHECK=1 check is an after the fact check and not a "check on each stack bump" test. Are the wasm backend stack checks in the beginning of each function when doing a bump for local function stack frame? (or an after the fact check?)

@brion
Copy link
Collaborator

@brion brion commented Dec 13, 2019

The cookie check may also fail to register some overflows; if a function allocates a too-large buffer on the stack and then writes into only part of it, it may not overwrite the cookie value even though it overwrites part of static data.

(This seems to be what is happening with the module I found to fail with a 4 KiB stack, as it reads data into a char[8192] buffer.)

juj added 3 commits Dec 14, 2019
… not need to be as big as in native code,

because control flow and regular local variables are not part of this stack.
@juj juj force-pushed the more_reasonable_default_stack_size branch from 37bd6df to b013525 Compare Dec 14, 2019
@juj
Copy link
Collaborator Author

@juj juj commented Dec 14, 2019

I would love to reduce the default stack all the way to 64KB here as well..

@kripken
Copy link
Member

@kripken kripken commented Dec 16, 2019

I think maybe we can do this change, or even to 64K, after (1) the upstream backend changes to put the stack first, at least when not optimizing for size, so there are no surprise errors, and (2) we remove fastcomp entirely - as otherwise we can't reduce the default stack size there without risk, as we don't have the stack first there.

And in an assertions build we should check for memory traps and add extra logging if they seem like they could be stack overflows - that is, we can look at the position of the stack after such a trap, and if the pointer is in a suspicious place, warn.

I think with all those we should be fairly safe?

@dschuff
Copy link
Member

@dschuff dschuff commented Dec 20, 2019

@sbc100's comment above indicated that stack-first actually makes the binary smaller. So why wouldn't we just use the same layout all the time?

@kripken
Copy link
Member

@kripken kripken commented Dec 20, 2019

@dschuff I think it's more compact with the stack second. When it's first,

[..stack..] FIRST_GLOBAL

then the address of FIRST_GLOBAL might be 512K, which takes more than one byte to LEB encode. But when the stack is second,

FIRST_GLOBAL [..stack..] 

then FIRST_GLOBAL might have address 8.

The key thing is that stack addresses are not hardcoded throughout the binary, but static global addresses are.

@juj juj changed the base branch from incoming to master Jan 14, 2020
@juj
Copy link
Collaborator Author

@juj juj commented Feb 24, 2020

(1) the upstream backend changes to put the stack first, at least when not optimizing for size, so there are no surprise errors, and (2) we remove fastcomp entirely - as otherwise we can't reduce the default stack size there without risk, as we don't have the stack first there.

Having stack seconds seems better to me as well, a lot of global references can occur in builds, so having them in low addresses is good.

Also I don't think the location of the stack is a good measure of safety, since there are still pthreads to consider about. The best would be to have good stack checking codegen emitted for stack bumps, i.e. when stack is pushed to set up space for function local variables, the bump size is verified. That would guard against out of stack space errors, and along with a cookie check (perhaps change to a configurable sized guard area check?) that should help resolve all types of stack errors.

@juj
Copy link
Collaborator Author

@juj juj commented May 6, 2020

Closing this to not leave unlanded PRs lingering around.

@juj juj closed this May 6, 2020
@xbcnn
Copy link
Contributor

@xbcnn xbcnn commented Feb 28, 2022

@juj The default TOTAL_STACK is still 5MB in settings.js. Curious that why the change not merged?

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Feb 28, 2022

Here is a more recent (and still open) PR attempting to reduce TOTAL_SIZE, but currently stalled: #14177

@juj
Copy link
Collaborator Author

@juj juj commented Mar 1, 2022

Progress indeed got stalled. Commented on 14177 above.

@xbcnn
Copy link
Contributor

@xbcnn xbcnn commented Mar 2, 2022

Hi, @juj, I have some questions that could you shed some light on:

WebAssembly/JS is very special compared to native stack sizes in that neither execution control flow and regular local variables count into this size, but control flow is guarded inside browser

Here execution control flow, do you mean the stack frames? I know in native x86 debugging env, there are EBP/ESP to follow up tracing all frames. In WebAssembly, how calling frames are built, and any way we can see those lower data?

and regular locals are handled by local variables in JS code or in wasm.

I always wondering that where the passed arguments were put, as I find nothing in the STAK area in linear memory of wasm. (I use C/C++ DevTools Support extension for debugging in Browser, and view memory with Memory Inspector). So there is another stack maintained by the wasm engine in browser internally, is this correct?

Only variables that have their addresses taken, or large structs or arrays utilize this limit.

I understand address taken variables. But as for large structs or arrays, how large is it? Is there any spec/docs about this? or any examples.

Thanks very much.

@xbcnn
Copy link
Contributor

@xbcnn xbcnn commented Mar 2, 2022

I'd like to know more details about how functions calling implemented in WebAssembly, such as how arguments passing, execution control transferring, local variables alloc/free process, and results returned. Where is way to investigate or debugging? @juj @sbc100, Any guidance/information is welcome.

@juj
Copy link
Collaborator Author

@juj juj commented Mar 2, 2022

know in native x86 debugging env, there are EBP/ESP to follow up tracing all frames. In WebAssembly, how calling frames are built, and any way we can see those lower data?

That's right, in Wasm, there are two stacks. There is the internal Wasm VM managed control flow stack, which is completely hidden from Wasm user code to observe and examine - there is no way to see or manipulate any data in that stack, it is completely hidden for browser security reasons. That stack would contain function call frames (the EBP registers).

This stack that is Emscripten/LLVM established is only for "fat local data". If you have a general int foo as a local in a function, it will not be part of this data stack, but it will become a local in Wasm code. If you have an array char str[256]; or if you take an address of that int foo, then that data will be placed in the data stack.

I always wondering that where the passed arguments were put

The same story with passed arguments: they are part of the hidden Wasm VM stack, and not the LLVM data stack.

But as for large structs or arrays, how large is it? Is there any spec/docs about this?

Anything that is accessed via addresses or array indexing will need to be part of the stack. The Wasm locals cannot be array indexed. If a local variable does not have its address taken and is not indexed via an array, it will become a Wasm code local var. If the variable's address is taken, or it is indexed via an array subscript, it must go on the LLVM data stack.

how functions calling implemented in WebAssembly, such as how arguments passing, execution control transferring, local variables alloc/free process, and results returned

All of this information is internally implemented in a Wasm VM, and not exposed to user code to observe. The way it is implemented varies per each browser, so you may want to dig in to the browser source code to figure that out.

Emscripten/LLVM data stack position is tracked using a __stack_pointer global variable in the wasm module.

@xbcnn
Copy link
Contributor

@xbcnn xbcnn commented Mar 2, 2022

Thanks for the detailed replies, @juj. I got following good points and that answers part of my confusions.

in Wasm, there are two stacks.
Anything that is accessed via addresses or array indexing will need to be part of the stack.

For Wasm function calling model, I think I have to dive into browser source (probably v8/src/wasm) to find out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants