Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove varargs from g_assert and g_assert_not_reachable (save 200+ bytes per frame in wasm interp). #17254

Merged
merged 1 commit into from
Oct 9, 2019

Conversation

jaykrell
Copy link
Contributor

@jaykrell jaykrell commented Oct 9, 2019

g_assertf unchanged.

I have a strong suspicion, this will save approximately 432 - sizeof (InterpFrame) bytes of stack
per function call in the WebAssembly interpreter.

It will not significantly benefit or hurt any other system.
Systems with disabled asserts will receive a small size improvement.

WebAssembly has no provision for varargs.
Emscripten emulates it.
It appears the emulation allocates room in the frame per outgoing varargs call.
No stack packing.
So each g_assert takes space.

Alternative would be to discourage or disable or remove asserts.

g_assertf unchanged.

I have a strong suspicion, this will save approximately 432 bytes of stack
per function call in the WebAssembly interpreter.

It will not significantly benefit or hurt any other system.
Systems with disabled asserts will receive a small size improvement.

WebAssembly has no provision for varargs.
Emscripten emulates it.
It appears the emulation allocates room in the frame per outgoing varargs call.
No stack packing.
So each g_assert takes space.

Alternative would be to discourage or disable or remove asserts.
@jaykrell jaykrell marked this pull request as ready for review October 9, 2019 14:27
@BrzVlad
Copy link
Member

BrzVlad commented Oct 9, 2019

How does one check the stack size on wasm ? Disassembly ?

@jaykrell
Copy link
Contributor Author

jaykrell commented Oct 9, 2019

So, this part of the stack, is just in global memory.
So while this is a good change, should make things a little faster, on wasm, I suspect it is not the stack that is being exhausted.

@jaykrell
Copy link
Contributor Author

jaykrell commented Oct 9, 2019

It is a little murky, because, hey, JIT can optimize, but we have this from objdump:

0052ba func[163] <interp_exec_method_full>:
 0052bb: 16 7f                      | local[0..21] type=i32
 0052bd: 03 7e                      | local[22..24] type=i64
 0052bf: 02 7d                      | local[25..26] type=f32
 0052c1: 02 7c                      | local[27..28] type=f64
 0052c3: 23 00                      | global.get 0
 0052c5: 41 b0 03                   | i32.const 432
 0052c8: 6b                         | i32.sub
 0052c9: 22 0f                      | local.tee 15
 0052cb: 21 0e                      | local.set 14
 0052cd: 20 0f                      | local.get 15
 0052cf: 24 00                      | global.set 0

That 432 is what was bugging me.
The global 0 is a sort of stack pointer -- for stuff not in registers and not for return addresses.
If you look at the uses then of local 14 and 15, you can see they are around the allocas, the child_frames use, and outgoing varargs calls.
You can see the outgoing varargs calls all use a different offset from local 14 -- no stack packing.
And they are all/mostly asserts.
And note that wasm has no varargs. It must be emulated, by passing pointer to parameters.
So it holds together.

Aside then, the recurse/nonrecursive split is seemingly very effective, producing this, saving about 116 bytes:

0052b9 func[163] <interp_exec_method_full>:
 0052ba: 05 7f                      | local[0..4] type=i32
 0052bc: 23 00                      | global.get 0
 0052be: 41 d0 00                   | i32.const 80
 0052c1: 6b                         | i32.sub
 0052c2: 22 04                      | local.tee 4
 0052c4: 21 05                      | local.set 5
 0052c6: 20 04                      | local.get 4
 0052c8: 24 00                      | global.set 0

however this is from an incorrect merge so not final.

As to how to find the runtime JITed code, I don't know.

#16172

@lewurm
Copy link
Contributor

lewurm commented Oct 9, 2019

I have a strong suspicion, this will save approximately 432 - sizeof (InterpFrame) bytes of stack
per function call in the WebAssembly interpreter.

Is that with a build where --enable-minimal=assert_messages is set? does it still do the varags stuff with that option? (Edit: This question doesn't make much sense; even with DISABLE_ASSERT_MESSAGES it used to use varargs.)

Anyway, nice find!

@vargaz
Copy link
Contributor

vargaz commented Oct 9, 2019

I can confirm that this change reduces stack size from 480 to 272 on wasm. Nice work.

@vargaz
Copy link
Contributor

vargaz commented Oct 9, 2019

@monojenkins build failed

@jaykrell
Copy link
Contributor Author

jaykrell commented Oct 9, 2019

This was Ken @kjpou1's private build. With extra printf even -- so 432 inflated some.
I think the disabled asserts don't help here. They are still separate call site, with different __LINE__ unless I misread that. They just save on the strings.

Note that assert could be better other ways.
It could be a global struct passed by address.
You could store into it upon failure.
Or it can be a static const struct, statically lnitialized, passed by address.
The string can be entirely constant, or split into a fixed number of pieces -- which this basically does.

@lewurm
Copy link
Contributor

lewurm commented Oct 9, 2019

@monojenkins squash

@jaykrell jaykrell changed the title Remove varargs from g_assert and g_assert_not_reachable. Remove varargs from g_assert and g_assert_not_reachable (save 200+ bytes per frame in wasm interp). Oct 9, 2019
@jaykrell
Copy link
Contributor Author

jaykrell commented Oct 9, 2019

Thanks to @kjpou1 for help & persistance here.

@monojenkins
Copy link
Contributor

Cannot squash because the following required status checks are not successful:

  • "OS X x64 Android SDK" state is "failure"

@lewurm
Copy link
Contributor

lewurm commented Oct 9, 2019

@monojenkins build OS X x64 Android SDK

@lewurm
Copy link
Contributor

lewurm commented Oct 9, 2019

@monojenkins build Windows x64 Interpreter

@monojenkins monojenkins merged commit 8718b75 into mono:master Oct 9, 2019
ManickaP pushed a commit to ManickaP/runtime that referenced this pull request Jan 20, 2020
…7254)

Remove varargs from g_assert and g_assert_not_reachable (save 200+ bytes per frame in wasm interp).

g_assertf unchanged.

I have a strong suspicion, this will save approximately 432 - sizeof (InterpFrame) bytes of stack
per function call in the WebAssembly interpreter.

It will not significantly benefit or hurt any other system.
Systems with disabled asserts will receive a small size improvement.

WebAssembly has no provision for varargs.
Emscripten emulates it.
It appears the emulation allocates room in the frame per outgoing varargs call.
No stack packing.
So each g_assert takes space.

Alternative would be to discourage or disable or remove asserts.


Commit migrated from mono/mono@8718b75
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants