Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM #21502

yamt · 2024-03-11T01:36:48Z

the new runtime code is basically a copy from:
https://github.com/yamt/garbage/blob/wasm-sjlj-alt2/wasm/longjmp/rt.c.

The corresponding LLVM change:
llvm/llvm-project#84137

Discussion:
https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit

yamt · 2024-03-11T02:08:23Z

to use this, i suppose we need one of the following:

make the LLVM use the new method unconditionally
or, make some changes in SUPPORT_LONGJMP and tools/building.py to specify the new LLVM option

sbc100

Thanks for working on this!

Would you mind adding these new functions to the existing emscripten_setjmp.c since that is where these functions currently live?

Also, can you match the formatting rules of that file which is basically LLVM coding style (you can use clang-format to do this for you I think).

sbc100 · 2024-03-11T21:31:00Z

system/lib/compiler-rt/wasm_setjmp.c

+__wasm_sjlj_setjmp(void *env, uint32_t label, void *func_invocation_id)
+{
+        struct jmp_buf_impl *jb = env;
+        if (label == 0) { /* ABI contract */


Maybe replace these two with you replace these with assert(...) ?

sbc100 · 2024-03-11T21:32:00Z

system/lib/compiler-rt/wasm_setjmp.c

+void
+__wasm_sjlj_setjmp(void *env, uint32_t label, void *func_invocation_id)
+{
+        struct jmp_buf_impl *jb = env;


Should we just declare the first argument as struct jmp_buf_impl *jb? (i.e. why case from void* here?)

i don't want to have jmp_buf_impl in the function prototype as it's an internal detail.

Fair enough. I was thinking that because this function is not actually declared anywhere else then it would not matter since this internal detail would remains solely within this .c file.

sbc100 · 2024-03-11T21:34:11Z

system/lib/compiler-rt/wasm_setjmp.c

+         * ideally, this can be replaced with multivalue.
+         */
+        struct arg {
+                void *env;


Why not declare this as struct jmp_buf_impl* ?

aheejin

Thanks! I have basically the same set of comments with @sbc100:

clang-format with our current style
We seem to use // style comments in our existing libraries. So maybe match that for consistency
I guess it'd be better to move it to the current file emscripten-setjmp.c at least for now. We can discuss more on where to move this for the upstreaming later, but I don't think we need to be blocked on that.

Also, unless you are planning for more changes, we can un-draft this PR I think, given that we want to land this before the LLVM PR?

aheejin

Some nits.

Also, I'd like to bikeshed a bit on the (final) names of the functions... I agree it's good to prefix all functions with __wasm_, especially for future upstreaming. Do we need _sjlj_ too, given that setjmp and longjmp already imply that?

I like these, WDYT?

__wasm_setjmp
__wasm_test_setjmp
__wasm_longjmp
- For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

aheejin · 2024-03-12T00:28:37Z

system/lib/compiler-rt/wasm_setjmp.c

+ * a plan is to replace the __USING_WASM_SJLJ__ part of emscripten_setjmp.c
+ * with this.


I don't think this comment is necessary. Also, __wasm_sjlj_setjmp and __wasm_sjlj_test can be used with Emscripten SjLj too. Only __wasm_sjlj_longjmp, which uses Wasm throw, is exclusive to Wasm SjLj.

aheejin · 2024-03-12T00:29:44Z

system/lib/compiler-rt/wasm_setjmp.c

+ * jmp_buf should have large enough size and alignment to contain
+ * this structure.


We seem to use // with comments.
Nit: Can you wrap to 80 cols? (This and other comment lines too) This looks a bit too short.

aheejin · 2024-03-12T00:30:12Z

system/lib/compiler-rt/wasm_setjmp.c

+ * with this.
+ */
+
+#ifdef __USING_WASM_SJLJ__


As I said above setjmp and test can be shared with Emscripten SjLj, so we shouldn't have this here. Can we move this before __wasm_sjlj_longjmp?

aheejin · 2024-03-12T00:30:55Z

system/lib/compiler-rt/wasm_setjmp.c

+         * > The longjmp function cannot cause the setjmp macro to return
+         * > the value 0; if val is 0, the setjmp macro returns the value 1.


Do we need these >s?

aheejin · 2024-03-12T00:31:33Z

system/lib/compiler-rt/wasm_setjmp.c

+ * a runtime implementation for
+ * https://github.com/llvm/llvm-project/pull/84137
+ * https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit


I don't think we need to have these links in the library

sbc100 · 2024-03-12T06:26:22Z

Some nits.

Also, I'd like to bikeshed a bit on the (final) names of the functions... I agree it's good to prefix all functions with __wasm_, especially for future upstreaming. Do we need _sjlj_ too, given that setjmp and longjmp already imply that?

I like these, WDYT?

__wasm_setjmp

__wasm_test_setjmp

How about __wasm_setjmp_test for this one?

aheejin · 2024-03-12T19:04:05Z

__wasm_setjmp_test sounds good too.

yamt · 2024-03-13T09:05:48Z

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

sbc100 · 2024-03-13T15:44:29Z

system/lib/compiler-rt/emscripten_setjmp.c

+    return jb->label;
+  }
+  return 0;
+}


Does the corresponding LLVM change not also change the ABI for emscripten exceptions too? i.e. not just when USING_WASM_SJLJ? (i.e. should all this new code live outside the #if block?

+1. These __wasm_setjmp and __wasm_setjmp_test can be used both for emscripten and Wasm SjLj. Actually, using different functions for the two of them will be messy and not allow us to delete the existing saveSetjmp and testSetjmp. This is currently within #ifdef __USING_WASM_SJLJ__. Only __wasm_longjmp is exclusive to Wasm SjLj.

Does the corresponding LLVM change not also change the ABI for emscripten exceptions too? i.e. not just when USING_WASM_SJLJ? (i.e. should all this new code live outside the #if block?

currently, no.

+1. These __wasm_setjmp and __wasm_setjmp_test can be used both for emscripten and Wasm SjLj. Actually, using different functions for the two of them will be messy and not allow us to delete the existing saveSetjmp and testSetjmp. This is currently within #ifdef __USING_WASM_SJLJ__. Only __wasm_longjmp is exclusive to Wasm SjLj.

maybe. i haven't looked at the emscrpiten sjlj closely. do you think it's enough to apply the same change mechanically?

aheejin · 2024-03-13T22:37:39Z

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

Does this mean you deleted the __wasm_longjmp? I can't find it in the diff anymore. Then do you have a plan to re-add it? If they are functionally the same, I think you can just replace it.

yamt · 2024-03-14T01:03:11Z

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

Does this mean you deleted the __wasm_longjmp? I can't find it in the diff anymore.

yes.

Then do you have a plan to re-add it? If they are functionally the same, I think you can just replace it.

ok.

sbc100 · 2024-03-15T04:42:32Z

system/lib/compiler-rt/emscripten_setjmp.c

-  __builtin_wasm_throw(C_LONGJMP, &__wasm_longjmp_args);
+  arg->env = env;
+  arg->val = val;
+  __builtin_wasm_throw(1, arg); /* 1 == C_LONGJMP */


Is this code going to work both before and after the llvm change?

Is this code going to work both before and after the llvm change?

i expect so.
at least it passed ./test/runner.py 'other.test_setjmp*' and ./test/runner.py 'other.test_setjmp*'.
do you have specific concerns?

I guess I don't see how this same function can work both before and after the llvm change. Won't llvm generate different env arguments before and after your change? Is the layout of env not changing with your llvm change?

Can you also run ./test/runner core0.test_setjmp*?

it passed ./test/runner core0.test_setjmp* as well. (it was what i meant. repeating two other.test_setjmp* was a copy-and-paste mistake.)

the original jmp_buf was large enough to store &jb->arg. it was just unused before this change.

yamt · 2024-03-15T05:14:30Z

@sbc100 can you advise me about the latest CI failure? it isn't obvious to me what they are complaining about.

======================================================================
FAIL [0.005s]: test_minimal_runtime_code_size_hello_webgl2_wasm2js (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/Users/distiller/project/test/common.py", line 630, in resulting_test
    return func(self, *args)
  File "/Users/distiller/project/test/test_other.py", line 10841, in test_minimal_runtime_code_size
    self.assertEqual(total_output_size, total_expected_size)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 831, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 824, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 22635 != 22627

======================================================================
FAIL [0.004s]: test_minimal_runtime_code_size_hello_webgl_wasm2js (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/Users/distiller/project/test/common.py", line 630, in resulting_test
    return func(self, *args)
  File "/Users/distiller/project/test/test_other.py", line 10841, in test_minimal_runtime_code_size
    self.assertEqual(total_output_size, total_expected_size)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 831, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 824, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 22137 != 22129

Add setjmp/longjump support based on Wasm EH proposal. It's provided as a separate library (libsetjmp) from libc so that runtimes w/o EH support can still load libc.so. To use this setjmp/longjmp implementation, an application should be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`. (You need an LLVM with the change mentioned below.) Also, you need a runtime with EH support to run such an application. If you want to use the latest EH instructions, you can use `binaryen --translate-eh-old-to-new` on your application. Note: You don't need to translate libsetjmp.a/so to the new EH. While LLVM currently produces bytecode for an old version of the EH proposal, luckily for us, the bytecode used in this library (ie. the tag definition and the "throw" instruction) is compatible with the latest version of the proposal. The runtime logic is basically copy-and-paste from: https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp The corresponding LLVM change: llvm/llvm-project#84137 (Note: you need this change to build setjmp/longjmp using code. otoh, you don't need this to build libsetjmp.) A similar change for emscripten: emscripten-core/emscripten#21502 An older version of this PR, which doesn't require LLVM changes: WebAssembly#467 Discussion: https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit An example to use the latest EH instructions: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm toywasm --wasi your_app.wasm ``` Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`. An example to use the older EH instructions, which LLVM currently produces: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp iwasm your_app.wasm ``` Note: use wasm-micro-runtime built with `-DWASM_ENABLE_EXCE_HANDLING=1`. Note: as of writing this, only the classic interpreter supports EH.

sbc100 · 2024-03-15T05:43:28Z

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

yamt · 2024-03-15T06:06:07Z

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

ok. let me try a rebase. i hope the force push won't make @aheejin too unhappy.

sbc100

Awesome! I'm excited to see this land

Add setjmp/longjump support based on Wasm EH proposal. It's provided as a separate library (libsetjmp) from libc so that runtimes w/o EH support can still load libc.so. To use this setjmp/longjmp implementation, an application should be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`. (You need an LLVM with the change mentioned below.) Also, you need a runtime with EH support to run such an application. If you want to use the latest EH instructions, you can use `binaryen --translate-eh-old-to-new` on your application. Note: You don't need to translate libsetjmp.a/so to the new EH. While LLVM currently produces bytecode for an old version of the EH proposal, luckily for us, the bytecode used in this library (ie. the tag definition and the "throw" instruction) is compatible with the latest version of the proposal. The runtime logic is basically copy-and-paste from: https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp The corresponding LLVM change: llvm/llvm-project#84137 (Note: you need this change to build setjmp/longjmp using code. otoh, you don't need this to build libsetjmp.) A similar change for emscripten: emscripten-core/emscripten#21502 An older version of this PR, which doesn't require LLVM changes: WebAssembly#467 Discussion: https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit An example to use the latest EH instructions: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm toywasm --wasi your_app.wasm ``` Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`. An example to use the older EH instructions, which LLVM currently produces: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp iwasm your_app.wasm ``` Note: use wasm-micro-runtime built with `-DWAMR_BUILD_EXCE_HANDLING=1`. Note: as of writing this, only the classic interpreter supports EH.

aheejin · 2024-03-15T23:09:35Z

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

ok. let me try a rebase. i hope the force push won't make @aheejin too unhappy.

You can do a merge (instead of a rebase) via git merge main in order not to force-push 😀 But anyway, I think it's fine either way in small PRs like this. Sorry if that caused any inconvenience.

also, replace __wasm_longjmp with my version as well.

aheejin

I ran all sjlj tests with this PR and llvm/llvm-project#84137 and all tests seem to pass except for test_longjmp_zero for Emscripten SjLj. Turns out emscripten_longjmp needs to adjust val to 1 when the input is 0 too. LGTM otherwise.

Then the question is why test_longjmp_zero was passing for emscripten SjLj until now. I investigated it a little, and the answer is, accidentally.

Feel free to skip the below if not interested. I'm writing this mostly for my own record-keeping.

https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L157-L169
In this code, it checks whether both __THREW__ and __threwValue are nonzero. Nonzero __THREW__ means an exception or a longjmp occurred. __threwValue is the second argument to longjmp. Somehow it checks whether __threwValue is 0, and if so, goes here:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L166-L167
Which causes the switch to go to the splitted next BB, which means, run as if a longjmp hasn't occurred:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L174

But I don't think this is correct. longjmp has occurred if __THREW__ is set, and is not dependent on whether __threwValue is 0 or not, but this code assumes longjmp occurred only when both of them are nonzero. I don't know why I wrote code like this, but I guess I was porting then fastcomp's implementation and moved this part just verbatim: https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278

Anyway, it causes us to go to the next code after longjmp. But the next instruction in LLVM IR after longjmp is always unreachable. So a part of the generated code looks like this:

if.end18:                                         ; preds = %if.else117, %if.end221
  %label23 = phi i32 [ %label22, %if.end221 ], [ -1, %if.else117 ]
  %longjmp_result24 = call i32 @getTempRet0()
  switch i32 %label23, label %if.end.split [
    i32 1, label %entry.split.split
  ]

;; This is the post-longjmp BB
if.end.split:                                     ; preds = %if.end18
  unreachable

This part matches here: https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169-L175

In this code, because we come from here, %label23 is -1, and we end up going to %if.end.split, which contains only unreachable. And %entry.split.split is one of the post-setjmp BBs. (Here there's only one setjmp, so there's only one entry) So looking at the code here it looks we have to end up trapping, even if that's not the correct output (because the correct output is when the input is 0, it should be fixed to 1). But because %if.end.split is unreachable, in isel, it converts that switch into

br label %entry.split.split

So even if the code tries to run as if longjmp has not occurred, it actually ends up running as longjmp has occurred and jumps to the post-setjmp BB. I know this is so (unnecessarily) accidentally messed up and I don't really recommend reading this at this point...

And if we go back to the post-setjmp BB, what's the result of that gonna be? If you run the else part, you don't end up running this setTempRet0: https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L165

And the result of longjmp is read here, but because we haven't run setTempRet0, the value we read will be whatever value that was set before:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169

And before, we happened to call setTempRet0 with the setjmpTableSize. Not sure why we were doing that. Even the comment says "unneeded":

emscripten/system/lib/compiler-rt/emscripten_setjmp.c

Line 48 in 918e131

setTempRet0(size); // FIXME: unneeded?

So as long as we call longjmp(buf, 0) after setjmp, if the second argument to longjmp is 0, we end up getting 4, which is setjmpTableSize, as a result. But this PR and the companion llvm PR removes setjmpTable and setjmpTableSize and the setTempRet0 call with them. So now getTempRet0 here ends up getting 0 (not because it was longjmp's second argument but because it was read before being set with any values)
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169

So test_longjmp_zero.c's this line returns 0:

emscripten/test/core/test_longjmp_zero.c

Line 14 in 918e131

int val = setjmp(b1);

and the next line if (val) is not taken, and goes to longjmp again, and setjmp return 0 again, ... so it goes into an infinite loop and hangs.

aheejin · 2024-03-18T07:39:47Z

system/lib/compiler-rt/emscripten_setjmp.c

+  // C standard says:
+  // The longjmp function cannot cause the setjmp macro to return
+  // the value 0; if val is 0, the setjmp macro returns the value 1.
  if (val == 0) {
    val = 1;
  }


We need this for emscripten_longjmp too. Can you paste it there too?

sbc100 · 2024-03-18T14:00:21Z

Thanks for the details analysis of the test_longjmp_zero issue @aheejin.

When I added that test I did notice the setjmp was returning 4 but I couldn't figure out why.

Do you think we could fix that issue first, before we land the ABI change? I assume it would be an standalone llvm change, at which point we could update the test_longjmp_zero test to actually assert that the return value is 1.

aheejin · 2024-03-18T17:51:09Z

No I think we should land this first, or add

if (val == 0)
  val = 1

separately to emscripten_longjmp first. What we do in LLVM is to add an unnecessary condition & __threwValue = 0) to the longjmp condition, but with the correct library this can't happen anyway. I'll fix LLVM anyway after these SjLj PRs land because it is unnecessary code.

sbc100 · 2024-03-18T17:56:40Z

No I think we should land this first, or add
if (val == 0)
  val = 1
separately to emscripten_longjmp first.

That sounds good. I couldn't find a place to do this myself when I added test_longjmp_zero .. it would be good get that done and add the extra assert to test_longjmp_zero.

yamt · 2024-03-21T00:09:17Z

No I think we should land this first, or add
if (val == 0)
  val = 1
separately to emscripten_longjmp first.
That sounds good. I couldn't find a place to do this myself when I added test_longjmp_zero .. it would be good get that done and add the extra assert to test_longjmp_zero.

#21577

aheejin

So I think we can merge this now? Will merge it. Thank you for working on this!

sbc100 · 2024-03-21T20:41:31Z

Awesome. I guess now the llvm side can land too?

aheejin · 2024-03-21T21:17:38Z

He pushed another conflicts-resolving commit so the LLVM CI is still running, so I think we can do that some hours later.

…jlj (#84137) Instead of maintaining per-function-invocation malloc()'ed tables to track which functions each label belongs to, store the equivalent info in jump buffers (jmp_buf) themselves. Also, use a less emscripten-looking ABI symbols: ``` saveSetjmp -> __wasm_setjmp testSetjmp -> __wasm_setjmp_test getTempRet0 -> (removed) __wasm_longjmp -> (no change) ``` While I want to use this for WASI, it should work for emscripten as well. An example runtime and a few tests: https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp wasi-libc version of the runtime: WebAssembly/wasi-libc#483 emscripten version of the runtime: emscripten-core/emscripten#21502 Discussion: https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit

Currently the code thinks a `longjmp` occurred if both `__THREW__` and `__threwValue` are nonzero. But `__threwValue` can be 0, and the `longjmp` library function should change it to 1 in case it is 0: https://en.cppreference.com/w/c/program/longjmp Emscripten libraries were not consistent about that, but after emscripten-core/emscripten#21493 and emscripten-core/emscripten#21502, we correctly pass 1 in case the input is 0. So there will be no case `__threwValue` is 0. And regardless of what `longjmp` library function does, treating `longjmp`'s 0 input to its second argument as "not longjmping" doesn't seem right. I'm not sure where that `__threwValue` checking came from, but probably I was porting then fastcomp's implementation and moved this part just verbatim: https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278 Just for the context, how this was discovered: emscripten-core/emscripten#21502 (review)

After llvm/llvm-project#84137 and emscripten-core#21502, these methods are no longer used.

After llvm/llvm-project#84137 and #21502, these methods are no longer used.

As part of emscripten-core#21502 some assertions were added to the wasm SjLj helpers. Because calls to these functions can be generated during LTO we exclude compiler-rt from LTO and always build it as normal object files. Because these normal object files could be pulled in after LTO takes place, they cannot themsleves refer to LTO objects. Sadly `assert` referees to printf and stdout stuff which is compiled as LTO. This leads the failures we are currently setting for lto0.test_longjmp_standalone_standalone: ``` wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(stderr.o): attempt to add bitcode file after LTO (__stderr_FILE) wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(fprintf.o): attempt to add bitcode file after LTO (fprintf) ``` Building with `-DNDEBUG` works around this issue by not actually including the assert code. Its also more correct to do so for compiler-rt which is not a debug library.

…86633) Currently the code thinks a `longjmp` occurred if both `__THREW__` and `__threwValue` are nonzero. But `__threwValue` can be 0, and the `longjmp` library function should change it to 1 in case it is 0: https://en.cppreference.com/w/c/program/longjmp Emscripten libraries were not consistent about that, but after emscripten-core/emscripten#21493 and emscripten-core/emscripten#21502, we correctly pass 1 in case the input is 0. So there will be no case `__threwValue` is 0. And regardless of what `longjmp` library function does, treating `longjmp`'s 0 input to its second argument as "not longjmping" doesn't seem right. I'm not sure where that `__threwValue` checking came from, but probably I was porting then fastcomp's implementation and moved this part just verbatim: https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278 Just for the context, how this was discovered: emscripten-core/emscripten#21502 (review)

As part of #21502 some assertions were added to the wasm SjLj helpers. Because calls to these functions can be generated during LTO we exclude compiler-rt from LTO and always build it as normal object files. Because these normal object files could be pulled in after LTO takes place, they cannot themselves refer to LTO objects. Sadly `assert` referees to printf and stdout stuff which is compiled as LTO. This leads the failures we are currently setting for lto0.test_longjmp_standalone_standalone: ``` wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(stderr.o): attempt to add bitcode file after LTO (__stderr_FILE) wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(fprintf.o): attempt to add bitcode file after LTO (fprintf) ``` Building with `-DNDEBUG` works around this issue by not actually including the assert code. Its also more correct to do so for compiler-rt which is not a debug library.

* Add libsetjmp.a/so Add setjmp/longjump support based on Wasm EH proposal. It's provided as a separate library (libsetjmp) from libc so that runtimes w/o EH support can still load libc.so. To use this setjmp/longjmp implementation, an application should be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`. (You need an LLVM with the change mentioned below.) Also, you need a runtime with EH support to run such an application. If you want to use the latest EH instructions, you can use `binaryen --translate-eh-old-to-new` on your application. Note: You don't need to translate libsetjmp.a/so to the new EH. While LLVM currently produces bytecode for an old version of the EH proposal, luckily for us, the bytecode used in this library (ie. the tag definition and the "throw" instruction) is compatible with the latest version of the proposal. The runtime logic is basically copy-and-paste from: https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp The corresponding LLVM change: llvm/llvm-project#84137 (Note: you need this change to build setjmp/longjmp using code. otoh, you don't need this to build libsetjmp.) A similar change for emscripten: emscripten-core/emscripten#21502 An older version of this PR, which doesn't require LLVM changes: #467 Discussion: https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit An example to use the latest EH instructions: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm toywasm --wasi your_app.wasm ``` Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`. An example to use the older EH instructions, which LLVM currently produces: ``` clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp iwasm your_app.wasm ``` Note: use wasm-micro-runtime built with `-DWAMR_BUILD_EXCE_HANDLING=1`. Note: as of writing this, only the classic interpreter supports EH. * Make libsetjmp build optional * CI: Disable libsetjmp for old LLVM * libc-top-half/musl/include/setjmp.h: fix a rebase botch

yamt force-pushed the wasm-sjlj-alt2 branch from 35f6cc7 to 1ed5f81 Compare March 11, 2024 01:37

yamt mentioned this pull request Mar 11, 2024

[WebAssembly] Implement an alternative translation for -wasm-enable-sjlj llvm/llvm-project#84137

Merged

sbc100 reviewed Mar 11, 2024

View reviewed changes

sbc100 requested a review from aheejin March 11, 2024 21:34

aheejin reviewed Mar 12, 2024

View reviewed changes

yamt force-pushed the wasm-sjlj-alt2 branch 2 times, most recently from dbbde24 to 57817d3 Compare March 13, 2024 07:20

yamt changed the title ~~Add a setjmp/longjmp runtime for a new mode proposed in LLVM~~ Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM Mar 13, 2024

yamt marked this pull request as ready for review March 13, 2024 09:04

sbc100 reviewed Mar 13, 2024

View reviewed changes

sbc100 reviewed Mar 15, 2024

View reviewed changes

yamt mentioned this pull request Mar 15, 2024

Add libsetjmp.a/so WebAssembly/wasi-libc#483

Merged

sbc100 approved these changes Mar 15, 2024

View reviewed changes

yamt force-pushed the wasm-sjlj-alt2 branch from fcb8ade to b1e809d Compare March 15, 2024 06:07

yamt added 3 commits March 17, 2024 21:45

provide __wasm_setjmp/__wasm_setjmp_test for emscripten sjlj as well

bd447a3

also, replace __wasm_longjmp with my version as well.

format

18e2fb2

C_LONGJMP definition, comments

3195086

yamt force-pushed the wasm-sjlj-alt2 branch from eb4d601 to 3195086 Compare March 17, 2024 12:46

aheejin reviewed Mar 18, 2024

View reviewed changes

toyobayashi mentioned this pull request Mar 20, 2024

wasi-libc is adding sjlj support toyobayashi/wasm-sjlj#2

Open

aheejin approved these changes Mar 21, 2024

View reviewed changes

aheejin merged commit 618ed94 into emscripten-core:main Mar 21, 2024
29 checks passed

aheejin mentioned this pull request Mar 26, 2024

[WebAssembly] Remove threwValue comparison after __wasm_setjmp_test llvm/llvm-project#86633

Merged

aheejin added a commit to aheejin/emscripten that referenced this pull request Mar 26, 2024

[SjLj] Remove old unused library methods

c543aaf

After llvm/llvm-project#84137 and emscripten-core#21502, these methods are no longer used.

aheejin mentioned this pull request Mar 26, 2024

[SjLj] Remove old unused library methods #21613

Merged

aheejin added a commit that referenced this pull request Mar 26, 2024

[SjLj] Remove old unused library methods (#21613)

29be801

After llvm/llvm-project#84137 and #21502, these methods are no longer used.

sbc100 mentioned this pull request Mar 27, 2024

Fix lto0.test_longjmp_standalone_standalone. NFC #21632

Merged

yamt mentioned this pull request Apr 1, 2024

Document a setjmp/longjmp convention WebAssembly/tool-conventions#225

Merged

sbc100 mentioned this pull request Apr 15, 2024

emsdk 3.1.57: undefined symbol: saveSetjmp/testSetjmp #21756

Closed

		* a plan is to replace the __USING_WASM_SJLJ__ part of emscripten_setjmp.c
		* with this.

		* jmp_buf should have large enough size and alignment to contain
		* this structure.

		* > The longjmp function cannot cause the setjmp macro to return
		* > the value 0; if val is 0, the setjmp macro returns the value 1.

Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM #21502

Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM #21502

Conversation

yamt commented Mar 11, 2024 • edited

yamt commented Mar 11, 2024

sbc100 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aheejin left a comment • edited

Choose a reason for hiding this comment

aheejin left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aheejin Mar 12, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbc100 commented Mar 12, 2024

aheejin commented Mar 12, 2024

yamt commented Mar 13, 2024

Choose a reason for hiding this comment

aheejin Mar 13, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aheejin commented Mar 13, 2024

yamt commented Mar 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yamt commented Mar 15, 2024

sbc100 commented Mar 15, 2024

yamt commented Mar 15, 2024

sbc100 left a comment

Choose a reason for hiding this comment

aheejin commented Mar 15, 2024

aheejin left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbc100 commented Mar 18, 2024

aheejin commented Mar 18, 2024 • edited

sbc100 commented Mar 18, 2024

yamt commented Mar 21, 2024

aheejin left a comment

Choose a reason for hiding this comment

sbc100 commented Mar 21, 2024

aheejin commented Mar 21, 2024

yamt commented Mar 11, 2024 •

edited

aheejin left a comment •

edited

aheejin left a comment •

edited

aheejin Mar 12, 2024 •

edited

aheejin Mar 13, 2024 •

edited

aheejin left a comment •

edited

aheejin commented Mar 18, 2024 •

edited