Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM #21502

Merged
merged 4 commits into from
Mar 21, 2024

Conversation

yamt
Copy link
Contributor

@yamt yamt commented Mar 11, 2024

@yamt
Copy link
Contributor Author

yamt commented Mar 11, 2024

to use this, i suppose we need one of the following:

  • make the LLVM use the new method unconditionally
  • or, make some changes in SUPPORT_LONGJMP and tools/building.py to specify the new LLVM option

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Would you mind adding these new functions to the existing emscripten_setjmp.c since that is where these functions currently live?

Also, can you match the formatting rules of that file which is basically LLVM coding style (you can use clang-format to do this for you I think).

__wasm_sjlj_setjmp(void *env, uint32_t label, void *func_invocation_id)
{
struct jmp_buf_impl *jb = env;
if (label == 0) { /* ABI contract */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace these two with you replace these with assert(...) ?

void
__wasm_sjlj_setjmp(void *env, uint32_t label, void *func_invocation_id)
{
struct jmp_buf_impl *jb = env;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just declare the first argument as struct jmp_buf_impl *jb? (i.e. why case from void* here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't want to have jmp_buf_impl in the function prototype as it's an internal detail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I was thinking that because this function is not actually declared anywhere else then it would not matter since this internal detail would remains solely within this .c file.

* ideally, this can be replaced with multivalue.
*/
struct arg {
void *env;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not declare this as struct jmp_buf_impl* ?

@sbc100 sbc100 requested a review from aheejin March 11, 2024 21:34
Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have basically the same set of comments with @sbc100:

  1. clang-format with our current style
  2. We seem to use // style comments in our existing libraries. So maybe match that for consistency
  3. I guess it'd be better to move it to the current file emscripten-setjmp.c at least for now. We can discuss more on where to move this for the upstreaming later, but I don't think we need to be blocked on that.

Also, unless you are planning for more changes, we can un-draft this PR I think, given that we want to land this before the LLVM PR?

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits.

Also, I'd like to bikeshed a bit on the (final) names of the functions... I agree it's good to prefix all functions with __wasm_, especially for future upstreaming. Do we need _sjlj_ too, given that setjmp and longjmp already imply that?

I like these, WDYT?

  • __wasm_setjmp
  • __wasm_test_setjmp
  • __wasm_longjmp
    • For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

Comment on lines 11 to 12
* a plan is to replace the __USING_WASM_SJLJ__ part of emscripten_setjmp.c
* with this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is necessary. Also, __wasm_sjlj_setjmp and __wasm_sjlj_test can be used with Emscripten SjLj too. Only __wasm_sjlj_longjmp, which uses Wasm throw, is exclusive to Wasm SjLj.

Comment on lines 21 to 22
* jmp_buf should have large enough size and alignment to contain
* this structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to use // with comments.
Nit: Can you wrap to 80 cols? (This and other comment lines too) This looks a bit too short.

* with this.
*/

#ifdef __USING_WASM_SJLJ__
Copy link
Member

@aheejin aheejin Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above setjmp and test can be shared with Emscripten SjLj, so we shouldn't have this here. Can we move this before __wasm_sjlj_longjmp?

Comment on lines 77 to 78
* > The longjmp function cannot cause the setjmp macro to return
* > the value 0; if val is 0, the setjmp macro returns the value 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these >s?

Comment on lines 7 to 9
* a runtime implementation for
* https://github.com/llvm/llvm-project/pull/84137
* https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to have these links in the library

@sbc100
Copy link
Collaborator

sbc100 commented Mar 12, 2024

Some nits.

Also, I'd like to bikeshed a bit on the (final) names of the functions... I agree it's good to prefix all functions with __wasm_, especially for future upstreaming. Do we need _sjlj_ too, given that setjmp and longjmp already imply that?

I like these, WDYT?

  • __wasm_setjmp
  • __wasm_test_setjmp

How about __wasm_setjmp_test for this one?

@aheejin
Copy link
Member

aheejin commented Mar 12, 2024

__wasm_setjmp_test sounds good too.

@yamt yamt force-pushed the wasm-sjlj-alt2 branch 2 times, most recently from dbbde24 to 57817d3 Compare March 13, 2024 07:20
@yamt yamt changed the title Add a setjmp/longjmp runtime for a new mode proposed in LLVM Add a setjmp/longjmp runtime for a new sjlj translation proposed in LLVM Mar 13, 2024
@yamt yamt marked this pull request as ready for review March 13, 2024 09:04
@yamt
Copy link
Contributor Author

yamt commented Mar 13, 2024

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

return jb->label;
}
return 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the corresponding LLVM change not also change the ABI for emscripten exceptions too? i.e. not just when USING_WASM_SJLJ? (i.e. should all this new code live outside the #if block?

Copy link
Member

@aheejin aheejin Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. These __wasm_setjmp and __wasm_setjmp_test can be used both for emscripten and Wasm SjLj. Actually, using different functions for the two of them will be messy and not allow us to delete the existing saveSetjmp and testSetjmp. This is currently within #ifdef __USING_WASM_SJLJ__. Only __wasm_longjmp is exclusive to Wasm SjLj.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the corresponding LLVM change not also change the ABI for emscripten exceptions too? i.e. not just when USING_WASM_SJLJ? (i.e. should all this new code live outside the #if block?

currently, no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. These __wasm_setjmp and __wasm_setjmp_test can be used both for emscripten and Wasm SjLj. Actually, using different functions for the two of them will be messy and not allow us to delete the existing saveSetjmp and testSetjmp. This is currently within #ifdef __USING_WASM_SJLJ__. Only __wasm_longjmp is exclusive to Wasm SjLj.

maybe. i haven't looked at the emscrpiten sjlj closely. do you think it's enough to apply the same change mechanically?

@aheejin
Copy link
Member

aheejin commented Mar 13, 2024

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

Does this mean you deleted the __wasm_longjmp? I can't find it in the diff anymore. Then do you have a plan to re-add it? If they are functionally the same, I think you can just replace it.

@yamt
Copy link
Contributor Author

yamt commented Mar 14, 2024

For now this can't be done because we have to have the current __wasm_longjmp. I suggest we land with a different name (the current __wasm_sjlj_longjmp seems fine) and change this (both in LLVM and emscripten) later.

i just used the same name as the change in this function is not essential.

Does this mean you deleted the __wasm_longjmp? I can't find it in the diff anymore.

yes.

Then do you have a plan to re-add it? If they are functionally the same, I think you can just replace it.

ok.

__builtin_wasm_throw(C_LONGJMP, &__wasm_longjmp_args);
arg->env = env;
arg->val = val;
__builtin_wasm_throw(1, arg); /* 1 == C_LONGJMP */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code going to work both before and after the llvm change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code going to work both before and after the llvm change?

i expect so.
at least it passed ./test/runner.py 'other.test_setjmp*' and ./test/runner.py 'other.test_setjmp*'.
do you have specific concerns?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't see how this same function can work both before and after the llvm change. Won't llvm generate different env arguments before and after your change? Is the layout of env not changing with your llvm change?

Can you also run ./test/runner core0.test_setjmp*?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it passed ./test/runner core0.test_setjmp* as well. (it was what i meant. repeating two other.test_setjmp* was a copy-and-paste mistake.)

the original jmp_buf was large enough to store &jb->arg. it was just unused before this change.

@yamt
Copy link
Contributor Author

yamt commented Mar 15, 2024

@sbc100 can you advise me about the latest CI failure? it isn't obvious to me what they are complaining about.

======================================================================
FAIL [0.005s]: test_minimal_runtime_code_size_hello_webgl2_wasm2js (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/Users/distiller/project/test/common.py", line 630, in resulting_test
    return func(self, *args)
  File "/Users/distiller/project/test/test_other.py", line 10841, in test_minimal_runtime_code_size
    self.assertEqual(total_output_size, total_expected_size)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 831, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 824, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 22635 != 22627

======================================================================
FAIL [0.004s]: test_minimal_runtime_code_size_hello_webgl_wasm2js (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/Users/distiller/project/test/common.py", line 630, in resulting_test
    return func(self, *args)
  File "/Users/distiller/project/test/test_other.py", line 10841, in test_minimal_runtime_code_size
    self.assertEqual(total_output_size, total_expected_size)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 831, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Users/distiller/emsdk/python/3.9.2_64bit/lib/python3.9/unittest/case.py", line 824, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 22137 != 22129

yamt added a commit to yamt/wasi-libc that referenced this pull request Mar 15, 2024
Add setjmp/longjump support based on Wasm EH proposal.

It's provided as a separate library (libsetjmp) from libc so that
runtimes w/o EH support can still load libc.so.

To use this setjmp/longjmp implementation, an application should
be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`.
(You need an LLVM with the change mentioned below.)

Also, you need a runtime with EH support to run such an application.

If you want to use the latest EH instructions, you can use
`binaryen --translate-eh-old-to-new` on your application.

Note: You don't need to translate libsetjmp.a/so to the new EH.
While LLVM currently produces bytecode for an old version of the EH
proposal, luckily for us, the bytecode used in this library (ie. the tag
definition and the "throw" instruction) is compatible with the latest
version of the proposal.

The runtime logic is basically copy-and-paste from:
    https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp

The corresponding LLVM change:
    llvm/llvm-project#84137
    (Note: you need this change to build setjmp/longjmp using code.
    otoh, you don't need this to build libsetjmp.)

A similar change for emscripten:
    emscripten-core/emscripten#21502

An older version of this PR, which doesn't require LLVM changes:
    WebAssembly#467

Discussion:
    https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit

An example to use the latest EH instructions:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm
toywasm --wasi your_app.wasm
```
Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`.

An example to use the older EH instructions, which LLVM currently produces:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
iwasm your_app.wasm
```
Note: use wasm-micro-runtime built with `-DWASM_ENABLE_EXCE_HANDLING=1`.
Note: as of writing this, only the classic interpreter supports EH.
@sbc100
Copy link
Collaborator

sbc100 commented Mar 15, 2024

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

@yamt
Copy link
Contributor Author

yamt commented Mar 15, 2024

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

ok. let me try a rebase. i hope the force push won't make @aheejin too unhappy.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I'm excited to see this land

yamt added a commit to yamt/wasi-libc that referenced this pull request Mar 15, 2024
Add setjmp/longjump support based on Wasm EH proposal.

It's provided as a separate library (libsetjmp) from libc so that
runtimes w/o EH support can still load libc.so.

To use this setjmp/longjmp implementation, an application should
be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`.
(You need an LLVM with the change mentioned below.)

Also, you need a runtime with EH support to run such an application.

If you want to use the latest EH instructions, you can use
`binaryen --translate-eh-old-to-new` on your application.

Note: You don't need to translate libsetjmp.a/so to the new EH.
While LLVM currently produces bytecode for an old version of the EH
proposal, luckily for us, the bytecode used in this library (ie. the tag
definition and the "throw" instruction) is compatible with the latest
version of the proposal.

The runtime logic is basically copy-and-paste from:
    https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp

The corresponding LLVM change:
    llvm/llvm-project#84137
    (Note: you need this change to build setjmp/longjmp using code.
    otoh, you don't need this to build libsetjmp.)

A similar change for emscripten:
    emscripten-core/emscripten#21502

An older version of this PR, which doesn't require LLVM changes:
    WebAssembly#467

Discussion:
    https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit

An example to use the latest EH instructions:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm
toywasm --wasi your_app.wasm
```
Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`.

An example to use the older EH instructions, which LLVM currently produces:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
iwasm your_app.wasm
```
Note: use wasm-micro-runtime built with `-DWAMR_BUILD_EXCE_HANDLING=1`.
Note: as of writing this, only the classic interpreter supports EH.
@aheejin
Copy link
Member

aheejin commented Mar 15, 2024

Those "code_size" test failures are likely unrelated. They often happen then llvm changes effect code size. Can you trying rebasing or merging?

ok. let me try a rebase. i hope the force push won't make @aheejin too unhappy.

You can do a merge (instead of a rebase) via git merge main in order not to force-push 😀 But anyway, I think it's fine either way in small PRs like this. Sorry if that caused any inconvenience.

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran all sjlj tests with this PR and llvm/llvm-project#84137 and all tests seem to pass except for test_longjmp_zero for Emscripten SjLj. Turns out emscripten_longjmp needs to adjust val to 1 when the input is 0 too. LGTM otherwise.


Then the question is why test_longjmp_zero was passing for emscripten SjLj until now. I investigated it a little, and the answer is, accidentally.

Feel free to skip the below if not interested. I'm writing this mostly for my own record-keeping.

https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L157-L169
In this code, it checks whether both __THREW__ and __threwValue are nonzero. Nonzero __THREW__ means an exception or a longjmp occurred. __threwValue is the second argument to longjmp. Somehow it checks whether __threwValue is 0, and if so, goes here:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L166-L167
Which causes the switch to go to the splitted next BB, which means, run as if a longjmp hasn't occurred:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L174

But I don't think this is correct. longjmp has occurred if __THREW__ is set, and is not dependent on whether __threwValue is 0 or not, but this code assumes longjmp occurred only when both of them are nonzero. I don't know why I wrote code like this, but I guess I was porting then fastcomp's implementation and moved this part just verbatim: https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278

Anyway, it causes us to go to the next code after longjmp. But the next instruction in LLVM IR after longjmp is always unreachable. So a part of the generated code looks like this:

if.end18:                                         ; preds = %if.else117, %if.end221
  %label23 = phi i32 [ %label22, %if.end221 ], [ -1, %if.else117 ]
  %longjmp_result24 = call i32 @getTempRet0()
  switch i32 %label23, label %if.end.split [
    i32 1, label %entry.split.split
  ]

;; This is the post-longjmp BB
if.end.split:                                     ; preds = %if.end18
  unreachable

This part matches here: https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169-L175

In this code, because we come from here, %label23 is -1, and we end up going to %if.end.split, which contains only unreachable. And %entry.split.split is one of the post-setjmp BBs. (Here there's only one setjmp, so there's only one entry) So looking at the code here it looks we have to end up trapping, even if that's not the correct output (because the correct output is when the input is 0, it should be fixed to 1). But because %if.end.split is unreachable, in isel, it converts that switch into

br label %entry.split.split

So even if the code tries to run as if longjmp has not occurred, it actually ends up running as longjmp has occurred and jumps to the post-setjmp BB. I know this is so (unnecessarily) accidentally messed up and I don't really recommend reading this at this point...

And if we go back to the post-setjmp BB, what's the result of that gonna be? If you run the else part, you don't end up running this setTempRet0: https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L165

And the result of longjmp is read here, but because we haven't run setTempRet0, the value we read will be whatever value that was set before:
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169

And before, we happened to call setTempRet0 with the setjmpTableSize. Not sure why we were doing that. Even the comment says "unneeded":

setTempRet0(size); // FIXME: unneeded?

So as long as we call longjmp(buf, 0) after setjmp, if the second argument to longjmp is 0, we end up getting 4, which is setjmpTableSize, as a result. But this PR and the companion llvm PR removes setjmpTable and setjmpTableSize and the setTempRet0 call with them. So now getTempRet0 here ends up getting 0 (not because it was longjmp's second argument but because it was read before being set with any values)
https://github.com/llvm/llvm-project/blob/99a8ffe6b1e78d16f5dc528048a7c34a236b62bb/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L169

So test_longjmp_zero.c's this line returns 0:

int val = setjmp(b1);

and the next line if (val) is not taken, and goes to longjmp again, and setjmp return 0 again, ... so it goes into an infinite loop and hangs.

Comment on lines +122 to 127
// C standard says:
// The longjmp function cannot cause the setjmp macro to return
// the value 0; if val is 0, the setjmp macro returns the value 1.
if (val == 0) {
val = 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this for emscripten_longjmp too. Can you paste it there too?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 18, 2024

Thanks for the details analysis of the test_longjmp_zero issue @aheejin.

When I added that test I did notice the setjmp was returning 4 but I couldn't figure out why.

Do you think we could fix that issue first, before we land the ABI change? I assume it would be an standalone llvm change, at which point we could update the test_longjmp_zero test to actually assert that the return value is 1.

@aheejin
Copy link
Member

aheejin commented Mar 18, 2024

No I think we should land this first, or add

if (val == 0)
  val = 1

separately to emscripten_longjmp first. What we do in LLVM is to add an unnecessary condition & __threwValue = 0) to the longjmp condition, but with the correct library this can't happen anyway. I'll fix LLVM anyway after these SjLj PRs land because it is unnecessary code.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 18, 2024

No I think we should land this first, or add

if (val == 0)
  val = 1

separately to emscripten_longjmp first.

That sounds good. I couldn't find a place to do this myself when I added test_longjmp_zero .. it would be good get that done and add the extra assert to test_longjmp_zero.

@yamt
Copy link
Contributor Author

yamt commented Mar 21, 2024

No I think we should land this first, or add

if (val == 0)
  val = 1

separately to emscripten_longjmp first.

That sounds good. I couldn't find a place to do this myself when I added test_longjmp_zero .. it would be good get that done and add the extra assert to test_longjmp_zero.

#21577

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we can merge this now? Will merge it. Thank you for working on this!

@aheejin aheejin merged commit 618ed94 into emscripten-core:main Mar 21, 2024
29 checks passed
@sbc100
Copy link
Collaborator

sbc100 commented Mar 21, 2024

Awesome. I guess now the llvm side can land too?

@aheejin
Copy link
Member

aheejin commented Mar 21, 2024

He pushed another conflicts-resolving commit so the LLVM CI is still running, so I think we can do that some hours later.

aheejin pushed a commit to llvm/llvm-project that referenced this pull request Mar 26, 2024
…jlj (#84137)

Instead of maintaining per-function-invocation malloc()'ed tables to track which functions each label belongs to, store the equivalent info in jump buffers (jmp_buf) themselves.

Also, use a less emscripten-looking ABI symbols:
```
    saveSetjmp     -> __wasm_setjmp
    testSetjmp      -> __wasm_setjmp_test
    getTempRet0    -> (removed)
    __wasm_longjmp  -> (no change)
```

While I want to use this for WASI, it should work for emscripten as well.

An example runtime and a few tests:
https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp

wasi-libc version of the runtime:
WebAssembly/wasi-libc#483

emscripten version of the runtime:
emscripten-core/emscripten#21502

Discussion:
https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit
aheejin added a commit to aheejin/llvm-project that referenced this pull request Mar 26, 2024
Currently the code thinks a `longjmp` occurred if both `__THREW__` and
`__threwValue` are nonzero. But `__threwValue` can be 0, and the
`longjmp` library function should change it to 1 in case it is 0:
https://en.cppreference.com/w/c/program/longjmp

Emscripten libraries were not consistent about that, but after
emscripten-core/emscripten#21493 and
emscripten-core/emscripten#21502, we correctly
pass 1 in case the input is 0. So there will be no case `__threwValue`
is 0. And regardless of what `longjmp` library function does, treating
`longjmp`'s 0 input to its second argument as "not longjmping" doesn't
seem right.

I'm not sure where that `__threwValue` checking came from, but probably
I was porting then fastcomp's implementation and moved this part just
verbatim:
https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278

Just for the context, how this was discovered:
emscripten-core/emscripten#21502 (review)
aheejin added a commit to aheejin/emscripten that referenced this pull request Mar 26, 2024
aheejin added a commit that referenced this pull request Mar 26, 2024
sbc100 added a commit to sbc100/emscripten that referenced this pull request Mar 27, 2024
As part of emscripten-core#21502 some assertions were added to the wasm SjLj helpers.

Because calls to these functions can be generated during LTO we exclude
compiler-rt from LTO and always build it as normal object files.
Because these normal object files could be pulled in after LTO takes
place, they cannot themsleves refer to LTO objects.   Sadly `assert`
referees to printf and stdout stuff which is compiled as LTO.  This
leads the failures we are currently setting for
lto0.test_longjmp_standalone_standalone:

```
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(stderr.o): attempt to add bitcode file after LTO (__stderr_FILE)
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(fprintf.o): attempt to add bitcode file after LTO (fprintf)
```

Building with `-DNDEBUG` works around this issue by not actually
including the assert code.  Its also more correct to do so for
compiler-rt which is not a debug library.
sbc100 added a commit to sbc100/emscripten that referenced this pull request Mar 27, 2024
As part of emscripten-core#21502 some assertions were added to the wasm SjLj helpers.

Because calls to these functions can be generated during LTO we exclude
compiler-rt from LTO and always build it as normal object files.
Because these normal object files could be pulled in after LTO takes
place, they cannot themsleves refer to LTO objects.   Sadly `assert`
referees to printf and stdout stuff which is compiled as LTO.  This
leads the failures we are currently setting for
lto0.test_longjmp_standalone_standalone:

```
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(stderr.o): attempt to add bitcode file after LTO (__stderr_FILE)
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(fprintf.o): attempt to add bitcode file after LTO (fprintf)
```

Building with `-DNDEBUG` works around this issue by not actually
including the assert code.  Its also more correct to do so for
compiler-rt which is not a debug library.
aheejin added a commit to llvm/llvm-project that referenced this pull request Mar 27, 2024
…86633)

Currently the code thinks a `longjmp` occurred if both `__THREW__` and
`__threwValue` are nonzero. But `__threwValue` can be 0, and the
`longjmp` library function should change it to 1 in case it is 0:
https://en.cppreference.com/w/c/program/longjmp

Emscripten libraries were not consistent about that, but after
emscripten-core/emscripten#21493 and
emscripten-core/emscripten#21502, we correctly
pass 1 in case the input is 0. So there will be no case `__threwValue`
is 0. And regardless of what `longjmp` library function does, treating
`longjmp`'s 0 input to its second argument as "not longjmping" doesn't
seem right.

I'm not sure where that `__threwValue` checking came from, but probably
I was porting then fastcomp's implementation and moved this part just
verbatim:
https://github.com/emscripten-core/emscripten-fastcomp/blob/9bdc7bb4fc595fe05a021b06fe350e8494a741a1/lib/Target/JSBackend/CallHandlers.h#L274-L278

Just for the context, how this was discovered:
emscripten-core/emscripten#21502 (review)
sbc100 added a commit that referenced this pull request Mar 27, 2024
As part of #21502 some assertions were added to the wasm SjLj helpers.

Because calls to these functions can be generated during LTO we exclude
compiler-rt from LTO and always build it as normal object files.
Because these normal object files could be pulled in after LTO takes
place, they cannot themselves refer to LTO objects.   Sadly `assert`
referees to printf and stdout stuff which is compiled as LTO.  This
leads the failures we are currently setting for
lto0.test_longjmp_standalone_standalone:

```
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(stderr.o): attempt to add bitcode file after LTO (__stderr_FILE)
wasm-ld: error: /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc-debug.a(fprintf.o): attempt to add bitcode file after LTO (fprintf)
```

Building with `-DNDEBUG` works around this issue by not actually
including the assert code.  Its also more correct to do so for
compiler-rt which is not a debug library.
sunfishcode pushed a commit to WebAssembly/wasi-libc that referenced this pull request Apr 1, 2024
* Add libsetjmp.a/so

Add setjmp/longjump support based on Wasm EH proposal.

It's provided as a separate library (libsetjmp) from libc so that
runtimes w/o EH support can still load libc.so.

To use this setjmp/longjmp implementation, an application should
be compiled with `-mllvm -wasm-enable-sjlj` and linked with `-lsetjmp`.
(You need an LLVM with the change mentioned below.)

Also, you need a runtime with EH support to run such an application.

If you want to use the latest EH instructions, you can use
`binaryen --translate-eh-old-to-new` on your application.

Note: You don't need to translate libsetjmp.a/so to the new EH.
While LLVM currently produces bytecode for an old version of the EH
proposal, luckily for us, the bytecode used in this library (ie. the tag
definition and the "throw" instruction) is compatible with the latest
version of the proposal.

The runtime logic is basically copy-and-paste from:
    https://github.com/yamt/garbage/tree/wasm-sjlj-alt2/wasm/longjmp

The corresponding LLVM change:
    llvm/llvm-project#84137
    (Note: you need this change to build setjmp/longjmp using code.
    otoh, you don't need this to build libsetjmp.)

A similar change for emscripten:
    emscripten-core/emscripten#21502

An older version of this PR, which doesn't require LLVM changes:
    #467

Discussion:
    https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit

An example to use the latest EH instructions:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
wasm-opt --translate-eh-old-to-new -o your_app.wasm your_app.wasm
toywasm --wasi your_app.wasm
```
Note: use toywasm built with `-DTOYWASM_ENABLE_WASM_EXCEPTION_HANDLING=ON`.

An example to use the older EH instructions, which LLVM currently produces:
```
clang -mllvm -wasm-enable-sjlj -o your_app.wasm your_app.c -lsetjmp
iwasm your_app.wasm
```
Note: use wasm-micro-runtime built with `-DWAMR_BUILD_EXCE_HANDLING=1`.
Note: as of writing this, only the classic interpreter supports EH.

* Make libsetjmp build optional

* CI: Disable libsetjmp for old LLVM

* libc-top-half/musl/include/setjmp.h: fix a rebase botch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants