Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64: Take SIGFPE signals for divide traps #6026

Merged
merged 8 commits into from
Mar 16, 2023

Conversation

alexcrichton
Copy link
Member

Prior to this commit Wasmtime would configure avoid_div_traps=true unconditionally for Cranelift. This, for the division-based instructions, would change emitted code to explicitly trap on trap conditions instead of letting the div x86 instruction trap.

There's no specific reason for Wasmtime, however, to specifically avoid traps in the div instruction. This means that the extra generated branches on x86 aren't necessary since the div and idiv instructions already trap for similar conditions as wasm requires.

This commit instead disables the avoid_div_traps setting for Wasmtime's usage of Cranelift. Subsequently the codegen rules were updated slightly:

  • When avoid_div_traps=true, traps are no longer emitted for div instructions.
  • The udiv/urem instructions now list their trap as divide-by-zero instead of integer overflow.
  • The lowering for sdiv was updated to still explicitly check for zero but the integer overflow case is deferred to the instruction itself.
  • The lowering of srem no longer checks for zero and the listed trap for the div instruction is a divide-by-zero.

This means that the codegen for udiv and urem no longer have any branches. The codegen for sdiv removes one branch but keeps the zero-check to differentiate the two kinds of traps. The codegen for srem removes one branch but keeps the -1 check since the semantics of srem mismatch with the semantics of idiv with a -1 divisor (specifically for INT_MIN).

This is unlikely to have really all that much of a speedup but was something I noticed during #6008 which seemed like it'd be good to clean up. Plus Wasmtime's signal handling was already set up to catch SIGFPE, it was just never firing.

Prior to this commit Wasmtime would configure `avoid_div_traps=true`
unconditionally for Cranelift. This, for the division-based
instructions, would change emitted code to explicitly trap on trap
conditions instead of letting the `div` x86 instruction trap.

There's no specific reason for Wasmtime, however, to specifically avoid
traps in the `div` instruction. This means that the extra generated
branches on x86 aren't necessary since the `div` and `idiv` instructions
already trap for similar conditions as wasm requires.

This commit instead disables the `avoid_div_traps` setting for
Wasmtime's usage of Cranelift. Subsequently the codegen rules were
updated slightly:

* When `avoid_div_traps=true`, traps are no longer emitted for `div`
  instructions.
* The `udiv`/`urem` instructions now list their trap as divide-by-zero
  instead of integer overflow.
* The lowering for `sdiv` was updated to still explicitly check for zero
  but the integer overflow case is deferred to the instruction itself.
* The lowering of `srem` no longer checks for zero and the listed trap
  for the `div` instruction is a divide-by-zero.

This means that the codegen for `udiv` and `urem` no longer have any
branches. The codegen for `sdiv` removes one branch but keeps the
zero-check to differentiate the two kinds of traps. The codegen for
`srem` removes one branch but keeps the -1 check since the semantics of
`srem` mismatch with the semantics of `idiv` with a -1 divisor
(specifically for INT_MIN).

This is unlikely to have really all that much of a speedup but was
something I noticed during bytecodealliance#6008 which seemed like it'd be good to clean
up. Plus Wasmtime's signal handling was already set up to catch
`SIGFPE`, it was just never firing.
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen wasmtime:api Related to the API of the `wasmtime` crate itself labels Mar 15, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought on the plumbing below; also, if you want to branch-fold away the avoid_div_traps=true case as part of this PR or a followup, I think that would also be a welcome change (unless there's an environment where it is necessary that I'm missing).

cranelift/codegen/src/isa/x64/inst/emit.rs Outdated Show resolved Hide resolved
With no known users currently removing this should be possible and helps
simplify the x64 backend.
Remove the `validate_sdiv_divisor*` pseudo-instructions and clean up
some of the ISLE rules now that `div` is allowed to itself trap
unconditionally.
@alexcrichton
Copy link
Member Author

if you want to branch-fold away the avoid_div_traps=true case as part of this PR

Nah I think it makes sense to go ahead and do that here, and I feel it pretty nicely simplified the rules!

@alexcrichton
Copy link
Member Author

One minor non-optimal bit about this PR is that currently I've represented the trap as TrapCode inside of the div instructions, but it arguably should be Option<TrapCode> and None for cases such as divide-by-constant-that-isn't-zero. Dealing with Option is a bit of a hassle in ISLE right now though and the only "cost" of this is a slightly larger trap metadata section which doesn't seem like it'll break the bank or cause any major issues in the meantime.

Don't accidentally fold multiple traps together
@github-actions github-actions bot added cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:meta Everything related to the meta-language. labels Mar 15, 2023
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Non-optional trapcode is fine, IMHO; we can always optimize later as you say.

r+ with CI fix (emit-tests that need to be updated with the new trapcode parameter)

@alexcrichton alexcrichton added this pull request to the merge queue Mar 16, 2023
Merged via the queue into bytecodealliance:main with commit 5ae8575 Mar 16, 2023
@alexcrichton alexcrichton deleted the x64-sigfpe branch March 16, 2023 00:59
@thibaultcha
Copy link
Contributor

thibaultcha commented May 15, 2023

After upgrading to Wasmtime 8.0.0, how can we revert back to the previous behavior? This patch causes Valgrind to complain about SIGFPE when before 8.0.0 everything was just fine. Just to be clear we don't want to raise SIGFPE when Wasmtime encounters a div by 0, we want it to throw a trap as usual and nothing more.

@cfallin
Copy link
Member

cfallin commented May 15, 2023

@thibaultcha there isn't really a good way to do that; this PR removed the code-paths that support inline checks rather than relying on signals, because it vastly simplifies things and we judged that it would be fine to use SIGFPE handling instead for the embedding scenarios we support.

It's possible we could revert this whole PR, but that does have a maintenance cost, so it would be useful to know more about the requirements of your case. Is the issue mainly Valgrind warnings (and if so, is there a way to define a suppression for them)? Or is there something more fundamental about your embedding environment where signals are not acceptable?

@alexcrichton
Copy link
Member Author

Out of curiosity, if SIGFPE is causing problems how does valgrind handle wasm performing out-of-bounds memory loads? Those are done with guard pages and signal handling right now, so if those are working then I'd expecte SIGFPE to work as well. Or are you perhaps configuring wasm to have no signals by using dynamic memory with explicit bounds checks?

(I'm also curious about what Chris mentioned in terms of suppressions and info about signals, too)

@thibaultcha
Copy link
Contributor

thibaultcha commented May 16, 2023

It seems we have sorted it out! It wasn't my intention to ask for a revert this patch; it was rather that since I read:

Wasmtime would configure avoid_div_traps=true

I was initially wondering if we as users could revert this setting to get back to the previous behavior and fix our issue.

We found that a Valgrind flag does the trick for this test and allows us to correctly handle SIGFPE with the new cranelift behavior: --vex-iropt-register-updates=allregs-at-each-insn.

Thank you both for the replies!

@cfallin
Copy link
Member

cfallin commented May 16, 2023

Great! I'm glad you've got a solution and let us know if you have any other issues.

We're always happy to take PRs to docs if you feel like there's a place where a Valgrind-related tip might have helped you (but no pressure at all, of course); otherwise at least these comments should be searchable if others hit the same thing in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants