Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge with upstream #174

Closed
wants to merge 18 commits into from
Closed

Merge with upstream #174

wants to merge 18 commits into from

Conversation

dhil
Copy link
Member

@dhil dhil commented May 17, 2024

Notable changes:

  • The WasmFX baseline implementation is toggable via
    setting the environment variable WASMFX_IMPL=baseline only. The
    rationale for this change is to workaround the fact that Cargo
    features are additive, meaning the feature wasmfx_baseline would
    always be toggled by cargo build --all features as it happens to be
    the case in the CI.

afonso360 and others added 18 commits May 15, 2024 21:47
)

* egraphs: Const propagate bitwise float operations

* egraphs: Move float bitwise operations to prelude.isle
This fixes an accidental regression from bytecodealliance#8616 where page alignment was
implicitly happening due to how configuration was processed but it
wasn't re-added in the refactoring.
* CI: shard testing and checking crates across multiple jobs

* prtest:full
…8635)

* Cranelift: add alignment parameter to stack slots.

Fixes bytecodealliance#6716.

Currently, stack slots on the stack are aligned only to a machine-word
boundary. This is insufficient for some use-cases: for example, storing
SIMD data or structs that require a larger alignment.

This PR adds a parameter to the `StackSlotData` to specify alignment,
and the associated logic to the CLIF parser and printer. It updates the
shared ABI code to compute the stackslot layout taking the alignment
into account. In order to ensure the alignment is always a power of two,
it is stored as a shift amount (log2 of actual alignment) in the IR.

* Apply suggestions from code review

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

* Update filetest.

* Update alignment to ValRaw vector.

* Fix printer test.

* cargo-fmt from suggestion update.

---------

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Use WASM function names in compiled objects

Instead of generating symbol names in the format
"wasm[$MODULE_ID]::function[$FUNCTION_INDEX]", generate (if possible)
something more readable, such as "wasm[$MODULE_ID]::$FUNCTION_NAME".
This helps when debugging or profiling the generated code.

Co-authored-by: Jamey Sharp <jsharp@fastly.com>

* Ensure symbol names are cleaned up and have function indexes

Filter symbol names to include only characters that are usually used
for function names, and that might be produced by name mangling.
Replace everything else with a question mark (and all repeated question
marks by a single one), and then truncate to a length of 96 characters.

This should be enough to not only avoid passing user-controlled strings
to tools such as "perf" and "objdump", and make it easier to
disambiguate symbols that might have the same name but different
indices.

* Make symbol cleaning slightly more efficient

* Update symbol names to be closer to what tests expect

* Ensure only alphanumeric ASCII characters are allowed in a symbol name

* Ensure sliced symbol name is within its bounds

* Update test expectations after adding function name to symbol name

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
…liance#8631)

* gen_nominal_sp_adj now returns a smallvec

* Remove the virtual sp offset from the x64 backend

* Remove the virtual sp offset from the aarch64 backend

* Remove the virtual sp offset from the riscv64 backend

* Remove the virtual sp offset from the s390x backend

* Remove gen_nomninal_sp_adj, and argument area management functions

* Remove get_virtual_sp_offset_from_state

* Code review suggestions
* occurred

* winch typos

* tests typos

* cli typos

* fuzz typos

* examples typos

* docs typos

* crates/wasmtime typos

* crates/environ typos

* crates/cranelift typos

* crates/test-programs typos

* crates/c-api typos

* crates/cache typos

* crates other typos

* cranelift/codegen/src/isa typos

* cranelift/codegen/src other typos

* cranelift/codegen other typos

* cranelift other typos

* ci js typo

* .github workflows typo

* RELEASES typo

* Fix clang-format documentation line

---------

Co-authored-by: Andrew Brown <andrew.brown@intel.com>
)

* cranelift: expand umbrella crate with more crates

* Break the dependency cycle between cranelift-jit and cranelift

---------

Co-authored-by: Trevor Elliott <telliott@fastly.com>
…iance#8628)

* Use bytes for maximum size of linear memory with pooling

This commit changes configuration of the pooling allocator to use a
byte-based unit rather than a page based unit. The previous
`PoolingAllocatorConfig::memory_pages` configuration option configures
the maximum size that a linear memory may grow to at runtime. This is an
important factor in calculation of stripes for MPK and is also a
coarse-grained knob apart from `StoreLimiter` to limit memory
consumption. This configuration option has been renamed to
`max_memory_size` and documented that it's in terms of bytes rather than
pages as before.

Additionally the documented constraint of `max_memory_size` must be
smaller than `static_memory_bound` is now additionally enforced as a
minor clean-up as part of this PR as well.

* Review comments

* Fix benchmark build
…nce#8629)

* Remove the native ABI calling convention from Wasmtime

This commit proposes removing the "native abi" calling convention used
in Wasmtime. For background this ABI dates back to the origins of
Wasmtime. Originally Wasmtime only had `Func::call` and eventually I
added `TypedFunc` with `TypedFunc::call` and `Func::wrap` for a faster
path. At the time given the state of trampolines it was easiest to call
WebAssembly code directly without any trampolines using the native ABI
that wasm used at the time. This is the original source of the native
ABI and it's persisted over time under the assumption that it's faster
than the array ABI due to keeping arguments in registers rather than
spilling them to the stack.

Over time, however, this design decision of using the native ABI has not
aged well. Trampolines have changed quite a lot in the meantime and it's
no longer possible for the host to call wasm without a trampoline, for
example. Compilations nowadays maintain both native and array
trampolines for wasm functions in addition to host functions. There's a
large split between `Func::new` and `Func::wrap`. Overall, there's quite
a lot of weight that we're pulling for the design decision of using the
native ABI.

Functionally this hasn't ever really been the end of the world.
Trampolines aren't a known issue in terms of performance or code size.
There's no known faster way to invoke WebAssembly from the host (or
vice-versa). One major downside of this design, however, is that
`Func::new` requires Cranelift as a backend to exist. This is due to the
fact that it needs to synthesize various entries in the matrix of ABIs
we have that aren't available at any other time. While this is itself
not the worst of issues it means that the C API cannot be built without
a compiler because the C API does not have access to `Func::wrap`.

Overall I'd like to reevaluate given where Wasmtime is today whether it
makes sense to keep the native ABI trampolines. Sure they're supposed to
be fast, but are they really that much faster than the array-call ABI as
an alternative? This commit is intended to measure this.

This commit removes the native ABI calling convention entirely. For
example `VMFuncRef` is now one pointer smaller. All of `TypedFunc` now
uses `*mut ValRaw` for loads/stores rather than dealing with ABI
business. The benchmarks with this PR are:

* `sync/no-hook/core - host-to-wasm - typed - nop` - 5% faster
* `sync/no-hook/core - host-to-wasm - typed - nop-params-and-results` - 10% slower
* `sync/no-hook/core - wasm-to-host - typed - nop` - no change
* `sync/no-hook/core - wasm-to-host - typed - nop-params-and-results` - 7% faster

These numbers are a bit surprising as I would have suspected no change
in both "nop" benchmarks as well as both being slower in the
params-and-results benchmarks. Regardless it is apparent that this is
not a major change in terms of performance given Wasmtime's current
state. In general my hunch is that there are more expensive sources of
overhead than reads/writes from the stack when dealing with wasm values
(e.g. trap handling, store management, etc).

Overall this commit feels like a large simplification of what we
currently do in `TypedFunc`:

* The number of ABIs that Wasmtime deals with is reduced by one. ABIs
  are pretty much always tricky and having fewer moving parts should
  help improve the understandability of the system.
* All of the `WasmTy` trait methods and `TypedFunc` infrastructure is
  simplified. Traits now work with simple `load`/`store` methods rather
  than various other flavors of conversion.
* The multi-return-value handling of the native ABI is all gone now
  which gave rise to significant complexity within Wasmtime's Cranelift
  translation layer in addition to the `TypedFunc` backing traits.
* This aligns components and core wasm where components always use the
  array ABI and now core wasm additionally will always use the array ABI
  when communicating with the host.

I'll note that this still leaves a major ABI "complexity" with respect
to native functions do not have a wasm ABI function pointer until
they're "attached" to a `Store` with a `Module`. That's required to
avoid needing Cranelift for creating host functions and that property is
still true today. This is a bit simpler to understand though now that
`Func::new` and `Func::wrap` are treated uniformly rather than one being
special-cased.

* Fix miri unsafety

prtest:full
… Fiber library. Now it fails gracefully with a trap.
This commit makes the WasmFX baseline implementation toggable via
setting the environment variable `WASMFX_IMPL=baseline` only. The
rationale for this change is to workaround the fact that Cargo
features are additive, meaning the feature `wasmfx_baseline` would
always be toggled by `cargo build --all features` as it happens to be
the case in the CI.
@dhil dhil closed this May 20, 2024
@dhil dhil deleted the wasmfx-merge3 branch May 20, 2024 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants