[pull] main from bytecodealliance:main #24

pull · 2023-10-10T22:36:16Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

This removes about a million lines from our estimated audit backlog according to `cargo vet suggest`. If I understand the Criterion documentation correctly, I believe this means that generating HTML reports from Criterion benchmarks now requires having gnuplot installed, because it can't fall back to using the pure-Rust "plotters" crate.

Currently the CMakeLists.txt is designed to only be consumed by a parent project, which limits its flexibility. Adding a `project()` function allows it to also be built as its own project. Also set `USES_TERMINAL_BUILD` to `TRUE` in `ExternalProject_Add()`, which allows getting the installation progress output from `cargo build`.

* winch(amd64): more fpu Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com> * Add tests Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com> --------- Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>

* Gate type-builder types from `wasmtime-environ` on `compile` This commit gates the `*Builder` types related to building sets of types in the `wasmtime-environ` crate on the `compile` feature. This helps bring in less code when the feature is disabled and helps exclude some dependencies for the upcoming `no_std` migration as well. This commit doesn't change anything, it's just moving code around. * Remove no-longer-needed import prtest:full

* Add an example for wasi-nn WinML backend. This example is similar as classification-component-onnx, but it's not a wasm component, and it's for WinML backend. It also includes a step by step instruction for running this example. * Touch up the documentation for this example This change removes some duplicated information and tweaks some of the wording. --------- Co-authored-by: Andrew Brown <andrew.brown@intel.com>

* separate out stable map and set to prevent access to internals * separate out disjointset to prevent access to internals * fix tests

Now all registers passed to the operand collector are mutably borrowed directly out of their original locations in the Inst, so it is possible to update them in place. As an initial demonstration of the utility of this change, the results of the VReg renamer are applied directly to the instructions during operand collection, and then all VReg aliases are cleared after operand collection. Most of this commit consists of deleting noise from the many `get_operands` implementations in all the backends: most ampersands and asterisks, and all uses of the `ref` keyword.

This commit adds support for defining array types from Wasm or the host, and managing them inside the engine's types registry. It does not introduce support for allocating or manipulating array values. That functionality will come in future pull requests.

We know the type of each VReg at the moment when we allocate it, so we never need to set the type again. That also means we don't need to deduplicate reftyped VRegs as we go, although it's still possible to have duplicates after aliases are resolved.

Stashing the source location in a side location is unnecessary: we already have it where we need it.

Just a small follow up to #8481

The operand collector and the instruction emitter for Inst::Mov both placed the `rm` register before `rd`, so the emitted code was correct, but the pretty-printer used the opposite order and so printed the operands backwards. Note that the VCode disassembly disagreed with Capstone's disassembly of the emitted machine code.

* typos in comments * typos in comments and one fuzzing error message * revert typo under the deps directory

This paves the way for more implementations of this OperandVisitor trait which can do different things with the operands. Of particular note, this commit introduces a second implementation which is used only in the s390x backend and only to implement a debug assertion. Previously, s390x used an OperandCollector::no_reuse_def method to implement this assertion, but I didn't want to require that all implementors of the new trait have to provide that method, so this captures the same check but keeps it local to where it's needed.

Instead of > Performing build step for 'wasmtime-crate''WASMTIME_CARGO_BINARY-NOTFOUND' is not recognized as an internal or external command, operable program or batch file. this will now instead output > "cargo" was not found. Ensure "cargo" is in PATH. Aborting...

* move fx hash to workspace level dep * change internal fxhash to use fxhash crate * remove unneeded HashSet import * change fxhash crate to rustc hash * undo migration to rustc hash * manually implement hash function from fxhash * change to rustc hash

Now that #8486 landed, allowing us to resolve aliases in machine instructions, we have ensured that all VReg aliases are resolved by the time we're done building the VCode. Therefore we only need to keep track of the aliases map before that. The VReg allocator is also dropped when we finish building the VCode, and it makes sense to track aliases there. This lets us maintain an invariant, that PCC facts are only stored on VRegs which are not aliased, while only reasoning locally within VRegAllocator. I've changed the trace-log output to print the VCode immediately before it's finalized, along with key details in the VRegAllocator. This allows seeing the instructions before aliases are rewritten, although they're in reverse order at that point. There's another trace-log message somewhere else which logs the finalized VCode, so you can see both. Previously, the initial capacity of the vreg_aliases map was set to ten times the number of basic blocks in the function. However we can make a better estimate based on the number of SSA values in the function, and use that to preallocate storage for other things in VRegAllocator too. Keeping the aliases outside the VCode fixes previous borrow-checker challenges, which is a nice bonus.

This establishes the property that the VCode's various lists of ranges each fully cover the index range of another list. Previously, the block_succ_range list covered the first half of block_succs_preds, and the block_pred_range list covered the second half. While I was in the area, I replaced the O(n log n) sort in compute_preds_from_succs with a linear-time counting sort, which uses less temporary storage and directly computes the ranges we want as a byproduct.

…dule (#8501) * Expose `wasmtime-runtime` as `crate::runtime::vm` internally for the `wasmtime` crate * Rewrite uses of `wasmtime_runtime` to `crate::runtime::vm` * Remove dep on `wasmtime-runtime` from `wasmtime-cli` * Move the `wasmtime-runtime` crate into the `wasmtime::runtime::vm` module * Update labeler for merged crates * Fix `publish verify` prtest:full

* cranelift: add icmp-of-icmp rules for comparisons with 1 * cranelift: add tests for icmp-of-icmp rule

* wasmtime: Use ConstExpr for element segment offsets This shouldn't change any behavior currently, but prepares us for supporting extended constant expressions. * Fix clippy::cast_sign_loss lint

* cranelift/x64: Fix XmmRmREvex pretty-printing The operand collector had these operands in src1/src2/dst order, but the pretty-printer fetched the allocations in dst/src1/src2 order instead. Although our pretty-printer looked like it was printing src1/src2/dst, because it consumed operands in the wrong order, what it actually printed was src2/dst/src1. Meanwhile, Capstone actually uses src2/src1/dst order in AT&T mode. (GNU objdump agrees.) In the only filetest covering the vpsraq instruction, our output agreed with Capstone because register allocation picked the same register for both src1 and dst, so the two orders were indistinguishable. I've extended the filetest to force register allocation to pick different registers. This format is also used for vpmullq, but we didn't have any compile filetests covering that instruction, so I've added one with the same register allocation pattern. Now our pretty-printer agrees with Capstone on both instructions. * Fix emit-tests and vpermi2b This test for vpmullq had what we have now determined is the wrong order for src1 and src2. There were no emit-tests for vpsraq, so I added one. The vpermi2b tests used the wrong form of the Inst enum, judging by the assertions that are in x64_get_operands (which is not exercised by emit tests) and the fact that we never use that form for that instruction anywhere else. Pretty-printing vpermi2b disagreed with Capstone in the same way as vpsraq and vpmullq. I've fixed that form to agree with Capstone as well, aside from the duplicated src1/dst operand which are required to be different before register allocation and equal afterward.

These lists of ranges always cover contiguous ranges of an index space, meaning the start of one range is the same as the end of the previous range, so we can cut storage in half by only storing one endpoint of each range. This in turn means we don't have to keep track of the other endpoint while building these lists, reducing the state we need to keep while building vcode and simplifying the various build steps.

* Wasmtime: add one-entry call-indirect caching. In WebAssembly, an indirect call is somewhat slow, because of the indirection required by CFI (control-flow integrity) sandboxing. In particular, a "function pointer" in most source languages compiled to Wasm is represented by an index into a table of funcrefs. The `call_indirect` instruction then has to do the following steps to invoke a function pointer: - Load the funcref table's base and length values from the vmctx. - Bounds-check the invoked index against the actual table size; trap if out-of-bounds. - Spectre mitigation (cmove) on that bounds-check. - Load the `vmfuncref` from the table given base and index. - For lazy table init, check if this is a non-initialized funcref pointer, and initialize the entry. - Load the signature from the funcref struct and compare it against the `call_indirect`'s expected signature; trap if wrong. - Load the actual code pointer for the callee's Wasm-ABI entry point. - Load the callee vmctx (which may be different for a cross-module call). - Put that vmctx in arg 0, our vmctx in arg 1, and invoke the loaded code pointer with an indirect call instruction. Compare and contrast to the process involved in invoking a native function pointer: - Invoke the code pointer with an indirect call instruction. This overhead buys us something -- it is part of the SFI sandbox boundary -- but it is very repetitive and unnecessary work in *most* cases when indirect function calls are performed repeatedly (such as within an inner loop). This PR introduces the idea of *caching*: if we know that the result of all the above checks won't change, then if we use the same index as "the last time" (for some definition), we can skip straight to the "invoke the code pointer" step, with a cached code pointer from that last time. Concretely, it introduces a two-word struct inlined into the vmctx for each `call_indirect` instruction in the module (up to a limit): - The last invoked index; - The code pointer that index corresponded to. When compiling the module, we check whether the table could possibly be mutable at a given index once read: any instructions like `table.set`, or the whole table exported thus writable from the outside. We also check whether index 0 is a non-null funcref. If neither of these things are true, then we know we can cache an index-to-code-pointer mapping, and we know we can use index 0 as a sentinel for "no cached value". We then make use of the struct for each indirect call site and generate code to check if the index matches; if so, call cached pointer; if not, load the vmfuncref, check the signature, check that the callee vmctx is the same as caller (intra-module case), and stash the code pointer and index away (fill the cache), then make the call. On an in-development branch of SpiderMonkey-in-Wasm with ICs (using indirect calls), this is about a 20% speedup; I haven't yet measured on other benchmarks. It is expected that this might be an instantiation-time slowdown due to a larger vmctx (but we could use madvise to zero if needed). This feature is off by default right now. * Addressed review feedback. * Added some more comments. * Allow unused VMCallIndirectCache struct (defined for parity with other bits but not needed in actual runtime). * Add a limit to the number of call-indirect cache slots. * Fix merge conflict: handle ConstOp element offset. * Review feedback.

* Migrate the wasmtime-types crate to no_std This commit is where no_std for Wasmtime starts to get a bit interesting. Specifically the `wasmtime-types` crate is the first crate that depends on some nontrivial crates that also need to be migrated to `no_std`. This PR disables the default feature of `wasmparser` by default and additionally does the same for `serde`. This enables them to compile in `no_std` contexts by default and default features will be enabled elsewhere in this repository as necessary. This also opts to drop the `thiserror` dependency entirely in favor of a manual `Display` implementation with a cfg'd implementation of `Error`. As before CI checks are added for `wasmtime-types` with a `no_std` target itself to ensure the crate and all dependencies all avoid `std`. * Fix adapter build

This commit skips the safety checks of `AutoAssertNoGc` for the duration of host calls where the types involved are statically known to not perform any GCs (e.g. integers and floats). This helps recover some performance loss from indirect calls made on entry/exit of an `AutoAssertNoGc` scope when the `gc` feature is enabled in Wasmtime.

* Const-propagate some offsets in `VMOffsets` Before this commit all offsets to all fields in `VMOffsets` were stored as fields within `VMOffset` itself. All of the fields at the start of `VMOffsets`, however, are statically known given the pointer size. Notably this means that the use of `HostPtr` in the runtime still was forcing a dynamic lookup of these static offsets. This commit refactors this to reflect all static offsets based solely on the pointer size in the `PtrSize` trait, removing all the fields from `VMOffsets`. All the dynamically sized fields, however, remain in `VMOffsets`. * Fix expected error message

* Update the spec test suite submodule * Ignore the `memory64` tests since Wasmtime doesn't implement the table64 extension yet. * Ignore `gc/i31.wast` as it's got new tests which Wasmtime doesn't currently pass. * Only ignore table64-related memory64 tests * Fix ignoring some tests on Winch

These were originally a SpiderMonkey-ism and have been unused ever since. It was introduced for GC integration, where the runtime could do something to make Cranelift code hit a trap and pause for a GC and then resume execution once GC completed. But it is unclear that, as implemented, this is actually a useful mechanism for doing that (compared to, say, loading from some Well Known page and the GC protecting that page and catching signals to interrupt the mutator, or simply branching and doing a libcall). And if someone has that particular use case in the future (Wasmtime and its GC integration doesn't need exactly this) then we can design something for what is actually needed at that time, instead of carrying this cruft forward forever.

After #8809, the mutator cannot resume from a trap so we don't need to consider them safepoints, as no GC-managed references are live after the trap. The one exception being the `debugtrap` CLIF instruction, which is technically still a resumable trap, but which exists only for emitting the equivalent of an `int3` breakpoint instruction for pausing in a debugger to inspect state, and should never be used for mutator-collector interactions.

* Disable `call-hook` crate feature by default This commit disables the `call-hook` feature for the Wasmtime crate added in #8795 by default. The rationale is that this has a slight cost to all embeddings even if the feature isn't used and it's not expected to be that widely used of a feature, so off-by-default seems like a more appropriate default. * Enable all features in doc build * More doc fixes

…ion emission (#8812) Forgot to address this in #8810

* Disable memory protection keys by default at compile time This commit gates memory protection keys behind a new Cargo feature which is disabled by default. Memory protection keys are already disabled by default on all platforms and are only configured to possibly work with Linux x64. When enabled, however, it unconditionally adds a small amount of overhead to WebAssembly entries/exits even if the feature is disabled at runtime for the same reason that the `call-hook` feature adds overhead. With `call-hook` being disabled by default in #8808 it seemed reasonable to additionally gate memory protection keys to avoid needing to disable features in Wasmtime to get the best performance wasm<->host calls. * Enable Wasmtime feature for fuzzing

* Force some more permission checks with 0-length writes When a 0-length write is performed try to send the write all the way to the underlying file descriptor to at least check that it's valid to write. Closes #8818 * Update crates/test-programs/src/bin/preview1_file_write.rs Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com> * Allow a second error for Windows as well --------- Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

Looks like wasi-libc is testing for the READDIR right in addition to the READ right in the reported flags. Update write-only files to remove both the READ and READDIR rights accordingly. Closes #8816

* Inherit Linux semantics for `fd_pwrite` with `O_APPEND` This commit updates the implementation of `fd_pwrite` in WASI to match Linux semantics for an under-specified corner of WASI. Specifically if `fd_pwrite` is used the offset specified is ignored if the file is opened in append mode and the bytes are instead appended. This commit additionally refactors `fd_write` and `fd_pwrite` to have basically the same code with only a minor branch internally when the final write is being performed to help deduplicate more logic. Closes #8817 * Ignore new tests on macos prtest:full * Update ignore to all non-linux

* wasi-nn: remove some unncecessary panics from test programs * Make `libtest-mimic` a workspace dependency * wasi-nn: use \`libtest-mimic\` for testing wasi-nn's testing story is complicated by different levels of support on different platforms (some backends work on certain architectures, others only work on certain OSes, etc.). This change migrates the `testing` module, which was included in `src`, to exist solely under `tests`. It also dynamically checks whether each test is runnable and then chooses whether to ignore it with a `libtest-mimic` flag. This ensures we can see all the tests all the time and whether they are running or not, which is helpful during development. * Refactor for more subtle `ignore` behavior On any development machine, with no prior setup, we should be able to compile and move past the ignored tests without issue: ```console $ cargo test -- --quiet running 4 tests iiii ``` With the proper setup and enabling the right features, tests that are able to run should do so (eliding a bunch of test output): ```console $ cargo test --all-features -- --quiet running 4 tests iii. ``` On CI, tests that _should_ pass will fail if they can't run: ```console $ CI=1 cargo test --all-features -- --quiet iFF. ``` prtest:full * Add missing `use` * fix: share download lock between checks * fix: typo, winml usedx preloaded model * fix: revert to previous winml behavior This test was reusing the ONNX test for some reason. * fix: fully qualify bail!

* Un-nest exports in a component This commit flattens the representation of exports in a component to make them more easily indexable without forcing traversal through the hierarchy of instance imports/exports to get there. * Guarantee type information on component exports Don't have it optional in some cases and present in others, instead ensure there's type information for all component exports immediately available. * Refactor how component instance exports are loaded This commit is a change to Wasmtime's public API for `wasmtime::component::Instance` that reorganizes how component exports are loaded. Previously there was a system where `Instance::exports()` was called that that was sort of "iterated over" in a builder-style pattern to acquire the actual export desired. This required lifetime trickery for nested instances and some unfortunate API bloat. The major downside of this approach is that it requires unconditional string lookups at runtime for exports and additionally does not serve as a great place to implement the semver-compatible logic of #8395. The goal of this refactoring is to pave the way to improving this. The new APIs for loading exports now look a bit more similar to what's available for core modules. Notably there's a new `Component::export_index` method which enables performing a string lookup and returning an index. This index can in turn be passed to `Instance::get_*` to skip the string lookup when exports are loaded. The `Instance::exports` API is then entirely removed and dismantled. The only piece remaining is the ability to load nested exports which is done through an `Option` parameter to `Component::export_index`. The way to load a nested instance is now to first lookup the instance with `None` as this parameter an then the instance itself is `Some` to look up an export of that instance. This removes the need for a recursive-style lifetime-juggling API from wasmtime and in theory helps simplify the usage of loading exports. * Update `bindgen!` generated structures for exports This commit updates the output of `bindgen!` to have a different setup for exports of worlds to handle the changes from the previous commit. This introduces new `*Pre` structures which are generated alongside the existing `Guest` structures for example. The `*Pre` versions contain `ComponentExportIndex` from the previous commit and serve as a path to accelerating instantiation because all name lookups are skipped. * Update test expectations for `bindgen!`-generated output * Review comments * Fix doc link

* Update the wasi_testsuite submodule This commit updates the wasi_testsuite submodule which we haven't updated in a little over a year and applies a few small fixes but mostly ignores new tests. * Add another ignore#

…es (#8792) * wasi-adapter: Implement provider crate that embeds the adapter binaries * Upgrade wasi adapters to the latest version * Update adapter docs * Recompile asi adapters with 1.78 * Recompile wasi adapters with 1.79 * Add some debugging to adapter build script * Fix script debugging * Compute wasi adapter version based on latest adapter commit hash * Try to bless wasi adapters again * Try to work around CI auto-merges * Revert to just using workspace version * Add the wasi adapter provider to the crate publication list * Use wasi adapter provider in artifacts test + explicit MSRV in CI * Explicit adapter crate version * Small fix * Remove version info from adapter metadata * Check but don't install rust toolchain in build script * Bless after rebase --------- Co-authored-by: Alex Crichton <alex@alexcrichton.com>

…8827)

… Cranelift and Wasmtime (#8826) * Introduce the `cranelift-bitset` crate The eventual goal is to deduplicate bitset types between Cranelift and Wasmtime, especially their use in stack maps. * Use the `cranelift-bitset` crate inside both Cranelift and Wasmtime Mostly for stack maps, also for a variety of other random things where `cranelift_codegen::bitset::BitSet` was previously used. * Fix stack maps unit test in cranelift-codegen * Uncomment `no_std` declaration * Fix `CompountBitSet::reserve` method * Fix `CompoundBitSet::insert` method * Keep track of the max in a `CompoundBitSet` Makes a bunch of other stuff easier, and will be needed for replacing `cranelift_entity::EntitySet`'s bitset with this thing anyways. * Add missing parens * Fix a bug around insert and reserve * Implement `with_capacity` in terms of `new` and `reserve` * Rename `reserve` to `ensure_capacity`

This commit improves the experience around using the `trappable_error_type` configuration by fixing two issues: * When an error can't be resolved it doesn't result in a `unwrap()`, instead a first-class error is returned to get reported. * The name lookup procedure is now consistent with the name lookup that the `with` key does, notably allowing the version to be optional but still supporting the version. This fixes an issue that came up recently where a path with a version was specified but the old lookup logic ended up requiring that the version wasn't specified because there was only one package with that version. This behavior resulted in a panic with a very long backtrace-based error message which was confusing to parse. By returning an error the error is much more succinct and by supporting more names the original intuition will work.

* Implement semver compatibility for exports This commit is an implementation of component model semver compatibility for export lookups. Previously in #7994 component imports were made semver-aware to ensure that bumping version numbers would not be a breaking change. This commit implements the same feature for component exports. This required some refactoring to move the definition of semver compat around and the previous refactoring in #8786 enables frontloading this work to happen before instantiation. Closes #8395 * Review comments * Fix tests

* wasmtime: Add profile markers around host-calls The output of the guest profiler can be misleading around hostcalls. Whatever happened to be the last sample before the hostcall appears to run for the entire time of the hostcall. This change ensures that we can see the actual call stack at the time of the hostcall, and get a visual indication of which periods are not spent executing guest code. * wasmtime-cli needs wasmtime/call-hook, but wasmtime itself doesn't In general, embedders that wish to use the new functionality likely will need to enable the wasmtime/call-hook feature in order to get Wasmtime to notify them of when to call into the profiler. However embedders could consider other alternatives, such as calling the profiler from selected hostcall implementations.

…et::CompoundBitSet` (#8834) * cranelift-entity: Implement `EntitySet` in terms of `cranelift_bitset::CompoundBitSet` * Shrink the size of `CompoundBitSet` so we don't perturb vmctx size test expectations * Update vmctx size test expectations anyways because we shrunk "too much" * Move `cranelift-bitset` to the front of `CRATES_TO_PUBLISH`

) * Add tests for patterns I'm about to optimize * x64: Optimize vector compare-and-branch This commit implements lowering optimizations for the `vall_true` and `vany_true` CLIF instructions when combined with `brif`. This is in the same manner as `icmp` and `fcmp` combined with `brif` where the result of the comparison is never materialized into a general purpose register which helps lower register pressure and remove some instructions. * x64: Optimize `vconst` with an all-ones pattern This has a single-instruction lowering which doesn't load from memory so it's probably cheaper than loading all-ones from memory.

This commit removes the `simm32` extractor from lowerings as it's not as useful as it was when it was first introduced. Nowadays an `Imm64` needs to be interpreted with the type known as well to understand whether bits being masked off is significant or not. The old `simm32` extractor only took `Imm64` meaning that it was unable to do this and wouldn't match negative numbers. This is because the high 32 bits of `Imm64` were always zero and `simm64` would take the `i64` value from `Imm64` and try to convert it to an `i32`. This commit replaces `simm32`, and uses of it, with a new extractor `i32_from_iconst`. This matches the preexisting `i64_from_iconst` and is able to take the type of the value into account and produce a correctly sign-extended value. cc #8706

…et (#8843) I noticed that the wasm_memory64 flag was left out of Config's debug impl, so rather than add it, I decided to use the `bitflags::Flags::FLAGS` const to iterate the complete set of flags. THe downside of this change is that it will print flags which do not have a setter in Config, e.g. `wasm_component_model_nested_names`. An alternative to this change is, rather than expanding out the single `features: WasmFeatures` member into many different debug_struct fields, the debug impl of WasmFeatures is used. Here is a sample debug of Config with this change: Config { debug_info: None, wasm_mutable_global: true, wasm_saturating_float_to_int: true, wasm_sign_extension: true, wasm_reference_types: true, wasm_multi_value: true, wasm_bulk_memory: true, wasm_simd: true, wasm_relaxed_simd: false, wasm_threads: false, wasm_shared_everything_threads: false, wasm_tail_call: false, wasm_floats: true, wasm_multi_memory: false, wasm_exceptions: false, wasm_memory64: false, wasm_extended_const: false, wasm_component_model: false, wasm_function_references: false, wasm_memory_control: false, wasm_gc: false, wasm_custom_page_sizes: false, wasm_component_model_values: false, wasm_component_model_nested_names: false, parallel_compilation: true, compiler_config: CompilerConfig { strategy: Some(Cranelift), target: None, settings: {"opt_level": "speed", "enable_verifier": "true"}, flags: {}, cache_store: None, clif_dir: None, wmemcheck: false }, parse_wasm_debuginfo: false }

This commit raises the default setting of `max_memory_size` in the pooling allocator from 10M to 4G. This won't actually impact the virtual memory reserved in the pooling allocator because we already reserved 6G of virtual memory for each linear memory this instead allows access to all of it by default. This matches the default behavior of Wasmtime for the non-pooling allocator which is to not artificially limit memory by default. The main impact of this setting is that the memory-protection-keys feature, which is disabled by default, will have no effect by default unless `max_memory_size` is also configured to something smaller than 4G. The documentation has been updated to this effect. Closes #8846

This was accidentally broken in #8692. It turns out bitcasts from i128 to i128 are legal, that PR accidentally reverted that use case. This is now added to a runtest to ensure it works on all platforms.

The epoch interruption implementation caches the current deadline in a register, and avoids reloading that cache until the cached deadline has passed. However, the first epoch check happens immediately after the cache has been populated on function entry, so there's never a reason to reload the cache at that point. It only needs to be reloaded in loops. So this commit eliminates the double-check on function entry. When Cranelift optimizations are enabled, the alias analysis pass correctly detected that this load was redundant, and the egraph pass optimized away the `icmp` as well. However, since we don't do conditional constant propagation, the branch couldn't be optimized away. On x86 this lowered to a redundant `cmp`/`jae` pair of instructions in every function entry, which this commit eliminates. To keep track of what code we're generating for epoch interruptions, I've also added disas tests with a trivial infinite loop.

The identifier for the `cold` calling convention overlaps with the `cold` keyword for basic blocks so handle another kind of token when parsing signatures.

pull bot added the ⤵️ pull label Oct 10, 2023

jameysharp and others added 29 commits April 26, 2024 11:39

Encapsulate StableMap, StableSet and DisjointSets (#8491)

657e5dc

* separate out stable map and set to prevent access to internals * separate out disjointset to prevent access to internals * fix tests

cranelift: Pass source location through directly (#8471)

3e87883

Stashing the source location in a side location is unnecessary: we already have it where we need it.

Add the Ref::null constructor and use it in a few places (#8492)

6232904

Just a small follow up to #8481

typos in comments (#8495)

1f39f63

* typos in comments * typos in comments and one fuzzing error message * revert typo under the deps directory

Fxhash to rustchash (#8498)

132ef1e

* move fx hash to workspace level dep * change internal fxhash to use fxhash crate * remove unneeded HashSet import * change fxhash crate to rustc hash * undo migration to rustc hash * manually implement hash function from fxhash * change to rustc hash

cranelift: Add znver4 arch preset (#8502)

9ce1a5d

cranelift: add icmp-of-icmp rules for comparisons with 1 (#8510)

662c35c

* cranelift: add icmp-of-icmp rules for comparisons with 1 * cranelift: add tests for icmp-of-icmp rule

Fix a typo in fuzz config (#8517)

d1d084f

fix docs typo in component::Component (#8512)

66266fa

wasmtime: Use ConstExpr for element segment offsets (#8514)

d4e968c

* wasmtime: Use ConstExpr for element segment offsets This shouldn't change any behavior currently, but prepares us for supporting extended constant expressions. * Fix clippy::cast_sign_loss lint

reassociate mixture of add and sub to wide trees (#8518)

b7ba20f

alexcrichton and others added 30 commits June 14, 2024 18:15

Cranelift: Don't attempt to take stack maps for traps during instruct…

b7fd095

…ion emission (#8812) Forgot to address this in #8810

Fix fcntl(fd, F_GETFL) from wasi-libc with write-only files (#8824)

e97b74d

Looks like wasi-libc is testing for the READDIR right in addition to the READ right in the reported flags. Update write-only files to remove both the READ and READDIR rights accordingly. Closes #8816

Update the wasi_testsuite submodule (#8829)

864fdb6

* Update the wasi_testsuite submodule This commit updates the wasi_testsuite submodule which we haven't updated in a little over a year and applies a few small fixes but mostly ignores new tests. * Add another ignore#

cranelift-entity: Remove the unused EntitySet::cardinality method (#…

4705d15

…8827)

Remove leftover debugging dbg! in CLIF parser (#8836)

b5422e5

riscv64: Add support for bitcast.i128 (#8851)

b24b323

This was accidentally broken in #8692. It turns out bitcasts from i128 to i128 are legal, that PR accidentally reverted that use case. This is now added to a runtest to ensure it works on all platforms.

riscv64: Update Inst::worst_case_size (#8850)

67afe4d

clif: Fix parsing the cold calling convention (#8854)

6b89213

The identifier for the `cold` calling convention overlaps with the `cold` keyword for basic blocks so handle another kind of token when parsing signatures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from bytecodealliance:main #24

[pull] main from bytecodealliance:main #24

pull bot commented Oct 10, 2023 •

edited

Loading

[pull] main from bytecodealliance:main #24

Are you sure you want to change the base?

[pull] main from bytecodealliance:main #24

Conversation

pull bot commented Oct 10, 2023 • edited Loading

pull bot commented Oct 10, 2023 •

edited

Loading