Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from bytecodealliance:main #24

Open
wants to merge 1,067 commits into
base: main
Choose a base branch
from

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 10, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Oct 10, 2023
jameysharp and others added 29 commits April 26, 2024 11:39
This removes about a million lines from our estimated audit backlog
according to `cargo vet suggest`.

If I understand the Criterion documentation correctly, I believe this
means that generating HTML reports from Criterion benchmarks now
requires having gnuplot installed, because it can't fall back to using
the pure-Rust "plotters" crate.
Currently the CMakeLists.txt is designed to only be consumed by a parent
project, which limits its flexibility. Adding a `project()` function
allows it to also be built as its own project.

Also set `USES_TERMINAL_BUILD` to `TRUE` in `ExternalProject_Add()`,
which allows getting the installation progress output from `cargo
build`.
* winch(amd64): more fpu

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>

* Add tests

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>

---------

Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
* Gate type-builder types from `wasmtime-environ` on `compile`

This commit gates the `*Builder` types related to building sets of types
in the `wasmtime-environ` crate on the `compile` feature. This helps
bring in less code when the feature is disabled and helps exclude some
dependencies for the upcoming `no_std` migration as well.

This commit doesn't change anything, it's just moving code around.

* Remove no-longer-needed import

prtest:full
* Add an example for wasi-nn WinML backend.

This example is similar as classification-component-onnx, but it's not a wasm
component, and it's for WinML backend. It also includes a step by step
instruction for running this example.

* Touch up the documentation for this example

This change removes some duplicated information and tweaks some of the
wording.

---------

Co-authored-by: Andrew Brown <andrew.brown@intel.com>
* separate out stable map and set to prevent access to internals

* separate out disjointset to prevent access to internals

* fix tests
Now all registers passed to the operand collector are mutably borrowed
directly out of their original locations in the Inst, so it is possible
to update them in place.

As an initial demonstration of the utility of this change, the results
of the VReg renamer are applied directly to the instructions during
operand collection, and then all VReg aliases are cleared after operand
collection.

Most of this commit consists of deleting noise from the many
`get_operands` implementations in all the backends: most ampersands and
asterisks, and all uses of the `ref` keyword.
This commit adds support for defining array types from Wasm or the host, and
managing them inside the engine's types registry. It does not introduce support
for allocating or manipulating array values. That functionality will come in
future pull requests.
We know the type of each VReg at the moment when we allocate it, so we
never need to set the type again. That also means we don't need to
deduplicate reftyped VRegs as we go, although it's still possible to
have duplicates after aliases are resolved.
Stashing the source location in a side location is unnecessary: we
already have it where we need it.
The operand collector and the instruction emitter for Inst::Mov both
placed the `rm` register before `rd`, so the emitted code was correct,
but the pretty-printer used the opposite order and so printed the
operands backwards. Note that the VCode disassembly disagreed with
Capstone's disassembly of the emitted machine code.
* typos in comments

* typos in comments and one fuzzing error message

* revert typo under the deps directory
This paves the way for more implementations of this OperandVisitor trait
which can do different things with the operands.

Of particular note, this commit introduces a second implementation which
is used only in the s390x backend and only to implement a debug
assertion. Previously, s390x used an OperandCollector::no_reuse_def
method to implement this assertion, but I didn't want to require that
all implementors of the new trait have to provide that method, so this
captures the same check but keeps it local to where it's needed.
Instead of

> Performing build step for
'wasmtime-crate''WASMTIME_CARGO_BINARY-NOTFOUND' is not recognized as an
internal or external command, operable program or batch file.

this will now instead output

> "cargo" was not found. Ensure "cargo" is in PATH. Aborting...
* move fx hash to workspace level dep

* change internal fxhash to use fxhash crate

* remove unneeded HashSet import

* change fxhash crate  to rustc hash

* undo migration to rustc hash

* manually implement hash function from fxhash

* change to rustc hash
Now that #8486 landed, allowing us to resolve aliases in machine
instructions, we have ensured that all VReg aliases are resolved by the
time we're done building the VCode. Therefore we only need to keep track
of the aliases map before that.

The VReg allocator is also dropped when we finish building the VCode,
and it makes sense to track aliases there. This lets us maintain an
invariant, that PCC facts are only stored on VRegs which are not
aliased, while only reasoning locally within VRegAllocator.

I've changed the trace-log output to print the VCode immediately before
it's finalized, along with key details in the VRegAllocator. This allows
seeing the instructions before aliases are rewritten, although they're
in reverse order at that point. There's another trace-log message
somewhere else which logs the finalized VCode, so you can see both.

Previously, the initial capacity of the vreg_aliases map was set to ten
times the number of basic blocks in the function. However we can make a
better estimate based on the number of SSA values in the function, and
use that to preallocate storage for other things in VRegAllocator too.

Keeping the aliases outside the VCode fixes previous borrow-checker
challenges, which is a nice bonus.
This establishes the property that the VCode's various lists of ranges
each fully cover the index range of another list. Previously, the
block_succ_range list covered the first half of block_succs_preds, and
the block_pred_range list covered the second half.

While I was in the area, I replaced the O(n log n) sort in
compute_preds_from_succs with a linear-time counting sort, which uses
less temporary storage and directly computes the ranges we want as a
byproduct.
…dule (#8501)

* Expose `wasmtime-runtime` as `crate::runtime::vm` internally for the `wasmtime` crate

* Rewrite uses of `wasmtime_runtime` to `crate::runtime::vm`

* Remove dep on `wasmtime-runtime` from `wasmtime-cli`

* Move the `wasmtime-runtime` crate into the `wasmtime::runtime::vm` module

* Update labeler for merged crates

* Fix `publish verify`

prtest:full
* cranelift: add icmp-of-icmp rules for comparisons with 1

* cranelift: add tests for icmp-of-icmp rule
* wasmtime: Use ConstExpr for element segment offsets

This shouldn't change any behavior currently, but prepares us for
supporting extended constant expressions.

* Fix clippy::cast_sign_loss lint
* cranelift/x64: Fix XmmRmREvex pretty-printing

The operand collector had these operands in src1/src2/dst order, but the
pretty-printer fetched the allocations in dst/src1/src2 order instead.

Although our pretty-printer looked like it was printing src1/src2/dst,
because it consumed operands in the wrong order, what it actually
printed was src2/dst/src1.

Meanwhile, Capstone actually uses src2/src1/dst order in AT&T mode. (GNU
objdump agrees.)

In the only filetest covering the vpsraq instruction, our output agreed
with Capstone because register allocation picked the same register for
both src1 and dst, so the two orders were indistinguishable. I've
extended the filetest to force register allocation to pick different
registers.

This format is also used for vpmullq, but we didn't have any compile
filetests covering that instruction, so I've added one with the same
register allocation pattern.

Now our pretty-printer agrees with Capstone on both instructions.

* Fix emit-tests and vpermi2b

This test for vpmullq had what we have now determined is the wrong order
for src1 and src2.

There were no emit-tests for vpsraq, so I added one.

The vpermi2b tests used the wrong form of the Inst enum, judging by the
assertions that are in x64_get_operands (which is not exercised by emit
tests) and the fact that we never use that form for that instruction
anywhere else.

Pretty-printing vpermi2b disagreed with Capstone in the same way as
vpsraq and vpmullq. I've fixed that form to agree with Capstone as well,
aside from the duplicated src1/dst operand which are required to be
different before register allocation and equal afterward.
These lists of ranges always cover contiguous ranges of an index space,
meaning the start of one range is the same as the end of the previous
range, so we can cut storage in half by only storing one endpoint of
each range.

This in turn means we don't have to keep track of the other endpoint
while building these lists, reducing the state we need to keep while
building vcode and simplifying the various build steps.
* Wasmtime: add one-entry call-indirect caching.

In WebAssembly, an indirect call is somewhat slow, because of the
indirection required by CFI (control-flow integrity) sandboxing. In
particular, a "function pointer" in most source languages compiled to
Wasm is represented by an index into a table of funcrefs. The
`call_indirect` instruction then has to do the following steps to invoke
a function pointer:

- Load the funcref table's base and length values from the vmctx.
- Bounds-check the invoked index against the actual table size; trap if
  out-of-bounds.
- Spectre mitigation (cmove) on that bounds-check.
- Load the `vmfuncref` from the table given base and index.
  - For lazy table init, check if this is a non-initialized funcref
    pointer, and initialize the entry.
- Load the signature from the funcref struct and compare it against the
  `call_indirect`'s expected signature; trap if wrong.
- Load the actual code pointer for the callee's Wasm-ABI entry point.
- Load the callee vmctx (which may be different for a cross-module
  call).
- Put that vmctx in arg 0, our vmctx in arg 1, and invoke the loaded
  code pointer with an indirect call instruction.

Compare and contrast to the process involved in invoking a native
function pointer:

- Invoke the code pointer with an indirect call instruction.

This overhead buys us something -- it is part of the SFI sandbox
boundary -- but it is very repetitive and unnecessary work in *most*
cases when indirect function calls are performed repeatedly (such as
within an inner loop).

This PR introduces the idea of *caching*: if we know that the result of
all the above checks won't change, then if we use the same index as "the
last time" (for some definition), we can skip straight to the "invoke
the code pointer" step, with a cached code pointer from that last time.

Concretely, it introduces a two-word struct inlined into the vmctx for
each `call_indirect` instruction in the module (up to a limit):

- The last invoked index;
- The code pointer that index corresponded to.

When compiling the module, we check whether the table could possibly be
mutable at a given index once read: any instructions like `table.set`,
or the whole table exported thus writable from the outside. We also
check whether index 0 is a non-null funcref. If neither of these things
are true, then we know we can cache an index-to-code-pointer mapping,
and we know we can use index 0 as a sentinel for "no cached value".

We then make use of the struct for each indirect call site and generate
code to check if the index matches; if so, call cached pointer; if not,
load the vmfuncref, check the signature, check that the callee vmctx is
the same as caller (intra-module case), and stash the code pointer and
index away (fill the cache), then make the call.

On an in-development branch of SpiderMonkey-in-Wasm with ICs (using
indirect calls), this is about a 20% speedup; I haven't yet measured on
other benchmarks. It is expected that this might be an
instantiation-time slowdown due to a larger vmctx (but we could use
madvise to zero if needed).

This feature is off by default right now.

* Addressed review feedback.

* Added some more comments.

* Allow unused VMCallIndirectCache struct (defined for parity with other bits but not needed in actual runtime).

* Add a limit to the number of call-indirect cache slots.

* Fix merge conflict: handle ConstOp element offset.

* Review feedback.
* Migrate the wasmtime-types crate to no_std

This commit is where no_std for Wasmtime starts to get a bit
interesting. Specifically the `wasmtime-types` crate is the first crate
that depends on some nontrivial crates that also need to be migrated to
`no_std`. This PR disables the default feature of `wasmparser` by
default and additionally does the same for `serde`. This enables them to
compile in `no_std` contexts by default and default features will be
enabled elsewhere in this repository as necessary.

This also opts to drop the `thiserror` dependency entirely in favor of a
manual `Display` implementation with a cfg'd implementation of `Error`.

As before CI checks are added for `wasmtime-types` with a `no_std`
target itself to ensure the crate and all dependencies all avoid `std`.

* Fix adapter build
alexcrichton and others added 30 commits June 14, 2024 18:15
This commit skips the safety checks of `AutoAssertNoGc` for the duration
of host calls where the types involved are statically known to not
perform any GCs (e.g. integers and floats). This helps recover some
performance loss from indirect calls made on entry/exit of an
`AutoAssertNoGc` scope when the `gc` feature is enabled in Wasmtime.
* Const-propagate some offsets in `VMOffsets`

Before this commit all offsets to all fields in `VMOffsets` were stored
as fields within `VMOffset` itself. All of the fields at the start of
`VMOffsets`, however, are statically known given the pointer size.
Notably this means that the use of `HostPtr` in the runtime still was
forcing a dynamic lookup of these static offsets.

This commit refactors this to reflect all static offsets based solely on
the pointer size in the `PtrSize` trait, removing all the fields from
`VMOffsets`. All the dynamically sized fields, however, remain in
`VMOffsets`.

* Fix expected error message
* Update the spec test suite submodule

* Ignore the `memory64` tests since Wasmtime doesn't implement the
  table64 extension yet.
* Ignore `gc/i31.wast` as it's got new tests which Wasmtime doesn't
  currently pass.

* Only ignore table64-related memory64 tests

* Fix ignoring some tests on Winch
These were originally a SpiderMonkey-ism and have been unused ever
since. It was introduced for GC integration, where the runtime could do
something to make Cranelift code hit a trap and pause for a GC and then resume
execution once GC completed. But it is unclear that, as implemented, this is
actually a useful mechanism for doing that (compared to, say, loading from some
Well Known page and the GC protecting that page and catching signals to
interrupt the mutator, or simply branching and doing a libcall). And if someone
has that particular use case in the future (Wasmtime and its GC integration
doesn't need exactly this) then we can design something for what is actually
needed at that time, instead of carrying this cruft forward forever.
After #8809, the mutator cannot
resume from a trap so we don't need to consider them safepoints, as no
GC-managed references are live after the trap. The one exception being the
`debugtrap` CLIF instruction, which is technically still a resumable trap, but
which exists only for emitting the equivalent of an `int3` breakpoint
instruction for pausing in a debugger to inspect state, and should never be used
for mutator-collector interactions.
* Disable `call-hook` crate feature by default

This commit disables the `call-hook` feature for the Wasmtime crate
added in #8795 by default. The rationale is that this has a slight cost
to all embeddings even if the feature isn't used and it's not expected
to be that widely used of a feature, so off-by-default seems like a more
appropriate default.

* Enable all features in doc build

* More doc fixes
* Disable memory protection keys by default at compile time

This commit gates memory protection keys behind a new Cargo feature
which is disabled by default. Memory protection keys are already
disabled by default on all platforms and are only configured to possibly
work with Linux x64. When enabled, however, it unconditionally adds a
small amount of overhead to WebAssembly entries/exits even if the
feature is disabled at runtime for the same reason that the `call-hook`
feature adds overhead. With `call-hook` being disabled by default
in #8808 it seemed reasonable to additionally gate memory protection
keys to avoid needing to disable features in Wasmtime to get the best
performance wasm<->host calls.

* Enable Wasmtime feature for fuzzing
* Force some more permission checks with 0-length writes

When a 0-length write is performed try to send the write all the way to
the underlying file descriptor to at least check that it's valid to
write.

Closes #8818

* Update crates/test-programs/src/bin/preview1_file_write.rs

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

* Allow a second error for Windows as well

---------

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
Looks like wasi-libc is testing for the READDIR right in addition to
the READ right in the reported flags. Update write-only files to remove
both the READ and READDIR rights accordingly.

Closes #8816
* Inherit Linux semantics for `fd_pwrite` with `O_APPEND`

This commit updates the implementation of `fd_pwrite` in WASI to match
Linux semantics for an under-specified corner of WASI. Specifically if
`fd_pwrite` is used the offset specified is ignored if the file is
opened in append mode and the bytes are instead appended.

This commit additionally refactors `fd_write` and `fd_pwrite` to have
basically the same code with only a minor branch internally when the
final write is being performed to help deduplicate more logic.

Closes #8817

* Ignore new tests on macos

prtest:full

* Update ignore to all non-linux
* wasi-nn: remove some unncecessary panics from test programs

* Make `libtest-mimic` a workspace dependency

* wasi-nn: use \`libtest-mimic\` for testing

wasi-nn's testing story is complicated by different levels of support on
different platforms (some backends work on certain architectures, others
only work on certain OSes, etc.). This change migrates the `testing`
module, which was included in `src`, to exist solely under `tests`. It
also dynamically checks whether each test is runnable and then chooses
whether to ignore it with a `libtest-mimic` flag. This ensures we can
see all the tests all the time and whether they are running or not,
which is helpful during development.

* Refactor for more subtle `ignore` behavior

On any development machine, with no prior setup, we should be able to
compile and move past the ignored tests without issue:

```console
$ cargo test -- --quiet
running 4 tests
iiii
```

With the proper setup and enabling the right features, tests that are
able to run should do so (eliding a bunch of test output):

```console
$ cargo test --all-features -- --quiet
running 4 tests
iii.
```

On CI, tests that _should_ pass will fail if they can't run:

```console
$ CI=1 cargo test --all-features -- --quiet
iFF.
```

prtest:full

* Add missing `use`

* fix: share download lock between checks

* fix: typo, winml usedx preloaded model

* fix: revert to previous winml behavior

This test was reusing the ONNX test for some reason.

* fix: fully qualify bail!
* Un-nest exports in a component

This commit flattens the representation of exports in a component to
make them more easily indexable without forcing traversal through the
hierarchy of instance imports/exports to get there.

* Guarantee type information on component exports

Don't have it optional in some cases and present in others, instead
ensure there's type information for all component exports immediately
available.

* Refactor how component instance exports are loaded

This commit is a change to Wasmtime's public API for
`wasmtime::component::Instance` that reorganizes how component exports
are loaded. Previously there was a system where `Instance::exports()`
was called that that was sort of "iterated over" in a builder-style
pattern to acquire the actual export desired. This required lifetime
trickery for nested instances and some unfortunate API bloat. The major
downside of this approach is that it requires unconditional string
lookups at runtime for exports and additionally does not serve as a
great place to implement the semver-compatible logic of #8395. The goal
of this refactoring is to pave the way to improving this.

The new APIs for loading exports now look a bit more similar to what's
available for core modules. Notably there's a new
`Component::export_index` method which enables performing a string
lookup and returning an index. This index can in turn be passed to
`Instance::get_*` to skip the string lookup when exports are loaded. The
`Instance::exports` API is then entirely removed and dismantled.

The only piece remaining is the ability to load nested exports which is
done through an `Option` parameter to `Component::export_index`. The
way to load a nested instance is now to first lookup the instance with
`None` as this parameter an then the instance itself is `Some` to look
up an export of that instance. This removes the need for a
recursive-style lifetime-juggling API from wasmtime and in theory helps
simplify the usage of loading exports.

* Update `bindgen!` generated structures for exports

This commit updates the output of `bindgen!` to have a different setup
for exports of worlds to handle the changes from the previous commit.
This introduces new `*Pre` structures which are generated alongside the
existing `Guest` structures for example. The `*Pre` versions contain
`ComponentExportIndex` from the previous commit and serve as a path to
accelerating instantiation because all name lookups are skipped.

* Update test expectations for `bindgen!`-generated output

* Review comments

* Fix doc link
* Update the wasi_testsuite submodule

This commit updates the wasi_testsuite submodule which we haven't
updated in a little over a year and applies a few small fixes but mostly
ignores new tests.

* Add another ignore#
…es (#8792)

* wasi-adapter: Implement provider crate that embeds the adapter binaries

* Upgrade wasi adapters to the latest version

* Update adapter docs

* Recompile asi adapters with 1.78

* Recompile wasi adapters with 1.79

* Add some debugging to adapter build script

* Fix script debugging

* Compute wasi adapter version based on latest adapter commit hash

* Try to bless wasi adapters again

* Try to work around CI auto-merges

* Revert to just using workspace version

* Add the wasi adapter provider to the crate publication list

* Use wasi adapter provider in artifacts test + explicit MSRV in CI

* Explicit adapter crate version

* Small fix

* Remove version info from adapter metadata

* Check but don't install rust toolchain in build script

* Bless after rebase

---------

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
… Cranelift and Wasmtime (#8826)

* Introduce the `cranelift-bitset` crate

The eventual goal is to deduplicate bitset types between Cranelift and Wasmtime,
especially their use in stack maps.

* Use the `cranelift-bitset` crate inside both Cranelift and Wasmtime

Mostly for stack maps, also for a variety of other random things where
`cranelift_codegen::bitset::BitSet` was previously used.

* Fix stack maps unit test in cranelift-codegen

* Uncomment `no_std` declaration

* Fix `CompountBitSet::reserve` method

* Fix `CompoundBitSet::insert` method

* Keep track of the max in a `CompoundBitSet`

Makes a bunch of other stuff easier, and will be needed for replacing
`cranelift_entity::EntitySet`'s bitset with this thing anyways.

* Add missing parens

* Fix a bug around insert and reserve

* Implement `with_capacity` in terms of `new` and `reserve`

* Rename `reserve` to `ensure_capacity`
This commit improves the experience around using the
`trappable_error_type` configuration by fixing two issues:

* When an error can't be resolved it doesn't result in a
  `unwrap()`, instead a first-class error is returned to get reported.

* The name lookup procedure is now consistent with the name lookup that
  the `with` key does, notably allowing the version to be optional but
  still supporting the version.

This fixes an issue that came up recently where a path with a version
was specified but the old lookup logic ended up requiring that the
version wasn't specified because there was only one package with that
version. This behavior resulted in a panic with a very long
backtrace-based error message which was confusing to parse. By returning
an error the error is much more succinct and by supporting more names
the original intuition will work.
* Implement semver compatibility for exports

This commit is an implementation of component model semver compatibility
for export lookups. Previously in #7994 component imports were made
semver-aware to ensure that bumping version numbers would not be a
breaking change. This commit implements the same feature for component
exports. This required some refactoring to move the definition of semver
compat around and the previous refactoring in #8786 enables frontloading
this work to happen before instantiation.

Closes #8395

* Review comments

* Fix tests
* wasmtime: Add profile markers around host-calls

The output of the guest profiler can be misleading around hostcalls.
Whatever happened to be the last sample before the hostcall appears to
run for the entire time of the hostcall. This change ensures that we can
see the actual call stack at the time of the hostcall, and get a visual
indication of which periods are not spent executing guest code.

* wasmtime-cli needs wasmtime/call-hook, but wasmtime itself doesn't

In general, embedders that wish to use the new functionality likely will
need to enable the wasmtime/call-hook feature in order to get Wasmtime
to notify them of when to call into the profiler. However embedders
could consider other alternatives, such as calling the profiler from
selected hostcall implementations.
…et::CompoundBitSet` (#8834)

* cranelift-entity: Implement `EntitySet` in terms of `cranelift_bitset::CompoundBitSet`

* Shrink the size of `CompoundBitSet` so we don't perturb vmctx size test expectations

* Update vmctx size test expectations anyways because we shrunk "too much"

* Move `cranelift-bitset` to the front of `CRATES_TO_PUBLISH`
)

* Add tests for patterns I'm about to optimize

* x64: Optimize vector compare-and-branch

This commit implements lowering optimizations for the `vall_true` and
`vany_true` CLIF instructions when combined with `brif`. This is in the
same manner as `icmp` and `fcmp` combined with `brif` where the result
of the comparison is never materialized into a general purpose register
which helps lower register pressure and remove some instructions.

* x64: Optimize `vconst` with an all-ones pattern

This has a single-instruction lowering which doesn't load from memory so
it's probably cheaper than loading all-ones from memory.
This commit removes the `simm32` extractor from lowerings as it's not as
useful as it was when it was first introduced. Nowadays an `Imm64` needs
to be interpreted with the type known as well to understand whether bits
being masked off is significant or not. The old `simm32` extractor only
took `Imm64` meaning that it was unable to do this and wouldn't match
negative numbers. This is because the high 32 bits of `Imm64` were
always zero and `simm64` would take the `i64` value from `Imm64` and try
to convert it to an `i32`.

This commit replaces `simm32`, and uses of it, with a new extractor
`i32_from_iconst`. This matches the preexisting `i64_from_iconst` and is
able to take the type of the value into account and produce a correctly
sign-extended value.

cc #8706
…et (#8843)

I noticed that the wasm_memory64 flag was left out of Config's debug impl,
so rather than add it, I decided to use the `bitflags::Flags::FLAGS`
const to iterate the complete set of flags.

THe downside of this change is that it will print flags which do not
have a setter in Config, e.g. `wasm_component_model_nested_names`.

An alternative to this change is, rather than expanding out the single
`features: WasmFeatures` member into many different debug_struct fields,
the debug impl of WasmFeatures is used.

Here is a sample debug of Config with this change:

Config { debug_info: None, wasm_mutable_global: true, wasm_saturating_float_to_int: true, wasm_sign_extension: true, wasm_reference_types: true, wasm_multi_value: true, wasm_bulk_memory: true, wasm_simd: true, wasm_relaxed_simd: false, wasm_threads: false, wasm_shared_everything_threads: false, wasm_tail_call: false, wasm_floats: true, wasm_multi_memory: false, wasm_exceptions: false, wasm_memory64: false, wasm_extended_const: false, wasm_component_model: false, wasm_function_references: false, wasm_memory_control: false, wasm_gc: false, wasm_custom_page_sizes: false, wasm_component_model_values: false, wasm_component_model_nested_names: false, parallel_compilation: true, compiler_config: CompilerConfig { strategy: Some(Cranelift), target: None, settings: {"opt_level": "speed", "enable_verifier": "true"}, flags: {}, cache_store: None, clif_dir: None, wmemcheck: false }, parse_wasm_debuginfo: false }
This commit raises the default setting of `max_memory_size` in the
pooling allocator from 10M to 4G. This won't actually impact the virtual
memory reserved in the pooling allocator because we already reserved 6G
of virtual memory for each linear memory this instead allows access to
all of it by default. This matches the default behavior of Wasmtime for
the non-pooling allocator which is to not artificially limit memory by
default.

The main impact of this setting is that the memory-protection-keys
feature, which is disabled by default, will have no effect by default
unless `max_memory_size` is also configured to something smaller than
4G. The documentation has been updated to this effect.

Closes #8846
This was accidentally broken in #8692. It turns out bitcasts from i128 to i128 are legal, that PR accidentally reverted that use case.

This is now added to a runtest to ensure it works on all platforms.
The epoch interruption implementation caches the current deadline in a
register, and avoids reloading that cache until the cached deadline has
passed.

However, the first epoch check happens immediately after the cache has
been populated on function entry, so there's never a reason to reload
the cache at that point. It only needs to be reloaded in loops. So this
commit eliminates the double-check on function entry.

When Cranelift optimizations are enabled, the alias analysis pass
correctly detected that this load was redundant, and the egraph pass
optimized away the `icmp` as well. However, since we don't do
conditional constant propagation, the branch couldn't be optimized away.
On x86 this lowered to a redundant `cmp`/`jae` pair of instructions in
every function entry, which this commit eliminates.

To keep track of what code we're generating for epoch interruptions,
I've also added disas tests with a trivial infinite loop.
The identifier for the `cold` calling convention overlaps with the
`cold` keyword for basic blocks so handle another kind of token when
parsing signatures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet