Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fuzz] Add a meta-differential fuzz target #4515

Merged
merged 12 commits into from
Aug 19, 2022

Conversation

abrown
Copy link
Collaborator

@abrown abrown commented Jul 23, 2022

This fuzz target combines past efforts to fuzz Wasmtime against various engines and against various kinds of modules. Once complete, it is intended to replace the other differential targets. Some notable changes in this PR include the addition of interfaces for each Wasm engine we want to fuzz against and the ability to evaluate a function multiple times with different randomly-generated inputs. See the commit messages for more details.

@abrown abrown requested a review from fitzgen July 23, 2022 00:16
@jameysharp
Copy link
Contributor

Ooh, I am eager to look at this on Monday!

@jameysharp jameysharp self-requested a review July 23, 2022 00:18
@github-actions github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Jul 23, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! I think there's cleanup to do internally and such but otherwise the overall architecture/design looks pretty reasonable. I'd like to ideally simplify the Wasmtime bits a bit since there seems to be good deal of juggling various bits of configuration.

I'm happy to help implement the v8 bits as well.


const KNOWN_I32_VALUES: &[i32] = &[i32::MIN, -1, 0, 1, i32::MAX];
const KNOWN_I64_VALUES: &[i64] = &[i64::MIN, -1, 0, 1, i64::MAX];
const KNOWN_U128_VALUES: &[u128] = &[u128::MIN, 1, u128::MAX];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that might be good to add here is "known values" for each lane width like (1u64, 1u64), (-1i64, -1i64), (1u32, 2u32, 3u32, 4u32), etc.

crates/fuzzing/src/generators/value.rs Outdated Show resolved Hide resolved
Comment on lines 93 to 100
/// Document the Wasm features necessary for a module to be evaluated.
#[derive(Debug, Default)]
#[allow(missing_docs)]
pub struct ModuleFeatures {
pub simd: bool,
pub multi_value: bool,
pub reference_types: bool,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a separate structure for this couldd all of ModuleConfig be passed around? Otherwise this seems a bit odd since it's missing a number of other proposals which could perhaps be relevant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially designed it like that but then moved to this when I realized that SwarmConfig (the wasm-smith struct what ModuleConfig wraps) is intended to "generate a module this way" and what I mean is "does a module use this feature?" So a lot of SwarmConfig, e.g., the minimum/maximums of various kinds, don't really apply. They don't matter to the engines for the most part (an engine can handle 10 instructions just as well as 100 instructions) and they don't make sense for the single-instruction modules (they always only have 1 instruction). That's when I switched from adapting a single-instance module to a ModuleConfig/SwarmConfig and instead started adapting a ModuleConfig/SwarmConfig to this new ModuleFeatures. Would you be more in favor of this struct if I added the remaining Wasm features to it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ideally want to centralize on feature management instead of having multiple avenues and tests in different places for features. Fuzzing all the various features gets nontrivial in management complexity pretty quickly and I always get lost a few lines in whenever I try to bring myself back up to speed on the fuzzers.

Is it a problem to take &ModuleConfig instead of &ModuleFeatures? I presume that something motivated you to add this but I don't know what that is. I'd ideally like to tackle this problem if we can.

.expect("failed to extract exported function signatures")
{
let mut invocations = 0;
loop {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop may not play nicely with fuel right now since if the module has wasm-smith-injected fuel the fuel is never reset and otherwise with Wasmtime-injected fuel I'm not sure that's ever reset either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should reset fuel for each call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, with only single-instruction modules, this shouldn't be too much of an issue but once we add in the wasm-smith modules this will be necessary; what is the right way to reset this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there's no way to reset the fuel, we'd have to add an exported function in the wasm-smith generated module (or export the global) for it to get modified by the embedder.

fuzz/fuzz_targets/differential-new.rs Outdated Show resolved Hide resolved
/// following exported items in the instance: globals, memory.
///
/// TODO allow more types of hashers.
fn hash(&self, state: &mut DefaultHasher) -> anyhow::Result<()>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over the implementations of this could this trait perhaps get changed to have something different like "get the hashable export named foo" to avoid requiring hardcoding the hash algorithm between all engines? That would also allow more direct comparisons and better errors along the lines of "the state of this global/memory is different" rather than "some state is different"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this probably does need some changes but I'm not yet entirely clear what they are; I think looking at what is necessary for the OCaml spec interpreter might make things more clear (as mentioned here).

crates/fuzzing/src/oracles/diff_wasmtime.rs Outdated Show resolved Hide resolved
crates/fuzzing/src/oracles/diff_wasmtime.rs Outdated Show resolved Hide resolved
Comment on lines +180 to +137
Val::FuncRef(_) => unimplemented!(),
Val::ExternRef(_) => unimplemented!(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is worth implementing but all that we can really say about these is "null" or "non-null" so I think we could have DiffValue::ExternRef { is_null: bool } perhaps

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single-instruction modules that this PR uses don't exercise reference types but I agree that in the future, e.g., with wasm-smith-generated modules, this is probably what we want to do.

let mut rhs_instance = rhs
.instantiate(&module.to_bytes())
.expect("failed to instantiate `rhs` module");
for (name, signature) in get_exported_function_signatures(&wasm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that may be helpful here is to use the concrete Wasmtime rhs to use Wasmtime's embedding API to determine this rather than using the wasmparser crate.

I also think it's reasonable for the differential function itself to just force the rhs to the Wasmtime concrete type because I don't think there's much value in having two 100% generic engines since we always want one to be Wasmtime anyway.

Copy link
Collaborator Author

@abrown abrown Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this would change things much: we still want a separate function to gather up the signatures of the functions to execute. wasmparser is already a dependency and it doesn't seem to me that switching to Wasmtime tod do so would reduce much code (?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the Wasmtime embedding API is much more natural in my opinion, and the Wasmtime API is much more stable than the wasmparser API. I don't really want to keep up with random wasmparser changes way over here in the fuzzing code when one of the engines we're fuzzing against should always be Wasmtime anyway.

(I also don't think it's useful to be so abstract here that we could fuzz v8 against wasmi here, I would rather have "here's the wasmtime config" and "here's the other engine config" and the fuzzing is entirely driven by the Wasmtime embedding API plus requests to the engine config through a trait)

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super excited for this! Thanks for taking this on, Andrew!

Comment on lines +69 to +106
impl From<&DiffValue> for Value {
fn from(v: &DiffValue) -> Self {
match *v {
DiffValue::I32(n) => Value::I32(n),
DiffValue::I64(n) => Value::I64(n),
DiffValue::F32(n) => Value::F32(n as i32),
DiffValue::F64(n) => Value::F64(n as i64),
DiffValue::V128(n) => Value::V128(n.to_le_bytes().to_vec()),
}
}
}

impl Into<DiffValue> for Value {
fn into(self) -> DiffValue {
match self {
Value::I32(n) => DiffValue::I32(n),
Value::I64(n) => DiffValue::I64(n),
Value::F32(n) => DiffValue::F32(n as u32),
Value::F64(n) => DiffValue::F64(n as u64),
Value::V128(n) => {
assert_eq!(n.len(), 16);
DiffValue::V128(u128::from_le_bytes(n.as_slice().try_into().unwrap()))
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff should probably be in generators/value.rs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was dividing the code up by engine, so I chose to put all these engine-specific conversions together with the engine that needs them.

Comment on lines +127 to +159
impl From<&DiffValue> for wasmi::RuntimeValue {
fn from(v: &DiffValue) -> Self {
use wasmi::RuntimeValue::*;
match *v {
DiffValue::I32(n) => I32(n),
DiffValue::I64(n) => I64(n),
DiffValue::F32(n) => F32(wasmi::nan_preserving_float::F32::from_bits(n)),
DiffValue::F64(n) => F64(wasmi::nan_preserving_float::F64::from_bits(n)),
DiffValue::V128(_) => unimplemented!(),
}
}
}

impl Into<DiffValue> for wasmi::RuntimeValue {
fn into(self) -> DiffValue {
use wasmi::RuntimeValue::*;
match self {
I32(n) => DiffValue::I32(n),
I64(n) => DiffValue::I64(n),
F32(n) => DiffValue::F32(n.to_bits()),
F64(n) => DiffValue::F64(n.to_bits()),
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines 59 to 60
let arguments: Result<Vec<_>> = signature
.params
.iter()
.map(|&t| DiffValue::arbitrary_of_type(&mut u, t.into()))
.collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running the exported functions multiple times will -- in general, between single-instruction modules and fuel limits -- be a lot cheaper than compiling and instantiating the module itself. Therefore, we should make sure we take as much advantage of the work we already did to get here by at least using each of our known values for each argument, and then whatever the fuzzer gives us that is left over can also be used to generate new inputs, but at least this way we know we are exercising every module with a bunch of different interesting argument values.

.expect("failed to extract exported function signatures")
{
let mut invocations = 0;
loop {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should reset fuel for each call.

@jameysharp
Copy link
Contributor

It's been suggested (by a couple different people, I think, but I've lost track of who) that when we have a wasm program we want to try running, we should run it with every engine we have, not just a pair of engines.

I don't think that needs to happen in this PR. I just wanted to note that if we do want that, I think the changes I see here already cover most of the work that would be needed. So that's great!

One of my teammates also suggested that we should check that different engines agree on whether a random sequence of bytes is a valid wasm binary. That seems like another relatively small extension of this work, that somebody could build after this lands.

@alexcrichton
Copy link
Member

One worry I have about running against all engines is that we run the risk of really hurting executions per second during fuzzing. When something fails I think it might be good to run in other engines to figure out which answer is probably correct, or otherwise being able to run in multiple engines easily locally would be good. While fuzzing though I think it's best to stick to just 2 engines with one guaranteed as Wasmtime

Cargo.lock Outdated Show resolved Hide resolved
@abrown abrown force-pushed the meta-diff branch 3 times, most recently from e9fb0d1 to 2902c24 Compare August 11, 2022 19:54
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 11, 2022
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 11, 2022
@abrown
Copy link
Collaborator Author

abrown commented Aug 11, 2022

All right, I'm marking this ready for review with some caveats that I think should be addressed in subsequent PRs. At this point, the PR provides a new interface for evaluating various engines against Wasmtime but for now is only is using single-instruction modules to do so. After resolving a bunch of issues that this identified (all related to fuzz infrastructure issues, IIRC), I was able to run the target for ~3 million test cases without crashing:

#2963550        REDUCE cov: 30985 ft: 87395 corp: 1762/219Kb lim: 4096 exec/s: 394 rss: 676Mb L: 140/521 MS: 1 EraseBytes-
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: Unknown opcode 195
ignoring error: this error should be ignored by fuzzing: Unknown opcode 195
ignoring error: this error should be ignored by fuzzing: Unknown opcode 195
ignoring error: this error should be ignored by fuzzing: Unknown opcode 195
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime
#2964099        REDUCE cov: 30985 ft: 87395 corp: 1762/219Kb lim: 4096 exec/s: 394 rss: 676Mb L: 131/521 MS: 4
PersAutoDict-ShuffleBytes-ShuffleBytes-EraseBytes- DE: "\x17\x00\x00\x00\x00\x00\x00\x00"-
ignoring error: this error should be ignored by fuzzing: unable to compile module in wasmtime

Note how ignored errors are printed but don't cause a crash: wasmi doesn't understand all of the opcodes (which probably means we need to make the single-inst modules smarter to report when they use float-to-int conversions, e.g.) and Wasmtime frequently crashes when the pooling allocator picks sizes that are too small.

When run with RUST_LOG=wasmtime_fuzzing=debug cargo +nightly fuzz run differential_meta, the log output shows something like:

[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles] wrote wasm file to `testcase14.wasm`
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles] Evaluating: test([F32(198)])
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles]  -> results on wasmi: Ok([F32(0)])
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles]  -> results on wasmtime: Ok([F32(0)])
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles] Hashing instances:
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles]  -> hash of wasmi: 15130871412783076140
[2022-08-11T20:36:44Z DEBUG wasmtime_fuzzing::oracles]  -> hash of wasmtime: 15130871412783076140

I expect that once the following big items are finished we could remove all other differential targets and just use this one. For now, though, I would like to get this work merged so I can stop rebasing. This is work that I think could go in different PRs (but I am certainly open for discussion on that):

  • adapt V8 to the DiffEngine/DiffInstance interface (@alexcrichton volunteered to look into this; the lifetime issues made this a bridge too far for this PR)
  • start fuzzing with the wasm-smith-generated modules (BigModule) instead of just single-instruction modules; this is complicated by the fact that wasm-smith will eat up all remaining unstructured data, leaving none for choosing engines and function arguments
  • figure out whether to fully commit to ModuleFeatures and report what Wasm feature a single-instruction uses (see comments above about this)
  • refactor how we ignore errors: I currently do this by wrapping the error message in DiffIgnoreError and downcasting the anyhow error to see if it is one of these; @alexcrichton mentioned that he preferred Result<Option<...>> to indicate the various states but this would involve some refactoring
  • add more instructions to the single-instruction module generator; e.g., my next feature to add would be SIMD instructions, which would likely find some bugs and require some additional infrastructure (e.g., V128 relaxed NaN comparison)

cc: @jameysharp, @alexcrichton, @fitzgen

@abrown abrown marked this pull request as ready for review August 11, 2022 20:45
@abrown
Copy link
Collaborator Author

abrown commented Aug 11, 2022

Oh, one other point: I have gone back and forth a few times about how to design the interface and today, at least, I'm leaning towards merging the DiffEngine and DiffInstance traits. Initially I separated them because logically that's how we're used to thinking about these and it is nice to be able filter out engines that won't work with a certain module. I feel like there is a way to just use the DiffInstance::instantiate function to do all of this. This refactoring change doesn't change the essence of the PR above, though. On a related note, the hashing interface could also change: I've been thinking through how hashing could work in the OCaml spec interpreter and, once I figure out the OCaml-Rust bindings issues, it may be that the interface ends up looking like what @alexcrichton suggested above.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok setting some things aside for future PRs, but at the same time I want to make sure that what's landed here is a solid base to build these future features on to. For example I don't think that ModuleFeatures is going to play well over time as keeping that in sync with ModuleConfig and the myriad of other possible configuration options. Instead I think it's best to simplify the configuration of the module generation as much as possible.

I think a better idea would be to structure the fuzzer as:

  • Generate a module configuration
  • Call set_differential_config to ensure that the module exhibits runtime behavior independent of the engine being run within (e.g. maximum memory limits, nan canonicalization, etc). I think that feature-related options in set_differential_config should all get removed, however.
  • Select a second engine to fuzz against with this module config. The config would act as a form of filter where if reference types are enabled then wasmi would be disabled since it doesn't implement reference types. Using wasmtime as a second engine is always a possibility though.
  • Afterwards the module is generated and instantiated in both engines.
  • Using Wasmtime as fixed-to-one-side the embedding API is used to explore the API surface area of the module. For N iterations exported functiosn are selected at random and then given arbitrary values, with the results being tested aftewards.

I think this structure would fix some of my concerns related to relying on wasmparser and/or otherwise trying to avoid the Wasmtime embedding API, avoiding multiple layers of configurations, filters, etc to get code that can run within each engine, and also additionally retaining maximum flexibility to run a differential config between different engines with as many shapes of modules as possible.

Comment on lines +345 to +360
// Both sides failed--this is an acceptable result (e.g., both sides
// trap at a divide by zero). We could compare the error strings perhaps
// (since the `lhs` and `rhs` could be failing for different reasons)
// but this seems good enough for now.
(Err(_), Err(_)) => {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has found bugs in the past, so I think that it would be good to get a jump-start on this. There's some code in the v8 differential bits comparing traps, and I think that could broadly be used to get adapted here. We can't expect precisely the same errors all the time but if neither stack overflows for example then I think it's fair to ensure that both divided by zero or something like that.

abrown added a commit to abrown/wasmtime that referenced this pull request Aug 18, 2022
This change is joint work with @alexcrichton to adapt the structure of
the fuzz target to his comments
[here](bytecodealliance#4515 (review)).

This change:
- removes `ModuleFeatures` and the `Module` enum (for big and small
  modules)
- upgrades `SingleInstModule` to filter out cases that are not valid for
  a given `ModuleConfig`
- adds `DiffEngine::name()`
- constructs each `DiffEngine` using a `ModuleConfig`, eliminating
  `DiffIgnoreError` completely
- prints an execution rate to the `differential_meta` target

Still TODO:
- `get_exported_function_signatures` could be re-written in terms of the
  Wasmtime API instead `wasmparser`
- the fuzzer crashes eventually, we think due to the signal handler
  interference between OCaml and Wasmtime
- the spec interpreter has several cases that we skip for now but could
  be fuzzed with further work

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 18, 2022
This change is joint work with @alexcrichton to adapt the structure of
the fuzz target to his comments
[here](bytecodealliance#4515 (review)).

This change:
- removes `ModuleFeatures` and the `Module` enum (for big and small
  modules)
- upgrades `SingleInstModule` to filter out cases that are not valid for
  a given `ModuleConfig`
- adds `DiffEngine::name()`
- constructs each `DiffEngine` using a `ModuleConfig`, eliminating
  `DiffIgnoreError` completely
- prints an execution rate to the `differential_meta` target

Still TODO:
- `get_exported_function_signatures` could be re-written in terms of the
  Wasmtime API instead `wasmparser`
- the fuzzer crashes eventually, we think due to the signal handler
  interference between OCaml and Wasmtime
- the spec interpreter has several cases that we skip for now but could
  be fuzzed with further work

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
abrown and others added 9 commits August 18, 2022 15:22
In order to randomly choose an engine to fuzz against, we expect all of
the engines to meet a common interface. The traits in this commit allow
us to instantiate a module from its binary form, evaluate exported
functions, and (possibly) hash the exported items of the instance.

This change has some missing pieces, though:
 - the `wasm-spec-interpreter` needs some work to be able to create
   instances, evaluate a function by name, and expose exported items
 - the `v8` engine is not implemented yet due to the complexity of its
   Rust lifetimes
When attempting to use both wasm-smith and single-instruction modules,
there is a mismatch in how we communicate what an engine must be able to
support. In the first case, we could use the `ModuleConfig`, a wrapper
for wasm-smith's `SwarmConfig`, but single-instruction modules do not
have a `SwarmConfig`--the many options simply don't apply. Here, we
instead add `ModuleFeatures` and adapt a `ModuleConfig` to that.
`ModuleFeatures` then becomes the way to communicate what features an
engine must support to evaluate functions in a module.
This change adds the `differential_meta` target to the list of fuzz
targets. I expect that sometime soon this could replace the other
`differential*` targets, as it almost checks all the things those check.
The major missing piece is that currently it only chooses
single-instruction modules instead of also generating arbitrary modules
using `wasm-smith`.

Also, this change adds the concept of an ignorable error: some
differential engines will choke with certain inputs (e.g., `wasmi` might
have an old opcode mapping) which we do not want to flag as fuzz bugs.
Here we wrap those errors in `DiffIgnoreError` and then use a new helper
trait, `DiffIgnorable`, to downcast and inspect the `anyhow` error to
only panic on non-ignorable errors; the ignorable errors are converted
to one of the `arbitrary::Error` variants, which we already ignore.
Because arithmetic NaNs can contain arbitrary payload bits, checking
that two differential executions should produce the same result should
relax the comparison of the `F32` and `F64` types (and eventually `V128`
as well... TODO). This change adds several considerations, however, so
that in the future we make the comparison a bit stricter, e.g., re:
canonical NaNs. This change, however, just matches the current logic
used by other fuzz targets.
@alexcrichton requested that the interface be adapted to accommodate
Wasmtime's API, in which even reading from an instance could trigger
mutation of the store.
This change is joint work with @alexcrichton to adapt the structure of
the fuzz target to his comments
[here](bytecodealliance#4515 (review)).

This change:
- removes `ModuleFeatures` and the `Module` enum (for big and small
  modules)
- upgrades `SingleInstModule` to filter out cases that are not valid for
  a given `ModuleConfig`
- adds `DiffEngine::name()`
- constructs each `DiffEngine` using a `ModuleConfig`, eliminating
  `DiffIgnoreError` completely
- prints an execution rate to the `differential_meta` target

Still TODO:
- `get_exported_function_signatures` could be re-written in terms of the
  Wasmtime API instead `wasmparser`
- the fuzzer crashes eventually, we think due to the signal handler
  interference between OCaml and Wasmtime
- the spec interpreter has several cases that we skip for now but could
  be fuzzed with further work

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
@abrown
Copy link
Collaborator Author

abrown commented Aug 18, 2022

Ok, I've rebased this PR and included some work @alexcrichton and I did today to adapt the fuzz target to his comments above (thanks @alexcrichton!). There are still some TODOs, noted in the commit message, but this now runs without error for a decent amount of time:

$ cargo +nightly fuzz run differential_meta -s none
...
=== Execution rate (464537 successes / 880000 attempted modules): 52.788295454545455% (total invocations: 3525516) ===
#880057 REDUCE cov: 28361 ft: 150473 corp: 9884/4009Kb lim: 627 exec/s: 531 rss: 174Mb L: 546/626 MS: 1 EraseBytes-
#880293 REDUCE cov: 28361 ft: 150473 corp: 9884/4009Kb lim: 627 exec/s: 530 rss: 174Mb L: 332/626 MS: 1 EraseBytes-
#880511 NEW    cov: 28361 ft: 150475 corp: 9885/4010Kb lim: 627 exec/s: 531 rss: 174Mb L: 622/626 MS: 3 CMP-CrossOver-ChangeByte- DE: "\x08\xa2"-

@alexcrichton
Copy link
Member

Thanks again for being patient with me here @abrown! I'm happy with where this has ended up and while I think there's more to do I think it's safe to do that all as a follow-up. I'm gonna merge this and see if we can get this into the next oss-fuzz build to get our first (potentially empty?) crop of fuzz bugs.

@alexcrichton alexcrichton merged commit 5ec92d5 into bytecodealliance:main Aug 19, 2022
@abrown abrown deleted the meta-diff branch August 19, 2022 01:58
@abrown
Copy link
Collaborator Author

abrown commented Aug 19, 2022

@alexcrichton, thanks for all the help reviewing this and reworking some of the bits!

abrown added a commit to abrown/wasmtime that referenced this pull request Aug 19, 2022
The changes in bytecodealliance#4515 do everything the `differential_spec` and
`differential_wasmi` fuzz target already do. These fuzz targets are now
redundant and this PR removes them. It also updates the fuzz
documentation slightly.
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 19, 2022
The changes in bytecodealliance#4515 do everything the `differential_spec` and
`differential_wasmi` fuzz target already do. These fuzz targets are now
redundant and this PR removes them. It also updates the fuzz
documentation slightly.
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 19, 2022
The changes in bytecodealliance#4515 do everything the `differential_spec` and
`differential_wasmi` fuzz target already do. These fuzz targets are now
redundant and this PR removes them. It also updates the fuzz
documentation slightly.
abrown added a commit that referenced this pull request Aug 19, 2022
* [fuzz] Remove some differential fuzz targets

The changes in #4515 do everything the `differential_spec` and
`differential_wasmi` fuzz target already do. These fuzz targets are now
redundant and this PR removes them. It also updates the fuzz
documentation slightly.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Aug 19, 2022
This commit aims to improve the support for the new "meta" differential
fuzzer added in bytecodealliance#4515 by ensuring that all existing differential fuzzing
is migrated to this new fuzzer. This PR includes features such as:

* The V8 differential execution is migrated to the new framework.
* `Config::set_differential_config` no longer force-disables wasm
  features, instead allowing them to be enabled as per the fuzz input.
* `DiffInstance::{hash, hash}` was replaced with
  `DiffInstance::get_{memory,global}` to allow more fine-grained
  assertions.
* Support for `FuncRef` and `ExternRef` have been added to `DiffValue`
  and `DiffValueType`. For now though generating an arbitrary
  `ExternRef` and `FuncRef` simply generates a null value.
* Arbitrary `DiffValue::{F32,F64}` values are guaranteed to use
  canonical NaN representations to fix an issue with v8 where with the
  v8 engine we can't communicate non-canonical NaN values through JS.
* `DiffEngine::evaluate` allows "successful failure" for cases where
  engines can't support that particular invocation, for example v8 can't
  support `v128` arguments or return values.
* Smoke tests were added for each engine to ensure that a simple wasm
  module works at PR-time.
* Statistics printed from the main fuzzer now include percentage-rates
  for chosen engines as well as percentage rates for styles-of-module.

There's also a few small refactorings here and there but mostly just
things I saw along the way.
alexcrichton added a commit that referenced this pull request Aug 19, 2022
* Port v8 fuzzer to the new framework

This commit aims to improve the support for the new "meta" differential
fuzzer added in #4515 by ensuring that all existing differential fuzzing
is migrated to this new fuzzer. This PR includes features such as:

* The V8 differential execution is migrated to the new framework.
* `Config::set_differential_config` no longer force-disables wasm
  features, instead allowing them to be enabled as per the fuzz input.
* `DiffInstance::{hash, hash}` was replaced with
  `DiffInstance::get_{memory,global}` to allow more fine-grained
  assertions.
* Support for `FuncRef` and `ExternRef` have been added to `DiffValue`
  and `DiffValueType`. For now though generating an arbitrary
  `ExternRef` and `FuncRef` simply generates a null value.
* Arbitrary `DiffValue::{F32,F64}` values are guaranteed to use
  canonical NaN representations to fix an issue with v8 where with the
  v8 engine we can't communicate non-canonical NaN values through JS.
* `DiffEngine::evaluate` allows "successful failure" for cases where
  engines can't support that particular invocation, for example v8 can't
  support `v128` arguments or return values.
* Smoke tests were added for each engine to ensure that a simple wasm
  module works at PR-time.
* Statistics printed from the main fuzzer now include percentage-rates
  for chosen engines as well as percentage rates for styles-of-module.

There's also a few small refactorings here and there but mostly just
things I saw along the way.

* Update the fuzzing README
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 24, 2022
This change is a follow-on from bytecodealliance#4515 to add the ability to configure
the `differential` fuzz target by limiting which engines and modules are
used for fuzzing. This is incredibly useful when troubleshooting, e.g.,
when an engine is more prone to failure, we can target that engine
exclusively. The effect of this configuration is visible in the
statistics now printed out from bytecodealliance#4739.

Engines are configured using the `ALLOWED_ENGINES` environment variable.
We can either subtract from the set of allowed engines (e.g.,
`ALLOWED_ENGINES=-v8`) or build up a set of allowed engines (e.g.,
`ALLOWED_ENGINES=wasmi,spec`), but not both at the same time.
`ALLOWED_ENGINES` only configures the left-hand side engine; the
right-hand side is always Wasmtime. When omitted, `ALLOWED_ENGINES`
defaults to [`wasmtime`, `wasmi`, `spec`, `v8`].

Modules are configured using `ALLOWED_MODULES`. This environment
variables works the same as above but the available options are:
[`wasm-smith`, `single-inst`].
abrown added a commit to abrown/wasmtime that referenced this pull request Aug 24, 2022
This change is a follow-on from bytecodealliance#4515 to add the ability to configure
the `differential` fuzz target by limiting which engines and modules are
used for fuzzing. This is incredibly useful when troubleshooting, e.g.,
when an engine is more prone to failure, we can target that engine
exclusively. The effect of this configuration is visible in the
statistics now printed out from bytecodealliance#4739.

Engines are configured using the `ALLOWED_ENGINES` environment variable.
We can either subtract from the set of allowed engines (e.g.,
`ALLOWED_ENGINES=-v8`) or build up a set of allowed engines (e.g.,
`ALLOWED_ENGINES=wasmi,spec`), but not both at the same time.
`ALLOWED_ENGINES` only configures the left-hand side engine; the
right-hand side is always Wasmtime. When omitted, `ALLOWED_ENGINES`
defaults to [`wasmtime`, `wasmi`, `spec`, `v8`].

The generated WebAssembly modules are configured using
`ALLOWED_MODULES`. This environment variables works the same as above
but the available options are: [`wasm-smith`, `single-inst`].
abrown added a commit that referenced this pull request Aug 24, 2022
This change is a follow-on from #4515 to add the ability to configure
the `differential` fuzz target by limiting which engines and modules are
used for fuzzing. This is incredibly useful when troubleshooting, e.g.,
when an engine is more prone to failure, we can target that engine
exclusively. The effect of this configuration is visible in the
statistics now printed out from #4739.

Engines are configured using the `ALLOWED_ENGINES` environment variable.
We can either subtract from the set of allowed engines (e.g.,
`ALLOWED_ENGINES=-v8`) or build up a set of allowed engines (e.g.,
`ALLOWED_ENGINES=wasmi,spec`), but not both at the same time.
`ALLOWED_ENGINES` only configures the left-hand side engine; the
right-hand side is always Wasmtime. When omitted, `ALLOWED_ENGINES`
defaults to [`wasmtime`, `wasmi`, `spec`, `v8`].

The generated WebAssembly modules are configured using
`ALLOWED_MODULES`. This environment variables works the same as above
but the available options are: [`wasm-smith`, `single-inst`].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fuzzing Issues related to our fuzzing infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants