-
Notifications
You must be signed in to change notification settings - Fork 829
wasm-split: Add fuzzer support #7014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
fa8a95f
work
kripken 0e74570
bettr
kripken f8cb484
clean
kripken 3d71854
work
kripken b5df612
format
kripken 4f07dae
nans
kripken 8406b24
oops^2
kripken 6104c04
flake
kripken d578928
compile regex
kripken fc15b5a
simplify proxy stubs
kripken ed9c1ac
feedback
kripken File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -743,8 +743,8 @@ def run_d8_js(js, args=[], liftoff=True): | |
| FUZZ_SHELL_JS = in_binaryen('scripts', 'fuzz_shell.js') | ||
|
|
||
|
|
||
| def run_d8_wasm(wasm, liftoff=True): | ||
| return run_d8_js(FUZZ_SHELL_JS, [wasm], liftoff=liftoff) | ||
| def run_d8_wasm(wasm, liftoff=True, args=[]): | ||
| return run_d8_js(FUZZ_SHELL_JS, [wasm] + args, liftoff=liftoff) | ||
|
|
||
|
|
||
| def all_disallowed(features): | ||
|
|
@@ -1391,6 +1391,111 @@ def handle(self, wasm): | |
| compare_between_vms(output, merged_output, 'Merge') | ||
|
|
||
|
|
||
| FUNC_NAMES_REGEX = re.compile(r'\n [(]func [$](\S+)') | ||
|
|
||
|
|
||
| # Tests wasm-split | ||
| class Split(TestCaseHandler): | ||
| frequency = 1 # TODO: adjust lower when we actually enable this | ||
|
|
||
| def handle(self, wasm): | ||
| # get the list of function names, some of which we will decide to split | ||
| # out | ||
| wat = run([in_bin('wasm-dis'), wasm] + FEATURE_OPTS) | ||
| all_funcs = re.findall(FUNC_NAMES_REGEX, wat) | ||
|
|
||
| # get the original output before splitting | ||
| output = run_d8_wasm(wasm) | ||
| output = fix_output(output) | ||
|
|
||
| # find the names of the exports. we need this because when we split the | ||
| # module then new exports appear to connect the two halves of the | ||
| # original module. we do not want to call all the exports on the new | ||
| # primary module, but only the original ones. | ||
| exports = [] | ||
| for line in output.splitlines(): | ||
| if FUZZ_EXEC_CALL_PREFIX in line: | ||
| exports.append(get_export_from_call_line(line)) | ||
|
|
||
| # pick which to split out, with a random rate of picking (biased towards | ||
| # 0.5). | ||
| rate = (random.random() + random.random()) / 2 | ||
| split_funcs = [] | ||
| for func in all_funcs: | ||
| if random.random() < rate: | ||
| split_funcs.append(func) | ||
|
|
||
| if not split_funcs: | ||
| # nothing to split out | ||
| return | ||
|
|
||
| # split the wasm into two | ||
| primary = wasm + '.primary.wasm' | ||
| secondary = wasm + '.secondary.wasm' | ||
|
|
||
| # we require reference types, because that allows us to create our own | ||
| # table. without that we use the existing table, and that may interact | ||
| # with user code in odd ways (it really only works with the particular | ||
| # form of table+segments that LLVM emits, and not with random fuzzer | ||
| # content). | ||
| split_feature_opts = FEATURE_OPTS + ['--enable-reference-types'] | ||
|
|
||
| run([in_bin('wasm-split'), wasm, '--split', | ||
| '--split-funcs', ','.join(split_funcs), | ||
| '--primary-output', primary, | ||
| '--secondary-output', secondary] + split_feature_opts) | ||
|
|
||
| # sometimes also optimize the split modules | ||
| optimized = False | ||
|
|
||
| def optimize(name): | ||
| # do not optimize if it would change the ABI | ||
| if CLOSED_WORLD: | ||
| return name | ||
| # TODO: use other optimizations here, but we'd need to be careful of | ||
| # anything that can alter the ABI, and also current | ||
| # limitations of open-world optimizations (see discussion in | ||
| # https://github.com/WebAssembly/binaryen/pull/6660) | ||
| opts = ['-O3'] | ||
| new_name = name + '.opt.wasm' | ||
| run([in_bin('wasm-opt'), name, '-o', new_name, '-all'] + opts + split_feature_opts) | ||
| nonlocal optimized | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Neat, I've never seen |
||
| optimized = True | ||
| return new_name | ||
|
|
||
| if random.random() < 0.5: | ||
| primary = optimize(primary) | ||
| if random.random() < 0.5: | ||
| secondary = optimize(secondary) | ||
|
|
||
| # prepare the list of exports to call. the format is | ||
| # | ||
| # exports:A,B,C | ||
| # | ||
| exports_to_call = 'exports:' + ','.join(exports) | ||
|
|
||
| # get the output from the split modules, linking them using JS | ||
| # TODO run liftoff/turboshaft/etc. | ||
| linked_output = run_d8_wasm(primary, args=[secondary, exports_to_call]) | ||
| linked_output = fix_output(linked_output) | ||
|
|
||
| # see D8.can_compare_to_self: we cannot compare optimized outputs if | ||
| # NaNs are allowed, as the optimizer can modify NaNs differently than | ||
| # the JS engine. | ||
| if not (NANS and optimized): | ||
| compare_between_vms(output, linked_output, 'Split') | ||
|
|
||
| def can_run_on_feature_opts(self, feature_opts): | ||
| # to run the split wasm we use JS, that is, JS links the exports of one | ||
| # to the imports of the other, etc. since we run in JS, the wasm must be | ||
| # valid for JS. | ||
| if not LEGALIZE: | ||
| return False | ||
|
|
||
| # see D8.can_run | ||
| return all_disallowed(['shared-everything']) | ||
|
|
||
|
|
||
| # Check that the text format round-trips without error. | ||
| class RoundtripText(TestCaseHandler): | ||
| frequency = 0.05 | ||
|
|
@@ -1413,6 +1518,8 @@ def handle(self, wasm): | |
| TrapsNeverHappen(), | ||
| CtorEval(), | ||
| Merge(), | ||
| # TODO: enable when stable enough, and adjust |frequency| (see above) | ||
| # Split(), | ||
| RoundtripText() | ||
| ] | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be simpler to use the
--export-prefixoption to add a unique prefix to the new exports that we can filter out directly in the JS wrapper.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, then we'd need to hardcode some special prefix in the JS wrapper, but I'm not sure there is a fixed prefix we can use: we don't want to overlap with existing export names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW Emscripten uses
%as the prefix. Maybe we can just choose an arbitrary one like that and it would be good enough?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the difference is that Emscripten has full control over the export names. In the fuzzer we do want to be able to fuzz initial content from anywhere. I suppose we could sanitize that content before fuzzing it, but that seems more complicated to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the currently solution LGTM, then.