Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 56 additions & 1 deletion src/ir/intrinsics.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
namespace wasm {

static Name BinaryenIntrinsicsModule("binaryen-intrinsics"),
CallWithoutEffects("call.without.effects");
CallWithoutEffects("call.without.effects"),
JSPrototypesModule("wasm:js-prototypes"), ConfigureAll("configureAll");

bool Intrinsics::isCallWithoutEffects(Function* func) {
if (func->module != BinaryenIntrinsicsModule) {
Expand All @@ -45,4 +46,58 @@ Call* Intrinsics::isCallWithoutEffects(Expression* curr) {
return nullptr;
}

bool Intrinsics::isConfigureAll(Function* func) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check the type as well so all users can be guaranteed that the parameters have the correct types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do that for the intrinsic above. Do you think it is worth the effort? I'm not sure what it would catch in practice that would be useful. (If the user has this wrong, the VM will error anyhow?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe it's not worth it. configureAll has a single type whereas call.without.effects is overloaded for an infinite set of types, so at least we could check its type rather than letting the user find out about the problem at runtime.

return func->module == JSPrototypesModule && func->base == ConfigureAll;
}

Call* Intrinsics::isConfigureAll(Expression* curr) {
if (auto* call = curr->dynCast<Call>()) {
if (auto* func = module.getFunctionOrNull(call->target)) {
if (isConfigureAll(func)) {
return call;
}
}
}
return nullptr;
}

std::vector<Name> Intrinsics::getConfigureAllFunctions(Call* call) {
assert(isConfigureAll(call));

auto error = [&](const char* msg) {
Fatal() << "Invalid configureAll( " << msg << "): " << *call;
};
Comment on lines +67 to +69
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely avoid erroring out on unrecognized configureAll patterns, since they would still be perfectly valid. Maybe we can print a warning and return an empty list if we see some other pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to error for now? A warning might be missed, and make things harder to debug. (We can always reduce an error to a warning later if it feels annoying in practice.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, if a warning is missed, and configureAll is not noticed, we'd end up optimizing away code in ways that might be confusing, like we've seen already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fair enough.


// The second operand is an array of signature-called function refs.
auto& operands = call->operands;
if (operands.size() <= 2) {
error("insufficient operands");
}
auto* arrayNew = operands[1]->dynCast<ArrayNewElem>();
if (!arrayNew) {
error("not array.new_elem");
}
auto start = arrayNew->offset->dynCast<Const>();
if (!start || start->value.geti32() != 0) {
error("start != 0");
}
auto size = arrayNew->size->dynCast<Const>();
if (!size) {
error("size not const");
}
auto* seg = module.getElementSegment(arrayNew->segment);
if (seg->data.size() != size->value.getUnsigned()) {
error("wrong seg size");
}
std::vector<Name> ret;
for (auto* curr : seg->data) {
if (auto* refFunc = curr->dynCast<RefFunc>()) {
ret.push_back(refFunc->func);
} else {
error("non-function ref");
}
}
return ret;
}

} // namespace wasm
29 changes: 25 additions & 4 deletions src/ir/intrinsics.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
#include "wasm-traversal.h"

//
// See the README.md for background on intrinsic functions.
// Intrinsics include Binaryen intrinsics, and Wasm spec builtins. See the
// README.md for more.
//
// Intrinsics can be recognized by Intrinsics::isFoo() methods, that check if a
// function is a particular intrinsic, or if a call to a function is so. The
Expand All @@ -37,8 +38,7 @@ class Intrinsics {
public:
Intrinsics(Module& module) : module(module) {}

//
// Check if an instruction is the call.without.effects intrinsic.
// Check if an instruction is the Binaryen call.without.effects intrinsic.
//
// (import "binaryen-intrinsics" "call.without.effects"
// (func (..params..) (param $target funcref) (..results..)))
Expand Down Expand Up @@ -85,9 +85,30 @@ class Intrinsics {
//
// Later passes will then turn that into a direct call and further optimize
// things.
//
bool isCallWithoutEffects(Function* func);
Call* isCallWithoutEffects(Expression* curr);

// Check if an instruction is the wasm/JS interop configureAll builtin. Such
// calls have a second parameter which is an array of function references,
// which must be assumed to be "signature-called": Called from outside the
// module, using that signature. "Signature-called" is less powerful than the
// reference to the function generally escaping, as we do not care about the
// heap type of the function (no call_refs or casts will happen), all we know
// is that a JS-style call of the methods can occur (and only the signature
// matters then).
bool isConfigureAll(Function* func);
Call* isConfigureAll(Expression* curr);
// Given a configureAll, return all the functions it refers to. We error if it
// is not in the canonical form
//
// (call $configureAll
// ..arg0..
// (array.new_elem $seg (0) (N)
// ..
// )
//
// where the segment $seg is of size N.
std::vector<Name> getConfigureAllFunctions(Call* call);
};

} // namespace wasm
Expand Down
21 changes: 21 additions & 0 deletions src/ir/module-utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
*/

#include "module-utils.h"
#include "ir/find_all.h"
#include "ir/intrinsics.h"
#include "ir/manipulation.h"
#include "ir/metadata.h"
Expand Down Expand Up @@ -692,6 +693,26 @@ std::vector<HeapType> getPublicHeapTypes(Module& wasm) {
WASM_UNREACHABLE("unexpected export kind");
}

// ConfigureAll in a start function makes its functions callable. They are
// only signature-called, so the heap type does not need to be public - nor
// types referred to - but for now we mark them as public to avoid breakage in
// several passes.
// TODO Specific fixes in those passes could replace this, and allow better
// optimization.
if (wasm.start) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine that eventually we'll need to handle this in places other than the start function, but starting with this is probably sufficient.

auto* start = wasm.getFunction(wasm.start);
if (!start->imported()) {
Intrinsics intrinsics(wasm);
for (auto* call : FindAll<Call>(start->body).list) {
if (intrinsics.isConfigureAll(call)) {
for (auto func : intrinsics.getConfigureAllFunctions(call)) {
notePublic(wasm.getFunction(func)->type.getHeapType());
}
}
}
}
}

// Ignorable public types are public.
for (auto type : getIgnorablePublicTypes()) {
notePublic(type);
Expand Down
9 changes: 8 additions & 1 deletion src/passes/RemoveUnusedModuleElements.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,8 @@ struct Noter : public PostWalker<Noter, UnifiedExpressionVisitor<Noter>> {
void visitCall(Call* curr) {
use({ModuleElementKind::Function, curr->target});

if (Intrinsics(*getModule()).isCallWithoutEffects(curr)) {
Intrinsics intrinsics(*getModule());
if (intrinsics.isCallWithoutEffects(curr)) {
// A call-without-effects receives a function reference and calls it, the
// same as a CallRef. When we have a flag for non-closed-world, we should
// handle this automatically by the reference flowing out to an import,
Expand All @@ -157,6 +158,12 @@ struct Noter : public PostWalker<Noter, UnifiedExpressionVisitor<Noter>> {
callRef.target = target;
visitCallRef(&callRef);
}
} else if (intrinsics.isConfigureAll(curr)) {
// Every function that configureAll refers to is signature-called. Mark
// them all as called, as JS can call them.
for (auto func : intrinsics.getConfigureAllFunctions(curr)) {
use({ModuleElementKind::Function, func});
}
}
}

Expand Down
137 changes: 137 additions & 0 deletions test/lit/passes/remove-unused-module-elements-configureAll.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
;; NOTE: Assertions have been generated by update_lit_checks.py --all-items and should not be edited.

;; RUN: foreach %s %t wasm-opt --remove-unused-module-elements --closed-world -all -S -o - | filecheck %s
;; RUN: foreach %s %t wasm-opt --remove-unused-module-elements -all -S -o - | filecheck %s --check-prefix OPEN_WORLD

;; Test that configureAll is respected: referred functions are not optimized
;; away or emptied out. This is so even in closed world.

(module
;; CHECK: (type $externs (array (mut externref)))
;; OPEN_WORLD: (type $externs (array (mut externref)))
(type $externs (array (mut externref)))

;; CHECK: (type $funcs (array (mut funcref)))
;; OPEN_WORLD: (type $funcs (array (mut funcref)))
(type $funcs (array (mut funcref)))

;; CHECK: (type $bytes (array (mut i8)))
;; OPEN_WORLD: (type $bytes (array (mut i8)))
(type $bytes (array (mut i8)))

;; CHECK: (type $configureAll (func (param (ref null $externs) (ref null $funcs) (ref null $bytes) externref)))
;; OPEN_WORLD: (type $configureAll (func (param (ref null $externs) (ref null $funcs) (ref null $bytes) externref)))
(type $configureAll (func (param (ref null $externs)) (param (ref null $funcs)) (param (ref null $bytes)) (param externref)))

;; CHECK: (type $4 (func))

;; CHECK: (type $5 (func (result i32)))

;; CHECK: (type $6 (func (param i32) (result i32)))

;; CHECK: (import "wasm:js-prototypes" "configureAll" (func $configureAll (type $configureAll) (param (ref null $externs) (ref null $funcs) (ref null $bytes) externref)))
;; OPEN_WORLD: (type $4 (func))

;; OPEN_WORLD: (type $5 (func (result i32)))

;; OPEN_WORLD: (type $6 (func (param i32) (result i32)))

;; OPEN_WORLD: (import "wasm:js-prototypes" "configureAll" (func $configureAll (type $configureAll) (param (ref null $externs) (ref null $funcs) (ref null $bytes) externref)))
(import "wasm:js-prototypes" "configureAll" (func $configureAll (type $configureAll)))

;; CHECK: (data $bytes "12345678")

;; CHECK: (elem $externs externref (item (ref.null noextern)))
;; OPEN_WORLD: (data $bytes "12345678")

;; OPEN_WORLD: (elem $externs externref (item (ref.null noextern)))
(elem $externs externref
(ref.null extern)
)

;; CHECK: (elem $funcs func $foo $bar)
;; OPEN_WORLD: (elem $funcs func $foo $bar)
(elem $funcs funcref
(ref.func $foo)
(ref.func $bar)
)

(data $bytes "12345678")

;; CHECK: (start $start)
;; OPEN_WORLD: (start $start)
(start $start)

;; CHECK: (func $start (type $4)
;; CHECK-NEXT: (call $configureAll
;; CHECK-NEXT: (array.new_elem $externs $externs
;; CHECK-NEXT: (i32.const 0)
;; CHECK-NEXT: (i32.const 1)
;; CHECK-NEXT: )
;; CHECK-NEXT: (array.new_elem $funcs $funcs
;; CHECK-NEXT: (i32.const 0)
;; CHECK-NEXT: (i32.const 2)
;; CHECK-NEXT: )
;; CHECK-NEXT: (array.new_data $bytes $bytes
;; CHECK-NEXT: (i32.const 0)
;; CHECK-NEXT: (i32.const 8)
;; CHECK-NEXT: )
;; CHECK-NEXT: (ref.null noextern)
;; CHECK-NEXT: )
;; CHECK-NEXT: )
;; OPEN_WORLD: (func $start (type $4)
;; OPEN_WORLD-NEXT: (call $configureAll
;; OPEN_WORLD-NEXT: (array.new_elem $externs $externs
;; OPEN_WORLD-NEXT: (i32.const 0)
;; OPEN_WORLD-NEXT: (i32.const 1)
;; OPEN_WORLD-NEXT: )
;; OPEN_WORLD-NEXT: (array.new_elem $funcs $funcs
;; OPEN_WORLD-NEXT: (i32.const 0)
;; OPEN_WORLD-NEXT: (i32.const 2)
;; OPEN_WORLD-NEXT: )
;; OPEN_WORLD-NEXT: (array.new_data $bytes $bytes
;; OPEN_WORLD-NEXT: (i32.const 0)
;; OPEN_WORLD-NEXT: (i32.const 8)
;; OPEN_WORLD-NEXT: )
;; OPEN_WORLD-NEXT: (ref.null noextern)
;; OPEN_WORLD-NEXT: )
;; OPEN_WORLD-NEXT: )
(func $start
(call $configureAll
(array.new_elem $externs $externs
(i32.const 0) (i32.const 1))
(array.new_elem $funcs $funcs
(i32.const 0) (i32.const 2))
(array.new_data $bytes $bytes
(i32.const 0) (i32.const 8))
(ref.null extern)
)
)

;; CHECK: (func $foo (type $5) (result i32)
;; CHECK-NEXT: (i32.const 42)
;; CHECK-NEXT: )
;; OPEN_WORLD: (func $foo (type $5) (result i32)
;; OPEN_WORLD-NEXT: (i32.const 42)
;; OPEN_WORLD-NEXT: )
(func $foo (result i32)
;; configureAll keeps this from being modified.
(i32.const 42)
)

;; CHECK: (func $bar (type $6) (param $x i32) (result i32)
;; CHECK-NEXT: (local.get $x)
;; CHECK-NEXT: )
;; OPEN_WORLD: (func $bar (type $6) (param $x i32) (result i32)
;; OPEN_WORLD-NEXT: (local.get $x)
;; OPEN_WORLD-NEXT: )
(func $bar (param $x i32) (result i32)
;; This too.
(local.get $x)
)

(func $unconfigured (result i32)
;; This is not referred to by configureAll, and can be removed.
(i32.const 1337)
)
)
Loading
Loading