Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grand Unified Flow Analysis (GUFA) #4598

Merged
merged 888 commits into from
Jul 22, 2022
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
888 commits
Select commit Hold shift + click to select a range
4f1a6b1
Merge remote-tracking branch 'origin/main' into fgprop
kripken May 18, 2022
52535db
clean
kripken May 19, 2022
3a4a894
polish
kripken May 19, 2022
c3e3e30
polish
kripken May 19, 2022
8dfedaa
polish
kripken May 19, 2022
5457df6
polish
kripken May 19, 2022
9a2e636
better
kripken May 19, 2022
d18b933
rename
kripken May 19, 2022
13adac9
format
kripken May 19, 2022
adf9227
work
kripken May 19, 2022
aecfd18
fix
kripken May 19, 2022
eee2e30
fix
kripken May 19, 2022
0444b41
format
kripken May 19, 2022
bcaea49
test
kripken May 19, 2022
2270174
text
kripken May 19, 2022
9d01a3b
text
kripken May 19, 2022
0e496a9
work
kripken May 19, 2022
f620b76
text
kripken May 19, 2022
b74f427
text
kripken May 19, 2022
e0d0592
text
kripken May 19, 2022
4627648
text
kripken May 19, 2022
265c6f2
text
kripken May 19, 2022
152541f
text
kripken May 19, 2022
6faaa19
text
kripken May 19, 2022
34380e0
text
kripken May 19, 2022
7f07af0
text
kripken May 19, 2022
732d881
text
kripken May 19, 2022
7c73a1f
fix
kripken May 19, 2022
243a960
test
kripken May 19, 2022
e304cdf
test
kripken May 19, 2022
6e29761
test
kripken May 19, 2022
d6ccf0d
comment
kripken May 19, 2022
9c64e28
text
kripken May 19, 2022
1dc3d6d
better
kripken May 19, 2022
55c473b
better
kripken May 19, 2022
de91e5c
better
kripken May 19, 2022
3aeb649
test fix
kripken May 19, 2022
77ada29
move
kripken May 19, 2022
71d7589
format
kripken May 19, 2022
5618fee
work
kripken May 19, 2022
55b5e44
work
kripken May 19, 2022
2dcaccb
comment
kripken May 19, 2022
d57ecd3
test
kripken May 20, 2022
0f5fee1
comment
kripken May 20, 2022
fa0736c
test
kripken May 20, 2022
2a0ac9a
format
kripken May 20, 2022
9d2b4c1
comments
kripken May 20, 2022
7af8861
simpler
kripken May 20, 2022
8f9f1e4
names
kripken May 20, 2022
32f7766
add optimizing mode
kripken May 20, 2022
2607dcb
format
kripken May 20, 2022
141e6e2
comment
kripken May 20, 2022
e2632b6
test
kripken May 20, 2022
b8af191
work
kripken May 20, 2022
0559dd9
test
kripken May 20, 2022
ceb2dcd
fuzz
kripken May 20, 2022
e835f56
typo
kripken May 20, 2022
25bb0ca
typo
kripken May 20, 2022
9061659
typo
kripken May 20, 2022
8759f6c
comment
kripken May 20, 2022
64bb162
comment
kripken May 20, 2022
1e3cf3e
comment
kripken May 20, 2022
208c220
comment
kripken May 20, 2022
972211b
comment
kripken May 20, 2022
2734f78
comment
kripken May 20, 2022
6870508
comment
kripken May 20, 2022
5110432
comment
kripken May 20, 2022
f3b538e
comment
kripken May 20, 2022
d1a25ab
comments
kripken May 20, 2022
0402cab
text
kripken May 20, 2022
bab9e92
comment
kripken May 20, 2022
26fca84
comment
kripken May 20, 2022
2810bcd
test
kripken May 20, 2022
4a12da0
Revert "test"
kripken May 20, 2022
9135f6b
comment
kripken May 20, 2022
caccaaf
test
kripken May 20, 2022
7b5ed43
fix
kripken May 20, 2022
e312c97
Merge remote-tracking branch 'origin/main' into gufa
kripken May 21, 2022
2c17fd3
Merge branch 'fgprop' into gufa
kripken May 21, 2022
7da19ab
comment
kripken May 23, 2022
4f338ad
comment
kripken May 23, 2022
8f47f02
comment
kripken May 23, 2022
12a5b23
comment
kripken May 23, 2022
e52d735
Merge remote-tracking branch 'origin/main' into fgprop
kripken May 23, 2022
4a9c365
Merge remote-tracking branch 'origin/main' into fgprop
kripken May 23, 2022
1854e72
fix
kripken May 23, 2022
7179d2e
fix
kripken May 23, 2022
7d61a73
fix
kripken May 23, 2022
225f36d
comment
kripken May 23, 2022
13464df
Merge branch 'fgprop' into gufa
kripken May 23, 2022
d11181f
fix
kripken May 23, 2022
ba3a913
fix
kripken May 23, 2022
0ab23d4
debug CI-only failure on alpine
kripken May 24, 2022
e66dfb0
Revert "debug CI-only failure on alpine"
kripken May 24, 2022
51d0356
Revert "Revert "debug CI-only failure on alpine""
kripken May 24, 2022
68b8dd2
undo
kripken May 24, 2022
5f213cc
fix
kripken May 24, 2022
8272223
format
kripken May 24, 2022
218a82e
undo
kripken May 24, 2022
439061a
pre-gufa
kripken May 24, 2022
35de732
files
kripken May 24, 2022
5e14116
Merge branch 'pre-gufa' into gufa
kripken May 24, 2022
efa0d21
feedback
kripken May 25, 2022
cd74ec1
feedback
kripken May 25, 2022
2eb813c
feedback
kripken May 25, 2022
0a9631a
feedback
kripken May 25, 2022
84e7541
feedback
kripken May 25, 2022
ed46901
format
kripken May 25, 2022
55760eb
clarify
kripken May 25, 2022
f2cad07
Merge branch 'pre-gufa' into gufa
kripken May 25, 2022
4055e61
example => gtest
kripken May 25, 2022
d2550b2
test
kripken May 25, 2022
10fccbe
format
kripken May 25, 2022
167f8ce
cleanup
kripken May 25, 2022
73b0809
Merge branch 'pre-gufa' into gufa
kripken May 25, 2022
e7aae12
move
kripken May 25, 2022
e17b5bc
use fixtures
kripken May 25, 2022
1d15978
separate
kripken May 25, 2022
cb4ed59
fix
kripken May 25, 2022
e69625d
more
kripken May 26, 2022
bee4bb6
cleanup
kripken May 26, 2022
48b8d23
format
kripken May 26, 2022
7c3296f
isTypeExact => hasExactType
kripken May 26, 2022
1e8d548
Update src/ir/possible-contents.h
kripken May 26, 2022
1a42ee2
clarify
kripken May 26, 2022
eb3c7d1
Merge remote-tracking branch 'origin/pre-gufa' into pre-gufa
kripken May 26, 2022
0df4a40
typo
kripken May 26, 2022
036876f
comment
kripken May 26, 2022
206f139
test rename: func => nonNullFunc
kripken May 26, 2022
bd1214e
format
kripken May 26, 2022
8a770ff
Merge branch 'pre-gufa' into gufa
kripken May 26, 2022
0cc8e91
Update src/ir/possible-contents.cpp
kripken May 27, 2022
947b7da
CollectedInfo => CollectedFuncInfo
kripken May 27, 2022
19b52cb
fix
kripken May 27, 2022
3ad6228
comment
kripken May 27, 2022
36cc3c3
BranchLocation => BreakTargetLocation
kripken May 31, 2022
b0add64
add missing isRelevant
kripken May 31, 2022
e310cdb
Merge commit '36cc3c3acbf502c2620d484db78369b3946d4e04' into gufa
kripken May 31, 2022
52a0b30
Merge branch 'pre-gufa' into gufa
kripken May 31, 2022
5da22cd
feedback
kripken May 31, 2022
0d5d412
comment
kripken May 31, 2022
032ed51
comment
kripken May 31, 2022
5df4495
refactor+fix
kripken May 31, 2022
66f3017
fix
kripken May 31, 2022
cdd0391
fix
kripken May 31, 2022
da56b25
fix
kripken May 31, 2022
dcfbfc2
Merge branch 'pre-gufa' into gufa
kripken May 31, 2022
e06e7bc
comment
kripken May 31, 2022
f4fd087
comment
kripken May 31, 2022
54c0531
targetExpr => child
kripken May 31, 2022
14f4e6c
Merge branch 'pre-gufa' into gufa
kripken May 31, 2022
b1c51a1
typo
kripken May 31, 2022
1c41a12
comment
kripken May 31, 2022
5aa3e15
comment
kripken May 31, 2022
7cf666a
clarify aliasing
kripken May 31, 2022
224f3df
comment
kripken May 31, 2022
45556c3
comment
kripken May 31, 2022
c47f529
Merge remote-tracking branch 'origin/refehpop' into pre-gufa
kripken May 31, 2022
da8d09c
fix
kripken May 31, 2022
b58cf1e
Merge remote-tracking branch 'origin/refehpop' into pre-gufa
kripken May 31, 2022
a3ac146
fix
kripken May 31, 2022
b128bd8
document and support 0 pops
kripken May 31, 2022
4a752dc
Merge remote-tracking branch 'origin/refehpop' into pre-gufa
kripken May 31, 2022
b7851c8
Merge remote-tracking branch 'origin/main' into pre-gufa
kripken May 31, 2022
2de2cac
Merge remote-tracking branch 'origin/main' into pre-gufa
kripken Jun 1, 2022
727f7f0
Merge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
eb9c718
comment
kripken Jun 1, 2022
aa9643f
test improvements
kripken Jun 1, 2022
143a004
wip
kripken Jun 1, 2022
8598306
Merge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
728caf0
cleaner
kripken Jun 1, 2022
14714f8
Merge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
6015ecb
format
kripken Jun 1, 2022
67a5967
fix
kripken Jun 1, 2022
137eac4
fix
kripken Jun 1, 2022
3f64ccf
feedback
kripken Jun 1, 2022
67e9ea2
nMerge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
7f5c287
comment
kripken Jun 1, 2022
7da0a3f
fix
kripken Jun 1, 2022
1069efe
fix
kripken Jun 1, 2022
facff67
Merge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
1c5aa5c
rename
kripken Jun 1, 2022
03f8896
addSpecialChildParentLink => addChildParentLink
kripken Jun 1, 2022
678b9b1
simpler
kripken Jun 1, 2022
2f20952
Revert "simpler"
kripken Jun 1, 2022
55ef34d
format
kripken Jun 1, 2022
40671be
Merge remote-tracking branch 'origin/main' into pre-gufa
kripken Jun 1, 2022
cc75d3d
Merge branch 'pre-gufa' into gufa
kripken Jun 1, 2022
6c5affd
try to emit Exact{i32}
kripken Jun 6, 2022
8f4ff06
Merge branch 'pre-gufa' into gufa
kripken Jun 6, 2022
ec3de02
Optimize the case of Exact{i32} in the flow
kripken Jun 6, 2022
2c3fb78
comments
kripken Jun 6, 2022
8196125
fix
kripken Jun 6, 2022
815e40f
Merge branch 'pre-gufa' into gufa
kripken Jun 6, 2022
53a7d60
SpecialLocation=
kripken Jun 6, 2022
b733ada
Merge branch 'pre-gufa' into gufa
kripken Jun 6, 2022
3ddd0a8
isorecursive too
kripken Jun 6, 2022
e67e76d
comment
kripken Jun 7, 2022
a710d52
improve comment
kripken Jun 8, 2022
e3d371e
Merge remote-tracking branch 'origin/main' into pre-gufa
kripken Jun 8, 2022
97a9ae3
comment about flow
kripken Jun 8, 2022
372d2bd
comment
kripken Jun 8, 2022
ac23076
comment
kripken Jun 8, 2022
5bac7e2
Merge branch 'pre-gufa' into gufa
kripken Jun 8, 2022
ee1eb57
simplify
kripken Jun 8, 2022
7d3288e
Merge branch 'pre-gufa' into gufa
kripken Jun 8, 2022
b73b5cb
try to make clang-tidy happy
kripken Jun 8, 2022
c07d878
Merge branch 'pre-gufa' into gufa
kripken Jun 8, 2022
93e3404
feedback
kripken Jun 9, 2022
c46c375
Update src/ir/possible-contents.cpp
kripken Jun 9, 2022
0ff332f
Merge remote-tracking branch 'origin/pre-gufa' into gufa
kripken Jun 21, 2022
556e59f
Merge remote-tracking branch 'origin/main' into gufa
kripken Jun 21, 2022
9825e67
merge
kripken Jun 24, 2022
789293f
format
kripken Jun 24, 2022
8832b52
fix
kripken Jun 24, 2022
4ec50b6
format
kripken Jun 24, 2022
1973c77
Merge remote-tracking branch 'origin/main' into gufa
kripken Jun 28, 2022
91ac663
isorecursive too
kripken Jun 28, 2022
d2aea40
rename
kripken Jun 28, 2022
e1e9138
todo
kripken Jun 28, 2022
1725861
feedback
kripken Jun 30, 2022
41ae2f9
fix determinism
kripken Jun 30, 2022
295bc67
feedback
kripken Jul 1, 2022
4790669
todo
kripken Jul 1, 2022
5400a37
Rename and simplify test
kripken Jul 6, 2022
3d1051f
Update test/lit/passes/gufa-refs.wast
kripken Jul 6, 2022
3c65662
compute shallow effects when we'll keep children anyhow
kripken Jul 6, 2022
ce6a45d
Merge remote-tracking branch 'origin/main' into gufa
kripken Jul 6, 2022
0d53d84
Merge remote-tracking branch 'origin/gufa' into gufa
kripken Jul 6, 2022
e4987cb
fix
kripken Jul 6, 2022
adef90c
comment
kripken Jul 8, 2022
95b8ea5
[GUFA] Simplify routines for dropping children (NFC)
aheejin Jul 9, 2022
3704951
Restore comments
aheejin Jul 9, 2022
0da2d8c
clang-format
aheejin Jul 9, 2022
68fcea1
feedback
kripken Jul 11, 2022
483087c
4598
aheejin Jul 12, 2022
7988682
Merge branch '4598' into improve_remove
aheejin Jul 12, 2022
c00db9e
Revert "Fix binaryen.js to include allocate() explicitly (#4793)"
aheejin Jul 12, 2022
c6c0769
Revert "[Parser] Start to parse instructions (#4789)"
aheejin Jul 12, 2022
8f9e2a6
Revert "Revert "[Parser] Start to parse instructions (#4789)""
aheejin Jul 12, 2022
8955856
Revert "Revert "Fix binaryen.js to include allocate() explicitly (#47…
aheejin Jul 12, 2022
1eed7e0
Revert "Merge branch '4598' into improve_remove"
aheejin Jul 12, 2022
d556760
feedback
kripken Jul 11, 2022
14d83ef
Revert "feedback"
aheejin Jul 12, 2022
4a1ca02
Add `optimize = true`s
aheejin Jul 12, 2022
8925759
Test changes
aheejin Jul 12, 2022
5e3d67c
Merge remote-tracking branch 'origin/main' into gufa
kripken Jul 12, 2022
5198ccc
Simplify routines for dropping children
aheejin Jul 18, 2022
cd8aa1e
Revert "Simplify routines for dropping children"
aheejin Jul 18, 2022
4c3842b
Simplify routines for dropping children (NFC)
aheejin Jul 18, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions scripts/fuzz_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -1158,6 +1158,8 @@ def write_commands(commands, filename):
["--global-refining"],
["--gsi"],
["--gto"],
["--gufa"],
["--gufa-optimizing"],
["--local-cse"],
["--heap2local"],
["--remove-unused-names", "--heap2local"],
Expand Down
8 changes: 4 additions & 4 deletions src/ir/drop.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,10 @@ namespace wasm {
//
// The caller must also pass in a last item to append to the output (which is
// typically what the original expression is replaced with).
Expression* getDroppedChildrenAndAppend(Expression* curr,
Module& wasm,
const PassOptions& options,
Expression* last) {
inline Expression* getDroppedChildrenAndAppend(Expression* curr,
Module& wasm,
const PassOptions& options,
Expression* last) {
Builder builder(wasm);
std::vector<Expression*> contents;
for (auto* child : ChildIterator(curr)) {
Expand Down
1 change: 1 addition & 0 deletions src/passes/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ set(passes_SOURCES
GlobalRefining.cpp
GlobalStructInference.cpp
GlobalTypeOptimization.cpp
GUFA.cpp
Heap2Local.cpp
I64ToI32Lowering.cpp
Inlining.cpp
Expand Down
308 changes: 308 additions & 0 deletions src/passes/GUFA.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,308 @@
/*
* Copyright 2022 WebAssembly Community Group participants
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

//
// Grand Unified Flow Analysis (GUFA)
//
// Optimize based on information about what content can appear in each location
// in the program. This does a whole-program analysis to find that out and
// hopefully learn more than the type system does - for example, a type might be
// $A, which means $A or any subtype can appear there, but perhaps the analysis
// can find that only $A', a particular subtype, can appear there in practice,
// and not $A or any subtypes of $A', etc. Or, we may find that no type is
// actually possible at a particular location, say if we can prove that the
// casts on the way to that location allow nothing through. We can also find
// that only a particular value is possible of that type.
//
// GUFA will infer constants and unreachability, and add those to the code. This
// can increase code size if further optimizations are not run later like dead
// code elimination and vacuum. The "optimizing" variant of this pass will run
// such followup opts automatically in functions where we make changes, and so
// it is useful if GUFA is run near the end of the optimization pipeline.
//
// TODO: GUFA + polymorphic devirtualization + traps-never-happen. If we see
// that the possible call targets are {A, B, C}, and GUFA info lets us
// prove that A, C will trap if called - say, if they cast the first
// parameter to something GUFA proved it cannot be - then we can ignore
// them, and devirtualize to a call to B.
//

#include "ir/drop.h"
#include "ir/eh-utils.h"
#include "ir/possible-contents.h"
#include "ir/properties.h"
#include "ir/utils.h"
#include "pass.h"
#include "wasm.h"

namespace wasm {

namespace {

struct GUFAOptimizer
: public WalkerPass<
PostWalker<GUFAOptimizer, UnifiedExpressionVisitor<GUFAOptimizer>>> {
bool isFunctionParallel() override { return true; }

ContentOracle& oracle;
bool optimizing;

GUFAOptimizer(ContentOracle& oracle, bool optimizing)
: oracle(oracle), optimizing(optimizing) {}

GUFAOptimizer* create() override {
return new GUFAOptimizer(oracle, optimizing);
}

bool optimized = false;

// Check if removing something (but not its children - just the node itself)
// would be ok structurally - whether the IR would still validate.
bool canRemoveStructurally(Expression* curr) {
// We can remove almost anything, but not a branch target, as we still need
// the target for the branches to it to validate.
if (BranchUtils::getDefinedName(curr).is()) {
return false;
}

// Pops are structurally necessary in catch bodies, and removing a try could
// leave a pop without a proper parent.
return !curr->is<Pop>() && !curr->is<Try>();
}

// Whether we can remove something (but not its children) without changing
// observable behavior or breaking validation.
bool canRemove(Expression* curr) {
if (!canRemoveStructurally(curr)) {
return false;
}
return !EffectAnalyzer(getPassOptions(), *getModule(), curr)
.hasUnremovableSideEffects();
}

// Whether we can replace something (but not its children, we can keep them
// with drops) with an unreachable without changing observable behavior or
// breaking validation.
bool canReplaceWithUnreachable(Expression* curr) {
if (!canRemoveStructurally(curr)) {
return false;
}
EffectAnalyzer effects(getPassOptions(), *getModule(), curr);
// Ignore a trap, as the unreachable replacement would trap too.
effects.trap = false;
return !effects.hasUnremovableSideEffects();
}

void visitExpression(Expression* curr) {
// Skip things we can't improve in any way.
auto type = curr->type;
if (type == Type::unreachable || type == Type::none ||
Properties::isConstantExpression(curr)) {
aheejin marked this conversation as resolved.
Show resolved Hide resolved
return;
}

if (type.isTuple()) {
// TODO: tuple types.
return;
}

if (type.isRef() && (getTypeSystem() != TypeSystem::Nominal &&
getTypeSystem() != TypeSystem::Isorecursive)) {
// Without type info we can't analyze subtypes, so we cannot infer
// anything about refs.
return;
}

// Ok, this is an interesting location that we might optimize. See what the
// oracle says is possible there.
auto contents = oracle.getContents(ExpressionLocation{curr, 0});

auto& options = getPassOptions();
auto& wasm = *getModule();
Builder builder(wasm);

auto replaceWithUnreachable = [&]() {
if (canReplaceWithUnreachable(curr)) {
replaceCurrent(getDroppedChildrenAndAppend(
curr, wasm, options, builder.makeUnreachable()));
} else {
// We can't remove this, but we can at least put an unreachable
// right after it.
replaceCurrent(builder.makeSequence(builder.makeDrop(curr),
builder.makeUnreachable()));
kripken marked this conversation as resolved.
Show resolved Hide resolved
}
optimized = true;
};

if (contents.getType() == Type::unreachable) {
// This cannot contain any possible value at all. It must be unreachable
// code.
replaceWithUnreachable();
return;
}

// This is reachable. Check if we can emit something optimized for it.
// TODO: can we handle more general things here too?
if (!contents.canMakeExpression()) {
return;
}

if (contents.isNull() && curr->type.isNullable()) {
aheejin marked this conversation as resolved.
Show resolved Hide resolved
// Null values are all identical, so just fix up the type here (the null's
// type might not fit in this expression).
contents =
PossibleContents::literal(Literal::makeNull(curr->type.getHeapType()));
aheejin marked this conversation as resolved.
Show resolved Hide resolved

// Note that if curr's type is *not* nullable, then the code will trap at
// runtime (the null must arrive through a cast that will trap). We handle
// that below, so we don't need to think about it here.
aheejin marked this conversation as resolved.
Show resolved Hide resolved

// TODO: would emitting a more specific null be useful when valid?
}

auto* c = contents.makeExpression(wasm);

// We can only place the constant value here if it has the right type. For
// example, a block may return (ref any), that is, not allow a null, but in
// practice only a null may flow there if it goes through casts that will
// trap at runtime.
aheejin marked this conversation as resolved.
Show resolved Hide resolved
// TODO: GUFA should eventually do this, but it will require it properly
// filtering content not just on ref.cast as it does now, but also
// ref.as etc. Once it does those we could assert on the type being
// valid here.
if (Type::isSubType(c->type, curr->type)) {
if (canRemove(curr)) {
replaceCurrent(getDroppedChildrenAndAppend(curr, wasm, options, c));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we remove its children too? This is within canRemove, which checks EffectAnalyzer to see if it has any side effects. So that we are here means curr can be removed safely, no? The same question for this function's use in replaceWithUnreachable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point! Yes, this could be a lot better. I fixed it to use ShallowEffectAnalyzer which is all we need since we'll keep the children if we actually need them. This optimizes more code + it's faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using ShallowEffectAnalyzer is better, but what I was asking about was, why do we even need to keep children if EffectAnalyzer.hasUnremovableSideEffects() returns false?

Some expression as a whole, including children, may not have a side effect even when some of children has one. For example,

(block $lable0
  (br $label0)
)

(br $label0) has a side effects, but the whole (block ... ) doesn't. I guess we don't encounter this case in this pass because we don't remove named blocks anyway, but actually in this case this block may be removable, because this block as a whole does not have any side effects.

So, if we are to preserve children anyway, using ShallowEffectAnalyer is better for sure. But in the previous code, when we were using EffectAnalyer, if that doesn't have any side effects, can't we remove the expression, including children, altogether? Do we even need to preserve children in that case as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right that the earlier code was silly, since if EffectAnalyzer.hasUnremovableSideEffects() returns false then we don't need the children.

The new code seems optimal: we check for shallow effects which ignores the children, which allows us to remove more things, and then we do keep the children (if we need them).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it more optimal if we don't keep children, because we can remove more code that way? I'm not sure why we need to keep children when EffectAnalyzer.hasUnremovableSideEffects returns false, which means the expression as a whole, including all children, doesn't have any side effects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we need to keep children when EffectAnalyzer.hasUnremovableSideEffects returns false, which means the expression as a whole, including all children, doesn't have any side effects.

You're right, we don't need to keep children in that case. Sorry if I wasn't clear before. Yes, that was an issue with the old code.

Does the new code make sense though? The key thing is that if the children are not actually needed then they'll get removed somehow - either right now, or later in Vacuum/DCE. So if we use EffectAnalyzer.hasUnremovableSideEffects and remove the children then we're just removing them eagerly here, but the final result is the same. But it's better to check for shallow effects here, since then we might find we can remove an instruction without removing its children - so the new code handles more cases, without losing anything (anything but the eager removal).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the new code handles more cases, without losing anything (anything but the eager removal).

Is that always the case? There are cases when a child has a side effect but the parent, including all children, may not. Like the case I wrote above:

(block $lable0
  (br $label0)
)

Also try-delegate is a similar case; a single try-delegate may have a side effect (because delegate can target an outer try) but the outer try as a whole may not have any side effects.

But yeah, these may not matter because we exclude blocks and trys in canRemoveStructurally or somewhere. But I wasn't sure it's always the case that "a child is not needed it will be removed by Vacuum eventually". Maybe it's true except the case of block and try, which we handle separately.

I think using ShallowEffectAnalyzer is better in terms of speed, because it's not O(n^2).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right! Sorry for not understanding you before. I forgot about the possibility of the combination having fewer effects.

Yes, I think you're right. In principle this is an issue. However, canRemoveStructurally rules those cases out as you said. I think any time that the combined expression has no effects, but the pieces of it do if they are separate, are cases that we can't remove other pieces.

I added a comment here.

Copy link
Member

@aheejin aheejin Jul 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK now I think I understand better. Then what's the difference between the ones we exclude in canRemoveStructurally and the ones in getDroppedUnconditionalChildrenAndAppend?

In canRemoveStructurally, we exclude named blocks, trys, and pops. In getDroppedUnconditionalChildrenAndAppend, we exclude ifs and trys. But I think the thing they check should be the same, namely, they check whether the expression itself can be removed, regardless of its chlidren can be removed or not, which will be dropped separately.

So my question is, 1. Are they really the same? 2. If so, do we need two places to check this? 3. And, doesn't getDroppedUnconditionalChildrenAndAppend need to exclude pop and named blocks as well, in case it is used in places other than GUFA? So what I mean is, can we check all these within getDroppedUnconditionalChildrenAndAppend and remove canRemoveStructurally altogether? We are currently excluding expressions in three places (canRemove, canRemoveStructurally, and getDroppedUnconditionalChildrenAndAppend), which is confusing. If we can remove one of them that will be an improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this can actually be simplified a lot more, but it may be better to do that as a follow-up, given that this PR was hold up for too many weeks due to my laziness 😅 Sorry for that. I tried to do the follow-up in #4787.

} else {
// We can't remove this, but we can at least drop it and put the
// optimized value right after it.
replaceCurrent(builder.makeSequence(builder.makeDrop(curr), c));
Copy link
Member

@aheejin aheejin Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this beneficial? Doesn't this add expressions and increase the code size? I understand this pass is supposed to increase the code size that can be cleaned up by other passes such as DCE and Vacuum, but that we are in this else clause means the dropped expression has some side effects that cannot be removed. Then can DCE or Vacuum remove it? For example, if the dropped part contains calls or returns, DCE/Vacuum wouldn't be able to remove it and we still need to run them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some risk here, that is true. Perhaps this should only be done when not optimizing for size? However, propagating the constant to here will potentially open up opportunities to remove other code, so I'm not sure. Will add a comment now but let's keep thinking about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we disable this part of code and do a quick check on perf and size, if that's easily doable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing now, code size is 1.6% smaller with this optimization. It's harder to test speed (we don't have great wasm GC benchmarks for this kind of thing), but emitting more constants is almost always better. So at least on this one example (j2wasm application) it looks worth doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm does that mean DCE or Vacuum was eventually able to remove those side-effect-having expressions? I was like, even if we emit an additional constant, if we cannot remove the original expression, that original expression still needs to be run..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's no guarantee we can remove it. But sometimes dropping an expression lets us remove parts of it:

(select
  (call $get-A) ;; returns ref.func $foo
  (call $get-B) ;; also returns ref.func $foo
  (..condition..)
)
=>
(drop
  (select
    (call $get-A) ;; returns ref.func $foo
    (call $get-B) ;; also returns ref.func $foo
    (..condition..)
  )
)
(ref.func $foo)

Now that the select is dropped we don't need it, and vacuum will remove it and just leave the calls (and maybe the condition).

(Not sure if that's the main factor here though. Could also be that we don't remove any dropped code, but the constant lets us remove other code...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what if move all meaningful reduction code from vacuum pass to some common util file and reuse routines from this file for vacuum and this pass? At least for those functions that don't require a global analysis of the program, but only subexpressions. This will allow you to avoid code overgrowth and its subsequent removal through a separate pass. It will also allow you to evaluate the benefit of constant propagation ahead of time (at least in local cases).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGraey The vacuum code is already reusable right now, isn't it? We reuse it from here already just by doing runner.add("vacuum");. If you mean we just need a subset of vacuum, I'm not sure that's true, we need basically all of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runner.add("vacuum") run on per function level right? But It can't provide a feedback. I mean something like tryVacuum(transform(expr)) and if tryVacuum return false (which mean you can't cleanup code)just don't replace current expression to transformed due to code overgrowth

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGraey I see. I think that might make sense to do when optimizing for size perhaps. But the numbers from a few comments up show that even if we increase size slightly in a temporary way, the constants that we propagate end up helping more. So I think for now this is good enough, but maybe we can do better later.

}
optimized = true;
} else {
// The type is not compatible: we cannot place |c| in this location, even
// though we have proven it is the only value possible here.
if (Properties::isConstantExpression(c)) {
// The type is not compatible and this is a simple constant expression
// like a ref.func. That means this code must be unreachable. (See below
// for the case of a non-constant.)
replaceWithUnreachable();
aheejin marked this conversation as resolved.
Show resolved Hide resolved
} else {
// This is not a constant expression, but we are certain it is the right
// value. Atm the only such case we handle is a global.get of an
// immutable global. We don't know what the value will be, nor its
// specific type, but we do know that a global.get will get that value
// properly. However, in this case it does not have the right type for
// this location. That can happen since the global.get does not have
// exactly the proper type for the contents: the global.get might be
// nullable, for example, even though the contents are not actually a
// null. For example, consider what happens here:
//
// (global $foo (ref null any) (struct.new $Foo))
// ..
// (ref.as_non_null
// (global.get $foo))
//
// We create a $Foo in the global $foo, so its value is not a null. But
// the global's type is nullable, so the global.get's type will be as
// well. When we get to the ref.as_non_null, we then want to replace it
// with a global.get - in fact that's what its child already is, showing
// it is the right content for it - but that global.get would not have a
// non-nullable type like a ref.as_non_null must have, so we cannot
// simply replace it.
//
// For now, do nothing here, but in some cases we could probably
// optimize (e.g. by adding a ref.as_non_null in the example) TODO
assert(c->is<GlobalGet>());
aheejin marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

// TODO: If an instruction would trap on null, like struct.get, we could
// remove it here if it has no possible contents. That information
// is present in OptimizeInstructions where it removes redundant
// ref.as_non_null, so maybe there is a way to share that.
aheejin marked this conversation as resolved.
Show resolved Hide resolved

void visitFunction(Function* func) {
if (!optimized) {
return;
}

// Optimization may introduce more unreachables, which we need to
// propagate.
ReFinalize().walkFunctionInModule(func, getModule());

// We may add blocks around pops, which we must fix up.
EHUtils::handleBlockNestedPops(func, *getModule());

// If we are in "optimizing" mode, we'll also run some more passes on this
// function that we just optimized. If not, leave now.
if (!optimizing) {
return;
}

PassRunner runner(getModule(), getPassOptions());
runner.setIsNested(true);
// New unreachables we added have created dead code we can remove. If we do
// not do this, then running GUFA repeatedly can actually increase code size
// (by adding multiple unneeded unreachables).
runner.add("dce");
// New drops we added allow us to remove more unused code and values. As
// with unreachables, without a vacuum we may increase code size as in
// nested expressions we may apply the same value multiple times:
//
// (block $out
// (block $in
// (i32.const 10)))
//
// In each of the blocks we'll infer the value must be 10, so we'll end up
// with this repeating code:
//
// (block ;; a new block just to drop the old outer block
// (drop
// (block $out
// (drop
// (block $in
// (i32.const 10)
// )
// )
// (i32.const 10)
// )
// )
// (i32.const 10)
// )
runner.add("vacuum");
runner.runOnFunction(func);
}
};

struct GUFAPass : public Pass {
bool optimizing;

GUFAPass(bool optimizing) : optimizing(optimizing) {}

void run(PassRunner* runner, Module* module) override {
ContentOracle oracle(*module);
GUFAOptimizer(oracle, optimizing).run(runner, module);
}
};

} // anonymous namespace

Pass* createGUFAPass() { return new GUFAPass(false); }
Pass* createGUFAOptimizingPass() { return new GUFAPass(true); }

} // namespace wasm
12 changes: 10 additions & 2 deletions src/passes/pass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -168,10 +168,18 @@ void PassRegistry::registerPasses() {
"generate-stack-ir", "generate Stack IR", createGenerateStackIRPass);
registerPass(
"global-refining", "refine the types of globals", createGlobalRefiningPass);
registerPass(
"gto", "globally optimize GC types", createGlobalTypeOptimizationPass);
registerPass(
"gsi", "globally optimize struct values", createGlobalStructInferencePass);
registerPass(
"gto", "globally optimize GC types", createGlobalTypeOptimizationPass);
registerPass("gufa",
"Grand Unified Flow Analysis: optimize the entire program using "
"information about what content can actually appear in each "
"location",
createGUFAPass);
registerPass("gufa-optimizing",
"GUFA plus local optimizations in functions we modified",
createGUFAOptimizingPass);
registerPass("type-refining",
"apply more specific subtypes to type fields where possible",
createTypeRefiningPass);
Expand Down
2 changes: 2 additions & 0 deletions src/passes/passes.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ Pass* createGenerateStackIRPass();
Pass* createGlobalRefiningPass();
Pass* createGlobalStructInferencePass();
Pass* createGlobalTypeOptimizationPass();
Pass* createGUFAPass();
Pass* createGUFAOptimizingPass();
Pass* createHeap2LocalPass();
Pass* createI64ToI32LoweringPass();
Pass* createInlineMainPass();
Expand Down
9 changes: 9 additions & 0 deletions test/lit/help/wasm-opt.test
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,15 @@
;; CHECK-NEXT:
;; CHECK-NEXT: --gto globally optimize GC types
;; CHECK-NEXT:
;; CHECK-NEXT: --gufa Grand Unified Flow Analysis:
;; CHECK-NEXT: optimize the entire program
;; CHECK-NEXT: using information about what
;; CHECK-NEXT: content can actually appear in
;; CHECK-NEXT: each location
;; CHECK-NEXT:
;; CHECK-NEXT: --gufa-optimizing GUFA plus local optimizations in
;; CHECK-NEXT: functions we modified
;; CHECK-NEXT:
;; CHECK-NEXT: --heap2local replace GC allocations with
;; CHECK-NEXT: locals
;; CHECK-NEXT:
Expand Down
Loading