You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After testing the fuzzer-generated input foo.clif, we identified three rule groups introduced by #12926 (commit 1578325) as candidates:
rule A: ((x + y) - (x + z)) -> (y - z)
rule B: ((x - z) - (y - z)) -> (x - y)
rule C: ((x - y) - (x - z)) -> (z - y)
The regression disappears as soon as both Rule B and Rule C are removed from 1578325.
benchmark details
Baseline commit: 1578325
Measurement command: CRANELIFT_FILETESTS_THREADS=1 /usr/bin/time -p <clif-util> test <foo.clif>
Runs per scenario: 5
scenario
removed groups
median real (s)
mean real (s)
min
max
stdev
stalled
slow
vs baseline
minus_group_ab
A + B
31.11
31.196
30.96
31.52
0.228
5
5
1.03x
minus_group_ac
A + C
31.21
31.230
31.02
31.63
0.218
5
5
1.03x
minus_group_bc
B + C
0.17
0.164
0.14
0.18
0.016
0
0
0.01x
minus_group_abc
A + B + C
0.15
0.156
0.14
0.18
0.016
0
0
0.00x
Leaving either Rule B or Rule C active still blows up, so each of them is independently sufficient to trigger the regression. Group A has no measurable impact for this testcase.
Result is the earlier expression, v_k. This means that rules can return the earlier value as an equivalent value while optimizing v_{k+1}. This equivalence is represented by a Union value, unless the rewrite subsumes the original or hits the eclass-size limit.
So later values can see both locally-created forms and forms already associated with earlier values such as v247.
Without Rule B/C, Rule D can still fire. But later repeated isub(v, v) values do not get rewritten back to earlier repeated isub values. Rule D is also bounded by the rewrite-depth limit.
With Rule B/C, later values can also be equivalent to earlier values, so Rule D gets many more valid bindings.
In short:
Rule B/C can rewrite a later isub(v, v) to a form already represented by an earlier value.
Rule D then has more equivalent forms to match against.
Limits for egraph rewriting
The existing limits are local:
REWRITE_LIMIT bounds recursive optimization.
ECLASS_ENODE_LIMIT bounds the size of one tree.
MATCHES_LIMIT truncates results from one ISLE call.
Each of the 48 isub(v, v) values is optimized separately. Rule B can rewrite a later one back to the previous isub(v, v) value, so each local rewrite can match forms recorded for earlier values.
Additional off-by-one: REWRITE_LIMIT = 5 is checked before incrementing rewrite_depth(which starts from 0), so the maximum observed depth is 6.
Evidence
Per-top-level Rule D fire counts (first few repeated isub(v, v) values):
top-level
fires
v247
6
v248
6
v249
44
v251
138
This growth pattern accumulates to observed 3.3M total fires.
Trace evidence that an earlier optimized value is returned again for later values(trace-log):
1123: optimizing inst inst132 orig result v247 gave v830
1269: -> returned from ISLE: v248 -> [v830, v852]
1904: -> returned from ISLE: v249 -> [v830, v852, v932, v939, v944]
v830 appearing in later return lists shows that an earlier optimized value is returned again as an equivalent value for later inputs.
Open question
Our interpretation is that this is not just a problem with one rewrite rule, but a compile-time risk in how e-graphs(and Cranelift's aegraph) expose accumulated equivalent alternatives during matching.
Does this interpretation sound reasonable, or would you think about this failure mode differently?
The original problem was shown here: #13068.
Problem
After testing the fuzzer-generated input
foo.clif, we identified three rule groups introduced by #12926 (commit1578325) as candidates:((x + y) - (x + z)) -> (y - z)((x - z) - (y - z)) -> (x - y)((x - y) - (x - z)) -> (z - y)The regression disappears as soon as both Rule B and Rule C are removed from
1578325.benchmark details
1578325CRANELIFT_FILETESTS_THREADS=1 /usr/bin/time -p <clif-util> test <foo.clif>5Leaving either Rule B or Rule C active still blows up, so each of them is independently sufficient to trigger the regression. Group A has no measurable impact for this testcase.
Observation
foo.clifsummaryThe fuzzer has two main structure:
ishlvalues sharing a common shift amount (v214),isub.i64x2(v, v)(48 times).Additional observations:
The regression was resolved after vector support in Support vector types in
iconst_{u,s}#13063 (f2807a1):isub.i64x2 v vcollapses to0at the root.Under
1578325(Rule B/C active, pre-Support vector types iniconst_{u,s}#13063 behavior), this rule fires ~3,000,000 times on a single rule:Let us call Rule D.
Causality
From
v_{k+1} = isub(v_k, v_k), wherev_k = isub(v_{k-1}, v_{k-1}), one level of unfolding gives:Result is the earlier expression,
v_k. This means that rules can return the earlier value as an equivalent value while optimizingv_{k+1}. This equivalence is represented by a Union value, unless the rewrite subsumes the original or hits the eclass-size limit.For example:
Rule B/C can make a later repeated
isub(v, v)value equivalent to an earlier one.Rule D matches when both operands can be viewed as
ishlexpressions with the same shift amount. For example,B/C add the extra effect that a later value can become equivalent to an earlier repeated
isubvalue:So later values can see both locally-created forms and forms already associated with earlier values such as
v247.isub(v, v)values do not get rewritten back to earlier repeatedisubvalues. Rule D is also bounded by the rewrite-depth limit.In short:
isub(v, v)to a form already represented by an earlier value.Limits for egraph rewriting
The existing limits are local:
REWRITE_LIMITbounds recursive optimization.ECLASS_ENODE_LIMITbounds the size of one tree.MATCHES_LIMITtruncates results from one ISLE call.Each of the 48
isub(v, v)values is optimized separately. Rule B can rewrite a later one back to the previousisub(v, v)value, so each local rewrite can match forms recorded for earlier values.Additional off-by-one:
REWRITE_LIMIT = 5is checked before incrementingrewrite_depth(which starts from0), so the maximum observed depth is6.Evidence
Per-top-level Rule D fire counts (first few repeated
isub(v, v)values):This growth pattern accumulates to observed 3.3M total fires.
Trace evidence that an earlier optimized value is returned again for later values(trace-log):
v830appearing in later return lists shows that an earlier optimized value is returned again as an equivalent value for later inputs.Open question
Our interpretation is that this is not just a problem with one rewrite rule, but a compile-time risk in how e-graphs(and Cranelift's aegraph) expose accumulated equivalent alternatives during matching.
Does this interpretation sound reasonable, or would you think about this failure mode differently?