ANFEncoder: iterate to fixpoint to eliminate nested duplicates#1135
Conversation
The ANF encoder extracts duplicated subexpressions into fresh variables,
but the existing single-pass implementation can leave large duplicated
sub-subexpressions behind.
Root cause: `removeSubsumed` drops candidate duplicates that are
subexpressions of other (larger) candidate duplicates, to avoid
creating redundant variable declarations. But this means that if
only the outer expression appears at the top level, the inner dupes
are hidden inside the lifted var declaration and never extracted.
Example (from PyAnalyzeLaurel benchmark check_storage_costs):
Original: assert Any_to_bool(Any_get(response, "Datapoints")) ...
After partial evaluation, Any_to_bool inlines its 7-branch body,
each branch referencing the argument. With Any_get also inlined
as an `ite (is-DictStrAny response) (DictStrAny_get ...)
(List_get ...)`, the Any_get expression ends up duplicated 62
times inside a single assert.
Old ANF output:
var $__anf.0 : bool := <9KB body with 62 duplicates of Any_get>;
assert $__anf.0
New ANF output (after iteration):
var $__anf.3 : Any := Any_get(response, "Datapoints");
var $__anf.0 : bool := <body using $__anf.3>;
assert $__anf.0
Effect: VC file size for VCs on one benchmark drops from 32KB to
11KB (~65% reduction), and another benchmark now completes
verification at 36s where it previously hit a 60s timeout.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR improves the Core ANF encoder so it repeatedly extracts duplicated subexpressions until reaching a fixpoint, reducing duplication that can remain hidden inside newly introduced var declarations after a single extraction pass.
Changes:
- Update
anfEncodeBodyto iterate until no further ANF encoder targets are found. - Adjust documentation to explain why fixpoint iteration is needed for nested-duplicate elimination.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖 Clean, well-motivated change. The fixpoint iteration is a natural solution to the nested-duplicate problem, and the early-exit on targets.isEmpty guarantees termination in practice (each pass extracts at least one duplicate, strictly reducing the pool of non-leaf subexpressions).
One suggestion below about test coverage for the new iteration behavior.
Three changes responding to the review threads: 1. Make the replacements map collision-safe (Copilot). The map's value type is now List (Expr × Expr) keyed by UInt64. Insertions append to the bucket list and lookups walk the list with structural `==`, so two distinct duplicates that share a hash do not displace each other — at most a few extra equality checks per lookup, and the bottom-up O(n) hash computation in replaceExprs is preserved. 2. Convert anfEncodeBody from `partial def` to a real `def` with a structurally decreasing fuel parameter (joscoh). The fuel value is the total expression size of the initial body, which is a sound upper bound on the iteration count. The docstring now states the termination argument explicitly: each pass replaces every duplicate occurrence by a fresh fvar (a leaf, filtered from future S(body)) and adds at most one var-decl init per duplicate (already in S(body)), so S strictly shrinks in non-trivial passes and is finite. 3. Add a multi-pass regression test (MikaelMayer). `nestedDupProg` has `(x + 1) * 2` (duplicate) and `x + 1` (subsumed by the larger duplicate in pass 1). Pass 1 lifts `(x + 1) * 2`; pass 2 then lifts `x + 1` after it becomes visible in the new var-decl init and the third assert. The single-pass version of the encoder would leave the inner duplicate in place. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 LGTM — all comments addressed.
Drop the "(cf. PR #1135 review)" parenthetical so the docstring is authoritative on its own (per MikaelMayer). Replace it with a sentence that makes explicit what the parenthetical was hand-waving toward: the collision-safe lookup is what lets anfEncodeBody's termination argument claim that every duplicate found by findANFEncoderTargets is actually rewritten on the same pass, so no unreplaced duplicate can survive into the next iteration. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All previous comments appear addressed — reviewer sign-off still needed.
The ANF encoder extracts duplicated subexpressions into fresh variables, but the existing single-pass implementation can leave large duplicated sub-subexpressions behind.
Root cause:
removeSubsumeddrops candidate duplicates that are subexpressions of other (larger) candidate duplicates, to avoid creating redundant variable declarations. But this means that if only the outer expression appears at the top level, the inner dupes are hidden inside the lifted var declaration and never extracted.Example (from PyAnalyzeLaurel benchmark check_storage_costs):
Original: assert Any_to_bool(Any_get(response, "Datapoints")) ...
After partial evaluation, Any_to_bool inlines its 7-branch body,
each branch referencing the argument. With Any_get also inlined
as an
ite (is-DictStrAny response) (DictStrAny_get ...) (List_get ...), the Any_get expression ends up duplicated 62times inside a single assert.
Old ANF output:
var $__anf.0 : bool := <9KB body with 62 duplicates of Any_get>;
assert $__anf.0
New ANF output (after iteration):
var $__anf.3 : Any := Any_get(response, "Datapoints");
var $__anf.0 : bool := <body using $__anf.3>;
assert $__anf.0
Effect: VC file size for VCs on one benchmark drops from 32KB to 11KB (~65% reduction), and another benchmark now completes verification at 36s where it previously hit a 60s timeout.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.