The new Yul optimizer toolkit by xermicus · Pull Request #519 · paritytech/revive

xermicus · 2026-05-13T10:25:10Z

No description provided.

Signed-off-by: xermicus <bigcyrill@hotmail.com>

Implement type-inference-driven value narrowing in newyork codegen: 1. Let-binding narrowing: i256 values with inferred width ≤I64 and unsigned are truncated to i64 at Let binding sites, enabling native RISC-V arithmetic downstream. 2. Arithmetic dispatch: Add/Mul check inferred result width. If result fits in i64, use ensure_min_width for native ops. Sub always uses i256 to handle negative results correctly. 3. Pointer-site narrowing: Memory offsets/lengths narrowed from i256 to i64 at use sites (mstore, mload, codecopy, etc.). 4. Fix condition_stmts forward inference: The forward type inference pass was not visiting Let statements inside for-loop condition blocks, causing large constants (e.g., type(int256).min) to retain default I1 width and be incorrectly truncated. 5. Fix inliner for large contracts: Single-call functions larger than 40 IR nodes are now deferred to LLVM's inliner instead of being inlined at IR level. This prevents creating monolithic dispatcher functions that cause regressions on OpenZeppelin contracts. Results: OZ ERC20 -3.5%, OZ ERC721 -3.9%, small benchmarks -1.5% to -3.1%. All 62 integration tests, 5851 retester tests, and make test pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a truncate-and-extend pattern after loading the free memory pointer (mload(64)) to prove to LLVM that the value fits in 64 bits. This enables LLVM to eliminate downstream overflow checks and simplify surrounding arithmetic, yielding significant code size reductions. Also adds revert(0,K) block deduplication using shared basic blocks keyed by constant revert length. Results vs standard pipeline: - Integration ERC20: -10.8% (was -5.1%) - Integration SHA1: -7.9% (was -1.4%) - OZ ERC20: -11.7% (74,508 vs 84,364 bytes) - OZ ERC721: -9.4% (84,950 vs 93,730 bytes) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Key changes: - FMP range proof changed from 64-bit to 32-bit to match XLEN, eliminating 60 more overflow trap sites in OZ ERC20 - Added apply_range_proof() helper for reusable range proofs - Added range proofs for timestamp (64), number (64), chainid (64), basefee (128) - Documented which builtins already have implicit range proofs - Memory region annotation in simplify pass using constant tracking - Updated IR_PLAN.md with optimization analysis and failed experiments Results: ERC20 -12.6%, Flipper -10.8%, Computation -10.5% vs standard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…cture Conditional sbrk NoInline: mark __sbrk_internal as NoInline only for contracts with >20 heap operations. ERC20 saves 359 bytes while small contracts remain unaffected. Interprocedural parameter narrowing: infrastructure to narrow function parameter types from I256 based on body-internal usage analysis. Currently no parameters meet the strict criteria but the code path is ready for broader narrowing rules. ERC20: 14287 bytes (-14.7% vs standard 16757) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The try_narrow_let_binding was overriding the 32-bit FMP range proof (trunc i256 → i32; zext i32 → i256) with a weaker 64-bit proof (trunc i256 → i64). Now detects zext source width and preserves tighter existing proofs. Result: 17 fewer overflow checks in LLVM IR, 288 fewer IR lines, though PVM blob size unchanged (PolkaVM already handles dead overflow blocks efficiently). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Major newyork IR optimizations: 1. Keccak256Pair IR node: fuses mstore(0,w0)+mstore(32,w1)+keccak256(0,64) into a single __revive_keccak256_two_words call. 13 sites in ERC20, saving ~529 bytes (-3.7%). 2. FMP constant propagation: replaces mload(0x40) with known constant when the free memory pointer hasn't been modified. Handles Statement::Block wrapping, Statement::Expr void calls, and builds transitive FMP-writer closure via call graph analysis. 3. Interprocedural parameter narrowing: propagates caller argument widths to callee parameters in the type inference forward pass. Narrows function signatures (e.g., assert_helper(i32), finalize_allocation(i64, i256)). 4. Conditional sbrk NoInline: only outline __sbrk_internal when the contract has >20 heap operations (saves ~359 bytes on ERC20 without regressing small contracts). 5. Load-after-store forwarding in mem_opt: when mload matches a tracked mstore to the same offset, forward the stored value directly. 6. Memory region annotation in simplify pass: tag MLoad/MStore with FreePointerSlot/Scratch/Dynamic regions from constant offset analysis. Code size improvements (newyork vs previous): - ERC20: 14287 → 13051 (-8.7%) - SHA1: 6748 → 5930 (-12.1%) - Computation: 1654 → 1577 (-4.7%) - DivisionArithmetics: 13501 → 13243 (-1.9%) All 62 integration tests pass with RESOLC_USE_NEWYORK=1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two new code size optimizations: 1. PanicRevert IR node: detects the Solidity panic pattern (mstore(0, 0x4e487b71...) + mstore(4, code) + revert(0, 0x24)) and replaces it with a PanicRevert { code } statement that lowers to a shared helper function. Each panic code (0x11 overflow, 0x22 encoding, 0x41 memory, etc.) gets its own shared block, deduplicating the 3-statement pattern across many call sites. 2. Shared return blocks: when multiple return(offset, length) statements have identical constant arguments, they branch to a single shared LLVM basic block instead of each generating their own exit sequence (sbrk + ptrtoint + seal_return). Code size improvements: - ERC20: 13051 → 12681 (-2.8%) - DivisionArithmetics: 13243 → 12950 (-2.2%) All 62 integration tests pass with RESOLC_USE_NEWYORK=1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

1. Common subexpression elimination for pure environment reads (calldatasize, callvalue, caller, origin, address). After inlining, multiple switch cases redundantly call these builtins. CSE replaces 2nd+ occurrences with references to the first binding, enabling LLVM to eliminate redundant loads. 2. Storage load/store by-value functions: pass key and value as i256 directly instead of through alloca+pointer. Eliminates 2x alloca + 2x store overhead per storage operation. 3. Outlined caller_word function: moves the syscall + bswap + zext sequence into a shared function to avoid code duplication. 4. Fix clippy: remove useless .into() on BasicValueEnum arguments. Code size improvements: - ERC20: 12681 → 12386 (-2.3%) - Flipper: 1507 → 1493 (-0.9%) All 62 integration tests pass with RESOLC_USE_NEWYORK=1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Earlier commits 549488e ("narrow finalize_allocation params via structural detection") and 06c39be ("extend allocator-shape detection to single-param size validators") narrowed both `p0` and `p1` of `finalize_allocation`, plus *any* I256 param whose entry guard was `if gt(p, UINT64_MAX) { panic }`. The panic guard is the part that breaks soundness: `new uint[](2 << 100)` is a valid Solidity expression that *intentionally* triggers panic 0x41 via that very check. Plain-truncating the i256 size argument at the call site silently drops the high bits before the guard observes them, so the wide value never trips the comparison and the panic is lost. The differential tests `complex/try_catch/call/panic/test.json` and `complex/try_catch/create/panic/test.json` (cases 7 and 16) cover this exact path; both regressed with the previous narrowing. Restrict Pattern 1 to narrowing only `p0` — that's the FMP-typed addend sourced from `mload(0x40)` in OZ contracts, bounded by heap_size, so plain truncation is sound. Drop Pattern 2 entirely; the body-internal AND-mask narrowing from the iter25/26 guard_narrow is still inserted, so the function body itself benefits from narrowed downstream arithmetic — just not via a narrowed signature. Verified: all 564 try_catch retester cases pass; clippy and fmt clean.

Add a structural rewrite that, for every `finalize_allocation`-shape function (`(i256, i256) -> ()` whose body has the `mstore(0x40, sum)` + `add(p0, _)` + `if or(gt(sum, UINT64_MAX), lt(sum, p0)) panic` triple), prepends an explicit `if gt(p1, UINT64_MAX) panic_0x41()` plus a downstream `p1 = and(p1, UINT64_MAX)` rewrite. All subsequent uses of `p1` in the body are redirected to the AND-masked value. Soundness: the inserted gt/panic pair is *redundant* with the body's existing overflow check — any caller that would have failed the early check would also have failed the existing late one, so semantics are preserved. The point is to expose the `p1 ≤ UINT64_MAX` invariant at the *top* of the function so type-inference / LLVM IPSCCP can fold the body's i256 arithmetic to i128 / i64. The function signature remains `(i64, i256)`, so caller IR is unchanged — sidestepping the 12-test panic-0x41 regression that the iter34 signature-narrowing attempt produced. After this rewrite, LLVM lowers the body to: %gt = icmp samesign ugt i256 %1, UINT64_MAX br i1 %gt, label %panic_0x41, label %if_join %2 = trunc nuw nsw i256 %1 to i128 ; rest of body is i128 arithmetic vs. the previous full-i256 body. Code-size impact across the OZ contracts (vs iter33 baseline 364,534): erc1155 -9 erc20 -8 erc721 -7 oz_gov -4 oz_rwa -24 oz_simple -33 oz_stable -122 proxy 0 Total -207 (-0.06%) Verified: 70/70 integration tests, 49/49 newyork unit tests, 564/564 try_catch/{call,create}/panic retester cases (the iter34 regression suite), and 674/674 wider sample (try_catch + create + defi + array_one_element). Format and clippy clean.

Previously the post-MLoad range proof on `mload(0x40)` only fired when `can_use_native(0x40) || fmp_native_safe()` — i.e. when the contract was clean enough to use little-endian native storage. Most OZ contracts have dynamic memory accesses (array allocations, ABI copies) and so fail that gate, ending up with a 4-limb i256 FMP load + bswap + or chain at every `mload(0x40)` site. The native-safe check is about byte ORDER (InlineNative writes LE, ByteSwap reads BE — they must agree). The range bound is independent: sbrk enforces FMP < heap_size on every store regardless of mode, so the value loaded from 0x40 is always small. Removing the `can_use_native` gate from the range-proof condition gives LLVM a tight `<heap_size_bits>` range on every FMP load, which lets it DCE three of the four limb loads + bswaps + ors and the trailing `safe_truncate_int_to_xlen` overflow check. In oz_gov this drops the qword load count from 151 to 49. ### Code-size impact (vs iter35 baseline 364,327) | Contract | iter35 | now | Δ | |------------|---------|---------|----------| | erc1155 | 36,952 | 34,002 | -2,950 | | erc20 | 50,709 | 46,691 | -4,018 | | erc721 | 57,668 | 53,943 | -3,725 | | oz_gov | 97,989 | 86,973 | -11,016 | | oz_rwa | 48,144 | 43,386 | -4,758 | | oz_simple | 17,970 | 17,757 | -213 | | oz_stable | 50,605 | 46,276 | -4,329 | | proxy | 4,290 | 4,028 | -262 | | **Total** | 364,327 | 332,956 | **-31,371 (-8.6%)** | Cumulative since 80b14de (iter25 baseline 369,926): **-36,970 bytes (-9.99%)** across iter25-36. ### Tested - 70/70 integration tests, 49/49 newyork unit tests. - 564/564 try_catch retester cases (panic-0x41 path intact). - 1149/1149 broad sample (entire `complex/` directory). - Format and clippy clean.

The FMP slot at heap[0x40] holds a heap offset in `[0, heap_size)` — its maximum value is `heap_size - 1`, not `heap_size`. The previous range proof used `bits(heap_size)`, which for a power-of-two heap (e.g. 131072 = 2^17) is one bit looser than necessary. For heap_size = 131072: previously 18 bits (mask 0x3FFFF = 262143), now 17 bits (mask 0x1FFFF = 131071) — a single i32 `and` with a smaller immediate, and a tighter range hint propagating downstream. ### Code-size impact (vs iter36 baseline 332,956) | Contract | iter36 | now | Δ | |-----------|---------|---------|-----| | erc1155 | 34,002 | 34,002 | 0 | | erc20 | 46,691 | 46,684 | -7 | | erc721 | 53,943 | 53,943 | 0 | | oz_gov | 86,973 | 86,973 | 0 | | oz_rwa | 43,386 | 43,372 | -14 | | oz_simple | 17,757 | 17,751 | -6 | | oz_stable | 46,276 | 46,262 | -14 | | proxy | 4,028 | 4,028 | 0 | | **Total** | 332,956 | 332,915 | **-41** | Cumulative since iter25 baseline (369,926): **-37,011 bytes (-10.00%)** — crosses the 10% mark. ### Tested - 70/70 integration, 49/49 newyork unit - 564/564 try_catch panic retester - 674/674 wider differential sample - 16613/16613 simple/ retester (entire `simple/` corpus, 0 failed)

…raps The "Free memory pointer range proof" section was 2 lines. Expand it to capture (a) the trunc-N → zext-256 encoding pattern, (b) the multiplicative IPSCCP propagation effect that explains why one codegen site dominates the optimizer's overall code-size reduction, and (c) the byte-order vs value-bound separation that motivated dropping the `fmp_native_safe()` gate from the load-side range proof. Add a "Soundness traps for FMP optimizations" subsection listing the three previously-found bugs and their regression coverage: - mload_at_fmp_slot (1fd6063) — byte-order mismatch via dynamic mload - mload_huge_offset_traps (ccca38d) — memory-offset narrowing bypass - FMP i32 shortcut removal (dbcfc92) — short stores breaking inline asm Each tells future contributors what to verify when touching FMP code, and what the three independent invariants are (encoding, value bound, stored width).

Signed-off-by: xermicus <bigcyrill@hotmail.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: xermicus <bigcyrill@hotmail.com>

elle-j · 2026-05-13T12:13:20Z

+/// Environment variable: when set, dumps the newyork IR for every translated object to
+/// `/tmp/newyork_ir_<object>.txt` after optimization passes have run.
+pub const NEWYORK_DUMP_IR_ENV: &str = "NEWYORK_DUMP_IR";


When debugging LLVM, I find it useful having the option to control after which specific pass to dump (e.g -print-after=<pass> instead of -print-after-all). We could consider adding something like that as well. It'd likely help debugging the optimizer.

I also think we should consider CLI flags/arguments for this.

elle-j · 2026-05-13T12:14:54Z

+pub struct SsaBuilder {
+    /// Next value ID to allocate.
+    next_value_id: u32,


Shouldn't SsaBuilder::next_value_id be of type ValueId rather than u32 directly? (Similarly in other places, such as InlineRemapper::next_value_id, some Simplifier fields, etc.)

elle-j · 2026-05-13T12:16:06Z

+    /// Defines a variable with the given name and value.
+    pub fn define(&mut self, name: &str, value: Value) {
+        self.current_scope.insert(name.to_string(), value);
+    }


define() currently handles both declarations and assignments. We could consider splitting the responsibility, e.g. into declare and assign, in order to catch some bugs earlier by adding appropriate assertions (e.g. asserting that the key does not already exist in the current scope, or that it does exist, etc.) (this is in addition to the current logic in validate.rs).

elle-j · 2026-05-13T12:17:59Z

+    /// Returns the scope that was exited (for computing modified variables).
+    pub fn exit_scope(&mut self) -> BTreeMap<String, Value> {
+        let exited = std::mem::take(&mut self.current_scope);
+        self.current_scope = self.scope_stack.pop().unwrap_or_default();


Instead of unwrap_or_default(), shouldn't we use only unwrap() (or expect()), since a correct implementation should not call pop() on an empty stack.

elle-j · 2026-05-13T12:22:42Z

+    }
+
+    /// Allocates a fresh value ID.
+    pub fn fresh_value(&mut self) -> ValueId {


Nit: The equivalent method on Simplifier is called fresh_id. The latter is clearer to me.

elle-j · 2026-05-13T12:27:20Z

+        for error in &errors {
+            log::warn!("IR validation error in {}: {}", ir_object.name, error);
+        }


Why are validation failures logged at warn and not error?

elle-j · 2026-05-13T12:29:38Z

+            let leave_overhead = leave_count * leave_count * LEAVE_OVERHEAD_PER_SITE * call_count;
+            let cost = (call_count - 1) * size * CODE_SIZE_COST_MULTIPLIER + leave_overhead;
+            let mut benefit = 0;
+
+            if size <= 15 && leave_count == 0 {
+                benefit += SMALL_FUNCTION_BONUS;
+            }
+
+            if call_count <= 3 && leave_count <= 1 {
+                benefit += 10;
+            }


Could we instead use named constants for the magic numbers in this cost-benefit calculation?

elle-j · 2026-05-13T12:35:20Z

+enum EnvRead {
+    CallDataSize,
+    CallValue,
+    Caller,
+    Origin,
+    Address,
+}
+
+/// Returns the `EnvRead` kind for an expression if it is a pure environment read.
+fn env_read_kind(expression: &Expression) -> Option<EnvRead> {
+    match expression {
+        Expression::CallDataSize => Some(EnvRead::CallDataSize),
+        Expression::CallValue => Some(EnvRead::CallValue),
+        Expression::Caller => Some(EnvRead::Caller),
+        Expression::Origin => Some(EnvRead::Origin),
+        Expression::Address => Some(EnvRead::Address),
+        _ => None,
+    }
+}


Looking at the Expressions, there seem to be more possible values that could become an EnvRead. Don't e.g. the *Hash variants or ChainId or CodeSize also stay constant for the invocation?

elle-j · 2026-05-13T12:38:16Z

+    let mut redirects: BTreeMap<FunctionId, FunctionId> = BTreeMap::new();
+
+    for function in object.functions.values() {
+        if function.size_estimate <= 3 {


Could we replace 3 with a constant instead. E.g. deduplicate_functions_fuzzy() uses if function.size_estimate < MIN_FUZZY_DEDUP_SIZE.

Late inline pass on top of the existing optimizer + cost-model retune + per-helper memory(...) attribute precision so GVN can fold across heap mstores. −8.95% on the OZ corpus vs the baseline. Co-authored-by: xermicus <bigcyrill@hotmail.com>

elle-j · 2026-05-15T15:21:46Z

+    let removed_count = redirects.len();
+    redirect_calls_in_block(&mut object.code, &redirects);
+    for function in object.functions.values_mut() {
+        redirect_calls_in_block(&mut function.body, &redirects);
+    }
+
+    for dup_id in redirects.keys() {
+        object.functions.remove(dup_id);
+    }


When redirecting the calls and removing the duplicate functinons (similarly in deduplicate_functions_fuzzy()), I don't see any updates to the canonicalized function's call_count.

I assume this would currently not set the NoInline attribute in some scenarios after deduplication even if the actual call count is >= 3, based on LlvmCodegen::set_inline_attribtutes()'s InlineDecision::CostBenefit match arm:

revive/crates/newyork/src/to_llvm.rs

Lines 3477 to 3485 in 6ea5672

Some(crate::InlineDecision::CostBenefit) => {

let force_noinline = function.call_count >= 3

|| function.size_estimate >= LARGE_FUNCTION_NOINLINE_THRESHOLD;

if force_noinline {

Some(revive_llvm_context::PolkaVMAttribute::NoInline)

} else {

None

}

}

elle-j · 2026-05-15T15:26:10Z

+
+/// Results of memory optimization.
+#[derive(Clone, Debug, Default)]
+pub struct MemOptResults {


MemOptResults doesn't actually seem to be used. I noticed that FmpPropagation::loads_eliminated only seemed to be incremented but not read, and when taking a closer look the same seemed to be true for the fields of MemOptResults.

elle-j · 2026-05-15T15:27:13Z

+/// the constant 0x80 through them.
+pub struct FmpPropagation {
+    /// Number of FMP loads eliminated.
+    pub loads_eliminated: usize,


loads_eliminated doesn't seem to be used (see previous comment).

elle-j · 2026-05-15T15:28:52Z

+                    let is_fmp_store =
+                        region == MemoryRegion::FreePointerSlot || resolved_offset == Some(0x40);


Why isn't just a resolved_offset == Some(0x40) check sufficient on its own? If the additional MemoryRegion::FreePointerSlot check is necessary, perhaps extracting the whole region == MemoryRegion::FreePointerSlot || resolved_offset == Some(0x40) into e.g. an is_fmp_slot() would be beneficial for maintenance and make it clearer as to why this check includes the || (the check is used both for is_fmp_store and is_fmp_load).

elle-j · 2026-05-15T15:34:14Z

+                if *region == MemoryRegion::Unknown && !pattern.is_aligned() {
+                    if let Some(addr) = self.extract_static_offset(offset) {
+                        if addr % 32 != 0 {
+                            self.tainted_regions.insert(addr / 32 * 32);


The addr / 32 * 32 is used very frequently in this file (and a few times in mem_opt.rs as well), should we introduce a word address helper?

Fix various soundness issues at the cost of ~1% code size growth. Signed-off-by: xermicus <bigcyrill@hotmail.com>

Co-authored-by: xermicus <bigcyrill@hotmail.com>

Resolves conflicts after LLVM 21 → 22 upgrade and inkwell 0.8 → 0.9 landed on main: * Cargo.toml: keep bumped versions from main; preserve revive-newyork. * docs/: regenerated with mdbook against cl/newyork sources. * function/mod.rs: keep ours' NoFree+NoUnwind PVM target attributes; keep main's is_middle_end_enabled gate around NoInline/OptimizeNone. * context/mod.rs: drop narrow_divrem_instructions and strip_minsize_for_divrem — both were LLVM 21 backend workarounds removed by main in #501 once the upstream patch landed; LLVM 22 bundles the fix. * heap.rs: port inkwell 0.8 → 0.9 API breaks (custom_width_int_type now takes NonZeroU32 and returns Result). * codesize.json: kept ours (LLVM 22 bumps integration sizes ~5–60%; update intentionally in a follow-up). Verified: RESOLC_USE_NEWYORK=1 make test green (workspace + book). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a per-operation reference for the newyork IR to the compiler book for easily being able to view printed syntax, operand and result types, examples, etc. for any operation. Using easily readable syntax notation. The new page is partly inspired by [MLIR's docs](https://mlir.llvm.org/docs/Dialects/SCFDialect/). <details> <summary>Example entry</summary> ### `and` (`Expression::Binary` with `BinaryOperation::And`) #### Description Bitwise AND. The common idiom for type narrowing: a constant mask on the right lets forward analysis pick up a tight result width. #### Syntax ```text and($lhs[: <type>], $rhs[: <type>]) ``` #### Example ```text let v2 := and(v0, v1) let v3 := and(v0, 0xff) // type inference narrows result to i8 ``` #### Operands | Name | Type | Notes | |---|---|---| | `lhs` | `i256` | — | | `rhs` | `i256` | — | #### Result and purity | Result | Purity | |---|---| | `min(width(lhs), width(rhs))` — AND can only clear bits, so the result fits in the narrower operand | Pure | #### Annotations None. </details>

Signed-off-by: xermicus <bigcyrill@hotmail.com>

* Drop the -O0 `optnone`/`noinline` gate from `set_default_attributes` (main #502). It pinned both attributes on every function when the middle end is off; with newyork-emitted IR (many small helpers expecting to be inlined) the backend's fast-isel produced PVM basic blocks over the 1000-instruction `BasicBlockTooLarge` limit. #502 itself reported the change as NFC on main, so the safeguard was inert there but actively harmful here. * Restore `narrow_divrem_instructions` after the LLVM pipeline. `strip_minsize_for_divrem` stays dropped — LLVM 22 carries the patch that made it unnecessary — but the narrowing safety net still helps i256 div/rem expand into compact backend code. * In `LlvmCodegen::set_inline_attributes`, early-return when the middle end is disabled and route the attribute write through `PolkaVMFunction::set_attributes(force = true)`. The raw `add_attribute` call stacked `alwaysinline` on top of the forced `noinline`/`optnone`, which the LLVM IR verifier rejected during library post-link compilation. `RESOLC_USE_NEWYORK=1 retester.sh`: 34888 passed, 6 failed, 144 ignored. The six failures are `blockhash(0xFFFF…)` at S- across all Y M{0,1,2,3,s,z} — a pre-existing newyork edge case masked at S+ by solc's optimizer folding the call. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pre-existing newyork edge case: `blockhash(0xFFFF…)` reverts instead of returning 0. solc's optimizer constant-folds the call to 0 at S+, so the failure is only visible across Y M{0,1,2,3,s,z} at S-. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Signed-off-by: xermicus <bigcyrill@hotmail.com>

elle-j · 2026-05-22T14:14:38Z

+    if std::env::var(NEWYORK_DUMP_IR_ENV).is_ok() {
+        use std::io::Write;
+        let dump_path = format!("/tmp/newyork_ir_{}.txt", ir_object.name.replace('/', "_"));


IR dumps for NEWYORK_DUMP_IR seem to be hardcoded to "/tmp/newyork_ir_{}.txt". Could we use e.g. our current --debug-output-dir instead in order to make this configurable? (Similarly for RESOLC_DEBUG_*.)

elle-j · 2026-05-22T14:18:07Z

Currently, the IR dumps don't show the inferred widths at e.g. let bindings, operands, if-statement conditions, etc. even after the narrow pass has run (RESOLC_DEBUG_IR=1 dumps post-narrow). Some examples from compiling Flipper.sol:

let v19 := 0xffffffffffffffff let v31 := 0x20 let v34 := slt(v7, v31) if v34 { ... } let v39 := iszero(v38) let v49 := 0xff let v50 := and(v39, v49)

I think it'd be more useful (and less ambiguous) if it showed the types in all cases, so it's easier to reason about if a type has been narrowed from looking at the IR. Something like:

let v19: i64 := 0xffffffffffffffff let v31: i8 := 0x20 let v34: i1 := slt(v7: i64, v31: i8) if v34: i1 { ... } let v39: i1 := iszero(v38: i256) let v49: i8 := 0xff let v50: i64 := and(v39: i1, v49: i8) /* v50 may be i64 instead of i1 due to using a demanded width */

elle-j · 2026-05-22T14:19:43Z

Function calls in subobjects are printed as func_<id> rather than <func_name>. From compiling ERC20.sol:

let v141 := func_0(v140, v681) // should be abi_encode_string let v154 := func_1(v689: i8) // should be abi_decode_address let v248 := func_3(v247, v221) // should be checked_sub_uint256 let v270 := func_4(v269, v221) // should be checked_add_uint256

But sometimes there is a name printed, e.g. extract_byte_array_length() (in the dump after setting NEWYORK_DUMP_IR) which is the name of a function in the parent object, so there seems to be a FunctionId looked up from the wrong map (or populated incorrectly). Worth also looking into whether this bug goes beyond the printer.

elle-j · 2026-05-22T14:20:48Z

+//! valid Yul but is designed to be human-readable and clearly show the SSA
+//! structure with explicit value IDs.
+//!
+//! # Example Output


The doc comment's example output doesn't correspond to the actual current output (e.g. it uses = instead of :=).

elle-j · 2026-05-22T14:22:13Z

+            Statement::For {
+                initial_values,
+                loop_variables,
+                condition_statements,
+                condition,
+                body,
+                post,
+                outputs,
+                ..


The post_input_variables is dropped (..) here. (Also, the for-loop indentation is a little broken.)

elle-j · 2026-05-22T14:23:27Z

+            } => {
+                self.write_indent();
+                let arg_strs: Vec<String> =
+                    arguments.iter().map(|a| format!("v{}", a.id.0)).collect();


This should probably use the existing write_value()/write_value_id().

xermicus and others added 30 commits February 2, 2026 21:28

first draft by claude, untestet

fcaeef2

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it did something

ae2f532

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

f7ae880

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

d42ba52

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

874aebb

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it struggles hard

da717f1

Signed-off-by: xermicus <bigcyrill@hotmail.com>

its so confused not even /clear helps anymore

663baed

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it contemplates its miserable existence

3bfe43a

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it did something again

6ae1d79

Signed-off-by: xermicus <bigcyrill@hotmail.com>

the plan lol

c2f3f39

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it finally achieved something

890c243

Signed-off-by: xermicus <bigcyrill@hotmail.com>

update

2161c76

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it ensured tests pass

1b9db52

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it did something

7fe1fac

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

b455de5

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it implemented some more opts

596aa54

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

73b7dcc

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed something

7655d9e

Signed-off-by: xermicus <bigcyrill@hotmail.com>

update plan

c5bffda

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it thought 2 hourse for this lol

80b2826

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it did stuff again

255892f

Signed-off-by: xermicus <bigcyrill@hotmail.com>

it fixed everything

7433d56

Signed-off-by: xermicus <bigcyrill@hotmail.com>

xermicus and others added 18 commits May 7, 2026 19:14

update book

367f7a2

Signed-off-by: xermicus <bigcyrill@hotmail.com>

code review

7e03477

Signed-off-by: xermicus <bigcyrill@hotmail.com>

bugfix

cd557b9

Signed-off-by: xermicus <bigcyrill@hotmail.com>

review

b69aec0

Signed-off-by: xermicus <bigcyrill@hotmail.com>

code review

ae8c0f4

Signed-off-by: xermicus <bigcyrill@hotmail.com>

bugfix

1e5bae1

Signed-off-by: xermicus <bigcyrill@hotmail.com>

review

18e39b5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

review

82bd1a3

Signed-off-by: xermicus <bigcyrill@hotmail.com>

remove more garbage AI slop

1961b70

Signed-off-by: xermicus <bigcyrill@hotmail.com>

remove more AI slop garbage

bf8a590

Signed-off-by: xermicus <bigcyrill@hotmail.com>

fix more AI slop

d94a84c

Signed-off-by: xermicus <bigcyrill@hotmail.com>

remove a lot of AI garbage

3cd9810

Signed-off-by: xermicus <bigcyrill@hotmail.com>

add env vars to book

2d71a4e

Signed-off-by: xermicus <bigcyrill@hotmail.com>

elle-j reviewed May 13, 2026

View reviewed changes

Improve the inliner (#3)

6ea5672

Late inline pass on top of the existing optimizer + cost-model retune + per-helper memory(...) attribute precision so GVN can fold across heap mstores. −8.95% on the OZ corpus vs the baseline. Co-authored-by: xermicus <bigcyrill@hotmail.com>

elle-j reviewed May 15, 2026

View reviewed changes

neo and others added 8 commits May 20, 2026 20:57

add failing test cases (#4)

72513bd

Fix various soundness issues at the cost of ~1% code size growth. Signed-off-by: xermicus <bigcyrill@hotmail.com>

cl/soundness (#5)

f6e64c3

Co-authored-by: xermicus <bigcyrill@hotmail.com>

remove deprecated target vm feature flag

7e49770

Signed-off-by: xermicus <bigcyrill@hotmail.com>

clippy

c7e54b8

Signed-off-by: xermicus <bigcyrill@hotmail.com>

elle-j reviewed May 22, 2026

View reviewed changes

	Some(crate::InlineDecision::CostBenefit) => {
	let force_noinline = function.call_count >= 3
	\|\| function.size_estimate >= LARGE_FUNCTION_NOINLINE_THRESHOLD;
	if force_noinline {
	Some(revive_llvm_context::PolkaVMAttribute::NoInline)
	} else {
	None
	}
	}

		let is_fmp_store =
		region == MemoryRegion::FreePointerSlot \|\| resolved_offset == Some(0x40);

Conversation

xermicus commented May 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elle-j May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elle-j May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

elle-j May 15, 2026 •

edited

Loading

elle-j May 15, 2026 •

edited

Loading