perf(rv64): split base_alu into add_sub and bitwiselogic chips#2883
Merged
Conversation
This comment has been minimized.
This comment has been minimized.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
316c914 to
0b8d705
Compare
Re-do of PR #2777 (base_alu part only) on top of the u16 memory-bus limbs change. The 64-bit BaseAlu chip is split into: - add_sub: ADD/SUB with carry constraints, send_xor(a,a,0) result range checks, and the paired send_range(b,c) read-byte bounds required now that the memory bus only checks packed u16 values - xor_or_and: XOR/OR/AND via the bitwise lookup, which already bounds the read bytes The BaseAlu core (cols/AIR/executor/filler) is kept since the bigint INT256 extension still uses it; its rv64-specific execution/cuda/tests move to the new chips. base_alu_w is left untouched for now. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Builds on `perf/split-base-alu-u16`, which already split the RV64
`base_alu` chip
into separate `add_sub` (ADD, SUB) and `bitwise_logic` (XOR, OR, AND)
chips with
u16 columns. This PR extends that split to the remaining consumers of
the old
combined ALU core — the `base_alu_w` (ADDW/SUBW) chip and the 256-bit
bigint
(INT256) extension — and then removes the now-orphaned `base_alu` core.
- **`base_alu_w` (ADDW/SUBW) reuses the `add_sub` core** (`561463700`)
- Adds a u16 ALU-W adapter (`adapters/alu_w_u16.rs` + `alu_w_u16.cuh`).
- `base_alu_w` now reuses `add_sub/core.rs` instead of the full ALU
core; adds
`base_alu_w/preflight.rs` and updates its execution / cuda / tests.
- **bigint (INT256): split `base_alu` → `add_sub` + `bitwise_logic`**
(`41ab11af6`)
- `base_alu.rs` → `bitwise_logic.rs`, plus a new `add_sub.rs`.
- The extension now registers `Rv64AddSub256` and `Rv64BitwiseLogic256`
separately (`AddSubCoreAir` / `BitwiseLogicCoreAir`), with matching CUDA
kernels, ABI, and tests.
- **Delete the dead `base_alu` core** (`2f0137386`) — with both
`base_alu_w` and
bigint migrated off it.
Nothing depends on the old combined `base_alu` core anymore: ADD/SUB and
bitwise
ops are proven by separate, narrower chips across RV64, RV64-W, and
INT256, and
the orphaned core is gone.
Closes INT-8379
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
917f93a to
c056190
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
1ad8909 to
3fe8036
Compare
This comment has been minimized.
This comment has been minimized.
shuklaayush
approved these changes
Jun 19, 2026
Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights. Commit: ad9a0f8 |
shuklaayush
added a commit
that referenced
this pull request
Jun 19, 2026
Re-do of PR #2777 (base_alu part only), now on top of the u16 memory-bus limbs change. Summary of the changes: - Split base_alu chip into add_sub and xor_or_and chops. - New xor_or_and chip is the old base_alu minus ADD/SUB. - New add_sub chip handles the add and sub opcodes and store 2 bytes per field element in its column. - This allows us to remove the interactions needed to range check that each individual field elements is bytes that was present in the previous base_alu chip. - Core width of the add_sub chip drops to 14 columns compared to the 29 columns of the base_alu chip. - Rewrite tests.rs of add_sub chip for the new u16 columns layout. Improves perf by 6% on the reth benchmark: https://github.com/axiom-crypto/openvm-eth/actions/runs/27436476879 Closes INT-8102 --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
shuklaayush
added a commit
that referenced
this pull request
Jun 19, 2026
Re-do of PR #2777 (base_alu part only), now on top of the u16 memory-bus limbs change. Summary of the changes: - Split base_alu chip into add_sub and xor_or_and chops. - New xor_or_and chip is the old base_alu minus ADD/SUB. - New add_sub chip handles the add and sub opcodes and store 2 bytes per field element in its column. - This allows us to remove the interactions needed to range check that each individual field elements is bytes that was present in the previous base_alu chip. - Core width of the add_sub chip drops to 14 columns compared to the 29 columns of the base_alu chip. - Rewrite tests.rs of add_sub chip for the new u16 columns layout. Improves perf by 6% on the reth benchmark: https://github.com/axiom-crypto/openvm-eth/actions/runs/27436476879 Closes INT-8102 --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Re-do of PR #2777 (base_alu part only), now on top of the u16 memory-bus limbs change. Summary of the changes:
Improves perf by 6% on the reth benchmark: https://github.com/axiom-crypto/openvm-eth/actions/runs/27436476879
Closes INT-8102