bugc: Add calldata, returndata, code, and transient storage to EVM codegen#181
bugc: Add calldata, returndata, code, and transient storage to EVM codegen#181
Conversation
…codegen The read/write instruction handlers in storage.ts previously only supported storage and memory locations. This adds full support for: - calldata reads via CALLDATALOAD (with shift+mask for partial reads) - returndata reads via RETURNDATACOPY to scratch memory + MLOAD - code reads via CODECOPY to scratch memory + MLOAD - transient storage reads via TLOAD - transient storage writes via TSTORE Includes tests covering all new read/write paths.
Self-reviewChanges:
Design decisions:
Test coverage: 6 new tests verify mnemonic sequences for each path. |
|
gnidan
left a comment
There was a problem hiding this comment.
Reviewed from the pointers/format perspective. The overall refactoring is clean — extracting generateStorageRead, generateCalldataRead, generateCopyBasedRead, and generateStorageWrite into separate functions improves readability a lot.
Two issues I noticed:
1. Returndata/code partial reads are missing shift+mask (potential bug)
generateCopyBasedRead copies length bytes to scratch at 0x60, then does MLOAD(0x60). When length < 32, the copied bytes are left-aligned in the 32-byte word (bytes at 0x60..0x60+length, rest is zero). But there's no shift+mask step to right-align the result.
Compare with calldata, which correctly shifts right by (32 - length) * 8 for partial reads, and storage, which shifts by (32 - offset - length) * 8. Returndata/code would return a left-aligned value while everything else returns right-aligned.
If the compiler never generates partial reads for these locations today, it's not a live bug, but it's a latent one. At minimum worth a comment; ideally add the same shift+mask treatment.
2. Transient storage reads don't handle offset/length
The IR Read interface allows offset and length for any location, but generateRead for transient just does TLOAD without partial-slot extraction. Storage reads handle this correctly via generateStorageRead with the shift+mask path. A partial transient read would silently return the full slot value.
Same suggestion: at minimum document the limitation, or reuse the storage partial-read logic (TLOAD is semantically identical to SLOAD for extraction purposes).
Everything else looks good. The scratch memory approach at 0x60 is sound, zeroing before copy handles partial-length padding correctly, and the CALLDATALOAD shift math is right. Tests cover the happy paths well.
Copy-based reads (returndata, code) returned left-aligned values from MLOAD without right-aligning for partial reads (length < 32). Add shift+mask path matching calldata's partial read handling. Transient storage reads used TLOAD without offset/length handling. Add the same partial-slot extraction as regular storage reads.
gnidan
left a comment
There was a problem hiding this comment.
Both issues addressed correctly:
-
generateCopyBasedReadnow has separate full/partial paths — partial reads shift right by(32 - length) * 8after MLOAD, matching calldata's right-alignment convention. -
generateTransientReadmirrorsgenerateStorageReadwith(32 - offset - length) * 8shift for partial slot extraction.
Shift math checks out, test coverage looks good. LGTM.
Summary
generateReadinstorage.tsto handle calldata, returndata, code, and transient storage locations (previously only storage and memory were supported)generateWritesupport for transient storage via TSTORE