New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: refactor interpreter internals (take 2) #582
Conversation
b92f2c3
to
967ac6c
Compare
967ac6c
to
4f6452a
Compare
1e8227d
to
c49fe6d
Compare
eddc2af
to
b3440b1
Compare
b3440b1
to
162a57b
Compare
Marking as draft until the other PRs are merged, so I can then rebase on top of main. The changes in PR will only be in |
f3f046d
to
924e959
Compare
The compiler generates much more favorable assembly if all the functions are casted to a `fn(_, _)` pointer before calling them. See <bluealloy#310 (comment)> for more information.
924e959
to
2510ae9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added descriptions of changes. Other minor (non functional) changes include:
- pub -> pub(super) to denote we are not exporting these instructions individually
- remove last argument of
as_usize_or_fail!
macro calls when it'sInvalidOperandOOG
, since it's the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- changed stack methods return type from
Option<IR>
toResult<(), IR>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- use U256 and
get_unchecked
in calldataload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- dedup code in initializer functions
- changed return range to (return length, return offset). Makes parsing the code easier. Before we were MAX..MAX to denote a length of zero, when we can just set the length and offset directly
- added
return_value_slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Display
should not be gated tostd
- avoid using manual implementations (like swap and pop) where possible by deferring to the standard library. This produces the same instructions while being safer
- changed stack methods return type from
Option<IR>
toResult<(), IR>
. This is clearer sinceSome
would be the error - dedup some code by reusing the other functions (like
pop_top
->pop + top
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- changed runtime panics to be
unreachable_unchecked
in release mode since they should never happen. This helps performance quite a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- factored out the set up in common for return and revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- cmp opcodes are now branchless
byte
indexes into the slice instead of performing shiftssar
usesU256::MAX
instead of computing it at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- avoid reassigning to 0 when already 0 in div and rem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to eval
to opcode.rs
, and the return instructions to control
}; | ||
use core::fmt; | ||
|
||
macro_rules! opcodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This macro declares all of:
const NAME: u8 = $value;
OPCODE_JUMPMAP[$value] = Some(stringify!($name));
eval
dispatch
Left out gas, which is improved below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was wondering if gas
should be part of the opcode, but maybe not? it'd be nice if they were all declared in one place, but ig fine as-is?
// When adding new opcodes: | ||
// 1. add the opcode to the list below; make sure it's sorted by opcode value | ||
// 2. add its gas info in the `opcode_gas_info` function below | ||
// 3. implement the opcode in the corresponding module; | ||
// the function signature must be the exact same as the others |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added instructions on how to add new opcodes with the new macro.
write!(f, "UNKNOWN(0x{n:02X})") | ||
} | ||
} | ||
} | ||
|
||
impl OpCode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecated:
try_from_u8
->new
u8
->get
This is the same naming scheme as NonZero*
std numbers.
/* 0xff SELFDESTRUCT */ OpInfo::gas_block_end(0), | ||
]; | ||
}; | ||
const fn opcode_gas_info(opcode: u8, spec: SpecId) -> OpInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved gas declarations from macros to const fns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something you'd want to move on the main OPcode macro? i.e make it point to a statement that returns OpInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered doing this but each match arm is pretty substantial for each opcode and would clutter the main invocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love the explanations on each change. very helpful.
generally supportive - this seems hard to break down further (except the visibility changes I guess). i like the new opcode declaration method. seems like const fns are paying off.
the changes outside of opcode.rs seem non-controversial/pure improvements, but defer to Dragan for all that
/* 0xff SELFDESTRUCT */ OpInfo::gas_block_end(0), | ||
]; | ||
}; | ||
const fn opcode_gas_info(opcode: u8, spec: SpecId) -> OpInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something you'd want to move on the main OPcode macro? i.e make it point to a statement that returns OpInfo
?
}; | ||
use core::fmt; | ||
|
||
macro_rules! opcodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was wondering if gas
should be part of the opcode, but maybe not? it'd be nice if they were all declared in one place, but ig fine as-is?
// for previous efforts on optimizing this function. | ||
let f: Instruction = match opcode { | ||
$($name => $f as Instruction,)* | ||
_ => control::not_found as Instruction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot! casting to Instruction
allows us to be sure that every fn
has the same signature. @DaniPopes do you think this is the reason for the performance boost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it's a bit more than half of the total perf improvement of this pr
@DaniPopes added mostly nits, moved all instructions to be public as that can be useful, and added some small changes. very nice PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test is failing for an unrelated reason.
Split off from (and supersedes) #558, see that PR for benches and list of changes.
Best reviewed on a commit-by-commit basis.
Results
Cachegrind