New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbolic execution for HEVM; hevm symbolic
and hevm equivalence
#353
Conversation
@MrChico What's
I'd be interested in trying something like that |
From what I understand, when using It seems appropriate for memory and storage, where we care mainly about the values at specific keys, rather than the structure as a whole. |
adfad1a
to
58b41e5
Compare
576a804
to
3575269
Compare
Incidentally also solves #273 |
7c2a310
to
345b6ea
Compare
604a061
to
b74e38e
Compare
e17bef7
to
db14055
Compare
I now consider this effort feature complete for a first release. I'll be adding some more tests and docs (check out the readme and please let me know whats unclear!) as well as writing a blog post over the next couple of days, but that shouldnt be blocking review. I'd be happy for experiments all over, but in particular the equivalence checking and the symbolic execution against remote state (using rpc) would benefit from some extra hands trying things out Concrete execution and all old cli endpoints should remain their current functionality (modulo some minor improvements), so please lmk if you find any regressions here. That said, many library functions have changed type signature though so this is a very breaking release Finally, I suggest we should not squash these commits when merging as that would make the git history fairly opaque. |
Squash me forcelit in filtering
Co-authored-by: David Terry <xwvvvvwx@users.noreply.github.com>
assert violations; better pretty printing counterexamples
This draft PR is
a WIPextendshevm
with symbolic execution capabilities, allowing it to be used as a proving engine for formal verification, as well as admitting more intelligent fuzzing strategies.It's not particularly close to being ready for merging yet, but I would like to share my progress to get some early feedback and discuss some of my findings so far.It is now ready for review!
On a high level, here's what's going on:
Most of the magic here comes from the SMT-based verification library sbv, which provides symbolic variants of common basetypes like booleans, words and integers. These variants are strict generalizations of their concrete counterparts and functions over them will just evaluate normally when given concrete arguments. Using these we can generalize various parts of the EVM:
stack
is changed from a list of words,[Word 256]
, to a list of symbolic words,[SWord 256]
, and stack operations are adapted accordingly.memory
,returndata
andcalldata
are changed fromByteString
to[SWord 8]
. This is a (concrete) list of symbolic values, not a symbolic list, which means that while reads and writes can refer to symbolic values, they must always happen at concrete locations. This restriction means that we are unable to deal with dynamic data types for the moment. More on this later.The main part of symbolic execution is handled by the functionsymExec
, which executes the EVM with symbolic values while keeping track of an accumulated list of constraints (path conditions), until reaching a JUMPI opcode. At this point, it invokes z3 to check which branches are possible, and recurses with the path condition updated accordingly until reaching a list of possible final VM states. The resulting VM states can be checked against a post condition.EDIT (04/07/2020):
Symbolic execution is now done with the "operational monad pattern" as seen in
TTY
,VMTests
Dev
, etc. Similarly to how instances ofSLOAD
orCALL
causes the execution to end with aQuery
error,JUMPI
introduces a new type of query (PleaseAskSMT
), which will need to be performed in anIO
context. The actual smt query is performed in a new fetcher, which is used inTTY.interpret
(which (as before) only keeps track of one VM at a time) and the newSymExec.interpret
, which is essentially a functionVM -> [VM]
, but crucially does the necessary lifting in case we encounter any blocking queries.Some example proofs are given in test.hs.
While defining arbitrary post conditions is probably better handled with something like act, this PR provides an easy way to check for asserts (executions ending with an
INVALID
opcode (0xfe
)) using the cli endpointhevm symbolic
.Example usage:
Obvious TODOs
Allow
storage
to be symbolic as well.Ensure that the conversion did not mess up any of the general state tests. The semantics for concrete execution should remain unchanged after this.
More interesting TODOs:
z3
to be good at dealing with them. As a result, it hangs on some simple proofs like this tautological SafeAdd proof. This has also been noted as a problem for @leonardoalt, which recommended to use unbounded integers instead. This would require adapting all arithmetic opcodes to operate modulo 2^256 instead. But there also might be other solvers thanz3
that are better at dealing with large bitvectors.SArray
which translates directly to an smtarray
, andSFunArray
, which is "implemented internally, without translating to SMT-Lib functions". I'm guessing the latter option is preferable.Show
instance for symbolic values, is quite boring as it just displays<symbolic>
for symbolic variables. For a more insightfulhevm --debug
experience, we might want to accompany symbolic words with a pretty-print string, at least when function ABIs are available. Opcodes would need be adapted to update the pretty-print string as well as updating the values.VM -> VM
functions that can only be applied under certain conditions. These "trusted specs" can themselves be proven by symbolic execution and coinduction.This type of symbolic execution would probably be able to generate a lot more interesting test cases for tools like Echidna. Do you guys (@japesinator, @incertia) have any specific requests for functions that would allow you to incorporate this into your workflow?
What are the effects on performance for concrete execution? Hopefully it does not change drastically, but I think a slight performance decrease would still be acceptable.