Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Compile-time overflow checks #2274
For developers, it is much more helpful to get notified about potential arithmetic overflows already at compilation time instead of only at runtime. Furthermore, such checks might even be easier to implement than runtime checks and they will not be a breaking change. It remains to be seen how many false positives we will have.
Notes about this-- the warning issued for potential overflow takes a very conservative approach. Anything that could overflow should issue a warning. This is to encourage contract developers to use
I spent some thoughts on how conditions could be implemented. One way would be to use an SMT solver, but most of them are based on bitvectors, and because of that, I think we should get quite poor performance. There are "smallish" libraries like this one here: https://github.com/stp/stp
Another approach would be to do this "manually", and it could work in the following way:
As we also currently have it, there would be one "Value Range" object for each expression in the AST and also for each variable at a certain context (i.e. it is updated while we walk the AST). A more general term for "Value Range" could also just be a vector of "Constraints", where a constraint for
The algorithm to check for overflows in a given function
2.1 Compute constraints for integer-typed expressions by starting from the atoms (variables and literals) and combining the constraints. Example: If we have
2.2 For statements of the form
If in 2.1 we computed
This of course is applied recursively until we can improve the constraints on a variable. If in this process, we e.g. arrive at an expression of the form
Since we know that
So we can add the constraint
Here, we assume that the addition did not overflow in the first place. Because of that, we should probably use an overflow warning of the form "This arithmetic operation might overflow. Further analysis of the code will assume that it cannot overflow."
I spent some more thoughts on SMT solvers and one striking feature (apart from smaller number of false positives) that SMT solvers have but the approach outlined above does not have: SMT solvers can provide counter examples (If you call this functions with inputs 1 and 7, the addition here will overflow) - which might be extremely valuable.
It seems that one of the best SMT solvers - z3 - can both be easily integrated as an external dependency via CMake (and has an MIT license) and it also supports SMT-Lib - a general SMT solver input format.
Because of that, I would like to keep the conditions and requirements as general as possible. We can evaluate the footprint of an external dependency as this and also evaluate its performance.
If the performance is too poor, but the footprint acceptable, we can first only run our manual and crude condition checker and put something like "If you want to run additional checks, use
If bot the footprint and performance is bad, we can output the SMT-Lib code and keep z3 as a fully external component.
The good news is that this is the approach I'm taking currently, with the exception that I'm storing ranges in a hashmap associated with the AST but not in AST nodes themselves. Otherwise, my WIP is an exemplar of everything up to 2.1, though a number of operators need to be implemented still.
The entire approach you've outlined is how I'm proceeding, but it's worth noting that (3) is where gets crazy in a hurry. Consider this situation:
Let's say that the known ranges are sufficiently wide that any of these clauses may or may not resolve to true. When we evaluate the