-
Notifications
You must be signed in to change notification settings - Fork 278
Incremental SMT2 support for enum values #7755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental SMT2 support for enum values #7755
Conversation
esteffin
commented
Jun 6, 2023
- Each commit message has a non-empty body, explaining why the change was made.
- Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
- The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
- Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
- My commit message includes data points confirming performance improvements (if claimed).
- My PR is restricted to a single feature or bugfix.
- White-space or formatting changes outside the feature-related changed lines are in commits of their own.
78fed5c
to
930993e
Compare
#include <solvers/smt2_incremental/type_size_mapping.h> | ||
#include <solvers/stack_decision_procedure.h> | ||
|
||
#include <memory> | ||
#include <unordered_map> | ||
|
||
#include "solvers/smt2_incremental/encoding/struct_encoding.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs to be #include <solvers/smt2_incremental/encoding/struct_encoding.h>
to fix your build issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix for this has been squashed into the wrong commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few initial comments.
#include <util/expr.h> | ||
#include <util/namespace.h> | ||
|
||
/// Function to lower c_enum_tag_typet to its enum underlying type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ The description doesn't quite make sense as it describes a change to types, but the function takes an expression. It would be nice to have an explanation here to cover a couple of points. Firstly that the function will lower enum expressions which are sub-expressions of expr
, rather than just expr
itself. Secondly, what the ns
parameter is used for. I remember right now that it is used for looking up the underlying types of enum types, but the reminder would be good.
5106d40
to
09b8bd3
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #7755 +/- ##
===========================================
+ Coverage 78.24% 78.56% +0.32%
===========================================
Files 1691 1693 +2
Lines 193183 193278 +95
===========================================
+ Hits 151154 151858 +704
+ Misses 42029 41420 -609
☔ View full report in Codecov by Sentry. |
09b8bd3
to
a52c93a
Compare
#include <solvers/smt2_incremental/type_size_mapping.h> | ||
#include <solvers/stack_decision_procedure.h> | ||
|
||
#include <memory> | ||
#include <unordered_map> | ||
|
||
#include "solvers/smt2_incremental/encoding/struct_encoding.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix for this has been squashed into the wrong commit.
c_enum_typet::c_enum_membert member; | ||
member.set_identifier("V" + std::to_string(i)); | ||
member.set_base_name("V" + std::to_string(i)); | ||
// Note: The value will be correctly set to a bv type when we know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this note actually true or is it the result of cut-and-paste? set_value
takes an irep_idt
/ string, therefore it has formatting rather than a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, changed.
member.set_base_name("V" + std::to_string(i)); | ||
// Note: The value will be correctly set to a bv type when we know | ||
// the width of the bitvector | ||
member.set_value(integer2string(i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💥 integer2string
produces a string wise representation of the given integer in base 10 by default. The value should be set using integer2bvrep
see line 1383 of src/ansi-c/c_typecheck_type.cpp
. The difference between these functions is currently that one uses base 10 and the other uses base 16, so there is less of a difference if your numbers are single digit only, which currently happens to be the case. Note that integer2bvrep
should be used consistently, in case we ever change the base. If memory serves me correctly, we formerly used base 2.
⛏️ It might be good to have some enum constants which have value greater than 16 in order to confirm the handling of base in all the relevant places is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
Changed
@@ -31,10 +31,40 @@ static mp_integer max_int(const std::size_t bits) | |||
return power2(bits) - 1; | |||
} | |||
|
|||
static symbolt make_c_enum_symbol(int underlying_size) | |||
{ | |||
const signedbv_typet underlying_type{42}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be using underlying_size
rather than hard coded to 42? You could of course remove the parameter and make all the enums in this example the same width...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
Changed
const namespacet ns{symbol_table}; | ||
const symbolt enum_symbol = make_c_enum_symbol(42); | ||
const symbolt enum_tag_symbol = make_c_enum_tag_symbol(enum_symbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ I think there may be better names to use here than enum_symbol
and enum_tag_symbol
. To me the key thing which distinguishes these is that one of them defines the type and the other one defines a value which has enum type. So maybe enum_type_symbol
and enum_value_symbol
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
Changed to enum_type_symbol
and enum_tag_value_symbol
int value_count; | ||
using rowt = std::pair<int, int>; | ||
std::tie(size, value_count) = | ||
GENERATE(rowt{42, 5}, rowt{2, 5}, rowt{0, 5}, rowt{42, 1}, rowt{42, 0}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I like the use of GENERATE
here as it makes it easier to see what has and has not been tested. 🤔 This allows me to easily spot and point out that the most commonly found architectures are likely to have enums which are 64 or 32 bits wide, to match the size of int on that platform. Also, I would expect 0 bits wide to cause an error in the subsequent expression to term conversion. So it would actually be preferable if there was an INVARIANT in the lowering on the number of bits wide being greater than 0, in order to aid with fault finding by failing earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 32
and 64
as sizes.
I will squash eventually
I removed the 0
-sized case as it should never happen before and the invariant should be checked later in the decision procedure code.
std::tie(size, value_count) = | ||
GENERATE(rowt{42, 5}, rowt{2, 5}, rowt{0, 5}, rowt{42, 1}, rowt{42, 0}); | ||
SECTION( | ||
"Checking enum of underlying size " + std::to_string(size) + " with " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ℹ️ There is an INFO
macro in catch which can be used to add this kind of information without increasing the indent of all the following test code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
Changed.
const array_typet lowered_array_type{ | ||
enum_underlying_type, from_integer(2, unsignedbv_typet{32})}; | ||
const array_exprt original_array_expr{ | ||
{from_integer(3, unsignedbv_typet{32}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Shouldn't the elements in this input array have enum type rather than a bit vector type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Fixed
{ | ||
const auto handle_find_result = expression_handle_identifiers.find(expr); | ||
const exprt lowered_expr = lower_enum(expr, ns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not going to request any further changes around this issue in this PR as it seems to work as-is and I suspect its a wider issue than the enum lowering alone, but it is somewhat inconsistent that we are lowering the enums here, but not doing the other flavours of lowering. I think there are other inconsistencies around whether the expressions which are placed in the maps have been lowered or not. This is something I intend to investigate further myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but the aim of make it consistent will be done in a subsequent PR.
@@ -28,50 +29,61 @@ static c_enum_typet make_c_enum_type(const typet &underlying_type) | |||
return enum_type; | |||
} | |||
|
|||
TEST_CASE("enum encoding of expressions", "[core][smt2_incremental]") | |||
TEST_CASE("enum corner cases", "[core][smt2_incremental]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I preferred the previous test description, as maybe we shouldn't label the entirety of the enum tests as corner cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, restored
My remaining comments are mostly qualms about the unit tests, but the code they are testing + the regression tests look good to me. The "Improved tests for enum in incremental smt2 solver" looks like an improvement and I am happy for it to be squashed into the appropriate unit test commit before merging. I'd like to get that include switch around in the first couple of commits fixed as well, so it doesn't introduce a potential issue and fix it within the same PR. |
As support for non-native C types (e.g. enums) is added in the incremental smt2 solver, other new encoding functions will be added. For this reason, to avoid putting all the encoding functions in the incremental smt2 solver root folder a new `encoding` folder has been created and the struct encoding functions have been moved there.
In order to support expressions with type `c_enum_tag_typet` in the incremental SMT2 solver, a new `lower_enum` has been added. Such function lowers the type of each expression node in the AST with type `c_enum_tag_type` to the underlying type of the `c_enum_typet` that the tag points to.
This commit enables the generation of traces containing expression of type `c_enum_tag_typet`.
a52c93a
to
b0087bd
Compare