Skip to content

Incremental SMT2 support for enum values #7755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 14, 2023

Conversation

esteffin
Copy link
Contributor

@esteffin esteffin commented Jun 6, 2023

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

#include <solvers/smt2_incremental/type_size_mapping.h>
#include <solvers/stack_decision_procedure.h>

#include <memory>
#include <unordered_map>

#include "solvers/smt2_incremental/encoding/struct_encoding.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to be #include <solvers/smt2_incremental/encoding/struct_encoding.h> to fix your build issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix for this has been squashed into the wrong commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, changed

Copy link
Contributor

@thomasspriggs thomasspriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments.

#include <util/expr.h>
#include <util/namespace.h>

/// Function to lower c_enum_tag_typet to its enum underlying type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ The description doesn't quite make sense as it describes a change to types, but the function takes an expression. It would be nice to have an explanation here to cover a couple of points. Firstly that the function will lower enum expressions which are sub-expressions of expr, rather than just expr itself. Secondly, what the ns parameter is used for. I remember right now that it is used for looking up the underlying types of enum types, but the reminder would be good.

@esteffin esteffin force-pushed the esteffin/incr-smt2-c-enum-tag branch 10 times, most recently from 5106d40 to 09b8bd3 Compare June 12, 2023 14:58
@codecov
Copy link

codecov bot commented Jun 12, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.32 🎉

Comparison is base (95f5db1) 78.24% compared to head (a52c93a) 78.56%.

❗ Current head a52c93a differs from pull request most recent head b0087bd. Consider uploading reports for the commit b0087bd to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7755      +/-   ##
===========================================
+ Coverage    78.24%   78.56%   +0.32%     
===========================================
  Files         1691     1693       +2     
  Lines       193183   193278      +95     
===========================================
+ Hits        151154   151858     +704     
+ Misses       42029    41420     -609     
Impacted Files Coverage Δ
...vers/smt2_incremental/encoding/struct_encoding.cpp 89.18% <ø> (ø)
...olvers/smt2_incremental/encoding/struct_encoding.h 100.00% <ø> (ø)
..._incremental/smt2_incremental_decision_procedure.h 75.00% <ø> (ø)
...vers/smt2_incremental/encoding/struct_encoding.cpp 100.00% <ø> (ø)
...smt2_incremental/construct_value_expr_from_smt.cpp 100.00% <100.00%> (ø)
...olvers/smt2_incremental/encoding/enum_encoding.cpp 100.00% <100.00%> (ø)
...ncremental/smt2_incremental_decision_procedure.cpp 96.46% <100.00%> (+0.04%) ⬆️
...smt2_incremental/construct_value_expr_from_smt.cpp 100.00% <100.00%> (ø)
...olvers/smt2_incremental/encoding/enum_encoding.cpp 100.00% <100.00%> (ø)

... and 40 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@esteffin esteffin force-pushed the esteffin/incr-smt2-c-enum-tag branch from 09b8bd3 to a52c93a Compare June 12, 2023 16:26
#include <solvers/smt2_incremental/type_size_mapping.h>
#include <solvers/stack_decision_procedure.h>

#include <memory>
#include <unordered_map>

#include "solvers/smt2_incremental/encoding/struct_encoding.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix for this has been squashed into the wrong commit.

c_enum_typet::c_enum_membert member;
member.set_identifier("V" + std::to_string(i));
member.set_base_name("V" + std::to_string(i));
// Note: The value will be correctly set to a bv type when we know
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this note actually true or is it the result of cut-and-paste? set_value takes an irep_idt / string, therefore it has formatting rather than a type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changed.

member.set_base_name("V" + std::to_string(i));
// Note: The value will be correctly set to a bv type when we know
// the width of the bitvector
member.set_value(integer2string(i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💥 integer2string produces a string wise representation of the given integer in base 10 by default. The value should be set using integer2bvrep see line 1383 of src/ansi-c/c_typecheck_type.cpp. The difference between these functions is currently that one uses base 10 and the other uses base 16, so there is less of a difference if your numbers are single digit only, which currently happens to be the case. Note that integer2bvrep should be used consistently, in case we ever change the base. If memory serves me correctly, we formerly used base 2.
⛏️ It might be good to have some enum constants which have value greater than 16 in order to confirm the handling of base in all the relevant places is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
Changed

@@ -31,10 +31,40 @@ static mp_integer max_int(const std::size_t bits)
return power2(bits) - 1;
}

static symbolt make_c_enum_symbol(int underlying_size)
{
const signedbv_typet underlying_type{42};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be using underlying_size rather than hard coded to 42? You could of course remove the parameter and make all the enums in this example the same width...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point.
Changed

const namespacet ns{symbol_table};
const symbolt enum_symbol = make_c_enum_symbol(42);
const symbolt enum_tag_symbol = make_c_enum_tag_symbol(enum_symbol);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I think there may be better names to use here than enum_symbol and enum_tag_symbol. To me the key thing which distinguishes these is that one of them defines the type and the other one defines a value which has enum type. So maybe enum_type_symbol and enum_value_symbol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
Changed to enum_type_symbol and enum_tag_value_symbol

int value_count;
using rowt = std::pair<int, int>;
std::tie(size, value_count) =
GENERATE(rowt{42, 5}, rowt{2, 5}, rowt{0, 5}, rowt{42, 1}, rowt{42, 0});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the use of GENERATE here as it makes it easier to see what has and has not been tested. 🤔 This allows me to easily spot and point out that the most commonly found architectures are likely to have enums which are 64 or 32 bits wide, to match the size of int on that platform. Also, I would expect 0 bits wide to cause an error in the subsequent expression to term conversion. So it would actually be preferable if there was an INVARIANT in the lowering on the number of bits wide being greater than 0, in order to aid with fault finding by failing earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 32 and 64 as sizes.
I will squash eventually
I removed the 0-sized case as it should never happen before and the invariant should be checked later in the decision procedure code.

std::tie(size, value_count) =
GENERATE(rowt{42, 5}, rowt{2, 5}, rowt{0, 5}, rowt{42, 1}, rowt{42, 0});
SECTION(
"Checking enum of underlying size " + std::to_string(size) + " with " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ There is an INFO macro in catch which can be used to add this kind of information without increasing the indent of all the following test code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
Changed.

const array_typet lowered_array_type{
enum_underlying_type, from_integer(2, unsignedbv_typet{32})};
const array_exprt original_array_expr{
{from_integer(3, unsignedbv_typet{32}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Shouldn't the elements in this input array have enum type rather than a bit vector type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
Fixed

{
const auto handle_find_result = expression_handle_identifiers.find(expr);
const exprt lowered_expr = lower_enum(expr, ns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not going to request any further changes around this issue in this PR as it seems to work as-is and I suspect its a wider issue than the enum lowering alone, but it is somewhat inconsistent that we are lowering the enums here, but not doing the other flavours of lowering. I think there are other inconsistencies around whether the expressions which are placed in the maps have been lowered or not. This is something I intend to investigate further myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the aim of make it consistent will be done in a subsequent PR.

@@ -28,50 +29,61 @@ static c_enum_typet make_c_enum_type(const typet &underlying_type)
return enum_type;
}

TEST_CASE("enum encoding of expressions", "[core][smt2_incremental]")
TEST_CASE("enum corner cases", "[core][smt2_incremental]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred the previous test description, as maybe we shouldn't label the entirety of the enum tests as corner cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, restored

@thomasspriggs
Copy link
Contributor

My remaining comments are mostly qualms about the unit tests, but the code they are testing + the regression tests look good to me. The "Improved tests for enum in incremental smt2 solver" looks like an improvement and I am happy for it to be squashed into the appropriate unit test commit before merging. I'd like to get that include switch around in the first couple of commits fixed as well, so it doesn't introduce a potential issue and fix it within the same PR.

Enrico Steffinlongo added 6 commits June 14, 2023 10:34
As support for non-native C types (e.g. enums) is added in the
incremental smt2 solver, other new encoding functions will be added. For
this reason, to avoid putting all the encoding functions in the
incremental smt2 solver root folder a new `encoding` folder has been
created and the struct encoding functions have been moved there.
In order to support expressions with type `c_enum_tag_typet` in the
incremental SMT2 solver, a new `lower_enum` has been added.
Such function lowers the type of each expression node in the AST with
type `c_enum_tag_type` to the underlying type of the `c_enum_typet` that
the tag points to.
This commit enables the generation of traces containing expression of
type `c_enum_tag_typet`.
@esteffin esteffin force-pushed the esteffin/incr-smt2-c-enum-tag branch from a52c93a to b0087bd Compare June 14, 2023 17:33
@esteffin esteffin merged commit 5d80835 into diffblue:develop Jun 14, 2023
@esteffin esteffin deleted the esteffin/incr-smt2-c-enum-tag branch June 14, 2023 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants