Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help needed] [clang-cpp] Support ArrayInitLoopExpr #1793

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

intrigus-lgtm
Copy link
Contributor

@intrigus-lgtm intrigus-lgtm commented Apr 23, 2024

This PR tries to fix #1792.

Unfortunately I could not get this to really work.
My proof of concept (basically: 6bad2dd#diff-784347884400427356b283749ef36cc6e458d4d4df4f8e3a2a4007244a637b1eR1057-R1176) which I implemented in clang_cpp_convert.cpp actually successfully verified the test case, however I would assume that this would be the wrong place for the code.

I was thinking that clang_cpp_adjust_expr.cpp would be a better place, so I tried to move my code there.
However it fails with migrate expr failed.

I would really appreciate if someone could take a look at this or guide me because I feel like this should be pretty simple to implement for someone that is more familiar with the code.

The high level goal of the PR is to support code like this:

struct a
{
  int b[3];
};
int main()
{
  a d{};
  a e{d}; // copy constructor
}

When copying d to e, the array b also has to be copied.
Clang represents that using an ArrayInitLoopExpr.
So basically this should compile to this code:

Symbol......: c:@S@a@F@a#&1$@S@a#
Module......: cpptest.pre
Base name...: a
Mode........: C++
Type........: constructor  (struct a *, const struct a &)
Value.......: 
{

array_init_loop_expr_tmp1 = &a::ref->b[0];
this->b[0] = array_init_loop_expr_tmp1[0];
this->b[1] = array_init_loop_expr_tmp1[1];
this->b[2] = array_init_loop_expr_tmp1[2];

}

Currently it looks like this (maybe that is the problem?):

Symbol......: c:@S@a@F@a#&1$@S@a#
Module......: cpptest.pre
Base name...: a
Mode........: C++
Type........: constructor  (struct a *, const struct a &)
Value.......: 
{

{
array_init_loop_expr_tmp1 = &a::ref->b[0];

this->b[0] = array_init_loop_expr_tmp1[0];

this->b[1] = array_init_loop_expr_tmp1[1];

this->b[2] = array_init_loop_expr_tmp1[2];

};

}

(The dag_match_replace_helper stuff can be moved elsewhere if needed)

This is used in copy/move ctors for arrays,
lambdas that capture arrays by value and
decomposition expressions.
@intrigus-lgtm intrigus-lgtm changed the title [clang-cpp] Support ArrayInitLoopExpr [Help needed] [clang-cpp] Support ArrayInitLoopExpr Apr 23, 2024
@fbrausse
Copy link
Member

Looks to me like this is generating a code_blockt inside an code_expressiont, which is unexpected. If you look at goto_convertt::convert_expression() you'll notice it never calls the top-level convert() again.

I also noticed you keep an empty id on the array_init_index_expr. That makes the .dump() output (and probably the current one from your symbol table) hard to read.

Comment on lines +347 to +351
symbolt source_array_symbol = tmp_symbol.new_symbol(
context, array_element_ptr_type, "array_init_loop_expr_tmp");
context.add(source_array_symbol);
source_array_symbol.lvalue = true;
source_array_symbol.value = source_array_expr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .new_symbol() call already registers the symbol with the context (hence it returns a reference), no need to add it again. In fact, it should be accessed by reference, otherwise you're modifying a copy outside of the context with the last 2 statements.

Copy link
Member

@mikhailramalho mikhailramalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @intrigus-lgtm I think we need to take a step back here and check if this is the correct approach. TBH, the adjust code seems a bit over-engineered.

Whenever we work on the frontend I always suggest we first check if the feature works in the old frontend, then on cbmc. If it does in either of them, half of the work is done.

For this one, cbmc does support the feature, so it's very likely that if we create an irep similar to the one there, the rest of esbmc will process it properly, in this case, the irep is:

  0: code
      * type: empty
      * statement: decl_block
      0: code
          * type: empty
          * #source_location: 
            * file: main2.cpp
            * line: 8
            * function: main
          * statement: decl
          0: symbol
              * type: struct_tag
                  * identifier: tag-a
              * identifier: main::1::d
              * #lvalue: 1
          1: struct
              * type: struct_tag
                  * identifier: tag-a
              * #source_location: 
                * file: main2.cpp
                * line: 8
                * function: main
              0: array
                  * type: array
                      * #member_name: tag-a
                      * size: constant
                          * type: signedbv
                              * width: 64
                              * #c_type: signed_long_int
                          * value: 3
                      0: signedbv
                          * width: 32
                          * #c_type: signed_int
                  * #source_location: 
                    * file: main2.cpp
                    * line: 8
                    * function: main
                  0: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 8
                        * function: main
                      * value: 0
                  1: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 8
                        * function: main
                      * value: 0
                  2: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 8
                        * function: main
                      * value: 0

Can you try to create a similar irep on the convert file?

@mikhailramalho
Copy link
Member

Bear in mind that esbmc has some different names for stuff (e.g, location instead of source_location), so you'll probably need to take a look around to see what's used in esbmc.

@XLiZHI
Copy link
Collaborator

XLiZHI commented Apr 24, 2024

For reference, this is the symbol table of CBMC.

Symbol......: main
Pretty name.: main()
Module......: main8
Base name...: main
Mode........: C
Type........: signed int (void)
Value.......: {
  struct a d={ .a::b={ 0, 0, 0 } };

  struct a e;
  e = d;

}
Flags.......:
Location....: file main8.cpp line 5

Symbol......: main::1::d
Pretty name.: main::1::d
Module......: main8
Base name...: d
Mode........: cpp
Type........: struct a
Value.......: { .b={ 0, 0, 0 } }
Flags.......: lvalue thread_local file_local
Location....: file main8.cpp line 7 function main

Symbol......: main::1::e
Pretty name.: main::1::e
Module......: main8
Base name...: e
Mode........: cpp
Type........: struct a
Value.......: 
Flags.......: lvalue thread_local file_local
Location....: file main8.cpp line 8 function main

@intrigus-lgtm
Copy link
Contributor Author

hey @intrigus-lgtm I think we need to take a step back here and check if this is the correct approach. TBH, the adjust code seems a bit over-engineered.

Whenever we work on the frontend I always suggest we first check if the feature works in the old frontend, then on cbmc. If it does in either of them, half of the work is done.

For this one, cbmc does support the feature, so it's very likely that if we create an irep similar to the one there, the rest of esbmc will process it properly, in this case, the irep is:

SNIP

Can you try to create a similar irep on the convert file?

@mikhailramalho
Can you please tell me how you tested this code in cbmc?
As far as I am aware, there is no C equivalent for the problem I'm trying to solve.
The problem is not with the "THIS" line, but with the "Hypothetical syntax" line that the ArrayInitLoopExpr basically represents.

  struct a { int b[3]; };
  struct a d={};

  struct a e;
  e = d; // THIS
  e.b = d.b // Hypothetical syntax

I could also not get my example to work with the old c++ frontend; it seems that it doesn't like aggregate inits.

Wouldn't it be better to try to represent the clang ast as faithfully as possible instead of assuming that it will always have a certain shape?

@mikhailramalho
Copy link
Member

hey, so cbmc also verifies cpp files, so just create a foo.cpp file and paste your code there. To print the symbol table, run cbmc with --show-symbol-table. I also use this patch to print the ireps:

diff --git a/src/goto-programs/show_symbol_table.cpp b/src/goto-programs/show_symbol_table.cpp
index 4a5b870ab6..936cb9487c 100644
--- a/src/goto-programs/show_symbol_table.cpp
+++ b/src/goto-programs/show_symbol_table.cpp
@@ -83,11 +83,16 @@ void show_symbol_table_plain(
 
     std::string type_str, value_str;
 
-    if(symbol.type.is_not_nil())
+    if(symbol.type.is_not_nil()){
+      out << symbol.type.pretty() << '\n';
       ptr->from_type(symbol.type, type_str, ns);
+    }
 
-    if(symbol.value.is_not_nil())
+
+    if(symbol.value.is_not_nil()) {
+      out << symbol.value.pretty() << '\n';
       ptr->from_expr(symbol.value, value_str, ns);
+    }
 
     out << "Symbol......: " << symbol.name << '\n' << std::flush;
     out << "Pretty name.: " << symbol.pretty_name << '\n';

cbmc doesn't handle the cassert header either, so you'll need to use __CPROVER_assert instead.

The symbol table and irep for:

int main()
{
  struct a { int b[3]; };
  struct a d={};

  struct a e;
  e = d; // THIS
//  e.b = d.b // Hypothetical syntax
}

is

code
  * return_type: signedbv
      * width: 32
      * #c_type: signed_int
  * parameters: 
code
  * type: code
      * return_type: signedbv
          * width: 32
          * #c_type: signed_int
      * parameters: 
  * #source_location: 
    * file: main2.cpp
    * line: 4
    * function: main
  * statement: block
  0: code
      * type: empty
      * statement: decl_block
  1: code
      * type: empty
      * statement: decl_block
      0: code
          * type: empty
          * #source_location: 
            * file: main2.cpp
            * line: 6
            * function: main
          * statement: decl
          0: symbol
              * type: struct_tag
                  * identifier: main::1::tag-a
              * identifier: main::1::d
              * #lvalue: 1
          1: struct
              * type: struct_tag
                  * identifier: main::1::tag-a
              * #source_location: 
                * file: main2.cpp
                * line: 6
                * function: main
              0: array
                  * type: array
                      * #member_name: main::1::tag-a
                      * size: constant
                          * type: signedbv
                              * width: 64
                              * #c_type: signed_long_int
                          * value: 3
                      0: signedbv
                          * width: 32
                          * #c_type: signed_int
                  * #source_location: 
                    * file: main2.cpp
                    * line: 6
                    * function: main
                  0: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 6
                        * function: main
                      * value: 0
                  1: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 6
                        * function: main
                      * value: 0
                  2: constant
                      * type: signedbv
                          * width: 32
                          * #c_type: signed_int
                      * #source_location: 
                        * file: main2.cpp
                        * line: 6
                        * function: main
                      * value: 0
  2: code
      * type: empty
      * statement: decl_block
      0: code
          * type: empty
          * #source_location: 
            * file: main2.cpp
            * line: 8
            * function: main
          * statement: decl
          0: symbol
              * type: struct_tag
                  * identifier: main::1::tag-a
              * identifier: main::1::e
              * #lvalue: 1
  3: code
      * type: empty
      * #source_location: 
        * file: main2.cpp
        * line: 9
        * function: main
      * statement: expression
      0: side_effect
          * type: struct_tag
              * identifier: main::1::tag-a
          * #source_location: 
            * file: main2.cpp
            * line: 9
            * function: main
          * statement: assign
          * #lvalue: 1
          0: symbol
              * type: struct_tag
                  * identifier: main::1::tag-a
              * #source_location: 
                * file: main2.cpp
                * line: 9
                * function: main
              * identifier: main::1::e
              * #lvalue: 1
          1: symbol
              * type: struct_tag
                  * identifier: main::1::tag-a
              * #source_location: 
                * file: main2.cpp
                * line: 9
                * function: main
              * identifier: main::1::d
Symbol......: main
Pretty name.: main()
Module......: main2
Base name...: main
Mode........: C
Type........: signed int (void)
Value.......: {

  struct main::1::a d={ .main::1::a::b={ 0, 0, 0 } };

  struct main::1::a e;

  e = d;
}
Flags.......:
Location....: file main2.cpp line 3

@intrigus-lgtm
Copy link
Contributor Author

hey, so cbmc also verifies cpp files, so just create a foo.cpp file and paste your code there.

Wow, I didn't know that. Thank you very much!
The documentation does a pretty good job at hiding that fact^^
(The GitHub page does say that it also supports C++, but I missed that)

I'll try to look into this a bit more later this week/next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ArrayInitLoopExprClass.
5 participants