Skip to content

Conversation

qinheping
Copy link
Collaborator

@qinheping qinheping commented May 13, 2022

This PR is an alternative approach to address the problem introduced in #6847. Comparing to #6847, in this patch,

  • the original expressions, instead of the names of symbol_exprt, are associate to tmp_post variables. The description of the violated property becomes
    dereference failure: pointer outside object bounds in *ptr++
  • from_expr_using_mode will print the original expressions on tmp_post. So the assertion for the violated property becomes
    POINTER_OFFSET(ptr++) >= 0l && OBJECT_SIZE(ptr++) >= (unsigned long int)POINTER_OFFSET(ptr++) + 4ul
    Note that the original expression with side-effect is in the above predicate.

As am example, below is the result and the trace for the test tmp_post_with_origin1 added in this PR.

** Results:
<builtin-library-malloc> function malloc
[malloc.assertion.1] line 26 max allocation size exceeded: SUCCESS
[malloc.assertion.2] line 31 max allocation may fail: SUCCESS

main.c function main
[main.pointer_dereference.1] line 6 dereference failure: pointer NULL in *ptr++: SUCCESS
[main.pointer_dereference.2] line 6 dereference failure: pointer invalid in *ptr++: SUCCESS
[main.pointer_dereference.3] line 6 dereference failure: deallocated dynamic object in *ptr++: SUCCESS
[main.pointer_dereference.4] line 6 dereference failure: dead object in *ptr++: SUCCESS
[main.pointer_dereference.5] line 6 dereference failure: pointer outside object bounds in *ptr++: FAILURE
[main.pointer_dereference.6] line 6 dereference failure: invalid integer address in *ptr++: SUCCESS

Trace for main.pointer_dereference.5:

Assumption:
  argc' >= 0

State 20 function __CPROVER__start thread 0
----------------------------------------------------
  INPUT argc: 1073741888 (01000000 00000000 00000000 01000000)

State 24 file main.c function __CPROVER__start line 1 thread 0
----------------------------------------------------
  argc=1073741888 (01000000 00000000 00000000 01000000)

State 25 file main.c function __CPROVER__start line 1 thread 0
----------------------------------------------------
  argv=argv' (00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 26 file main.c function main line 3 thread 0
----------------------------------------------------
  ptr=((signed int *)NULL) (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 30 file main.c function main line 4 thread 0
----------------------------------------------------
  malloc_size=2ul (00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010)

State 53 file main.c function main line 4 thread 0
----------------------------------------------------
  ptr=(signed int *)&dynamic_object1 (00000011 00000000 00000000 00000000 00000000 00000000 00000000 00000000)

State 54 file main.c function main line 5 thread 0
----------------------------------------------------
  ptr=(signed int *)((char *)&dynamic_object1 + 8l) (00000011 00000000 00000000 00000000 00000000 00000000 00000000 00001000)

State 58 file main.c function main line 6 thread 0
----------------------------------------------------
  ptr=(signed int *)((char *)&dynamic_object1 + 12l) (00000011 00000000 00000000 00000000 00000000 00000000 00000000 00001100)

Violated property:
  file main.c function main line 6 thread 0
  dereference failure: pointer outside object bounds in *ptr++
  POINTER_OFFSET(ptr++) >= 0l && OBJECT_SIZE(ptr++) >= (unsigned long int)POINTER_OFFSET(ptr++) + 4ul



** 1 of 8 failed (2 iterations)
VERIFICATION FAILED
  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@kroening
Copy link
Collaborator

This is not unproblematic. Remnants of these annotations might be very confusing, and it might be very confusing to read debug output.

@@ -315,6 +317,7 @@ void goto_convertt::remove_post(
{
exprt tmp = op;
make_temp_symbol(tmp, "post", dest, mode);
tmp.set(ID_tmp_post_origin, exprt(expr));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the idea of setting pretty_name of the symbol introduced by make_temp_symbol not work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The pretty_name of the symbol introduced by make_temp_symbol is not printed in the report and the trace. They instead print from_expr(e) where e is the tmp_post variable with type symbol_exprt.
I think although we can add a special case into expr2c to print pretty_name, it suffers the same issue as Daniel pointed out. Printing tmp_post as *ptr++ is very confusing that we cannot distinguish if it is the 'real' *ptr++, or it is tmp_post's pretty name.

@codecov
Copy link

codecov bot commented May 13, 2022

Codecov Report

Merging #6855 (b709f03) into develop (a188777) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #6855   +/-   ##
========================================
  Coverage    77.79%   77.79%           
========================================
  Files         1567     1567           
  Lines       179707   179706    -1     
========================================
+ Hits        139797   139806    +9     
+ Misses       39910    39900   -10     
Impacted Files Coverage Δ
src/ansi-c/expr2c.cpp 68.23% <100.00%> (+0.63%) ⬆️
src/goto-programs/goto_convert_side_effect.cpp 95.50% <100.00%> (+0.01%) ⬆️
src/ansi-c/cprover_library.cpp 87.50% <0.00%> (-10.12%) ⬇️
src/goto-instrument/accelerate/scratch_program.cpp 50.51% <0.00%> (-1.00%) ⬇️
src/pointer-analysis/goto_program_dereference.cpp 56.80% <0.00%> (-0.69%) ⬇️
src/goto-instrument/nondet_volatile.cpp 86.91% <0.00%> (-0.14%) ⬇️
src/goto-analyzer/taint_analysis.cpp 78.74% <0.00%> (-0.11%) ⬇️
src/goto-programs/goto_trace.cpp 81.41% <0.00%> (-0.10%) ⬇️
src/goto-programs/string_abstraction.cpp 92.43% <0.00%> (-0.02%) ⬇️
src/analyses/goto_rw.cpp 64.36% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58d64d8...b709f03. Read the comment docs.

@qinheping
Copy link
Collaborator Author

This is not unproblematic. Remnants of these annotations might be very confusing, and it might be very confusing to read debug output.

Yes, printing tmp_post as *ptr++ is very confusing that we cannot distinguish if it is the 'real' *ptr++, or it is just the pretty name of tmp_post. I am not sure about the following options, and want to hear what you think

  1. For tmp_post variable, we print (using from_expr_using_mode) them as tmp_post(=*ptr++) or something like that to distinguish them from the 'real' *ptr++
  2. Rename tmp_post as tmp_post_ptr where ptr is the name of the variable that is actually dereferenced. But this workaround only work for post-increment of variable such as *ptr++ but not *(*ptr)++ or *(ptr[0]++)
  3. Print tmp_post as *ptr++ only in the violated property description. Print tmp_post as tmp_post everywhere else.
  4. Print a hint to tell users they can find the original expression of tmp_post in the goto program when a tmp_post appears in the report or trace.

I endorse 1 or 4.

@tautschnig
Copy link
Collaborator

1. For `tmp_post` variable, we print (using `from_expr_using_mode`) them as `tmp_post(=*ptr++)` or something like that to distinguish them from the 'real' `*ptr++`

This breaks goto-instrument --dump-c, which expects from_expr to produce actual C code.

2. Rename `tmp_post` as `tmp_post_ptr` where `ptr` is the name of the variable that is actually dereferenced. But this workaround only work for post-increment of variable such as `*ptr++` but not `*(*ptr)++` or `*(ptr[0]++)`

Merged via #6847, which I still think is a useful first step.

3. Print `tmp_post` as `*ptr++` only in the violated property description. Print `tmp_post` as `tmp_post` everywhere else.

That seems feasible to me for it would not affect other code.

4. Print a hint to tell users they can find the original expression of `tmp_post` in the goto program when a `tmp_post` appears in the report or trace.

I'm worried that we are introducing an extra hoop.

@qinheping qinheping closed this Aug 24, 2022
@qinheping qinheping deleted the using_original_expression_instead_of_tmp_post branch January 18, 2023 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants