-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to the shape abstract domain and to memory safety analysis #120
Conversation
This is useful when analysing pointer and memory safety, respectively.
If there is and incompatibility between the type of the template row expression and the address of pointed object, use typecast. This is currently used when dealing with CPROVER-specific pointer variables, which are of (void *) type. Hence, we allow incompatible type returned by the solver for these variables only.
When adding paths from other row, remove the other row as a destination if it has been added to another path. When clearing pointed rows, do not clear itself and remove poining row from the pointed_by set.
This information is used to prevent creation of PHI nodes for dynamic objects for 'if' branches in case where the object allocation occurs in one incoming branch only.
… condition This prevents from getting non-deterministic values of dynamic object fields incoming from other branches (where the dynamic object was not allocated) during analysis.
…ignment This is needed for correct RHS concretisation.
This test demonstrates the power of the combination of pointer and interval domains.
…object The concrete object has "$co" suffix and there is always only one allocated. It is created non-deterministically, but with the condition that no other pointer in the loop points to this object (i.e. it has not been allocated before).
…n template These guards are used in the post-condition of template rows of the heap domain that contain members of dynamic objects in the same loop where the object was allocated. This is required to make sure that the object is analysed only when it is created.
Modify assertions for dereferencing deallocated objects to perform checks on concrete objects only. Since a concrete object allocated within a loop has a non-deterministic position within the abstract object, this approach is sound.
…ression A pair is stored as an "and" expression. This is used to track dependency between a pair of pointer variables.
… pointers This is useful to prove that a deallocated object is never dereferenced. Since the assertions checks for this property are done on conrete objects only, we need to prove that once a concrete object was deallocated, it cannot be reached in the next iteration.
It shows that tracking the points-to relation is enough since we use abstract objects and a path is expressed by field of an abstract object pointing to the same object.
Tests check pointer safety for linked lists.
This version of the test did not previously work because of a bug related to symbolic dereference objects occuring on both sides of an assignment.
…ides The bug was related to finding the symbol at LHS (since there may be multiple chained members and we need to get to the root object).
…s reachable Loop guards collected in the SSA form are pairs (loop-select and loop head guards). When checking the solver for symbolic path, both guards must be true to set the appropriate loop-select guard in the sympath.
GOTO instructions having "false" as guard are removed since some of them were backwards jumps treated as loops.
Previously, we set free::record to true, which in case of multiple frees in a loop caused just the last free to be recorded into the invariant. Now, we let free::record to be undefined, but instead update template rows for __CPROVER_deallocated only if one of free::record in the loop is true.
Currently, this is a part of heap-interval domain. This should be refactored to heap-polyhedra domain in future.
Disabling 'packet_filter' since it takes more than 500 seconds. Moving failing version of running example from 'heap' to 'heap-data'.
Now, the format is: malloc_value = cond1 ? (void *)&dyn_obj1 : (cond2 ? (void *)&dyn_obj2 ....) Also use read_rhs when collecting allocation guards to obtain correct versions of variables occuring in guards (this is needed when dealing with #co objects since their guards contain other pointers).
Multiple tests end inconclusive due to dynamic object splitting.
For more detail see 6f8efca. This strenghtens the condition for not creating a PHI node for a dynamic object field when merging branches of a conditional in case the object was allocated in one branch only.
We cannot iterate container while changing it.
We cannot iterate container while changing it.
The renaming commit probably forgot to delete the fmcad18 directory.
@@ -1,4 +1,4 @@ | |||
CORE | |||
KNOWNBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these fail now? Please create an issue for them so that the bug can be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fail because we split abstract dynamic objects into multiple and such abstraction is not strong enough to reason about these programs. Should I create an issue for a stronger shape domain?
} | ||
|
||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spurious newline (several occurrences)
src/2ls/2ls_parse_options.cpp
Outdated
@@ -1745,6 +1804,8 @@ void twols_parse_optionst::help() | |||
" --zones use zone domain\n" | |||
" --octagons use octagon domain\n" | |||
" --heap-interval use heap domain with interval domain for values\n" // NOLINT(*) | |||
" --heap-zones use heap domain with zones domain for values\n" // NOLINT(*) | |||
" --heap-zones use heap domain with dynamically incrementing stregth of value domain\n" // NOLINT(*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--heap-values-incremental?
stregth -> strength
src/2ls/2ls_parse_options.h
Outdated
@@ -37,7 +38,8 @@ class optionst; | |||
"(version)" \ | |||
"(i386-linux)(i386-macos)(i386-win32)(win32)(winx64)(gcc)" \ | |||
"(ppc-macos)(unsigned-char)" \ | |||
"(havoc)(intervals)(zones)(octagons)(equalities)(heap)(heap-interval)"\ | |||
"(havoc)(intervals)(zones)(octagons)(equalities)"\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put each option on a separate line to simplify merging in future
src/domains/strategy_solver_heap.cpp
Outdated
bool strategy_solver_heapt::is_cprover_symbol(const exprt &expr) | ||
{ | ||
return expr.id()==ID_symbol && | ||
id2string(to_symbol_expr(expr).get_identifier()).find("__CPROVER_")==0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use has_prefix()
and CPROVER_PREFIX
src/ssa/dynobj_instance_analysis.cpp
Outdated
|
||
// Do not include CPROVER symbols | ||
if(lhs.id()==ID_symbol && | ||
id2string(to_symbol_expr(lhs).get_identifier()).find("__CPROVER")!= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
src/ssa/dynobj_instance_analysis.cpp
Outdated
@@ -0,0 +1,318 @@ | |||
/*******************************************************************\ | |||
|
|||
Module: Analysis of a number of instances of abstract dynamic objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number
?
src/ssa/dynobj_instance_analysis.h
Outdated
@@ -0,0 +1,192 @@ | |||
/*******************************************************************\ | |||
|
|||
Module: Analysis of a number of instances of abstract dynamic objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
regression/Makefile
Outdated
@@ -1,5 +1,6 @@ | |||
DIRS = heap nontermination termination kiki \ | |||
preconditions interprocedural invariants | |||
DIRS = nontermination termination kiki \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put every directory on a separate line
@@ -0,0 +1,6 @@ | |||
CORE | |||
main.c | |||
--heap-values-incremental --sympath --inline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure -incremental
is a good name, maybe better -refine
? (or add a --refine-domains
option -- see below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the name to -refine. I suggest we leave implementation of a more generic solution with --refine-domains switch to Matej as a part of his work on generic domain combinations. This is closely related to it and I think it should be designed all together.
src/2ls/2ls_parse_options.cpp
Outdated
} | ||
|
||
if(cmdline.isset("horn-encoding")) | ||
while(!finished) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must not be implemented here, but inside summary_checker_ait
. I'd generalise this right away by adding a --refine-domains
option.
src/2ls/2ls_parse_options.cpp
Outdated
if(options.get_bool_option("pointer-check")) | ||
{ | ||
allow_record_malloc(goto_model); | ||
// allow_record_free(goto_model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commented out? Use #if 0
plus a comment to justify the deactivation or remove this line.
@@ -1243,6 +1286,21 @@ bool twols_parse_optionst::process_goto_program( | |||
inline_main(goto_model); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird indent in the lines above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any bad indentation
src/ssa/dynobj_instance_analysis.cpp
Outdated
bool found=false; | ||
for(const auto &i : must_alias_relations) | ||
{ | ||
unsigned long n; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t? (further instances below)
src/ssa/dynobj_instance_analysis.h
Outdated
\*******************************************************************/ | ||
class must_alias_setst:public union_find<exprt> | ||
{ | ||
protected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the public section to come first.
src/ssa/dynobj_instance_analysis.h
Outdated
friend class dynobj_instance_domaint; | ||
}; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
superfluous blank line
src/ssa/malloc_ssa.cpp
Outdated
|
||
void set_var_always_to_true( | ||
goto_modelt &goto_model, | ||
bool (*name_cond)(std::string &)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::function
preferred
src/ssa/ssa_value_set.h
Outdated
@@ -10,6 +10,7 @@ Author: Daniel Kroening, kroening@kroening.com | |||
#define CPROVER_2LS_SSA_SSA_VALUE_SET_H | |||
|
|||
#include <analyses/ai.h> | |||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems unused, please remove
… domains. Rename --heap-values-incremental to --heap-values-refine. Move refinement into summary_checker_ai.
1. Move long and utility functions into dedicated modules 2. Add braces for multi-line statements and multi-line conditionals 3. Remove commented code 4. Use better types (size_t instead of unsigned) 5. Replace function pointer by std::function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nitpicks
src/2ls/2ls_parse_options.cpp
Outdated
" --heap-interval use heap domain with interval domain for values\n" // NOLINT(*) | ||
" --sympath compute invariant for each symbolic path (only usable with --heap-interval switch)" // NOLINT(*) | ||
" --heap-interval use heap domain with interval domain for" | ||
"values\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd align values
with use
in the output
src/2ls/2ls_parse_options.cpp
Outdated
@@ -1744,8 +1773,14 @@ void twols_parse_optionst::help() | |||
" --heap use heap domain\n" | |||
" --zones use zone domain\n" | |||
" --octagons use octagon domain\n" | |||
" --heap-interval use heap domain with interval domain for values\n" // NOLINT(*) | |||
" --sympath compute invariant for each symbolic path (only usable with --heap-interval switch)" // NOLINT(*) | |||
" --heap-interval use heap domain with interval domain for" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\n
at the end of the line
src/2ls/2ls_parse_options.cpp
Outdated
" --sympath compute invariant for each symbolic path (only usable with --heap-interval switch)" // NOLINT(*) | ||
" --heap-interval use heap domain with interval domain for" | ||
"values\n" | ||
" --heap-zones use heap domain with zones domain for values" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\n
at the end of the line
src/2ls/2ls_parse_options.cpp
Outdated
"values\n" | ||
" --heap-zones use heap domain with zones domain for values" | ||
"\n" | ||
" --heap-values-refine use heap domain with a dynamically" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\n
at the end of the line
src/2ls/2ls_parse_options.cpp
Outdated
" --heap-zones use heap domain with zones domain for values" | ||
"\n" | ||
" --heap-values-refine use heap domain with a dynamically" | ||
"refinement of the strength of the value domain\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align output
src/2ls/2ls_parse_options.cpp
Outdated
"\n" | ||
" --heap-values-refine use heap domain with a dynamically" | ||
"refinement of the strength of the value domain\n" | ||
" --sympath compute invariant for each symbolic path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\n
at the end of the line
src/2ls/2ls_parse_options.cpp
Outdated
" --heap-values-refine use heap domain with a dynamically" | ||
"refinement of the strength of the value domain\n" | ||
" --sympath compute invariant for each symbolic path" | ||
"(only usable with --heap-interval switch)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align output
Multiple improvements and bug fixes mainly related to dynamic memory analysis. Changes are mostly related to the paper submitted to FMCAD'18.
Main contributions are: