-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve model of heap memory management #354
Comments
To see if I'm understanding this correctly, I'm going to write an example. Please correct me if I'm wrong anywhere. Say we have two programs that are identical and look like this:
I will refer to the first
Now, we reach
Now, we visit
Other questions I have are:
|
I think the easiest way to explain this is to say we're going to make up a new register, So, for example, you said:
This is right, and the way we "keep track of our new allocator state" is to do what you did for RAX: replace all locations of So then when we reach You asked about how we relate the two programs. The idea here is we have So, when we look at If that doesn't make sense, let's meet up and work through it in more detail - it will help to work some examples out looking at what precondition we're generating with the current model. To answer your other questions:
I think it would work fine single program analysis (though I'm not sure it has many benefits over the current model for that case). Probably we can use it to keep things uniform.
My instinct is to add it as an option, but I could be talked out of that if it's going to be a real pain to check whether the option is turned on in many places in the code.
We should treat |
The current default model for
malloc
simply assumes that it generates a fresh result every time it is called.This is inconvenient for comparative analysis, because if the two programs have an "identical" call to malloc, we currently assume each call gives a different result. This causes false positives where we think the programs differ because of this different result value or, more subtly, because the memories differ after writes through those pointers.
What to do? We certainly shouldn't assume
malloc
's result is determined by its arguments, as it is stateful. In the real world,malloc
's result is determined by its arguments and a bunch of allocator state stored in memory. So it would be technically accurate to saymalloc
's result is determined by its arguments and the current state of memory, but this doesn't really work for CBAT because we aren't modeling all that allocator state in memory. Indeed, if the model ofmalloc
doesn't also change memory, this model would result in two consecutive calls tomalloc
returning the same results.In principle we could build an accurate model of libc's state in memory and dutifully track the ways
malloc
changes it. But that's very time consuming (and would need to be redone for each libc implementation) and also probably extremely inefficient for analysis.So, how about this: Model
malloc
with a pair of uninterpreted functions:malloc_result_model
, which is likemalloc
but with an extra integer parameter representing the current allocator state, andmalloc_state_update
, which takes the same arguments and models the change to the heap state whenmalloc
is called.The high level idea is for
malloc_result_model
's extra allocator state parameter to be different at each call tomalloc
within one program, but the same between the two programs in comparative analysis. Of course, we don't actually have a way to "match up" calls tomalloc
in the two programs, but perhaps it's good enough to just assume that the heap state is the same at the start of the two functions being analyzed. The functionmalloc_state_update
returns the new allocator state value for the next call to malloc.I think would be an improvement to our current model. We still keep the idea that each consecutive call to
malloc
returns a different result (because the state parameter changes), but in a way that is deterministic enough so that we can get the same results in comparative analysis. I can, however, see two potential issues, which maybe we can ignore for now:This model pretends that the allocator state is totally separate from memory. But of course this allocator state parameter is a model for a whole bunch of data that malloc relies on, which is stored at various different places in memory. Writes to memory don't change the allocator state in our model, but might in the real world. This is a potential source of false NEGATIVES.
This model allows the allocator state to change after each call to malloc, but doesn't require the allocator state to be a fresh unique result every time. This is a potential source of false positives, where malloc in this model can return the same value more often than is really the case. On the other hand, if we somehow added the assumption that each change to the allocator state results in a truly unique result, that would also be wrong, and in a way that can cause false negatives.
The text was updated successfully, but these errors were encountered: