Skip to content
This repository has been archived by the owner on Oct 3, 2021. It is now read-only.

Witness format and validators for the proposed data-race category #1126

Open
MartinSpiessl opened this issue Sep 23, 2020 · 3 comments
Open
Labels
data-race Issues about the data-race category

Comments

@MartinSpiessl
Copy link
Contributor

For the proposed category, it is still unclear how the witnesses would/should look like and whether we will have any validators for this.

@MartinSpiessl MartinSpiessl added the data-race Issues about the data-race category label Sep 23, 2020
@MartinSpiessl
Copy link
Contributor Author

I already invested some thoughts in this after a discussion with @hernanponcedeleon on the CPAchecker mailing list.

Let me just quote the relevant statement from there:

If we want a category where tools have to detect data races, we need to describe a violating state in the witness automaton and a way to reach it.
At first it looks like those properties can neither be expressed as state properties nor as properties over a single path, i.e., we cannot use LTL to describe the problem and it cannot be expressed by
the current witness automata. But then we could just search for states where two threads have in their very next transition two conflicting actions. As such the detection of race conditions seems to
boil down to a reachability problem.

In essence, our (violation witnesses) are just guides through the state space, they limit the set of executions of the program such that the validator has it easier to rediscover the violation.

A data race can be seen as a set of two traces where we have two conflicting actions (neither happening before the other). But those traces will share most of their states.
As such we could just mark any state on the common prefix of the two violating traces as error state and this would probably suffice. It makes sense to take the last state, i.e. specify the longest common prefix, but this is not strictly necessary.

So I actually do not think that witnesses are a big problem here, but maybe someone has a different view on the problem.

@vesalvojdani
Copy link
Contributor

This is workable, but it does somewhat bias the race detection category to multithreaded reachability checking... I recognize the value of witnesses, but it is fairly difficult for us to produce this kind of scheduling information. We do not assume sequential consistency; instead, we use flow-insensitive invariants (rely-guarantee style reasoning) in order to soundly over-approximate thread interactions. Since most architectures are not sequentially consistent, I believe this is the right approach for a race detection tool.

Practically, I think it should be possible for us to back-trace invariant violations to find a thread interleaving that gets to the racing state, but perhaps you could adopt a scoring scheme similar to correctness witnesses in the main contest, so here one would award 1 point for correct answers (both positive and negative), and 2 points if the above counter-example trace is verified. This would open the contest to race detection tools that are not so heavily based on inter-thread reachability, but still reward witness generation.

(Also, it would be really great to have correctness witnesses. We can produce rely-guarantee invariants fairly directly from our computed results, but counter examples requires more work.)

@hernanponcedeleon
Copy link
Contributor

I think @vesalvojdani is right and we should have a proper discussion about the minimal requirements for witness. This is true not only for the no-data-race property, but in general. AFAIK, currently the minimal requirements are syntactic: you can generate a witness that provides none semantic information and this would still be valid (but of not much use since the validator will have to do all the work).

I think that for data races, the minimal we should require (I cannot think any technique that won't be able to provide such information) is to explicitly state what are the two instructions that are in the race.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data-race Issues about the data-race category
Development

No branches or pull requests

3 participants