-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSan: possible false positive with atomics #1009
Comments
This is part is not true. |
I tend to disagree...
But not necessarily the other way round. In my example there is no happens before relation between the Let's extend the example: #include <thread>
#include <atomic>
struct node { std::atomic<int> value; };
std::atomic<node*> _node{nullptr};
int y;
void f1() {
auto n = new node();
y=42;
n->value.store(1, std::memory_order_release);
_node.store(n, std::memory_order_relaxed);
}
void f2() {
auto n = _node.load(std::memory_order_relaxed);
while (n == nullptr)
n = _node.load(std::memory_order_relaxed);
n->value.fetch_add(1, std::memory_order_acquire);
y=43;
}
int main() {
std::thread t1(f1);
std::thread t2(f2);
t1.join();
t2.join();
return 0;
} So we have the following relations (sb = sequenced-before, rf = reads-from):
If I make the Without knowing any details of TSan's inner workings, I would assume that this is because the putative race happens on the same object that is used to establish the synchronization. Interestingly, TSan always reports a race on |
The race happens in the executions that don't have rf between fetch_add and store. There are such executions because by the time fetch_add starts executing there are no ordering relations established between the threads. |
Try to use this tool: |
Interesting... I was under the impression that in case of an acquire-fetch-add the read-part of the read-modify-write operation establishes the synchronize-with relation. Can you point me to some sources where I can find more information about this synchronization of satellite data? I did not find anything in the C++ standard... True, there is no enforced order between the store and the fetch-add, so there can be executions where the fetch-add does not read the value from the store. In my specific case I know that it does, so I expected it to establish the happens-before relation at that point, but you are probably right to treat it as a data race. Thanks for the link, I will take a look it! |
This is true. But it only affects subsequent memory operations, not the RMW itself. A memory operations can't synchronize itself. Memory ordering is only about dependent/satellite data.
This is an informal term.
If here you mean that |
I have studied the C++ standard and its definitions quite extensively, but unfortunately even though it tries to provide formal definitions it still leaves some room for interpretation. Due to the lack of more specific information I was assuming that if the read-part of an RMW establishes the synchronize-with relation, then the write-part is already synchronized. I take your word for it that this is not the case. 🙂
I am well aware. I just wanted to point out that in this specific case I knew that the fetch-add would read the value written by the store-release, and I was therefore expecting that the write-part would not cause a data race. Obviously the example code does not provide any guarantee that the fetch-add would always see that value, so it can race. But since TSan only reports races that actually occur I did not expect a race report in this specific case (due to my assumption of an implicit synchronization in the RMW operation). |
True. But what you are assuming is that the write part then also synchronizes the read part in hindsight. That is, read synchronizes write, which then synchronizes read and justifies own synchronization. In other words, an infinite loop that self-justifies itself. This is this part that is not true.
How? Why? Nobody provides any guarantees for this. Well, if you printed the read value later and concluded that read in fact read the stored value, then you may think of this in 3 ways:
|
I can't follow. Why would the write-part synchronize the read-part?
The C++ code does not provide any such guarantee, but the underlying x86 memory model that my code was running on does. I know that this cannot be generalized and I never claimed that. But assuming that this is the case (again, only in this very specific scenario), I expected TSan to recognize the synchronize-with relation. If I replace the fetch-add with separate load and store operations I have the same potential race, but TSan does not report anything. I expected the same for the fetch-add. But I am already convinced that it is correct to report a race in this scenario! 🙂 |
It does not. But that's what I figured from your argumentation like: 'If the fetch_add reads the value written by the store, it synchronizes-with that store`.
The underlying architecture does not matter here in any way, shape or form. It only matters if you program in assembly. If you program in C/C++, C/C++ rules are in play.
Looks like something to improve in tsan, I filed #1014 for this. |
Just to clarify to future readers, the proper way to express this in C++ is (acquire/release should be on _node and value does not need to be atomic at all):
|
Absolutely agree! I was just expecting that TSan would not be able to detect this race simply because the underlying memory model provides this guarantee - similar to the case when I use separate load/store operations instead of the fetch-add. But as I said, I am already convinced that TSan is right to report a race, and it is a good idea to improve TSan to also recognize a race when using a load operation. But for what it's worth, I think the reported race in the original program should be adapted. It reports a race between the two write operations, but it should probably report the read of the RMW as the conflicting operation. Would you agree? |
Agree. |
The following program:
produces the following report:
Essentially TSan reports a data race between the
fetch_add
inf2
and the implicit initialization ofvalue
by theoperator new
inf1
.Obviously this could be fixed by using acquire/release order for the operations on
_node
, but IMHO this should not be necessary.f1
performs astore
-release onvalue
before it "publishes" the node. Thefetch_add
usesmemory_order_acquire
. In the modification order of_node
thefetch_add
is ordered after thestore
, so it reads that value. Therefore, thefetch_add
synchronizes-with thestore
, and thestore
is sequenced after the initialization inoperator new
. So IMHO this is no data race as there is a proper happens-before relation between the initialization and thefetch_add
.What do you think?
The text was updated successfully, but these errors were encountered: