Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/race: unexpected race when Conn.Close used for synchronization #22132

Open
rogpeppe opened this issue Oct 4, 2017 · 9 comments

Comments

@rogpeppe
Copy link
Contributor

commented Oct 4, 2017

go version devel +a714470 Wed Sep 27 16:29:18 2017 +0000 linux/amd64

The following program reports a race even though the read of x
cannot happen in parallel with the assignment to x because
we wait for the connection to be closed before reading it.

https://play.golang.org/p/ufvoScCbSm

==================
WARNING: DATA RACE
Read at 0x00c42008a220 by main goroutine:
  main.main()
      /home/rog/src/tst.go:24 +0x288

Previous write at 0x00c42008a220 by goroutine 7:
  main.main.func1()
      /home/rog/src/tst.go:16 +0x7a

Goroutine 7 (finished) created at:
  main.main()
      /home/rog/src/tst.go:14 +0x16d
==================
2017/10/04 16:48:53 1
Found 1 data race(s)
exit status 66
@fraenkel

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2017

If the Dial and Read were to return errors for whatever reason, you would have a read before the write.

@rogpeppe

This comment has been minimized.

Copy link
Contributor Author

commented Oct 4, 2017

That's not true for Dial - we exit if Dial returns an error, but I added a couple more error checks to make the code clearer. I fixed the link above to point to the new code.

@ianlancetaylor ianlancetaylor changed the title race detector: unexpected race when Conn.Close used for synchronization runtime/race: unexpected race when Conn.Close used for synchronization Oct 4, 2017

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2017

I think you may be asking too much of the race detector. There is no obvious reason that the two different network connections are related. I mean, sure, they are, but only if you know that the Dial that created one matches the Accept that created the other. Only when you know that can you see that the Close of one affects the Read of the other.

CC @dvyukov

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Oct 4, 2017

@rogpeppe

This comment has been minimized.

Copy link
Contributor Author

commented Oct 4, 2017

@ianlancetaylor Does that apply to any causal connection through a network connection (for example using an httptest.Server and relying on being able to read variables after an HTTP request has replied) ?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2017

To the best of my knowledge, yes.

@rogpeppe

This comment has been minimized.

Copy link
Contributor Author

commented Oct 4, 2017

Interesting; it's not something I'd previously considered. I suspect we have hundreds of potentially race-flagged tests with this issue.

@fraenkel

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2017

I still see two different issues with the way it is written.

  1. Given that there is no synchronization, the x++ could be moved below the c.Close() by a compiler and be considered equivalent.
  2. You have no guarantee over the value read on line 30. The increment done on line 20 may not be published to the other go routine.

Unless my understanding of how things could work is inaccurate.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2017

Since c.Close() will call into the kernel, I don't think any plausible Go compiler could move the x++ across the call. Similarly, I think the x++ must be published. I think that any Go implementation must assume that the kernel may cause a happens-before relationship to be created. It's true that this is not in the memory spec, but in general Go is not trying to trick the programmer.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Oct 10, 2017

Ultimately any system call can synchronize with any other system call (e.g. create a file, another process notices that the dir becomes non-empty and send us a signal, so now creation of the file is somehow synchronized with arrival of the signal). But synchronizing everything with everything has a substantial negative effect in that it masks real data races.

In C++ we do more elaborate logic and e.g. actually track that connection between the pair of fd's returned by pipe and then how that propagates via dup's:
http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/tsan/rtl/tsan_fd.cc?revision=308929&view=markup
But we still don't track relation for sockets, theoretically should be possible to track connections within a process by looking at local/peer adds, but this still won't solve the problem when connection goes through a separate process/machine.

To summarize: I don't see anything simple that we could do for Go to solve the problem. If we are talking about tests, I would suggest to add a separate chan that notifies about completion of handling of request/connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.