-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fmt: apparent race on fmt.pp buf #44672
Comments
The fmt package has been tested and race-tested innumerable times. It could indeed be a bug in the library but I doubt it. It's much likelier to be a problem in your code or environment that reveals itself as this race. That's not helpful, I know, but we would need a reproducible case to do any serious debugging of such a well-known and well-tested package. |
CC @dvyukov I wonder if this could be due to the |
Thanks, @robpike. I used the word "apparent" in the title to try and convey the fact that we're not definitely looking at a race, but a situation the race detector thinks is a race (I didn't do that in the body of the description - now fixed). My analysis and assertion that the problem is not in our code might be wrong of course. |
The first stack seems to be on the fmt internal buffer (which is either newly allocated memory, or come from sync.Pool, right?). Is this race reliably reproducible? Does it always look the same way with the same stacks? |
It's the first time we've seen it unfortunately. Will obviously add more details here in case we find them. |
I'm not sure I can definitively say it is a user slice.
Slightly stabbing in the dark (very much out of my depth), but is it possible that a compiler optimisation determined it was safe for an internal On a related topic, how does the race detector handle/work with compiler optimisations? |
i also encountered a similar error, I don’t know if it’s the same problem WARNING: DATA RACE
Read at 0x00c0000814c8 by goroutine 27:
internal/race.ReadRange()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/internal/race/race.go:45 +0xb0
syscall.Write()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/syscall/syscall_unix.go:215 +0x94
internal/poll.(*FD).Write.func1()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/internal/poll/fd_unix.go:267 +0xf9
internal/poll.ignoringEINTR()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/internal/poll/fd_unix.go:567 +0x38
internal/poll.(*FD).Write()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/internal/poll/fd_unix.go:267 +0x20b
net.(*netFD).Write()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/net/fd_posix.go:73 +0x65
net.(*conn).Write()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/net/net.go:194 +0xeb
net.(*TCPConn).Write()
<autogenerated>:1 +0x69
..........more omit
Previous write at 0x00c0000814c8 by goroutine 79:
time.NewTicker()
/home/travis/.gimme/versions/go1.15.8.linux.amd64/src/time/tick.go:35 +0x12a
github.com/go-redis/redis/v8/internal/pool.(*ConnPool).reaper()
/home/travis/gopath/src/github.com/go-redis/redis/internal/pool/pool.go:449 +0x55
-----no more
//pool.go
//448 func (p *ConnPool) reaper(frequency time.Duration) {
//449 ticker := time.NewTicker(frequency)
//450 defer ticker.Stop()
.....
//} why do |
@monkey92t if you provide full report and git revisions used, it will be more useful. Currently it's unclear what calls TCPConn.Write. |
@dvyukov travis job log: https://travis-ci.org/github/go-redis/redis/jobs/761788172 |
At first glance it does not look related. So my first guess would be it's just a bug in that code. |
I have investigated many times and found that there is no relationship between the data. I don’t know much about the working principle of race. Why does it report two unrelated data errors? |
@monkey92t - I'd agree with @dvyukov, your situation does not look related to this apparent race condition. If you think there is an underlying Go issue that's causing your race condition I'd suggested opening another issue. |
@monkey92t The first stack is reading cmd.Args here: I could only track creation of cmd.Args up to here: The second stack is writing to w.lenBuf here: w.lenBuf is allocated in this package and used in very restricted way: Is my reading correct? |
Subsequent races look even more insane, with some timer internal. |
I was skeptical of lenBuff at first, but then I discovered that they are different new(Struct). |
Yes, it would be useful. If we could rule out sync.Pool, it would be already something. |
We've just seen another instance of this (
It seems unlikely, but is the race detector getting confused about the fact the |
Yes, I agree it seems unlikely. We have unit-tests for channels and I think we would see way more races and much more reproducible if channels synchronization would broke. |
@dvyukov I once again encountered a very puzzled data race, It seems that they write to the same memory address (I don't know very well). they should never produce a data race.
I don’t know if it helps you: redis/go-redis#1657 (comment) |
Running func Test44672(t *testing.T) {
x := []byte("test")
go func() {
for {
Sprintf("%s", x)
}
}()
for {
x[0] = 0
}
} I get very close stacktrace so it looks to me that the formatted value is modified in another sproc. |
@AlexanderYastrebov That program has a race condition. One goroutine reads |
given the lack of recent updates... optimistically closing? |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, although I can't reproduce this.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
This problem surfaced in a
-race
build forgovim
. The raw log is now stored here:https://gist.github.com/myitcv/483887344dd4e48596ed6bb364d77651
The relevant part of the test failure is:
At the time of this data race,
govim
was running tests against the then tip ofx/tools
andgopls
. The module state at the time of the test is captured in this branch:https://github.com/govim/govim/tree/race_fail
What did you expect to see?
Passing tests.
What did you see instead?
Above.
From the quick analysis I've done thus far I don't think the problem can be in our code because the apparent race is happening on the
buf
field of afmt.pp
.The text was updated successfully, but these errors were encountered: