-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/poll, runtime: got not pollable
err when reading fifo or socket
#59545
Comments
It's not clear to me that the bug is in the Go standard library. Can you reproduce it without (I would suggest starting by raising an issue with the maintainers of that package.) |
thanks for your reply. This library is a tiny repo to wrap the fifo operation and perform reading from a fifo file. I think it is unrelated to this issue. I also reprodeuce this issue using github.com/containerd/ttrpc (a tiny rpc lib) which perform simple read write to socket. also i will try to reproduce without these lib later :) |
Likely a dup of #38618. |
@bcmills @ianlancetaylor update: reproduce with only go std library |
This issue explains why you got the It seems to read the epollerr event from tun fd when reading fifo fd |
I think this is a go runtime bug. I can reproduce this problem with @zhuangqh 's demo.
fd34(a fd for a tun device) makes epoll_pwait return EPOLLERR, but this error is reported to fd10, |
this line of code forcibly set event err which emitted from tun fd to the Data address. https://github.com/golang/go/blob/master/src/runtime/netpoll_epoll.go#L162 if mode != 0 {
pd := *(**pollDesc)(unsafe.Pointer(&ev.Data))
pd.setEventErr(ev.Events == syscall.EPOLLERR)
netpollready(&toRun, pd, mode)
} |
In most cases, the residual EPOLLIN/EPOLLOUT will only trigger an invalid wakeup and have no effect on the upper code. |
Thanks for the analysis. I think I have a fix. |
Change https://go.dev/cl/484275 mentions this issue: |
@ianlancetaylor Hi, I've pulled the change and tested it locally, it seems that CL 484275 does NOT fix the issue, only mitigate it. |
Because there may be multiple threads calling netpoll at the same time. Therefore, atomic counting does not guarantee that no other thread calls netpoll during pollcache.promote() |
@chenhengqi Thanks. I've updated https://go.dev/cl/484275. Can you see if it fixes the problem for you? (Even the original version appeared to fix the problem for me.) If anybody has suggestions for a better fix, I would be happy to hear them. I'm not very happy with this fix--I'm worried that it will lead to some contention on pollcache.lock. |
…c reuse fix the problem that the netpoll event is associated with the wrong fd because pollDesc object is reused Fixes: golang#59545 Signed-off-by: ls-ggg <335814617@qq.com>
Fix the problem that the netpoll event is associated with the wrong fd because pollDesc object is reused Fixes: golang#59545 Signed-off-by: ls-ggg <coocnngooo@gmail.com>
Change https://go.dev/cl/484695 mentions this issue: |
@ianlancetaylor Hi, I think it is not timely enough to release the pollDesc object by waiting for the atomic count to be 0, especially for some programs with high io load。 I've submitted a pr that might fix this problem。Running the above demo in my environment is ok。 |
I suggest that we ensure that only one |
Thanks. I've written a different approach in the pair of CLs https://go.dev/cl/484836 and https://go.dev/cl/484837. |
Change https://go.dev/cl/484837 mentions this issue: |
Change https://go.dev/cl/484836 mentions this issue: |
@ianlancetaylor Here is the code sample you ask for. I used this to repro the issue on CL 484837. func busyLoop() {
for i := 0; i < 32; i++ {
go func() {
for {
http.Get("https://baidu.com")
}
}()
}
}
func main() {
go busyLoop()
fmt.Printf("%v start test\n", time.Now().Format(time.TimeOnly))
go manyTun()
for {
f()
}
} |
Thanks. Updated the CL. |
This is a refactoring with no change in behavior, in preparation for future netpoll work. For #59545 Change-Id: I493c5fd0f49f31b75787f7b5b89c544bed73f64f Reviewed-on: https://go-review.googlesource.com/c/go/+/484836 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Orlando Labao <orlando.labao43@gmail.com>
Change https://go.dev/cl/486255 mentions this issue: |
An apparent typo in CL 484837 caused the test to check for ErrExist instead of ErrNotExist when opening /dev/net/tun for read. That causes the test to fail on platforms where /dev/net/ton does not exist, such as on the darwin-amd64-longtest builder. Updates #59545. Change-Id: I9402ce0dba11ab459674e8358ae9a8b97eabc8d2 Reviewed-on: https://go-review.googlesource.com/c/go/+/486255 Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> Commit-Queue: Bryan Mills <bcmills@google.com> Reviewed-by: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Change https://go.dev/cl/558795 mentions this issue: |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
one sentence description: i've got
not pollable
error when reading from a fifo or socket.reproduce code: https://github.com/zhuangqh/not-pollable-issue
step:
/dev/net/tun
frequentlynot pollable
outputs:
I've learnt from this issue #30426 that open the device
/dev/net/tun
usingos.OpenFile
and read it will got thenot pollable
error. But i got this error from other fd when open and close it frequently.It seems that epoll events pollute other files in the case of high concurrency?
tips for maintainer:
What did you expect to see?
no error from other fd
What did you see instead?
instead the read failed with error
fifo read: read /proc/self/fd/3: not pollable
The text was updated successfully, but these errors were encountered: