Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/sys/unix: TestPoll failures with "wrong number of events" beginning 2021-11-04 #49380

Open
bcmills opened this issue Nov 5, 2021 · 9 comments
Open

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Nov 5, 2021

greplogs --dashboard -md -l -e '(?m)FAIL: TestPoll .*\n\s+.*wrong number of events'

2021-11-05T07:00:05-7861aae-6fefb7f/linux-amd64
2021-11-05T00:52:09-7861aae-bd580a0/linux-amd64-race
2021-11-04T21:41:49-7861aae-156abe5/linux-amd64-race

These all occurred after CL 361255 in x/sys, but I don't see how it could possibly be related. I suspect a regression on the Go side — @mknyszek, could this be another test that is overly sensitive to being interrupted by a GC cycle?

@gopherbot gopherbot added this to the Unreleased milestone Nov 5, 2021
@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 5, 2021

(Marking as release-blocker for Go 1.18 because this appears to be a regression on linux-amd64, which is a first-class port.)

Loading

@bcmills bcmills removed this from the Unreleased milestone Nov 5, 2021
@bcmills bcmills added this to the Go1.18 milestone Nov 5, 2021
@mknyszek mknyszek self-assigned this Nov 12, 2021
@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Nov 12, 2021

I'll look into it. I think you're probably right.

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Nov 12, 2021

My attempts to reproduce this have been unsuccessful thus far. Is this particularly rare?

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Nov 12, 2021

This failure mode is also so strange... It creates a FIFO that it then blocks on, and waits to time out. How could something be sent on that FIFO? What else could possibly get that FD?

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Nov 12, 2021

Would it make sense to augment the test to print which events it thinks fired? Even if I can't reproduce locally, it should in theory pop up on the builders eventually.

Loading

@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 12, 2021

My attempts to reproduce this have been unsuccessful thus far. Is this particularly rare?

It does seem to only occur in a small fraction of runs: we have a ton of linux-amd64 builders and only a small number of failures per day (none at all since Nov. 6).

(But we've also had ∞% more failures than we did between whenever my fetchlogs data begins — I think at least sometime in 2019 at the moment? — and Nov. 4 of this year. 😅)

If we think it may have been fixed since Nov. 6, maybe we can park it in WaitingForInfo and see if it occurs again by the time the Gopherbot expires it?


greplogs --dashboard -md -l -e '(?m)FAIL: TestPoll .*\n\s+.*wrong number of events' --since=2021-11-05T07:01:00

2021-11-06T13:10:06-c75c477-4f083c7/linux-amd64-race
2021-11-05T22:03:24-c75c477-035963c/linux-amd64-race

Loading

@bcmills
Copy link
Member Author

@bcmills bcmills commented Nov 12, 2021

Would it make sense to augment the test to print which events it thinks fired? Even if I can't reproduce locally, it should in theory pop up on the builders eventually.

I'm always in favor of more-descriptive test logs. 👍

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Nov 12, 2021

Oops. I forgot to mark it as golang/go#49380 in the CL. Anyway, golang.org/cl/363714 improves the logs. I'm going to put this in WaitingForInfo as you suggested Bryan, since we haven't seen it since Nov. 6th.

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 12, 2021

Change https://golang.org/cl/363660 mentions this issue: unix: avoid depending on consistent Revents type in TestPoll

Loading

gopherbot pushed a commit to golang/sys that referenced this issue Nov 13, 2021
For golang/go#49380

Change-Id: Ie1d370681962d9f69ef54b33ddf38e4c74a2e298
Reviewed-on: https://go-review.googlesource.com/c/sys/+/363660
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants