Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestSpuriousWakeupsNeverHangSemasleep failure with file already closed #52725

Closed
bcmills opened this issue May 5, 2022 · 6 comments
Closed
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented May 5, 2022

greplogs -l -e 'FAIL: TestSpuriousWakeupsNeverHangSemasleep .*(?:\n .*)* file already closed' --since=2022-01-01
2022-05-04T20:06:32-fd6ef06/solaris-amd64-oraclerel

It isn't clear to me whether this is a bug in the Go runtime or standard library, a bug in the test, or a bug in the Solaris platform. Leaving on the backlog to collect more information.

(CC @golang/runtime @golang/solaris @rorth)

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 5, 2022
@bcmills bcmills added this to the Backlog milestone May 5, 2022
@bcmills
Copy link
Contributor Author

bcmills commented May 5, 2022

This builder previously had a high failure rate for this test with a different failure mode; see #49727.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@aclements
Copy link
Member

I just got this failure on a linux-amd64-nonunified trybot: https://storage.googleapis.com/go-build-log/00af0613/linux-amd64-nounified_f492a55d.log

Re-running greplogs, it looks like it also happened on FreeBSD: 2022-06-02T17:17:19-aae0bef/freebsd-arm-paulzhol

@aclements aclements changed the title runtime: TestSpuriousWakeupsNeverHangSemasleep failure with file already closed on solaris-amd64-oraclerel runtime: TestSpuriousWakeupsNeverHangSemasleep failure with file already closed Jul 14, 2022
@aclements aclements modified the milestones: Backlog, Go1.19 Jul 14, 2022
@prattmic
Copy link
Member

prattmic commented Aug 1, 2022

@prattmic
Copy link
Member

prattmic commented Aug 1, 2022

This is a bug in the test. Arguably it could also be considered a weakness in either io.ReadAll or os/exec.Cmd.StdoutPipe.

The problem is that the test calls cmd.Wait in a goroutine with no synchronization with io.ReadAll. This sets up a race condition, with two possible outcomes.

Good case:

  1. Child closes write end of pipe.
  2. io.ReadAll gets EOF from read() due to closed write end, returns no error.
  3. cmd.Wait closes read end.

Bad case:

  1. Child closes write end of pipe.
  2. Child exits.
  3. cmd.Wait closes read end.
  4. io.ReadAll gets fs.ErrClosed from read() because the read end is closed, returns fs.ErrClosed.

FWIW, note that the child sleeps for 1s between (1) and (2), so (4) must be significantly delayed to lose the race.

This is rather annoying to work around in the context of the test. The best approach may just be to ignore fs.ErrClosed from io.ReadAll, or use a manually-managed os.Pipe instead of cmd.StdoutPipe.

@prattmic prattmic self-assigned this Aug 1, 2022
@prattmic prattmic added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 1, 2022
@prattmic prattmic modified the milestones: Go1.19, Go1.20 Aug 1, 2022
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/420597 mentions this issue: runtime: don't race cmd.Wait and cmd.StdoutPipe read

@bcmills
Copy link
Contributor Author

bcmills commented Aug 22, 2022

greplogs -l -e 'FAIL: TestSpuriousWakeupsNeverHangSemasleep .*(?:\n .*)* file already closed' --since=2022-08-01
2022-08-11T20:16:32-b648591/linux-amd64-race
2022-08-08T17:29:13-9903ab5/linux-amd64-nounified

Just noticed that the fix CL was waiting on a review from me, so +2'd. 😅

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants