Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: TestCloseWrite failures with "i/o timeout" on darwin/arm64 #49352

Open
bcmills opened this issue Nov 4, 2021 · 14 comments
Open

net: TestCloseWrite failures with "i/o timeout" on darwin/arm64 #49352

bcmills opened this issue Nov 4, 2021 · 14 comments
Labels
arch-arm64 NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@bcmills
Copy link
Member

bcmills commented Nov 4, 2021

#!watchflakes
post <- goos ~ `darwin|ios` && goarch == "arm64" && pkg == "net" && test == "TestCloseWrite"

greplogs --dashboard -md -l -e '(?m)FAIL: TestCloseWrite/.*\n(\s+.*\n)*.*i/o timeout'

2021-11-03T21:45:33-d3f5dd5/darwin-arm64-11_0-toothrot
2020-10-20T17:52:41-90c924f/ios-arm64-corellium
2020-09-28T21:30:01-af18bce/linux-mips64le-mengzhuo
2020-09-07T02:46:24-1a119ed/darwin-arm64-corellium
2020-08-26T16:15:32-6382809/darwin-arm64-corellium
2020-08-18T15:50:52-b58d297/darwin-arm64-corellium
2020-05-01T19:55:48-30f8074/darwin-arm64-corellium
2020-04-13T21:56:15-1b15c7f/darwin-arm64-corellium

See previously #34837.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 4, 2021
@bcmills bcmills added this to the Backlog milestone Nov 4, 2021
@bcmills
Copy link
Member Author

bcmills commented Jun 15, 2022

Still ongoing (but rare) on darwin/arm64.

greplogs -l -e '(?m)FAIL: TestCloseWrite/.*\n(\s+.*\n)*.*i/o timeout' --since=2021-11-04
2022-06-14T21:55:52-cb9bf93/darwin-arm64-11
2021-11-05T00:52:09-bd580a0/darwin-arm64-11_0-toothrot

(attn @neild, @ianlancetaylor)

@neild
Copy link
Contributor

neild commented Jun 15, 2022

I wonder if this is a variation on #37795?

@neild
Copy link
Contributor

neild commented Jun 15, 2022

The mips64le failure above has "too many open files" in the log, and is likely unrelated. I'll bet this is the same macOS bug as #37795. If it is, a similar workaround might work--add a small delay so that we don't close (or CloseWrite) a socket while reading from it.

@bcmills
Copy link
Member Author

bcmills commented Jun 15, 2022

Seems worth a shot, at least! (We can always try it, and reopen the issue if it turns out not to be the right workaround.)

@gopherbot
Copy link

Change https://go.dev/cl/414534 mentions this issue: net: avoid darwin/arm64 platform bug in TestCloseWrite

@bcmills
Copy link
Member Author

bcmills commented Jun 27, 2022

Sent https://go.dev/cl/414534 with a port of the workaround from #37795, prompted by another failure on the dashboard.

greplogs -l -e '(?m)FAIL: TestCloseWrite/.*\n(\s+.*\n)*.*i/o timeout' --since=2022-06-15
2022-06-26T00:21:33-666d736/darwin-arm64-12

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 27, 2022
@dmitshur dmitshur modified the milestones: Backlog, Go1.19 Jun 27, 2022
@heschi
Copy link
Contributor

heschi commented Jul 6, 2022

Unless I'm misunderstanding something, this recurred on 6/29: 2022-06-29T22:28:30-b2cc0fe/darwin-arm64-11

@heschi heschi reopened this Jul 6, 2022
@gopherbot gopherbot modified the milestones: Go1.19, Go1.20 Aug 2, 2022
jproberts pushed a commit to jproberts/go that referenced this issue Aug 10, 2022
On darwin_arm64, reading from a socket at the same time as the other
end is closing it will occasionally hang for 60 seconds before
returning ECONNRESET. (This is a macOS issue, not a Go issue.)

Work around this condition by adding a brief sleep before the read.

Fixes golang#49352 (we hope).
Updates golang#37795.

Change-Id: I4052aec21d311d7370550aea9dd7941f39141133
Reviewed-on: https://go-review.googlesource.com/c/go/+/414534
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 6, 2023
@gopherbot gopherbot removed the NeedsFix The path to resolution is known, but the work has not been done. label Jan 6, 2023
@bcmills bcmills changed the title net: TestCloseWrite failures with "i/o timeout" net: TestCloseWrite failures with "i/o timeout" on darwin/arm64 Jan 6, 2023
@bcmills
Copy link
Member Author

bcmills commented Jan 6, 2023

watchflakes found another in #57589. Added a pattern for this issue.

@gopherbot gopherbot modified the milestones: Go1.20, Go1.21 Feb 1, 2023
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- goos ~ `darwin|ios` && goarch == "arm64" && pkg == "net" && test == "TestCloseWrite"
2023-02-10 17:30 darwin-arm64-12 go@f69dbb6d net.TestCloseWrite (log)
--- FAIL: TestCloseWrite (0.00s)
    --- FAIL: TestCloseWrite/unix (159.80s)
        net_test.go:176: got (0, read unix ->/tmp/buildlet/tmp/4113395035/sock: i/o timeout); want (0, io.EOF)
        net_test.go:119: got (0, read unix /tmp/buildlet/tmp/4113395035/sock->: i/o timeout); want (0, io.EOF)

watchflakes

@bcmills
Copy link
Member Author

bcmills commented Feb 24, 2023

(Note that the above failure was on release-branch.go1.19.)

@neild
Copy link
Contributor

neild commented Feb 24, 2023

If this is indeed #37795, then I believe we should stop seeing this on more recent Darwin versions. (macOS 13 and higher? Not sure which version has the fix.)

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- goos ~ `darwin|ios` && goarch == "arm64" && pkg == "net" && test == "TestCloseWrite"
2023-05-16 19:46 darwin-arm64-12 go@b3d1cce3 net.TestCloseWrite (log)
--- FAIL: TestCloseWrite (0.00s)
    --- FAIL: TestCloseWrite/tcp (159.74s)
        net_test.go:119: got (0, read tcp4 127.0.0.1:55818->127.0.0.1:55819: i/o timeout); want (0, io.EOF)
        net_test.go:176: got (0, read tcp 127.0.0.1:55819->127.0.0.1:55818: i/o timeout); want (0, io.EOF)

watchflakes

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- goos ~ `darwin|ios` && goarch == "arm64" && pkg == "net" && test == "TestCloseWrite"
2023-05-19 22:45 darwin-arm64-12 go@fe786638 net.TestCloseWrite (log)
--- FAIL: TestCloseWrite (0.00s)
    --- FAIL: TestCloseWrite/unix (159.70s)
        net_test.go:176: got (0, read unix ->/tmp/buildlet/tmp/330333358/sock: i/o timeout); want (0, io.EOF)
        net_test.go:119: got (0, read unix /tmp/buildlet/tmp/330333358/sock->: i/o timeout); want (0, io.EOF)

watchflakes

@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- goos ~ `darwin|ios` && goarch == "arm64" && pkg == "net" && test == "TestCloseWrite"
2023-10-13 17:10 darwin-arm64-12 go@6dd7462a net.TestCloseWrite (log)
--- FAIL: TestCloseWrite (0.00s)
    --- FAIL: TestCloseWrite/tcp (159.96s)
        net_test.go:174: got (0, read tcp 127.0.0.1:54752->127.0.0.1:54750: i/o timeout); want (0, io.EOF)
        net_test.go:117: got (0, read tcp4 127.0.0.1:54750->127.0.0.1:54752: i/o timeout); want (0, io.EOF)

watchflakes

@gopherbot gopherbot modified the milestones: Go1.22, Go1.23 Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
Status: No status
Development

No branches or pull requests

5 participants