Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: TestTransportGCRequest failures #56809

Open
gopherbot opened this issue Nov 17, 2022 · 6 comments
Open

net/http: TestTransportGCRequest failures #56809

gopherbot opened this issue Nov 17, 2022 · 6 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@gopherbot
Copy link

gopherbot commented Nov 17, 2022

#!watchflakes
post <- pkg == "net/http" && test == "TestTransportGCRequest"

Issue created automatically to collect these failures.

Example (log):

1 second passes in backend, proxygone= false
2022/11/17 00:14:51 http2: server: error reading preface from client 127.0.0.1:38531: bogus greeting "CONNECT golang.fake.tld:"
2022/11/17 00:14:51 http2: server: error reading preface from client 127.0.0.1:38538: bogus greeting "CONNECT golang.fake.tld:"
2022/11/17 00:14:53 http: TLS handshake error from 127.0.0.1:38595: EOF
2022/11/17 00:14:54 http: TLS handshake error from 127.0.0.1:38601: EOF
2022/11/17 00:15:01 http: TLS handshake error from 127.0.0.1:38682: write tcp 127.0.0.1:38673->127.0.0.1:38682: use of closed network connection
2022/11/17 00:15:01 http: TLS handshake error from 127.0.0.1:38681: write tcp 127.0.0.1:38673->127.0.0.1:38681: use of closed network connection
2022/11/17 00:15:02 http: TLS handshake error from 127.0.0.1:38691: read tcp 127.0.0.1:38690->127.0.0.1:38691: use of closed network connection
--- FAIL: TestTransportGCRequest (0.00s)
    --- FAIL: TestTransportGCRequest/h2 (7.10s)
        --- FAIL: TestTransportGCRequest/h2/Body (5.32s)
            clientserver_test.go:1171: never saw GC of request
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39016: read tcp 127.0.0.1:39015->127.0.0.1:39016: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39017: read tcp 127.0.0.1:39015->127.0.0.1:39017: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39024: read tcp 127.0.0.1:39023->127.0.0.1:39024: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39026: read tcp 127.0.0.1:39023->127.0.0.1:39026: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39025: read tcp 127.0.0.1:39023->127.0.0.1:39025: read: connection reset by peer
2022/11/17 00:15:21 http: TLS handshake error from 127.0.0.1:39180: write tcp 127.0.0.1:39179->127.0.0.1:39180: i/o timeout
2022/11/17 00:15:21 http: TLS handshake error from 127.0.0.1:39178: write tcp 127.0.0.1:39177->127.0.0.1:39178: i/o timeout

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 17, 2022
@gopherbot
Copy link
Author

gopherbot commented Nov 17, 2022

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "net/http" && test == "TestTransportGCRequest"
2022-11-16 15:10 freebsd-riscv64-unmatched go@70f585f0 net/http.TestTransportGCRequest (log)
1 second passes in backend, proxygone= false
2022/11/17 00:14:51 http2: server: error reading preface from client 127.0.0.1:38531: bogus greeting "CONNECT golang.fake.tld:"
2022/11/17 00:14:51 http2: server: error reading preface from client 127.0.0.1:38538: bogus greeting "CONNECT golang.fake.tld:"
2022/11/17 00:14:53 http: TLS handshake error from 127.0.0.1:38595: EOF
2022/11/17 00:14:54 http: TLS handshake error from 127.0.0.1:38601: EOF
2022/11/17 00:15:01 http: TLS handshake error from 127.0.0.1:38682: write tcp 127.0.0.1:38673->127.0.0.1:38682: use of closed network connection
2022/11/17 00:15:01 http: TLS handshake error from 127.0.0.1:38681: write tcp 127.0.0.1:38673->127.0.0.1:38681: use of closed network connection
2022/11/17 00:15:02 http: TLS handshake error from 127.0.0.1:38691: read tcp 127.0.0.1:38690->127.0.0.1:38691: use of closed network connection
--- FAIL: TestTransportGCRequest (0.00s)
    --- FAIL: TestTransportGCRequest/h2 (7.10s)
        --- FAIL: TestTransportGCRequest/h2/Body (5.32s)
            clientserver_test.go:1171: never saw GC of request
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39016: read tcp 127.0.0.1:39015->127.0.0.1:39016: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39017: read tcp 127.0.0.1:39015->127.0.0.1:39017: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39024: read tcp 127.0.0.1:39023->127.0.0.1:39024: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39026: read tcp 127.0.0.1:39023->127.0.0.1:39026: read: connection reset by peer
2022/11/17 00:15:18 http: TLS handshake error from 127.0.0.1:39025: read tcp 127.0.0.1:39023->127.0.0.1:39025: read: connection reset by peer
2022/11/17 00:15:21 http: TLS handshake error from 127.0.0.1:39180: write tcp 127.0.0.1:39179->127.0.0.1:39180: i/o timeout
2022/11/17 00:15:21 http: TLS handshake error from 127.0.0.1:39178: write tcp 127.0.0.1:39177->127.0.0.1:39178: i/o timeout

watchflakes

@bcmills
Copy link
Member

bcmills commented Nov 17, 2022

It appears this test was last updated in https://go.dev/cl/438137 (attn @neild).

@bcmills bcmills added this to the Backlog milestone Nov 17, 2022
@neild
Copy link
Contributor

neild commented Nov 17, 2022

This doesn't seem likely to be caused by https://go.dev/cl/438137, which refactored the way the test is invoked and its name but not the content of the test.

The test sends an HTTP request, waits 100ms, calls runtime.GC, waits 4900ms, and fails if the request has not been finalized. Possibly a sufficiently slow builder could result in the request not being GCd within the time limit?

I'm currently unable to create a freebsd-riscv64-unmatched gomote to test. Going to leave this alone for now to collect more data.

@bcmills
Copy link
Member

bcmills commented Nov 17, 2022

“sufficiently slow builder” seems plausible.

Would it make sense to remove the arbitrary 5s timeout? If the request is truly pinned by a reachable Transport, letting the test run to its deadline should detect that bug equally well.

@neild
Copy link
Contributor

neild commented Nov 17, 2022

Dropping the timeout might be reasonable. Although I'm a bit confused why it would be necessary, since runtime.GC should block until the GC is complete, by which point finalizers should have run, no?

Let's see if any other flakes show up, and on what builders.

@bcmills
Copy link
Member

bcmills commented Nov 17, 2022

runtime.GC should block until the GC is complete, by which point finalizers should have run, no?

Finalizers aren't guaranteed to run at any given point relative to the GC cycle; IIRC they only start running at the end of the cycle in which the object becomes unreachable. And sometimes they can take arbitrarily many cycles to clear — for example, a chain of N objects with a finalizer set at each link of the chain will have N GC cycles of latency before the last finalizer can run.

So in general a test that uses finalizers to check for leaks has to be prepared for arbitrary latency.

Making matters even worse, if the allocation is small enough to fit into the tiny allocator, its finalizer might not ever run if some unrelated tiny allocation pins the block. (But wouldn't expect an http.Request to fit into the tiny allocator, so that's probably not the problem here. 😅)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: No status
Development

No branches or pull requests

3 participants