Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flake] goroutine leak detection flaking for kubelet #122051

Closed
pacoxu opened this issue Nov 27, 2023 · 4 comments
Closed

[Flake] goroutine leak detection flaking for kubelet #122051

pacoxu opened this issue Nov 27, 2023 · 4 comments
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Milestone

Comments

@pacoxu
Copy link
Member

pacoxu commented Nov 27, 2023

Which jobs are flaking?

ci-kubernetes-integration-1-29

  • only once

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-integration-1-29/1728580243783946240

Which tests are flaking?

{Failed;  I1126 01:41:46.566270  104868 etcd.go:71] etcd already running at http://127.0.0.1:2379
PASS
E1126 01:42:56.066382  104868 etcd.go:221] "EtcdMain goroutine check" err=<
	found unexpected goroutines:
	[Goroutine 28180 in state select, with k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).serve on top of the stack:
	goroutine 28180 [select]:
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).serve(0xc0090feea0)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:940 +0x88f
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*Server).ServeConn(0xc003c0aa00, {0x549b610?, 0xc0094fbc00}, 0xc00f1b7b18)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:531 +0xbcc
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.ConfigureServer.func1(0xc003b72870, 0x549b610?, {0x5459520, 0xc009e463c0})
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:321 +0xff
	net/http.(*conn).serve(0xc00957b3b0, {0x548d760, 0xc006420630})
		/usr/local/go/src/net/http/server.go:1917 +0x1213
	created by net/http.(*Server).Serve in goroutine 2406
		/usr/local/go/src/net/http/server.go:3086 +0x5cb
	
	 Goroutine 28182 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
	goroutine 28182 [IO wait]:
	internal/poll.runtime_pollWait(0x7f2a8867a890, 0x72)
		/usr/local/go/src/runtime/netpoll.go:343 +0x85
	internal/poll.(*pollDesc).wait(0xc005714b00?, 0xc009380a80?, 0x0)
		/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27
	internal/poll.(*pollDesc).waitRead(...)
		/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
	internal/poll.(*FD).Read(0xc005714b00, {0xc009380a80, 0xa80, 0xa80})
		/usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a
	net.(*netFD).Read(0xc005714b00, {0xc009380a80?, 0xc009380a85?, 0x155?})
		/usr/local/go/src/net/fd_posix.go:55 +0x25
	net.(*conn).Read(0xc009db8a30, {0xc009380a80?, 0x0?, 0xc008991db8?})
		/usr/local/go/src/net/net.go:179 +0x45
	crypto/tls.(*atLeastReader).Read(0xc0056d2600, {0xc009380a80?, 0xc0056d2600?, 0x0?})
		/usr/local/go/src/crypto/tls/conn.go:805 +0x3b
	bytes.(*Buffer).ReadFrom(0xc008991ea8, {0x545b2a0, 0xc0056d2600})
		/usr/local/go/src/bytes/buffer.go:211 +0x98
	crypto/tls.(*Conn).readFromUntil(0xc008991c00, {0x5458ea0?, 0xc009db8a30}, 0xa80?)
		/usr/local/go/src/crypto/tls/conn.go:827 +0xde
	crypto/tls.(*Conn).readRecordOrCCS(0xc008991c00, 0x0)
		/usr/local/go/src/crypto/tls/conn.go:625 +0x250
	crypto/tls.(*Conn).readRecord(...)
		/usr/local/go/src/crypto/tls/conn.go:587
	crypto/tls.(*Conn).Read(0xc008991c00, {0xc005c3e000, 0x1000, 0x800010601?})
		/usr/local/go/src/crypto/tls/conn.go:1369 +0x158
	bufio.(*Reader).Read(0xc004c5e780, {0xc007789000, 0x9, 0x7f2ad1ea7108?})
		/usr/local/go/src/bufio/bufio.go:244 +0x197
	io.ReadAtLeast({0x5458140, 0xc004c5e780}, {0xc007789000, 0x9, 0x9}, 0x9)
		/usr/local/go/src/io/io.go:335 +0x90
	io.ReadFull(...)
		/usr/local/go/src/io/io.go:354
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.readFrameHeader({0xc007789000, 0x9, 0x9ae725?}, {0x5458140?, 0xc004c5e780?})
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/frame.go:237 +0x65
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc007788fc0)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/frame.go:498 +0x85
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*clientConnReadLoop).run(0xc007be7f98)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/transport.go:2275 +0x11f
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*ClientConn).readLoop(0xc007bff800)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/transport.go:2170 +0x65
	created by k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*Transport).newClientConn in goroutine 28181
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/transport.go:821 +0xcbe
	
	 Goroutine 28212 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
	goroutine 28212 [IO wait]:
	internal/poll.runtime_pollWait(0x7f2a8b1bae58, 0x72)
		/usr/local/go/src/runtime/netpoll.go:343 +0x85
	internal/poll.(*pollDesc).wait(0xc00aca5b80?, 0xc009036480?, 0x0)
		/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27
	internal/poll.(*pollDesc).waitRead(...)
		/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
	internal/poll.(*FD).Read(0xc00aca5b80, {0xc009036480, 0x240, 0x240})
		/usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a
	net.(*netFD).Read(0xc00aca5b80, {0xc009036480?, 0xc009036485?, 0x22?})
		/usr/local/go/src/net/fd_posix.go:55 +0x25
	net.(*conn).Read(0xc004cae6e0, {0xc009036480?, 0x2c?, 0xc0094fbdb8?})
		/usr/local/go/src/net/net.go:179 +0x45
	crypto/tls.(*atLeastReader).Read(0xc0056d25b8, {0xc009036480?, 0xc0056d25b8?, 0x0?})
		/usr/local/go/src/crypto/tls/conn.go:805 +0x3b
	bytes.(*Buffer).ReadFrom(0xc0094fbea8, {0x545b2a0, 0xc0056d25b8})
		/usr/local/go/src/bytes/buffer.go:211 +0x98
	crypto/tls.(*Conn).readFromUntil(0xc0094fbc00, {0x5458ea0?, 0xc004cae6e0}, 0x240?)
		/usr/local/go/src/crypto/tls/conn.go:827 +0xde
	crypto/tls.(*Conn).readRecordOrCCS(0xc0094fbc00, 0x0)
		/usr/local/go/src/crypto/tls/conn.go:625 +0x250
	crypto/tls.(*Conn).readRecord(...)
		/usr/local/go/src/crypto/tls/conn.go:587
	crypto/tls.(*Conn).Read(0xc0094fbc00, {0xc007b1a820, 0x9, 0x451126?})
		/usr/local/go/src/crypto/tls/conn.go:1369 +0x158
	io.ReadAtLeast({0x7f2a880f8628, 0xc0094fbc00}, {0xc007b1a820, 0x9, 0x9}, 0x9)
		/usr/local/go/src/io/io.go:335 +0x90
	io.ReadFull(...)
		/usr/local/go/src/io/io.go:354
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.readFrameHeader({0xc007b1a820, 0x9, 0x0?}, {0x7f2a880f8628?, 0xc0094fbc00?})
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/frame.go:237 +0x65
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc007b1a7e0)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/frame.go:498 +0x85
	k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).readFrames(0xc0090feea0)
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:820 +0x87
	created by k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).serve in goroutine 28180
		/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:932 +0x56a
	]
 >
FAIL	k8s.io/kubernetes/test/integration/kubelet	69.607s

Since when has it been flaking?

NA

Testgrid link

https://testgrid.k8s.io/sig-release-1.29-blocking#integration-1.29

Reason for failure (if possible)

No response

Anything else we need to know?

There is a similar issue in #116196 which is fixed.

/cc @pohly

Relevant SIG(s)

/sig node testing

@pacoxu pacoxu added the kind/flake Categorizes issue or PR as related to a flaky test. label Nov 27, 2023
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 27, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pohly
Copy link
Contributor

pohly commented Nov 30, 2023

The reason for the flake seems to be different from last time: previously, some goroutine was stuck in a timed wait. Here the apiserver seems to be in the process of handling a request when the client stops sending data.

@pacoxu
Copy link
Member Author

pacoxu commented Dec 22, 2023

Test started 11/26/2023 failed
And again 12/09: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-integration-1-29/1733305856373559296

/milestone v1.29

Not flake in https://testgrid.k8s.io/sig-release-master-blocking#integration-master yet. (It should be the same code)

@pacoxu
Copy link
Member Author

pacoxu commented Feb 2, 2024

/close
per #123086 and no high fail rate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
Archived in project
Development

No branches or pull requests

3 participants