Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestLimitRanger_GetLimitRangesFixed22422 flakes #113377

Closed
pacoxu opened this issue Oct 27, 2022 · 7 comments · Fixed by #113736
Closed

TestLimitRanger_GetLimitRangesFixed22422 flakes #113377

pacoxu opened this issue Oct 27, 2022 · 7 comments · Fixed by #113736
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.

Comments

@pacoxu
Copy link
Member

pacoxu commented Oct 27, 2022

Which jobs are flaking?

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/111095/pull-kubernetes-unit/1585051220769247232

panic: test timed out after 3m0s

goroutine 112 [running]:
testing.(*M).startAlarm.func1()
	/usr/local/go/src/testing/testing.go:2036 +0xbb
created by time.goFunc
	/usr/local/go/src/time/sleep.go:176 +0x48

Which tests are flaking?

pull-kubernetes-unit

Since when has it been flaking?

#112696

Testgrid link

https://storage.googleapis.com/k8s-triage/index.html?test=TestLimitRanger_GetLimitRangesFixed22422#658be1b0e17b463179af

Reason for failure (if possible)

goroutine 89 [chan receive, 2 minutes]:
k8s.io/kubernetes/plugin/pkg/admission/limitranger.TestLimitRanger_GetLimitRangesFixed22422.func1({0x2bf1100, 0xc0006ea240})
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission_test.go:885 +0x19e
k8s.io/kubernetes/vendor/k8s.io/client-go/testing.(*SimpleReactor).React(0xc0006b1d10, {0x2bf1100, 0xc0006ea240})
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/testing/fixture.go:530 +0x5d
k8s.io/kubernetes/vendor/k8s.io/client-go/testing.(*Fake).Invokes(0xc0006ca000, {0x2bf1100, 0xc00053c480}, {0x2bdaa98, 0xc000781618})
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/testing/fake.go:145 +0x389
k8s.io/kubernetes/vendor/k8s.io/client-go/kubernetes/typed/core/v1/fake.(*FakeLimitRanges).List(0xc0000122d0, {0xc00070c0f0?, 0x0?}, {{{0x0, 0x0}, {0x0, 0x0}}, {0x0, 0x0}, {0x0, ...}, ...})
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/kubernetes/typed/core/v1/fake/fake_limitrange.go:60 +0x265
k8s.io/kubernetes/plugin/pkg/admission/limitranger.(*LimitRanger).GetLimitRanges.func1()
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:169 +0x14a
k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight.(*Group).doCall.func2(0xc000781b67, 0xc00044c480, 0xc0004a2d98)
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight/singleflight.go:193 +0xb5
k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight.(*Group).doCall(0xc000298040, 0xc00044c480, {0x2695e14, 0x4}, 0x238d420?)
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight/singleflight.go:195 +0x112
k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight.(*Group).Do(0xc000298040, {0x2695e14, 0x4}, 0x0?)
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/golang.org/x/sync/singleflight/singleflight.go:108 +0x211
k8s.io/kubernetes/plugin/pkg/admission/limitranger.(*LimitRanger).GetLimitRanges(0xc000298000, {0x2bf6cf0, 0xc000304d80})
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission.go:168 +0x2cf
k8s.io/kubernetes/plugin/pkg/admission/limitranger.TestLimitRanger_GetLimitRangesFixed22422.func2()
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission_test.go:904 +0xd0
created by k8s.io/kubernetes/plugin/pkg/admission/limitranger.TestLimitRanger_GetLimitRangesFixed22422
	/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/admission/limitranger/admission_test.go:902 +0x16a9

Anything else we need to know?

/cc @aimuz @aojea

Relevant SIG(s)

/sig scalability

@pacoxu pacoxu added the kind/flake Categorizes issue or PR as related to a flaky test. label Oct 27, 2022
@k8s-ci-robot k8s-ci-robot added sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 27, 2022
@k8s-ci-robot
Copy link
Contributor

@pacoxu: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aimuz
Copy link
Contributor

aimuz commented Oct 27, 2022

In local tests, he always passes, so I'll see if the problem is caused by the environment

go test ./plugin/pkg/admission/limitranger -count=1 -v

=== RUN   TestDefaultContainerResourceRequirements
--- PASS: TestDefaultContainerResourceRequirements (0.00s)
=== RUN   TestMergePodResourceRequirements
--- PASS: TestMergePodResourceRequirements (0.00s)
=== RUN   TestPodLimitFunc
--- PASS: TestPodLimitFunc (0.00s)
=== RUN   TestPodLimitFuncApplyDefault
--- PASS: TestPodLimitFuncApplyDefault (0.00s)
=== RUN   TestLimitRangerIgnoresSubresource
--- PASS: TestLimitRangerIgnoresSubresource (0.00s)
=== RUN   TestLimitRangerAdmitPod
E1027 10:06:05.816275   27659 reflector.go:140] k8s.io/client-go/informers/factory.go:149: Failed to watch *v1.LimitRange: unhandled watch: testing.WatchActionImpl{ActionImpl:testing.ActionImpl{Namespace:"", Verb:"watch", Resource:schema.GroupVersionResource{Group:"", Version:"v1", Resource:"limitranges"}, Subresource:""}, WatchRestrictions:testing.WatchRestrictions{Labels:labels.internalSelector(nil), Fields:fields.andTerm{}, ResourceVersion:"1"}}
--- PASS: TestLimitRangerAdmitPod (0.00s)
=== RUN   TestPersistentVolumeClaimLimitFunc
E1027 10:06:05.816688   27659 reflector.go:140] k8s.io/client-go/informers/factory.go:149: Failed to watch *v1.LimitRange: unhandled watch: testing.WatchActionImpl{ActionImpl:testing.ActionImpl{Namespace:"", Verb:"watch", Resource:schema.GroupVersionResource{Group:"", Version:"v1", Resource:"limitranges"}, Subresource:""}, WatchRestrictions:testing.WatchRestrictions{Labels:labels.internalSelector(nil), Fields:fields.andTerm{}, ResourceVersion:"1"}}
--- PASS: TestPersistentVolumeClaimLimitFunc (0.00s)
=== RUN   TestLimitRanger_GetLimitRangesFixed22422
--- PASS: TestLimitRanger_GetLimitRangesFixed22422 (0.00s)
PASS
ok      k8s.io/kubernetes/plugin/pkg/admission/limitranger      0.410s

@aimuz
Copy link
Contributor

aimuz commented Oct 27, 2022

If the lru cache exceeds 30 seconds, the cache will be invalidated, which will lead to unit test errors, and the problem will run normally again after retesting.

@aimuz
Copy link
Contributor

aimuz commented Oct 27, 2022

After I manually reduce an unhold <- struct{}{}, the same problem occurs, so it seems that the problem is caused by the cache being invalidated for more than 30 seconds. If the test doesn't complete in 30 seconds, then this is probably a machine performance issue

@pacoxu
Copy link
Member Author

pacoxu commented Oct 27, 2022

go test -c -race ./plugin/pkg/admission/limitranger  -run ^TestLimitRanger_GetLimitRangesFixed22422$
stress -p 4 ./limitranger.test

stress will hang after about a minute in my local test.

@aojea
Copy link
Member

aojea commented Oct 28, 2022

/cc

@aimuz
Copy link
Contributor

aimuz commented Oct 31, 2022

The issue is being fixed
#113378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
4 participants