New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: sync: negative WaitGroup counter in k8s.io/kubernetes/pkg/controller/node #16404

Closed
soltysh opened this Issue Sep 18, 2017 · 5 comments

Comments

Projects
None yet
4 participants
@soltysh
Member

soltysh commented Sep 18, 2017

Seen today here: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/16390/test_pull_request_origin_unit/2793/build-log.txt

=== RUN   TestCancel
panic: sync: negative WaitGroup counter
goroutine 400 [running]:
sync.(*WaitGroup).Add(0xc420392540, 0xffffffffffffffff)
	/usr/lib/golang/src/sync/waitgroup.go:75 +0x255
sync.(*WaitGroup).Done(0xc420392540)
	/usr/lib/golang/src/sync/waitgroup.go:100 +0x42
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.TestCancel.func1(0xc420318d80, 0x1, 0x1c)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/timed_workers_test.go:88 +0x61
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.(*TimedWorkerQueue).getWrappedWorkerFunc.func1(0xc420318d80, 0x0, 0x0)
	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/_test/_obj_test/timed_workers.go:81 +0xb6
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.CreateWorker.func1()
	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/_test/_obj_test/timed_workers.go:44 +0x72
created by time.goFunc
	/usr/lib/golang/src/time/sleep.go:170 +0x52
FAIL	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node	37.948s
@enj

This comment has been minimized.

Show comment
Hide comment
@enj

enj Sep 18, 2017

Member

We just need to disable these tests in origin:

#16077
kubernetes/kubernetes#51704
kubernetes/kubernetes#51705

Member

enj commented Sep 18, 2017

We just need to disable these tests in origin:

#16077
kubernetes/kubernetes#51704
kubernetes/kubernetes#51705

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 18, 2017

Member

I don't feel that's a good approach. Disabling tests is our last resort, followed immediately by a proper/temporary fix (if it's big). Why not increasing those timeouts even more?

Member

soltysh commented Sep 18, 2017

I don't feel that's a good approach. Disabling tests is our last resort, followed immediately by a proper/temporary fix (if it's big). Why not increasing those timeouts even more?

@enj

This comment has been minimized.

Show comment
Hide comment
@enj

enj Sep 18, 2017

Member

I suppose we can carry a patch for a 10 second sleep. The whole thing is a hack anyway :/

Member

enj commented Sep 18, 2017

I suppose we can carry a patch for a 10 second sleep. The whole thing is a hack anyway :/

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 18, 2017

Member

SGTM, will you open a PR or shall I?

Member

soltysh commented Sep 18, 2017

SGTM, will you open a PR or shall I?

@mfojtik

This comment has been minimized.

Show comment
Hide comment
@mfojtik

mfojtik Sep 20, 2017

Member

Seen this 4x today, raising to P0.

Member

mfojtik commented Sep 20, 2017

Seen this 4x today, raising to P0.

openshift-merge-robot added a commit that referenced this issue Sep 20, 2017

Merge pull request #16411 from soltysh/issue16404
Automatic merge from submit-queue

UPSTREAM: <carry>: increase timeout in TestCancelAndReadd even more

Fixes #16404.

Apparently #16077 didn't fix the problem. #16404 is showing that we're hitting this more and more.

@mfojtik @kargakis @enj ptal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment