Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unit flake: k8s.io/kubernetes/pkg/storage (specifically storage.TestList) #19254

Closed
gmarek opened this issue Jan 4, 2016 · 37 comments
Closed
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@gmarek
Copy link
Contributor

gmarek commented Jan 4, 2016

WARNING: DATA RACE
Write by goroutine 55:
  sync.raceWrite()
      /tmp/workdir/go/src/sync/race.go:41 +0x2e
  sync.(*WaitGroup).Wait()
      /tmp/workdir/go/src/sync/waitgroup.go:124 +0xf9
  net/http/httptest.(*Server).Close()
      /tmp/workdir/go/src/net/http/httptest/server.go:168 +0x80
  k8s.io/kubernetes/pkg/storage/etcd/testing.(*EtcdTestServer).Terminate()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/storage/etcd/testing/utils.go:151 +0x138
  k8s.io/kubernetes/pkg/storage_test.TestList()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/storage/cacher_test.go:150 +0x1c05
  testing.tRunner()
      /tmp/workdir/go/src/testing/testing.go:456 +0xdc

Previous read by goroutine 74:
  sync.raceRead()
      /tmp/workdir/go/src/sync/race.go:37 +0x2e
  sync.(*WaitGroup).Add()
      /tmp/workdir/go/src/sync/waitgroup.go:66 +0xfa
  net/http/httptest.(*waitGroupHandler).ServeHTTP()
      /tmp/workdir/go/src/net/http/httptest/server.go:198 +0x5c
  net/http.serverHandler.ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1862 +0x206
  net/http.(*conn).serve()
      /tmp/workdir/go/src/net/http/server.go:1361 +0x117c

Goroutine 55 (running) created at:
  testing.RunTests()
      /tmp/workdir/go/src/testing/testing.go:561 +0xaa3
  testing.(*M).Run()
      /tmp/workdir/go/src/testing/testing.go:494 +0xe4
  main.main()
      k8s.io/kubernetes/pkg/storage/_test/_testmain.go:72 +0x20f

Goroutine 74 (running) created at:
  net/http.(*Server).Serve()
      /tmp/workdir/go/src/net/http/server.go:1910 +0x464

Looks like a dupe of #18928

cc @kubernetes/goog-control-plane

@gmarek gmarek added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. team/control-plane kind/flake Categorizes issue or PR as related to a flaky test. labels Jan 4, 2016
@krousey
Copy link
Contributor

krousey commented Jan 6, 2016

Just ran this 10k times. Could not reproduce. Ran like:

> godep go test -c -race
> stress -p 20 ./storage.test -test.run TestList

@gmarek
Copy link
Contributor Author

gmarek commented Jan 7, 2016

It's possible that #19187 fixed this, but sadly running tests locally doesn't seem to be a good way to reproduce failures. Fact that tests were passing 100% when run locally was the sad reality with other tests as well. Sometimes adding sleep at the beginning of the test made tests fail. We believe that it's the problem that does not appear unless system is heavily loaded (like our Jenkins is).

@mikedanese
Copy link
Member

This looks like golang/go#12262. Fixed in go1.6

@ghost
Copy link

ghost commented Jan 7, 2016

@mikedanese nice sleuthing! It sounds like we should just add a temporary work-around to the test (eliminate timeouts) until we upgrade to go 1.6 (which might take quite some time).

Q

@davidopp
Copy link
Member

davidopp commented Jan 8, 2016

Or should we downgrade to 1.4 and then upgrade directly from 1.4 to 1.6?

@krousey
Copy link
Contributor

krousey commented Jan 8, 2016

It's a test only bug. I'd still want 1.5 coverage as it exposed many real
issues.

Perhaps we build our own toolchain with a patch for our Jenkins instance?
On Jan 7, 2016 4:13 PM, "David Oppenheimer" notifications@github.com
wrote:

Or should we downgrade to 1.4 and then upgrade directly from 1.4 to 1.6?


Reply to this email directly or view it on GitHub
#19254 (comment)
.

@krousey
Copy link
Contributor

krousey commented Jan 8, 2016

@ixdy would it be possible to run our jenkins jobs with a patched go 1.5 toolchain?

@ghost
Copy link

ghost commented Jan 8, 2016

@krousey I don't think that it's a good idea to build and run a custom go platform. Other developers will be using a standard go distro, and we want their tests to pass too :-)
@davidopp go 1.6 is likely many moons away. I think we should adopt 1.5, and resolve any issues that we run into, as @krousey suggests.

@ghost
Copy link

ghost commented Jan 8, 2016

My apologies, I'm wrong. Go 1.6 is scheduled for release on Feb 1 2016.

https://groups.google.com/forum/#!topic/golang-dev/vNboccLL95c

But even so, I think that getting things working right on go 1.5 makes sense. The required test workaround seems pretty trivial.

@krousey
Copy link
Contributor

krousey commented Jan 8, 2016

@quinton-hoole as it would only affect a test package, and therefore only our tests, I don't see the harm in it. We could even provide the patch if they wanted less flaky tests too. But if there's a simple workaround (I didn't see one), that would definitely be preferred.

@mikedanese
Copy link
Member

The workaround would be to comment out all the calls to ts.Close() and leave a note to uncomment them when we stop supporting 1.5.

The patch would not require rebuilding the toolchain, it's would only require applying against $GO_ROOT. It's a decently sized patch though.

https://go-review.googlesource.com/#/c/15151/15/src/net/http/httptest/server.go

Workaround sounds easier

@davidopp davidopp self-assigned this Jan 10, 2016
@davidopp
Copy link
Member

I can do the workaround. Here are all the files where we call httptest.NewServer and thus might have a Close() that we need to comment out.

./cmd/integration/integration.go
./contrib/mesos/pkg/executor/executor_test.go
./contrib/mesos/pkg/scheduler/integration/integration_test.go
./pkg/apiserver/apiserver_test.go
./pkg/apiserver/handlers_test.go
./pkg/apiserver/proxy_test.go
./pkg/apiserver/watch_test.go
./pkg/client/cache/listwatch_test.go
./pkg/client/chaosclient/chaosclient_test.go
./pkg/client/unversioned/client_test.go
./pkg/client/unversioned/containerinfo_test.go
./pkg/client/unversioned/helper_test.go
./pkg/client/unversioned/portforward/portforward_test.go
./pkg/client/unversioned/remotecommand/remotecommand_test.go
./pkg/client/unversioned/request_test.go
./pkg/client/unversioned/restclient_test.go
./pkg/client/unversioned/testclient/simple/simple_testclient.go
./pkg/cloudprovider/providers/mesos/client_test.go
./pkg/controller/controller_utils_test.go
./pkg/controller/endpoint/endpoints_controller_test.go
./pkg/controller/replication/replication_controller_test.go
./pkg/credentialprovider/gcp/jwt_test.go
./pkg/credentialprovider/gcp/metadata_test.go
./pkg/genericapiserver/genericapiserver_test.go
./pkg/kubectl/cmd/util/helpers_test.go
./pkg/kubectl/proxy_server_test.go
./pkg/kubectl/resource/builder_test.go
./pkg/kubelet/client/kubelet_client_test.go
./pkg/kubelet/config/http_test.go
./pkg/kubelet/server/server_test.go
./pkg/master/master_test.go
./pkg/probe/http/http_test.go
./pkg/probe/tcp/tcp_test.go
./pkg/proxy/userspace/proxier_test.go
./pkg/registry/generic/rest/proxy_test.go
./pkg/registry/generic/rest/streamer_test.go
./pkg/storage/etcd/util/etcd_util_test.go
./pkg/util/fake_handler_test.go
./pkg/util/httpstream/spdy/roundtripper_test.go
./pkg/util/httpstream/spdy/upgrade_test.go
./pkg/util/proxy/transport_test.go
./pkg/util/wsstream/conn_test.go
./plugin/pkg/auth/authenticator/token/oidc/oidc_test.go
./plugin/pkg/scheduler/factory/factory_test.go
./test/component/scheduler/perf/util.go
./test/integration/auth_test.go
./test/integration/extender_test.go
./test/integration/framework/master_utils.go
./test/integration/scheduler_test.go
./test/integration/secret_test.go
./test/integration/service_account_test.go

@davidopp davidopp added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 10, 2016
@mikedanese
Copy link
Member

@davidopp. I'd suggest using go oracle

davidopp pushed a commit to davidopp/kubernetes that referenced this issue Jan 12, 2016
golang/go#12262 . See kubernetes#19254 for
more details. This change should be reverted when we upgrade
to Go 1.6.
@yujuhong
Copy link
Contributor

Not sure if it's the same cause, but my PR just ran into a unit test failure whereTestList timed out.

=== RUN   TestList
panic: test timed out after 5m0s

The complete log: https://gist.github.com/yujuhong/28a12bbaf5a1e27c757f

@ixdy
Copy link
Member

ixdy commented Jan 21, 2016

Most recent kubernetes-test-go job failed in the same way.

@ixdy
Copy link
Member

ixdy commented Jan 21, 2016

We're still being hit by this. Most recently, k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration:

Found 2 data race(s)
FAIL    k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration   9.988s
==================
WARNING: DATA RACE
Write by goroutine 32:
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource.podResources()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource/resource.go:53 +0x1e4
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource.LimitPodCPU()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource/resource.go:110 +0x12a
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm.(*schedulerAlgorithm).limitPod()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm/algorithm.go:149 +0x7e
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm.(*schedulerAlgorithm).Schedule()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm/algorithm.go:110 +0xd6a
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).scheduleOne()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:86 +0x75f
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).(k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.scheduleOne)-fm()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:68 +0x2d
  k8s.io/kubernetes/contrib/mesos/pkg/runtime.Until.func1()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/runtime/util.go:115 +0x58
  k8s.io/kubernetes/contrib/mesos/pkg/runtime.Until()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/runtime/util.go:116 +0x99

Previous read by goroutine 46:
  k8s.io/kubernetes/pkg/api.deepCopy_api_PodSpec()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1922 +0x3f8
  k8s.io/kubernetes/pkg/api.deepCopy_api_Pod()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1732 +0x26e
  k8s.io/kubernetes/pkg/api.deepCopy_api_PodList()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1801 +0x4d2
  runtime.call128()
      /tmp/workdir/go/src/runtime/asm_amd64.s:439 +0x50
  reflect.Value.Call()
      /tmp/workdir/go/src/reflect/value.go:300 +0xcd
  k8s.io/kubernetes/pkg/conversion.(*Cloner).customDeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:150 +0x2be
  k8s.io/kubernetes/pkg/conversion.(*Cloner).deepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:142 +0x309
  k8s.io/kubernetes/pkg/conversion.(*Cloner).defaultDeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:196 +0xc52
  k8s.io/kubernetes/pkg/conversion.(*Cloner).deepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:144 +0x397
  k8s.io/kubernetes/pkg/conversion.(*Cloner).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:128 +0xef
  k8s.io/kubernetes/pkg/conversion.(*Scheme).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/scheme.go:295 +0x7e
  k8s.io/kubernetes/pkg/runtime.(*Scheme).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/runtime/scheme.go:399 +0x78
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.(*MockPodsListWatch).Pods()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:191 +0xe7
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.NewTestServer.func1()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:91 +0x86
  net/http.HandlerFunc.ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1422 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1699 +0x212
  net/http/httptest.(*waitGroupHandler).ServeHTTP()
      /tmp/workdir/go/src/net/http/httptest/server.go:200 +0xfe
  net/http.serverHandler.ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1862 +0x206
  net/http.(*conn).serve()
      /tmp/workdir/go/src/net/http/server.go:1361 +0x117c

Goroutine 32 (running) created at:
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).Run()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:68 +0xf8
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components.(*sched).Run()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/scheduler.go:128 +0x64
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.lifecycleTest.Start()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:535 +0x1b5
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.TestScheduler_LifeCycle()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:622 +0x23a
  testing.tRunner()
      /tmp/workdir/go/src/testing/testing.go:456 +0xdc

Goroutine 46 (running) created at:
  net/http.(*Server).Serve()
      /tmp/workdir/go/src/net/http/server.go:1910 +0x464
==================
==================
WARNING: DATA RACE
Write by goroutine 32:
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource.podResources()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource/resource.go:56 +0x28a
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource.LimitPodCPU()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/resource/resource.go:110 +0x12a
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm.(*schedulerAlgorithm).limitPod()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm/algorithm.go:149 +0x7e
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm.(*schedulerAlgorithm).Schedule()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/algorithm/algorithm.go:110 +0xd6a
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).scheduleOne()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:86 +0x75f
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).(k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.scheduleOne)-fm()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:68 +0x2d
  k8s.io/kubernetes/contrib/mesos/pkg/runtime.Until.func1()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/runtime/util.go:115 +0x58
  k8s.io/kubernetes/contrib/mesos/pkg/runtime.Until()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/runtime/util.go:116 +0x99

Previous read by goroutine 46:
  k8s.io/kubernetes/pkg/api.deepCopy_api_PodSpec()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1922 +0x3f8
  k8s.io/kubernetes/pkg/api.deepCopy_api_Pod()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1732 +0x26e
  k8s.io/kubernetes/pkg/api.deepCopy_api_PodList()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/api/deep_copy_generated.go:1801 +0x4d2
  runtime.call128()
      /tmp/workdir/go/src/runtime/asm_amd64.s:439 +0x50
  reflect.Value.Call()
      /tmp/workdir/go/src/reflect/value.go:300 +0xcd
  k8s.io/kubernetes/pkg/conversion.(*Cloner).customDeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:150 +0x2be
  k8s.io/kubernetes/pkg/conversion.(*Cloner).deepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:142 +0x309
  k8s.io/kubernetes/pkg/conversion.(*Cloner).defaultDeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:196 +0xc52
  k8s.io/kubernetes/pkg/conversion.(*Cloner).deepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:144 +0x397
  k8s.io/kubernetes/pkg/conversion.(*Cloner).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/cloner.go:128 +0xef
  k8s.io/kubernetes/pkg/conversion.(*Scheme).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/conversion/scheme.go:295 +0x7e
  k8s.io/kubernetes/pkg/runtime.(*Scheme).DeepCopy()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/runtime/scheme.go:399 +0x78
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.(*MockPodsListWatch).Pods()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:191 +0xe7
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.NewTestServer.func1()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:91 +0x86
  net/http.HandlerFunc.ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1422 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1699 +0x212
  net/http/httptest.(*waitGroupHandler).ServeHTTP()
      /tmp/workdir/go/src/net/http/httptest/server.go:200 +0xfe
  net/http.serverHandler.ServeHTTP()
      /tmp/workdir/go/src/net/http/server.go:1862 +0x206
  net/http.(*conn).serve()
      /tmp/workdir/go/src/net/http/server.go:1361 +0x117c

Goroutine 32 (running) created at:
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller.(*controller).Run()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/controller/controller.go:68 +0xf8
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components.(*sched).Run()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/components/scheduler.go:128 +0x64
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.lifecycleTest.Start()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:535 +0x1b5
  k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration.TestScheduler_LifeCycle()
      /workspace/kubernetes/_output/local/go/src/k8s.io/kubernetes/contrib/mesos/pkg/scheduler/integration/integration_test.go:622 +0x23a
  testing.tRunner()
      /tmp/workdir/go/src/testing/testing.go:456 +0xdc

Goroutine 46 (running) created at:
  net/http.(*Server).Serve()
      /tmp/workdir/go/src/net/http/server.go:1910 +0x464
==================

@krousey
Copy link
Contributor

krousey commented Jan 21, 2016

@ixdy those don't look like the race in httptest. Those look like other issues.

@nikhiljindal nikhiljindal added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jan 26, 2016
@lavalamp
Copy link
Member

@nikhiljindal any more thoughts considering that whatever this was, #19458 didn't fix it?

@nikhiljindal
Copy link
Contributor

No. Maybe @wojtek-t has a guess on any recent change that could have affected this.

@lavalamp
Copy link
Member

@nikhiljindal it looks like it is stuck here: https://github.com/kubernetes/kubernetes/blob/master/pkg/storage/watch_cache.go#L199

Which doesn't look like the httptest.Close race at all to me. It may not be the case that anything has changed here recently. I'm worried that we might have a rare deadlock, that locking is tricky.

Have you tried running with the 'stress' program @krousey told us about? Maybe the entire test suite needs to be run, maybe another test in it messes something up? It would be good to figure this out.

@nikhiljindal
Copy link
Contributor

I ran

godep go test -c -race ./pkg/storage/etcd/
stress ./etcd.test --test.run=TestList

and it hasnt failed in the last 1k runs. Will run it a bit longer and try running the whole suite as well (to see if some other test is affecting this).

@lavalamp
Copy link
Member

Thanks for trying!

@nikhiljindal
Copy link
Contributor

I ran it more over the weekend

Just the TestList test:

stress ./etcd.test --test.run=TestList
523421 runs so far, 0 failures

The whole suite:

stress ./etcd.test
99082 runs so far, 0 failures

So the test seems to have been fixed for sure.
We just dont know how it got fixed :)

@lavalamp
Copy link
Member

lavalamp commented Feb 1, 2016

OK. I think your desktop is not going to be able to reproduce the failures, but I'm not convinced another system won't... ;) Let's close this, and if we see another instance we will just have to put our detective hats on and study the log/stack trace.

@lavalamp lavalamp closed this as completed Feb 1, 2016
soltysh added a commit to soltysh/kubernetes that referenced this issue Feb 11, 2016
golang/go#12262 . See kubernetes#19254 for
more details. This change should be reverted when we upgrade
to Go 1.6.
zhouhaibing089 added a commit to zhouhaibing089/kubernetes that referenced this issue Apr 25, 2016
k8s-github-robot pushed a commit that referenced this issue Apr 28, 2016
Automatic merge from submit-queue

Uncomment the code that caused by #19254

Fix #24546.

@lavalamp
chrislovecnm pushed a commit to chrislovecnm/kubernetes that referenced this issue Apr 28, 2016
k8s-github-robot pushed a commit that referenced this issue Dec 15, 2017
Automatic merge from submit-queue (batch tested with PRs 56308, 54304, 56364, 56388, 55853). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

httptest server should be close since Close issue has been fixed

**What this PR does / why we need it**:
per #19254, the issue seem to be fix for a long time and `server.Close` is no longer a issue in current related golang version, so it's time to uncomment the server.Close(). 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
None
**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

9 participants