Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e flake: "Kubectl client Simple pod should support exec through an HTTP proxy" occasionally hangs #22671

Closed
ixdy opened this issue Mar 7, 2016 · 7 comments
Assignees
Labels
area/test kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@ixdy
Copy link
Member

ixdy commented Mar 7, 2016

Background: we've been seeing mysterious e2e timeouts in our parallel Jenkins builds for a while (#20778), but it's been hard to diagnose the issue due to some Ginkgo issues with Jenkins timeouts and logging (onsi/ginkgo#206). We recently re-worked how we did timeouts in Jenkins (#22374), and this seems to have helped with getting Ginkgo to print out its log.

It appears that the remote exec call is occasionally hanging, as can be seen from this log snippet:

STEP: Running kubectl in netexec via an HTTP proxy using HTTPS_PROXY
Mar  7 12:47:26.040: INFO: Running '/jenkins-master-data/jobs/kubernetes-pull-build-test-e2e-gce/workspace@2/kubernetes/platforms/linux/amd64/kubectl --server=https://107.178.223.6 --kubeconfig=/var/lib/jenkins/jobs/kubernetes-pull-build-test-e2e-gce/workspace@2/.kube/config create -f /jenkins-master-data/jobs/kubernetes-pull-build-test-e2e-gce/workspace@2/kubernetes/test/images/goproxy/pod.yaml --namespace=e2e-tests-kubectl-7y8ri'
Mar  7 12:47:26.356: INFO: stderr: ""
Mar  7 12:47:26.357: INFO: stdout: "pod \"goproxy\" created"
Mar  7 12:47:26.357: INFO: Waiting up to 5m0s for 1 pods to be running and ready: [goproxy]
Mar  7 12:47:26.357: INFO: Waiting up to 5m0s for pod goproxy status to be running and ready
Mar  7 12:47:26.362: INFO: Waiting for pod goproxy in namespace 'e2e-tests-kubectl-7y8ri' status to be 'running and ready'(found phase: "Pending", readiness: false) (5.557485ms elapsed)
Mar  7 12:47:28.366: INFO: Waiting for pod goproxy in namespace 'e2e-tests-kubectl-7y8ri' status to be 'running and ready'(found phase: "Pending", readiness: false) (2.009344986s elapsed)
Mar  7 12:47:30.370: INFO: Wanted all 1 pods to be running and ready. Result: true. Pods: [goproxy]
Mar  7 12:47:30.373: INFO: About to remote exec: HTTPS_PROXY=http://10.245.3.5:8080 ./uploads/upload370892077 --kubeconfig=/uploads/upload282296323 --server=https://107.178.223.6:443 --namespace=e2e-tests-kubectl-7y8ri exec nginx echo running in container

---------------------------------------------------------
Received interrupt.  Running AfterSuite...
^C again to terminate immediately
Mar  7 13:13:17.961: INFO: Waiting up to 1m0s for all nodes to be ready
STEP: Destroying namespace "e2e-tests-kubectl-7y8ri" for this suite.
Mar  7 13:13:18.175: FAIL: Unable to execute kubectl binary on the remote exec server due to error: an error on the server has prevented the request from succeeding (post pods netexec)
STEP: using delete to clean up resources

The eventual failure message from Ginkgo shows it'd been running for a while:

• Failure [1661.024 seconds]
Kubectl client
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:1089
  Simple pod
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:509
    should support exec through an HTTP proxy [It]
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:439

    Mar  7 13:13:18.175: Unable to execute kubectl binary on the remote exec server due to error: an error on the server has prevented the request from succeeding (post pods netexec)

    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:410

Full build log here: http://pr-test.k8s.io/22665/kubernetes-pull-build-test-e2e-gce/32042/build-log.txt

Might be the same as #19997, but I wanted to create a separate issue in case not.

cc @ncdc @lavalamp @jlowdermilk @kubernetes/sig-testing

@ncdc
Copy link
Member

ncdc commented Mar 7, 2016

More details from apiserver log:

POST /api/v1/namespaces/e2e-tests-kubectl-7y8ri/pods/netexec/proxy/shell?shellCommand=HTTPS_PROXY%3Dhttp%3A%2F%2F10.245.3.5%3A8080+.%2Fuploads%2Fupload370892077+--kubeconfig%3D%2Fuploads%2Fupload282296323+--server%3Dhttps%3A%2F%2F107.178.223.6%3A443+--namespace%3De2e-tests-kubectl-7y8ri+exec+nginx+echo+running+in+container: (25m47.800231226s) 503

I have to run now but I'll try to track down anything useful/interesting in the kubelet logs.

@wojtek-t
Copy link
Member

wojtek-t commented Mar 8, 2016

I'm almost sure it's related to #22165
@lavalamp - FYI

@ixdy
Copy link
Member Author

ixdy commented Mar 9, 2016

Yeah, looks like it. Should we apply a workaround like in #22510?

@wojtek-t
Copy link
Member

wojtek-t commented Mar 9, 2016

@ixdy - yeah, probably we should (although I think it will change the test from hanging forever to failing, but it is strictly better I think).

@lavalamp
Copy link
Member

Maybe we have to fix #22165 for real :/

@lavalamp lavalamp added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Mar 11, 2016
@lavalamp
Copy link
Member

Maybe @krousey will have time to look at this next week.

@piosz
Copy link
Member

piosz commented Apr 25, 2016

This is still failing. Closing in favor of #24620. Assigning @krousey there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

6 participants