Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: ServiceType & NodePort work #8707

Merged
merged 14 commits into from
May 22, 2015
Merged

Conversation

justinsb
Copy link
Member

Integrating all the PRs into a clean/mergeable history. Mostly creating PR for testing purposes.

A service with a NodePort set will listen on that port, on every node.

This is both handy for some load balancers (AWS ELB) and for people
that want to expose a service without using a load balancer.
This will replace publicIPs
@thockin
Copy link
Member

thockin commented May 22, 2015

LGTM so far

@thockin
Copy link
Member

thockin commented May 22, 2015

LGTM

@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2015
dchen1107 added a commit that referenced this pull request May 22, 2015
WIP: ServiceType & NodePort work
@dchen1107 dchen1107 merged commit 8d6d03b into kubernetes:master May 22, 2015
@dchen1107
Copy link
Member

I am monitoring jenkin and will update the e2e status.

@thockin
Copy link
Member

thockin commented May 22, 2015

Thanks Dawn and Justin! This is a huge improvement.

@dchen1107
Copy link
Member

@justinsb and @thockin My Jenkin run with this pr failed. Actually run is not finished, but I observed several service tests failure already:

• Failure [422.282 seconds]
Services
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:824
  should be able to create a functioning external load balancer [It]
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:313

  Expected error:
      <*url.Error | 0xc20859e0f0>: {
          Op: "Get",
          URL: "http://146.148.61.219:31852",
          Err: {
              Op: "dial",
              Net: "tcp",
              Addr: {
                  IP: "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\x92\x94=\xdb",
                  Port: 31852,
                  Zone: "",
              },
              Err: 0x6e,
          },
      }
      Get http://146.148.61.219:31852: dial tcp 146.148.61.219:31852: connection timed out
  not to have occurred
Services 
  should release the load balancer when Type goes from LoadBalancer -> NodePort
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:601
[BeforeEach] Services
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:59
>>> testContext.KubeConfig: /var/lib/jenkins/jobs/kubernetes-e2e-gce/workspace/.kube/config
STEP: Building a namespace api objects
[It] should release the load balancer when Type goes from LoadBalancer -> NodePort
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:601
STEP: creating service service-release-lb with type LoadBalancer
STEP: creating pod to be part of service service-release-lb
INFO: Waiting up to 5m0s for pod service-release-lb-1 status to be running
INFO: Waiting for pod 'service-release-lb-1' in namespace 'e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e' status to be '"running"' (found phase: '"Pending"', readiness: false) (12.499163ms)
INFO: Waiting for pod 'service-release-lb-1' in namespace 'e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e' status to be '"running"' (found phase: '"Pending"', readiness: false) (5.019461744s)
STEP: waiting up to 4m0s for service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e to have a LoadBalancer ingress point
INFO: Waiting for service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e to have a LoadBalancer ingress point (3.13453ms)
INFO: Waiting for service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e to have a LoadBalancer ingress point (5.007810069s)
INFO: Waiting for service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e to have a LoadBalancer ingress point (10.012672288s)
INFO: Waiting for service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e to have a LoadBalancer ingress point (15.017706187s)
STEP: hitting the pod through the service's NodePort
STEP: Checking reachability of http://146.148.61.219:30531
STEP: Got error waiting for reachability of http://146.148.61.219:30531: Get http://146.148.61.219:30531: dial tcp 146.148.61.219:30531: connection timed out
STEP: Got error waiting for reachability of http://146.148.61.219:30531: Get http://146.148.61.219:30531: dial tcp 146.148.61.219:30531: connection timed out
STEP: Got error waiting for reachability of http://146.148.61.219:30531: Get http://146.148.61.219:30531: dial tcp 146.148.61.219:30531: connection timed out
STEP: deleting pod service-release-lb-1 in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e
STEP: deleting service service-release-lb in namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e
[AfterEach] Services
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:68
STEP: Destroying namespace e2e-tests-service-0-0c5601f0-ea63-47c3-a6e1-970c809f049e
STEP: Destroying namespace e2e-tests-service-1-5305926b-2861-4060-a7bd-69744938293a

• Failure [427.308 seconds]
Services
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:824
  should release the load balancer when Type goes from LoadBalancer -> NodePort [It]
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:601

  Expected error:
      <*url.Error | 0xc20859e0f0>: {
          Op: "Get",
          URL: "http://146.148.61.219:30531",
          Err: {
              Op: "dial",
              Net: "tcp",
              Addr: {
                  IP: "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\x92\x94=\xdb",
                  Port: 30531,
                  Zone: "",
              },
              Err: 0x6e,
          },
      }
      Get http://146.148.61.219:30531: dial tcp 146.148.61.219:30531: connection timed out
  not to have occurred

  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:1048

@justinsb
Copy link
Member Author

@dchen1107 this is on GCE right? Looking right now...

@justinsb
Copy link
Member Author

e2e issues should be fixed by #8728 ; just running through a -up / -test / -down cycle.

@dchen1107
Copy link
Member

@justinsb Thanks for the fixing. I or @lavalamp or @quinton-hoole will update with you the jenkin result tomorrow morning. Sometimes I have trouble to access our Jenkin through VPN too.

@lavalamp
Copy link
Member

@justinsb @thockin Still have one consistent failure:

Shell tests that services.sh passes

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/shell.go:39
Error running &{/jenkins-master-data/jobs/kubernetes-e2e-gce/workspace/kubernetes/hack/e2e-suite/services.sh [/jenkins-master-data/jobs/kubernetes-e2e-gce/workspace/kubernetes/hack/e2e-suite/services.sh] []  <nil> Project: kubernetes-jenkins
Zone: us-central1-f
MINION_NAMES=e2e-test-jenkins-minion-xkzc e2e-test-jenkins-minion-xq2i
Found e2e-test-jenkins-minion-xkzc at 199.223.233.169
Found e2e-test-jenkins-minion-xq2i at 130.211.132.159
Starting service 'service-14074' on port 80 with 3 replicas
replicationcontrollers/service-14074
services/service-14074
Starting service 'service-342' on port 80 with 3 replicas
replicationcontrollers/service-342
services/service-342
Querying pods in service-14074
service-14074-1oi4d
service-14074-qnhqv
service-14074-yfmck
Waiting for 3 pods to become 'running'
Waiting for 3 pods to become 'running'
Querying pods in service-342
service-342-20lyi
service-342-gmed5
service-342-lps30
Waiting for 3 pods to become 'running'
Waiting for 2 pods to become 'running'
Test 1: Prove that the service portal is alive.
Verifying the portals from the host
Checking if service-14074-1oi4d service-14074-qnhqv service-14074-yfmck  == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Endpoints did not propagate in time
Stopping service 'service-14074'
replicationcontrollers/service-14074
services/service-14074
Stopping service 'service-342'
replicationcontrollers/service-342
services/service-342
Project: kubernetes-jenkins
Zone: us-central1-f
MINION_NAMES=e2e-test-jenkins-minion-xkzc e2e-test-jenkins-minion-xq2i
Found e2e-test-jenkins-minion-xkzc at 199.223.233.169
Found e2e-test-jenkins-minion-xq2i at 130.211.132.159
Starting service 'service-14074' on port 80 with 3 replicas
replicationcontrollers/service-14074
services/service-14074
Starting service 'service-342' on port 80 with 3 replicas
replicationcontrollers/service-342
services/service-342
Querying pods in service-14074
service-14074-1oi4d
service-14074-qnhqv
service-14074-yfmck
Waiting for 3 pods to become 'running'
Waiting for 3 pods to become 'running'
Querying pods in service-342
service-342-20lyi
service-342-gmed5
service-342-lps30
Waiting for 3 pods to become 'running'
Waiting for 2 pods to become 'running'
Test 1: Prove that the service portal is alive.
Verifying the portals from the host
Checking if service-14074-1oi4d service-14074-qnhqv service-14074-yfmck  == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Endpoints did not propagate in time
Stopping service 'service-14074'
replicationcontrollers/service-14074
services/service-14074
Stopping service 'service-342'
replicationcontrollers/service-342
services/service-342
[] <nil> 0xc208f931c0 exit status 1 <nil> true [0xc208da5c88 0xc208da5ca8 0xc208da5ca8] [0xc208da5c88 0xc208da5ca8] [0xc208da5ca0] [0x58ae70] 0xc208475140}:
Command output:
Project: kubernetes-jenkins
Zone: us-central1-f
MINION_NAMES=e2e-test-jenkins-minion-xkzc e2e-test-jenkins-minion-xq2i
Found e2e-test-jenkins-minion-xkzc at 199.223.233.169
Found e2e-test-jenkins-minion-xq2i at 130.211.132.159
Starting service 'service-14074' on port 80 with 3 replicas
replicationcontrollers/service-14074
services/service-14074
Starting service 'service-342' on port 80 with 3 replicas
replicationcontrollers/service-342
services/service-342
Querying pods in service-14074
service-14074-1oi4d
service-14074-qnhqv
service-14074-yfmck
Waiting for 3 pods to become 'running'
Waiting for 3 pods to become 'running'
Querying pods in service-342
service-342-20lyi
service-342-gmed5
service-342-lps30
Waiting for 3 pods to become 'running'
Waiting for 2 pods to become 'running'
Test 1: Prove that the service portal is alive.
Verifying the portals from the host
Checking if service-14074-1oi4d service-14074-qnhqv service-14074-yfmck  == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Checking if   == service-14074-1oi4d service-14074-qnhqv service-14074-yfmck 
Waiting for endpoints to propagate
Endpoints did not propagate in time
Stopping service 'service-14074'
replicationcontrollers/service-14074
services/service-14074
Stopping service 'service-342'
replicationcontrollers/service-342
services/service-342

@lavalamp
Copy link
Member

These two tests have occasional flakes:

  • Services should release the load balancer when Type goes from LoadBalancer -> NodePort
  • Services should correctly serve identically named services in different namespaces on different external IP addresses

@lavalamp
Copy link
Member

I think we can live with this, if @justinsb will fix the services.sh test. :)

@justinsb
Copy link
Member Author

Will definitely fix - sorry!

If you can provide me with an output from one of the flakes, I can have a look at those as well.

@lavalamp
Copy link
Member

Hm there are actually a number of flakes in this most recent run:

Services should release the load balancer when Type goes from LoadBalancer -> NodePort

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:602
Expected error:
    <*errors.errorString | 0xc208c90750>: {
        s: "service service-release-lb in namespace e2e-tests-service-0-af873fc2-22d6-45fd-9e2c-493b24167e65 doesn't have a LoadBalancer ingress point after 240.00 seconds",
    }
    service service-release-lb in namespace e2e-tests-service-0-af873fc2-22d6-45fd-9e2c-493b24167e65 doesn't have a LoadBalancer ingress point after 240.00 seconds
not to have occurred


Services should correctly serve identically named services in different namespaces on different external IP addresses

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:824
Expected error:
    <*errors.errorString | 0xc209116060>: {
        s: "service s0 in namespace e2e-tests-service-0-8397355a-319a-4d3a-bff9-04bc786a7fbe doesn't have a LoadBalancer ingress point after 240.00 seconds",
    }
    service s0 in namespace e2e-tests-service-0-8397355a-319a-4d3a-bff9-04bc786a7fbe doesn't have a LoadBalancer ingress point after 240.00 seconds
not to have occurred


Services should be able to create a functioning external load balancer

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:313
Expected error:
    <*errors.errorString | 0xc2091160a0>: {
        s: "service external-lb-test in namespace e2e-tests-service-0-16ce706c-478a-4768-9bf7-e2ea5a169d08 doesn't have a LoadBalancer ingress point after 240.00 seconds",
    }
    service external-lb-test in namespace e2e-tests-service-0-16ce706c-478a-4768-9bf7-e2ea5a169d08 doesn't have a LoadBalancer ingress point after 240.00 seconds
not to have occurred


Services should be able to change the type and nodeport settings of a service

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:521
Expected error:
    <*errors.errorString | 0xc208e47ef0>: {
        s: "service mutability-service-test in namespace e2e-tests-service-0-7e7ff24f-4523-4099-aac8-603e874aa107 doesn't have a LoadBalancer ingress point after 240.00 seconds",
    }
    service mutability-service-test in namespace e2e-tests-service-0-7e7ff24f-4523-4099-aac8-603e874aa107 doesn't have a LoadBalancer ingress point after 240.00 seconds
not to have occurred

@justinsb
Copy link
Member Author

WIP patch (implemented, but running through the test cycle) to continue to support deprecatedPublicIPs is here: #8739

@lavalamp I don't support we have hit the quota-limit on the number of load balancers allocated? If so, I'll look for a leak here.

@justinsb
Copy link
Member Author

Status update: I now get a different error from services.sh; it looks like it isn't doing strict round-robin between the nodes. It takes a while for endpoints to propogate, and then "verifying the portals from a container fails because it sees one of the pods twice:

  service-32506 portal failed from container, expected:

                service-32506-4wije
        service-32506-7ejip
        service-32506-m9efj

          got:

                service-32506-7ejip
        service-32506-7ejip
        service-32506-m9efj

I also have an idea on a potential cause for LB leakage - I made a late change to zero out the LoadBalancerStatus immediately, so I need to double-check the logic around service release here.

Investigating both issues!

@lavalamp
Copy link
Member

Thanks for working on this, Justin!

@lavalamp
Copy link
Member

It is possible there is a leak.

@lavalamp
Copy link
Member

A new flake, possibly all because we've leaked load balancers and I don't know which it's safe to delete because they're all named with long hex values:

Services should release the load balancer when Type goes from LoadBalancer -> NodePort

/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/service.go:602
Expected error:
    <*errors.errorString | 0xc208fa1f50>: {
        s: "service service-release-lb in namespace e2e-tests-service-0-a3ef5b51-026a-48e1-986c-ba04095e5d41 doesn't have a LoadBalancer ingress point after 240.00 seconds",
    }
    service service-release-lb in namespace e2e-tests-service-0-a3ef5b51-026a-48e1-986c-ba04095e5d41 doesn't have a LoadBalancer ingress point after 240.00 seconds
not to have occurred

@lavalamp
Copy link
Member

I just deleted all the forwarding rules... Let's see if this makes it green for a bit.

@justinsb
Copy link
Member Author

The default e2e setup does not configure any forwarding rules. So if this project is only running e2e, it should be safe to delete all the forwarding-rules (but I don't really know your setup). You could also check that they have no attached instances.

I've also yet to see a leak when running it myself (at the end of the test, gcloud compute forwarding-rules list is empty).

The hex string is the UUID of the k8s service, with a prefix of "a", I believe.

@lavalamp
Copy link
Member

So deleting all that stuff got us a totally green run. :)

@lavalamp
Copy link
Member

I think it's actually target pools that are leaking. The forwarding rules seem to get cleaned up.

@lavalamp
Copy link
Member

So to follow up: it's target pools leaking, from GKE's e2e test run that's using an old version. They share test projects & quotas with our GCE e2e testing.

@thockin
Copy link
Member

thockin commented May 24, 2015

When I originally wrote the verifying code I seem to recal reading 2x as
many tries then running that through sort and uniq for this reason.

On Sat, May 23, 2015 at 4:35 PM, Justin Santa Barbara <
notifications@github.com> wrote:

Status update: I now get a different error from services.sh; it looks like
it isn't doing strict round-robin between the nodes. It takes a while for
endpoints to propogate, and then "verifying the portals from a container
fails because it sees one of the pods twice:

service-32506 portal failed from container, expected:

            service-32506-4wije
    service-32506-7ejip
    service-32506-m9efj

      got:

            service-32506-7ejip
    service-32506-7ejip
    service-32506-m9efj

I also have an idea on a potential cause for LB leakage - I made a late
change to zero out the LoadBalancerStatus immediately, so I need to
double-check the logic around service release here.

Investigating both issues!


Reply to this email directly or view it on GitHub
#8707 (comment)
.

@thockin
Copy link
Member

thockin commented May 24, 2015

@a-robinson in case nobody did

On Sun, May 24, 2015 at 2:14 PM, Daniel Smith notifications@github.com
wrote:

So to follow up: it's target pools leaking, from GKE's e2e test run that's
using an old version. They share test projects & quotas with our GCE e2e
testing.


Reply to this email directly or view it on GitHub
#8707 (comment)
.

@a-robinson
Copy link
Contributor

The GKE tests leaking stuff is discussed in #7753 (comment). Tomorrow I'll check into adding the sleep to jenkins that I mention in the last comment there.

@eparis
Copy link
Contributor

eparis commented Jun 11, 2015

I just saw this. So PublicIPs were deprecated, what were they replaced with here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet