Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent stranding of partial load balancers resources #7852

Merged
3 commits merged into from May 22, 2015

Conversation

a-robinson
Copy link
Contributor

Will likely fix #7753, although the proof is in the pudding. Explanation of what this is doing in the comment there.

@quinton-hoole

@mbforbes
Copy link
Contributor

mbforbes commented May 6, 2015

+cc @smarterclayton (openshift)

@mbforbes mbforbes assigned ghost May 6, 2015
@a-robinson
Copy link
Contributor Author

openstack != openshift, Max :)

cc @anguslees for openstack change

@a-robinson
Copy link
Contributor Author

Passes LB e2es consistently

@mbforbes
Copy link
Contributor

mbforbes commented May 6, 2015

How embarrassing... that's what I get for attempting to triage during a meeting :-)

@smarterclayton
Copy link
Contributor

:) I'd like to pretend like some marketing guy probably didn't anticipate that when the name was chosen, but I know better ;)

----- Original Message -----

How embarrassing... that's what I get for attempting to triage during a
meeting :-)


Reply to this email directly or view it on GitHub:
#7852 (comment)

@@ -66,7 +66,8 @@ type TCPLoadBalancer interface {
CreateTCPLoadBalancer(name, region string, externalIP net.IP, ports []int, hosts []string, affinityType api.AffinityType) (string, error)
// UpdateTCPLoadBalancer updates hosts under the specified load balancer.
UpdateTCPLoadBalancer(name, region string, hosts []string) error
// DeleteTCPLoadBalancer deletes a specified load balancer.
// DeleteTCPLoadBalancer deletes a specified load balancer. It should return
// nil if the load balancer specified already didn't exist.
DeleteTCPLoadBalancer(name, region string) error
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest renaming to something like EnsureTCPLoadBalancerDeleted() to better advertise it's idempotence.

@ghost
Copy link

ghost commented May 6, 2015

The main feedback is that you need to retry pool deletion. The other comments are cosmetic, and won't block merging.

@a-robinson
Copy link
Contributor Author

Rebased and fixed all your comments except the pool error one.

return err
// It's ok if the pool doesn't exist, as we may still need to delete the vip
// (although I don't believe the system should ever be in that state).
pool, poolErr := pools.Get(lb.network, vip.PoolID).Extract()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above returned ErrNotFound, then vip.PoolID will be undefined here.

@ghost
Copy link

ghost commented May 7, 2015

@a-robinson let me know when you're ready for a re-review. Assigning to you until then. It looks like you still need to address some of @anguslees openstack feedback.

Also, it would be good to get someone to test this on openstack before moreging. Perhaps @anguslees can do that for us?

@ghost ghost assigned a-robinson and unassigned ghost May 7, 2015
@a-robinson
Copy link
Contributor Author

Sorry about the delay, I had some other things float to the top of my priority queue this week. I think we should be fine with getting pools by name rather than ID, but have separated that out into issue #8352.

In the meantime, I've modified this PR to effectively maintain current behavior, just returning nil rather than an error if we attempt to delete a VIP that doesn't exist, paralleling what would happen if we first attempted to get load balancer before deleting it.

@a-robinson
Copy link
Contributor Author

@quinton-hoole, are you alright with merging this?

@ghost
Copy link

ghost commented May 18, 2015

Apologies for the delay. I'll review it now. Has this been tested on openstack yet, as per above comment?

@a-robinson
Copy link
Contributor Author

No, I don't believe it has been tested on openstack, but @anguslees seems to be happy with it?

} else if vipErr != nil {
return vipErr
}
vipExists := (vipErr == nil)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it's possible to get here without vipErr being nil, in which case vipExists is superfluous, right?

@ghost
Copy link

ghost commented May 19, 2015

Just two minor things left to check/fix, as far as I can see.

to load balancers having already been deleted.
controller to rely on that, so that we won't strand partial resources
from them anymore (target pools in GCE, pools in OpenStack, etc.).
@a-robinson
Copy link
Contributor Author

Re-pushed the code with the vipExists logic torn out and TODOs clarified with the openstack issue number.

@ghost ghost added team/cluster priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 19, 2015
@ghost ghost added this to the v1.0-candidate milestone May 19, 2015
@ghost
Copy link

ghost commented May 19, 2015

Have you performed a few runs of the services e2e tests against this code to confirm that:

  1. They pass?
  2. They don't leak GCE load balancer resources?

@roberthbailey roberthbailey removed this from the v1.0-candidate milestone May 19, 2015
@ghost
Copy link

ghost commented May 21, 2015

Can anyone explain why the v1.0 candidate milestone tag was removed from this one? I don't think that we can launch without this, as all clusters that employ external load balancers on GCE will stop working properly within a few days, like our current soak test clusters.

@ghost ghost added this to the v1.0-candidate milestone May 21, 2015
@ghost
Copy link

ghost commented May 21, 2015

Tentatively re-adding v1.0-candidate milestone until I understand why it's not v1.0 worthy.

ghost pushed a commit that referenced this pull request May 22, 2015
Prevent stranding of partial load balancers resources
@ghost ghost merged commit 6d8eaa9 into kubernetes:master May 22, 2015
@a-robinson a-robinson unassigned ghost Aug 12, 2015
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Target pool leak in GCE project running our jenkins e2e tests
6 participants