Cluster reconciliation should reconcile exited ELB #53

ashish-amarnath · 2019-06-30T06:53:05Z

get only running ELB containers
remove exited ELB containers with name conflict

Signed-off-by: Ashish Amarnath ashish.amarnath@gmail.com

What this PR does / why we need it:
Cluster reconciliation should reconcile deleted ELBs

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #52

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

NONE

chuckha

@ashish-amarnath What do you think about cleaning up the load balancer on cluster delete so that we don't have to clean up exited containers on create?

I'm curious how you're ending up with exited containers in the first place -- what sort of workflow are you using?

ashish-amarnath · 2019-07-01T19:58:13Z

@chuckha I like the cleaning up on cluster delete. But I think we need to keep the container prune logic in SetupLoadbalancer
About how I am ending up w/ exited containers... I am being nasty and killing them.
Opened an issue #52 w/ repro

amy

This is out of scope for this PR but do we have any tests that can verify this change?

ashish-amarnath · 2019-07-01T20:05:06Z

@amy Yes, we should have a good testing story for capd as a whole. This scenario, in particular, is kinda hard to test via automated tests. The repro steps for this is pretty much me doing bad things on purpose.

chuckha · 2019-07-01T20:05:31Z

interesting...

i'd really like to find a balance here between readability and robustness.

CAPD doesn't have to be robust as it will never be a production system and should explicitly be designed to not be a robust system and more to be a general example of cluster-api assumptions.

So I'm thinking about what is the equivalent use case in cluster-api-provider-aws -- if we stop a node I don't think even CAPA is smart enough to clean up the old stopped node.

Because of that I think this logic should move into to the cluster delete and the SetupLoadBalancer should assume the environment is clean. Does that seem reasonable to you?

ashish-amarnath · 2019-07-01T20:09:08Z

@chuckha drawing comparison w/ capa.. this is equivalent to deleting the ELB fronting the API server. At which point, IIRC, capa controller will reconcile the deleted ELB. No?

chuckha · 2019-07-01T20:20:55Z

That's a good comparison, certainly more accurate than mine. There seems to be no equivalent state between aws and docker with regards to killing the container. A stopped container == what in AWS world? I would suggest that instead of cleaning up before creating that we instead add --rm to the load balancer set up. EDIT: actually after my second paragraph i don't even agree with this, i'd suggest using docker rm -f instead of docker kill to simulate a failure.

The only use case this particular code is covering is adding support for the docker kill command when really the only delete operation we support is docker rm (-f). Do you see a use case for supporting docker kill?

ashish-amarnath · 2019-07-01T20:30:21Z

A stopped container == what in AWS world?

Depends on which container you stop. Stopping the nginx (loadbalancer) container == killing the ELB. Stopping the kind node container == deleting the EC2 instance. There is also a scenario where you stop the EC2 instance. In that case, capa controller will recreate it. At least based on https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/master/pkg/cloud/aws/services/ec2/instances.go#L107-L111
that is what I would expect.

chuckha · 2019-07-01T20:51:04Z

ok, I think I see the use case now. The outlined use case for this is simulating a stopped load balancer. This situation doesn't exist in AWS since you cannot stop and ELB, but it does exist in docker. Do I have it right?

ashish-amarnath · 2019-07-01T21:07:45Z

The outlined use case for this is simulating a stopped load balancer.

This is correct.

This situation doesn't exist in AWS since you cannot stop and ELB

In AWS you can delete the ELB from the AWS console and the capa cluster actuator will recreate it.

chuckha

/approve

Thank you for clarifying the use case regarding this situation, much appreciated

k8s-ci-robot · 2019-07-01T21:15:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashish-amarnath, chuckha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [chuckha]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

chuckha · 2019-07-02T17:12:16Z

/lgtm

on amy's behalf

detiber · 2019-07-02T18:09:18Z

/hold

detiber

Overall, lgtm, but we should be using logging now that #27 has merged.

detiber · 2019-07-02T18:09:53Z

actuators/actuators.go

@@ -40,9 +44,11 @@ func getRole(machine *clusterv1.Machine) string {
 }

 func getExternalLoadBalancerNode(clusterName string) (*nodes.Node, error) {
+	fmt.Printf("Getting external load balancer node for cluster %q\n", clusterName)


These fmt.Printfs should be replaced with log statements now that #27 has merged.

#27 added loggers into the clusterActuator and machineActuators. This function is not in either of those scopes. So, we'll need to use klog or something like that.
Looks like the build also failed because of undefined: fmt

the interim step is to pass a log through to all of these functions and rely on the logger interface. we should not use klog anywhere.

we should not use klog anywhere.

+1
always pass a logger interface and not the logging implementation directly, whether it's fmt or klog.

* get only running ELB containers * remove exited ELB containers with name conflict Signed-off-by: Ashish Amarnath <ashish.amarnath@gmail.com>

chuckha · 2019-07-03T14:36:32Z

/lgtm

@detiber you ok removing the hold?

detiber · 2019-07-03T14:39:24Z

/hold cancel

k8s-ci-robot requested review from detiber and vincepri June 30, 2019 06:53

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 30, 2019

ashish-amarnath force-pushed the reconcile-deleted-elb branch from 9fb70dd to 486939c Compare June 30, 2019 06:59

chuckha reviewed Jul 1, 2019

View reviewed changes

amy approved these changes Jul 1, 2019

View reviewed changes

chuckha approved these changes Jul 1, 2019

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 1, 2019

k8s-ci-robot assigned chuckha Jul 2, 2019

k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 2, 2019

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 2, 2019

detiber reviewed Jul 2, 2019

View reviewed changes

Cluster reconciliation should reconcile exited ELB

3aa185e

* get only running ELB containers * remove exited ELB containers with name conflict Signed-off-by: Ashish Amarnath <ashish.amarnath@gmail.com>

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 3, 2019

ashish-amarnath force-pushed the reconcile-deleted-elb branch from e43c100 to 3aa185e Compare July 3, 2019 06:30

k8s-ci-robot removed the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 3, 2019

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 3, 2019

use logr instead of fmt printfs

603fffb

k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 3, 2019

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 3, 2019

k8s-ci-robot merged commit 1064568 into kubernetes-retired:master Jul 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster reconciliation should reconcile exited ELB #53

Cluster reconciliation should reconcile exited ELB #53

ashish-amarnath commented Jun 30, 2019

chuckha left a comment

ashish-amarnath commented Jul 1, 2019

amy left a comment

ashish-amarnath commented Jul 1, 2019

chuckha commented Jul 1, 2019

ashish-amarnath commented Jul 1, 2019

chuckha commented Jul 1, 2019 •

edited

ashish-amarnath commented Jul 1, 2019 •

edited

chuckha commented Jul 1, 2019

ashish-amarnath commented Jul 1, 2019

chuckha left a comment

k8s-ci-robot commented Jul 1, 2019

chuckha commented Jul 2, 2019

detiber commented Jul 2, 2019

detiber left a comment

detiber Jul 2, 2019

ashish-amarnath Jul 2, 2019

chuckha Jul 2, 2019

neolit123 Jul 2, 2019 •

edited

ashish-amarnath Jul 3, 2019

chuckha commented Jul 3, 2019

detiber commented Jul 3, 2019

Cluster reconciliation should reconcile exited ELB #53

Cluster reconciliation should reconcile exited ELB #53

Conversation

ashish-amarnath commented Jun 30, 2019

chuckha left a comment

Choose a reason for hiding this comment

ashish-amarnath commented Jul 1, 2019

amy left a comment

Choose a reason for hiding this comment

ashish-amarnath commented Jul 1, 2019

chuckha commented Jul 1, 2019

ashish-amarnath commented Jul 1, 2019

chuckha commented Jul 1, 2019 • edited

ashish-amarnath commented Jul 1, 2019 • edited

chuckha commented Jul 1, 2019

ashish-amarnath commented Jul 1, 2019

chuckha left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jul 1, 2019

chuckha commented Jul 2, 2019

detiber commented Jul 2, 2019

detiber left a comment

Choose a reason for hiding this comment

detiber Jul 2, 2019

Choose a reason for hiding this comment

ashish-amarnath Jul 2, 2019

Choose a reason for hiding this comment

chuckha Jul 2, 2019

Choose a reason for hiding this comment

neolit123 Jul 2, 2019 • edited

Choose a reason for hiding this comment

ashish-amarnath Jul 3, 2019

Choose a reason for hiding this comment

chuckha commented Jul 3, 2019

detiber commented Jul 3, 2019

chuckha commented Jul 1, 2019 •

edited

ashish-amarnath commented Jul 1, 2019 •

edited

neolit123 Jul 2, 2019 •

edited