Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE: Delete cluster will also delete the DNS entries created by kubernetes #8250

Conversation

mccare
Copy link
Contributor

@mccare mccare commented Jan 2, 2020

Kops delete cluster will now also delete the GCE DNS entries.

GCE DNS deletion is done by creating a Change object with resource record deletions/additions for one zone. So the implementation is using the GroupDeleter and the GroupKey.

For deletion the zone.name was needed and was not present in the dns.ResourceRecordSet. So I passed it along as the name of the ResourceRecord to be deleted.

Any hints or pointers on how to write tests for this change very much appreciated. Works in my use case (example cluster on GCE).

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 2, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @mccare. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 2, 2020
@mccare
Copy link
Contributor Author

mccare commented Jan 2, 2020

/assign @geojaz

@geojaz
Copy link
Member

geojaz commented Jan 3, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 3, 2020
geojaz
geojaz previously requested changes Jan 3, 2020
Copy link
Member

@geojaz geojaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments- this looks like we're moving in a useful direction, but I think it brings up a bunch of questions that I'm not yet sure how to answer...


zoneResponse, err := d.gceCloud.NameService().ManagedZones.List(d.gceCloud.Project()).Do()
if err != nil {
return nil, fmt.Errorf("error getting GCE DNS zones %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels copypasta'd from AWS. I would prefer to s/getting GCE DNS zones/listing Cloud DNS zones (GCP Cloud DNS zones?) or similar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the aws implementation, and I was wrong about copypasta, but my comment stands :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure what you mean, the naming of NameService() to CloudDNS()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was a nit. I would update the Error to error listing Cloud DNS zones.

pkg/resources/gce/gce.go Show resolved Hide resolved
continue
}

if !strings.HasSuffix(record.Name, "."+d.clusterName+".") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing some interpolation here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless, the logic feels a bit arbitrary to me. If I'm reading this correctly, you're basically trying to figure out if Cloud DNS records are records that we care about (and thus would need to delete). Would it be useful to pull out the logic to figure out if a given record is one that we should be cleaning up after into a helper?

One major thing that I think we're not yet considering, but was eluded to in the comments of the faux-original implementation (commented out) is the concept of "shared" ownership; ie: assuming there is a record in cloud dns that we think we/kops should own, do we really own it? In this case, i'm not trying to poke holes in your implementation, i'm trying to figure out how to clean up after ourselves... I don't have an answer or suggestion right now.

Copy link
Contributor Author

@mccare mccare Jan 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically the implementation was taken from the AWS implementation.

I've refactored some:

  • pulled out the method if a DNS name is kops managed into the clusterDiscoveryGCE (Best would be if this would be a general method for all providers )
  • Only api, api.internal and bastion. will be deleted.
    • Master-public-name which can be set at the command line will not be deleted. I would need access to the Cloudspec which is not accessible from here
    • etcd seems obsolete since the implementation of etcd-manager (see buildPrecreateDNSHostnames in dns.go)

I've readded the check for gossip host name, that was in the previous comments. As far as I understood the previous comments, they were mentioning shared zones. The records we delete are created by kubernetes. So IMHO they can't be really shared (only one A record per name allowed) and can be deleted.

This is not to say that the code may not delete a record e.g. in a private only zone with the same DNS name as the public zone. But from my tests I cannot create a cluster where there is a private a public zone with the same DNS name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not to say that the code may not delete a record e.g. in a private only zone with the same DNS name as the public zone. But from my tests I cannot create a cluster where there is a private a public zone with the same DNS name.

I think we can probably not worry about shared ownership of Cloud DNS records, but I think it's possible to end up in a place where you have a shared Zone. Whether or not kops supports that right now for gce is an open question.

For zones, would it be possible to add a label like we have in many AWS resources? I was thinking that if a zone was specified that already existed, we could add label: "kubernetes.io/cluster/" + clusterName = "shared". Then we could skip deletion of that zone.

This seems like a great place for some tests to make sure that we're cleaning up when we need to and not deleting shared resources, but I'm not sure which route to follow without doing some more thinking. @rifelpet Do you have any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using labels sounds optimal. Indeed for AWS resources we dont want kops to manage, kops supports detecting a shared label/tag and if it is present, it wont delete the resource. In AWS it is mostly around networking (VPC, subnets, etc) but I think it makes sense to use it for DNS zones as well.

@mccare mccare force-pushed the gce-delete-dns-entries-for-cluster-delete branch 5 times, most recently from 2f9a26b to 932be2b Compare January 3, 2020 12:26
@mccare mccare requested a review from geojaz January 3, 2020 12:43
@mccare mccare force-pushed the gce-delete-dns-entries-for-cluster-delete branch from 932be2b to 8f9e455 Compare January 3, 2020 15:56
@mccare mccare force-pushed the gce-delete-dns-entries-for-cluster-delete branch from 8f9e455 to 956d4b5 Compare January 14, 2020 18:52
@geojaz geojaz dismissed their stale review March 10, 2020 01:25

changes addressed


for _, record := range r {
r := record.Obj.(*clouddns.ResourceRecordSet)
zoneName = record.Name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the groupKey would be great here


if d.isKopsManagedDNSName(record.Name) {
resource := resources.Resource{
Name: zone.Name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is shown to the user, so it might be nice to keep this as record.Name

@justinsb
Copy link
Member

justinsb commented Apr 7, 2020

Thanks @mccare this looks great. I had one idea about using the group key for the zone name, but I'm going to experiment in a follow-on PR (and will cc you if it works!).

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 7, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, mccare

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 7, 2020
@justinsb justinsb force-pushed the gce-delete-dns-entries-for-cluster-delete branch from 956d4b5 to 765fa36 Compare April 7, 2020 14:37
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 7, 2020
@rifelpet
Copy link
Member

rifelpet commented Apr 7, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 7, 2020
@k8s-ci-robot k8s-ci-robot merged commit f14ce65 into kubernetes:master Apr 7, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Apr 7, 2020
justinsb added a commit to justinsb/kops that referenced this pull request Apr 8, 2020
Use the GroupKey to pass the zone name, meaning the name can be the
user-facing value.

Follow-on to kubernetes#8250
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants