Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change ApplyChanges in RFC2136 to batch update #1164

Merged
merged 2 commits into from Oct 8, 2019
Merged

Change ApplyChanges in RFC2136 to batch update #1164

merged 2 commits into from Oct 8, 2019

Conversation

h3ndrk
Copy link
Contributor

@h3ndrk h3ndrk commented Aug 24, 2019

This changes the ApplyChanges in the RFC2136 provider to issue a batch update to the DNS server instead of issueing an update for each updated record. An update is not sent to the server if there are no changes to send.

Also the RemoveRecord function only issues a RR removal instead of an RRset removal so external-dns is now able to remove distinct records.

Some tests for the RFC2136 provider are fixed and added.

This changes the ApplyChanges in the RFC2136 provider to issue a batch update
to the DNS server instead of issueing an update for each updated record. An
update is not sent to the server if there are no changes to send.

Also the RemoveRecord function only issues a RR removal instead of an RRset
removal so external-dns is now able to remove distinct records.
This fixes and adds some tests for the RFC2136 provider. The SendMessage stub
function is improved with better DNS message parsing. Also a new test is
introduced which simulates a single record with multiple targets. Also the test
for ApplyChanges is fixed for the new batch updating.
@k8s-ci-robot
Copy link
Contributor

Welcome @NIPE-SYSTEMS!

It looks like this is your first PR to kubernetes-incubator/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-incubator/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 24, 2019
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 24, 2019
Copy link
Member

@njuettner njuettner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 8, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: njuettner

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 8, 2019
@k8s-ci-robot k8s-ci-robot merged commit 85274ac into kubernetes-sigs:master Oct 8, 2019
@lkusnadi
Copy link

Is there an amendment for a use case where I need to control DNS outside external DNS but I want to use that DNS for my deployment? Example:
www.domain.com is created by AWS cloudformation. www.domain.com is an RRSet with CNAME to cs1.prod.domain.com and cs2.prod.domain.com. I put weight of either 0 or 100 on either CNAME. Why I want to do this? cs1.prod.domain.com is cluster 1 and cs2.prod is cluster 2. I need to be able to migrate the entire kube cluster without downtime by re-creating the cluster 2, deploy everything, then use www.domain.com as floating dns to direct the traffic when cluster 2 is ready.
As a result, I need to define www.domain.com in my ingress so that traffic with HOST header www.domain.com can be recognized when it's going into the cluster. Surely, external DNS will complain because www.domain.com is owned by cloudformation. In Batch Update, this hogs everything, including the record set that is not conflicting. Thanks.

@h3ndrk
Copy link
Contributor Author

h3ndrk commented Jan 20, 2020

I'm sorry, but I don't get your point entirely, where the problem is.

When I remember correctly (correct me if I'm wrong), external-dns manages its own state (so it stores all RRs itself). When it detects that its state differs from the plugin's state, it calls AppleChanges() and passes the changes. The changes consist of individual RRs, I think.

Before this PR, the RFC2136 plugin did non-batched updates, that means, it sends updates whenever it gets individual RRs. This is problematic because it is not atomic. A partial change may lead to an inconsistent update. This is what I experienced in my cluster (some updates had gone through, some had been dropped) and therefore created this PR.

When looking closer at the code, the removal of RRs seems not correct. As I said, if I'm correct, external-dns passes individual RRs but the prior ApplyChanges() did not remove individual RRs but the whole RRSet. This means if external-dns wants to remove only some RRs the RFC2136 plugin just removed the whole RRSet which is not correct and was therefore changed with this PR.

Now with my reasoning, can you point out problems with this?

@lkusnadi
Copy link

Thanks h3ndrk, I get the point. What you're doing make sense. However, I think this is an implementation that I use, which is trying to figure out how to rollover the entire cluster. I am using floating DNS to control the traffic into which cluster during the rollover.
Unfortunately when using this patch, I get:
level=error msg="InvalidChangeBatch: [RRSet of type A with DNS name messaging.prod.domain.io. is not permitted because a conflicting RRSet of type CNAME with the same DNS name already exists in zone staging.domain.io., ... 27 more exception(s).]\n\tstatus code: 400, request id: fd75a01d-b70d-47ba-95bd-ec34abed4103"
There are lots of update that is also batched into the same request (I guess) and it cannot proceed because of the conflict.
As pointed out before (in my specific case):
messaging.prod.domain.io is cloudformation and has no corresponding TXT, but messaging.prod.domain.io and messaging-cs1.prod.domain.io is defined in ingress host in cluster 1. Also in cluster 2, ingress host has messaging.prod.domain.io and messaging-cs2.prod.domain.io. I use zalando aws ingress controller with skipper. I need messaging.prod.domain.io in the ingress host so that it can be discovered by skipper and passed to the correct backend.

Without this patch, I don't get the conflict. I think it's due to external dns does not attempt to try updating the record set because it doesn't have the corresponding TXT.

Let me know if it's not clear :)

@h3ndrk
Copy link
Contributor Author

h3ndrk commented Jan 20, 2020

Okay, thanks for the clarifications. For me this is out of my scope since I've neither used Zalando AWS ingress nor Skipper, so can't help you in your particular setup.

You could wait for an answer of the maintainers or post a fresh issue with a reference to your explanations here.

@richstokes
Copy link

richstokes commented Apr 13, 2020

This breaks external-dns for anyone who manages specific CNAMES outside of their cluster, but still needs to be able to route traffic for a hostname should it arrive.

For example when you had a CNAME configured outside of the cluster that may route traffic into it for failover or latancy-routing reasons. See #1517

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants