Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCLOUD: Fix issue on creating/deleting/updating TXT records #3011

Merged
merged 3 commits into from
Jun 17, 2024

Conversation

ingwarsw
Copy link
Contributor

@ingwarsw ingwarsw commented Jun 14, 2024

Fixes the issue with updating/deleting TXT (maybe others too) record types from GCLOUD.

#1: + CREATE domain.com TXT "some_value" ttl=3600
FAILURE! runChange error: googleapi: Error 412: Precondition not met for 'entity.change.deletions[domain.com.][TXT]', conditionNotMet

Under the hood DNS had TXT resources that was not sorted and diff2.ByRecordSet returns changes that have sorted values so we tried to send request with proper values but sorted differently, and seems like google is picky about ordering.
Maybe it was possible to pass compFunc but I didnt get how to do that..
Taking the original records was way easier, and should always return proper order.

Easiest reproduction case.
Add any TXT record with entries not sorted alphabetically (manually), and do any change/delete via dnscontrol.

@ingwarsw ingwarsw changed the title Fix issue on deleting TXT records on GCLOUD GCLOUD: Fix issue on deleting/updating TXT records Jun 14, 2024
@ingwarsw ingwarsw changed the title GCLOUD: Fix issue on deleting/updating TXT records GCLOUD: Fix issue on creating/deleting/updating TXT records Jun 14, 2024
@cafferata
Copy link
Collaborator

Ping @riyadhalnur, the maintainer of the Google Cloud DNS provider.

@riyadhalnur
Copy link
Contributor

@ingwarsw thanks for the PR. I'll review this soon

@tlimoncelli
Copy link
Contributor

Interesting bug! Do you think you could construct an integration test that detects this situation?

Hey, @blackshadev! You're my ordering guru. Any thoughts?

@ingwarsw
Copy link
Contributor Author

Interesting bug! Do you think you could construct an integration test that detects this situation?

I dont think its possible..
GCLOUD provider always sets txt records in proper order, so whatever order I will provide it will create them under the hood sorted..
Only possible way would be to add step direct_tc or whatever that would just set it with other means exactly as we want it..

@blackshadev
Copy link
Contributor

I don't fully understand the problem here. I tried deduce the problem from the code and description. But it really doesn't come through to me. Could you maybe provide a record set diff which fails and one which succeed to examplify the issue?

Regarding the ordering let me explain what is possible. DNSControl can order your change sets based on dependencies. But than DNSControl need to know the dependent record. You can take a look at record.go:458, TXT records generally do not have such dependencies. So I am not sure how this would help you.

Second thing to note is the fact that DNSControl sorts change sets only. So the diff2 package ByRecordSet clusters the changes between record set and the ordering happen between the RecordSet clusters, not within the RecordSet. If you need that, you would need another diff2 method.

I hope this information helps you, and you can possible give me an example, maybe we can solve this issue within the DNSControl ordering itself.

@ingwarsw
Copy link
Contributor Author

So to rephrase..
GCLOUD internally have TXT records as just 1 record.
So it have
TXT name ["record1", "record2", "record3"] TTL

And to delete record you need to send EXACTLY same content of the record (including order of rrds).

And DNSControl sorts the rrds here.

So if you have in GCLOUD record thats
TXT some.name. ["a_record", "z_record", "g_record"] 3600
and DNSControl wants to delete it it will send
TXT some.name. ["a_record", "g_record", "z_record"] 3600 from here and GCOULD will throw error, because order of RRDs will be wrong.

But if we will take original order it will pass properly.

@ingwarsw
Copy link
Contributor Author

ingwarsw commented Jun 16, 2024

BTW.. Im not sure why --debug was being deprecated but it would help tremendously here..

Maybe my poor man --debug output, should help you grasp the issue.

from GCLOUD &{dns#resourceRecordSet xxx.eu. <nil> ["m05gb..." "adobe-..." "dz95h..."] [] 3600 TXT {0 map[]} [] []}

changed [{CHANGE {xxx.eu TXT} ["adobe-..." "dz95h..." "m05gb..."] -> ["_uc75... (new record)" "adobe-..." "dz95h..." "m05gb..."] [+ CREATE xxx.eu TXT "_uc75... (new record)" ttl=3600] + CREATE xxx.eu TXT "_uc75... (new record)" ttl=3600 map[] false false}
old ["adobe-..." "dz95h..." "m05gb..."]
orig ["m05gb..." "adobe-..." "dz95h..."]

adds &{dns#resourceRecordSet xxx.eu. <nil> ["_uc75... (new record)" "adobe-..." "dz95h..." "m05gb..."] [] 3600 TXT {0 map[]} [] []}
dels &{dns#resourceRecordSet xxx.eu. <nil> ["adobe-..." "dz95h..." "m05gb..."] [] 3600 TXT {0 map[]} [] []}

And now del throws errors because as you clearly see what we send as del was different as what we got from GCLOUD..
If we use orig as thing to delete then order is the same and it works.

@blackshadev
Copy link
Contributor

Ok, now I understand. GCloud expects the recordset referenced to be exactly the same as how it is reported by itself. In that case, your approach of using the original records seems very logical and valid. We could alter the way DNSControl reorders records within a record set, but let's be honest, if we fix how google expects it, some other provider will break.

One thing that would intrest me: There is a flag to disable sorting: --disableordering , I assume this issue to persist even with ordering disabled, because, as I said earlier, the dnssort part doesn't alter the order within the recordset itself. Could you give it a try. This is just out of curiosity.

@ingwarsw
Copy link
Contributor Author

ingwarsw commented Jun 17, 2024

I dont know what happened but today google returns TXT RRDs always sorted..
Even if I will send them unsorted they will be returned sorted.

So Im not sure if my PR is still needed (I still think its safe to always send what GCLOUD returned)..
Strange things happen around..

@tlimoncelli
Copy link
Contributor

I dont know what happened but today google returns TXT RRDs always sorted.. Even if I will send them unsorted they will be returned sorted.

That's good news! I'm guessing that people complained and they took action.

So Im not sure if my PR is still needed (I still think its safe to always send what GCLOUD returned).. Strange things happen around..

I'm ambivalent.

However, if we do adopt this PR, I recommend a different implementation. The PR currently uses the "existing" record to determine what the order should be. That could have problems if "existing" gets mutated.

Instead, it should record the result of the API call in RecordConfig.Original, then refer to that. It's probably the same data, but .Original is guaranteed to be correct because (1) it is never modified, (2) it is a pointer to the data that we know is correct.

That change would look something like:

  • nativeToRecord() would store a pointer to the RRSet in RecordConfig.Original.
  • When diff2.DELETE needs to know the order of the original, refer to change.Old[0].Original
  • Similar for diff2.CHANGE

A good example to copy is ROUTE53:

@tlimoncelli
Copy link
Contributor

excellent! (and much more simple)

@tlimoncelli tlimoncelli merged commit d55474c into StackExchange:main Jun 17, 2024
1 of 2 checks passed
@tlimoncelli
Copy link
Contributor

P.S. Excellent detective work on this one! Such an obscure bug! I don't know how you figured it out! Great job!

@ingwarsw
Copy link
Contributor Author

Our pipeline had been failing constantly over last few months.. and when everyone was mad enough it happened we just leaved it unapplied and I spend few h trying to debug issue..

@ingwarsw ingwarsw deleted the ing-fix-gcloud branch October 2, 2024 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants