-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP error 412: Precondition not met for 'entity.change.deletions[0]', conditionNotMet #467
Comments
I thought I should provide more background on my use-case. I have a managed zone in GCP, it's called I spin up a few clusters, where each cluster is called external-dns.alpha.kubernetes.io/hostname: "training-user-<N>.training.weave.works"
external-dns.alpha.kubernetes.io/ttl: "5" So for each cluster I have a DNS record that points at the service inside that cluster. I have configured - name: external-dns
image: registry.opensource.zalan.do/teapot/external-dns:v0.4.8
args:
- --source=service
- --source=ingress
- --policy=upsert-only
- --provider=google
- --registry=txt
- --txt-owner-id=dx-training-external-dns
- --domain-filter=training.weave.works
- --google-project=dx-training I wonder whether I should try tweaking Besides, I'd be good to understand why that error happens in the first place, because it didn't occur to me in earlier tests. |
Currently a single ExternalDNS instance is designed to manage a single Kubernetes cluster, similar to e.g. an autoscaler, ingress-controller etc. Therefore, for each of your training clusters you'll want to deploy a dedicated ExternalDNS instance. In a simple world each cluster would have its own dedicated subdomain and you'd use If multiple clusters share the same DNS namespace the different ExternalDNS instances need to coordinate themselves a bit. This is achieved in two ways:
What I would suggest:
With that setup users of cluster If you want to ensure that even those cases are not possible you could use a different Finally, it looks to me your clusters are short lived which raises the question of cleanup. If you just terminate your cluster your DNS records will survive and they will be owned by this particular cluster's ExternalDNS instance, therefore you will never be able to reuse them in another cluster (they are claimed and you just terminated the only instance that can unclaim them, besides your manual hands of course). Either delete all Services and Ingresses from your cluster and wait for ExternalDNS to do another syncronization before you terminate it or delete all records manually that belong to this particular Regarding the error: afaik, this precondition error occurs when you try to delete a DNS record that doesn't exist. I believe that multiple concurrent ExternalDNS instances do conflicting changes because they share the same
Using different values for On a side node, you can also have DNS names automatically being generated without having to add annotations by using the @errordeveloper What you are trying looks interesting. Please let us know about your progress. 😃 |
Martin,
Thanks for your input! I already have a working setup. I will try adding
unique TXT ID, I didn't do that for some reason. If you are interested in
this training setup of ours, I have most of it the bits under
https://github.com/errordeveloper/k9c/blob/master/README.md, but there is
also an internal repo that has more glue scripts that I am not ready to
share publicaly yet (however happy to walk through in private).
…On Thu, 22 Feb 2018, 3:08 pm Martin Linkhorst, ***@***.***> wrote:
Currently a single ExternalDNS instance is designed to manage a single
Kubernetes cluster, similar to e.g. an autoscaler, ingress-controller etc.
Therefore, for each of your training clusters you'll want to deploy a
dedicated ExternalDNS instance. In a simple world each cluster would have
its own dedicated subdomain and you'd use --domain-filter so that every
attempt to declare a DNS name outside of this domain is ignored. The whole
subdomain would be managed by ExternalDNS, hence there'd be no conflicts.
If multiple clusters share the same DNS namespace the different
ExternalDNS instances need to coordinate themselves a bit. This is achieved
in two ways:
- --domain-filter which instructs ExternalDNS to ignore desired DNS
names that are not ending in a particular suffix
- --txt-owner-id which is a view on a DNS domain that hides any
existing DNS records that don't belong to this particular instance of
ExternalDNS. The goal is that multiple ExternalDNS instances can happily
sync their records in the very same DNS domain without removing each
others records. (A multi-tenant DNS zone where ExternalDNS is the tenant,
if you will)
What I would suggest:
- For each cluster deploy a dedicated ExternalDNS instance in that
cluster
- For each instance use a different value for --txt-owner-id, such as
training-user-<N>
- For each instance use the --domain-filter=training.weave.works (like
you did) so that ExternalDNS ignores any annotations stating something
else, e.g. bad.prod.weave.works.
With that setup users of cluster training-user-<N> could still create
services with annotations that instruct its ExternalDNS instance to create,
e.g. training-user-<N+1>.training.weave.works. However, the --txt-owner-id
at least ensures that either cluster <N> or <N+1> would manage that
record but never both.
If you want to ensure that even those cases are not possible you could use
a different --domain-filter for each ExternalDNS instance. The domain
filter is a simple suffix match so you could use
--domain-filter=-<N>.training.weave.works for cluster <N> and so on.
Since this looks a little odd you may also consider to give each cluster a
full subdomain so your domain filter looks more like
--domain-filter=".cluster-<N>.training.weave.works".
Finally, it looks to me your clusters are short lived which raises the
question of cleanup. If you just terminate your cluster your DNS records
will survive and they will be owned by this particular cluster's
ExternalDNS instance, therefore you will never be able to reuse them in
another cluster (they are claimed and you just terminated the only instance
that can unclaim them, besides your manual hands of course).
Either delete all Services and Ingresses from your cluster and wait for
ExternalDNS to do another syncronization before you terminate it or delete
all records manually that belong to this particular --txt-owner-id after
you terminated the cluster to unclaim them.
Regarding the error: afaik, this precondition error occurs when you try to
delete a DNS record that doesn't exist. I believe that multiple concurrent
ExternalDNS instances do conflicting changes because they share the same
--txt-owner-id but since they manage different clusters see different
Services.
- ExternalDNS instance <N> constantly creates training-user-<N> and
drops training-user-<N+1>
- ExternalDNS instance <N+1> constantly creates training-user-<N+1>
and drops training-user-<N>
Using different values for --txt-owner-id solves that issue.
On a side node, you can also have DNS names automatically being generated
without having to add annotations by using the --fqdn-template feature.
@errordeveloper <https://github.com/errordeveloper> What you are trying
looks interesting. Please let us know about your progress. 😃
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#467 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPWS1wO8ffy2w9da8SH_kGpX3sqNC_gks5tXYMBgaJpZM4SGiwm>
.
|
Unfortunately I am currently running into the same issue. I am using the current master as of today. external-dns has already some A records from ingress resources. After that I added an annotation specifying the TTL of that resource. But external-dns is not able to update them in Google Cloud DNS. I get the following log messages:
My first thought was, that it tried to specify the wrong (new) TTL when trying to delete the records, but as the info log shows, the TTL is correctly the old one. |
I just realised that for me this error didn't appear until I have tweaked the TTL to lowest possible (5s, IIRC). Perhaps this is a more general issue to do with low TTL? I also noticed that TTL doesn't apply to TXT records, which could be related, but I don't know. |
I think the change in TTL is the problem, not the length of the TTL. The records were probably first created without the TTL annotation, and then you probably added the annotation later to modify the TTL. At least that is what I was doing. |
Just encountered this : the issue is that when deleting a record after a change of TTL via annotations, external-dns tries to delete a record with the newly specified TTL, so GCP can't find it and throw an error (since the existing record has the previous TTL) |
I'm seeing this issue even after completely cleaning out all A and TXT records and having external-dns recreate them. As soon as it's finished creating the new ones with my annotated 60s ttl, it fails again with the "Precondition not met" error and refuses to do anything more. I've had to remove the ttl annotations to move forward. |
Just to confirm the above, I've experienced a similar issues with records not being cleaned up when the TTL has been specified via an annotation:
After removing the service the record it then tries to delete has a TTL of 300:
Version: 0.5.1 Args:
|
I can confirm that when I remove When I change the TTL manually, this does not cause an issue until an update needs to happen. |
Same issue here, from what I can tell even though the record is updated in the first place, then external-dns tries to delete a record with the default ttl (300) which doesn't exist:
that shouldn't happen since a record with the correct ttl is already there |
I am seeing this in the in the logs:
I am trying to use a dirty zone, and it worked well in some initial tests I did, but somehow eventually I started seeing this. I am going to work around the problem by clearing my zone, but it would be good to understand what this error really means and why it's working okay sometimes and sometimes it doesn't.
The text was updated successfully, but these errors were encountered: