Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredns 1.0.6 #1504

Closed
miekg opened this issue Feb 8, 2018 · 27 comments
Closed

coredns 1.0.6 #1504

miekg opened this issue Feb 8, 2018 · 27 comments
Assignees

Comments

@miekg
Copy link
Member

miekg commented Feb 8, 2018

There some important fixes (esp. in proxy); kick of new release.

@miekg miekg self-assigned this Feb 8, 2018
@johnbelamaric
Copy link
Member

johnbelamaric commented Feb 8, 2018

I think we need decisions on a couple things, and as to whether or not we hold the release for them. These are:

I would like to see these finalized for the release, and then we can use 1.0.6 in the k8s 1.10 release for the beta of CoreDNS as default. That means we need to act pretty quickly on this.

Reload
I think it's important to have this capability to simplify changes to the k8s config map - which means making a decision and changing it if necessary. So, options are:

  1. Define the concept of global plugins and make it one of those.
  2. Leave it as is, but error on syntax if it's defined more than once.
  3. Make it a CLI flag.

Opinions? I am OK with any of these. It's nice to have the plugin structure but I did originally also thing about a CLI flag. But are we just kicking the can down the road in not dealing with global plugins?

Upstream Behavior
Here, I think we need something like in #1484, otherwise I think behavior will change for the CNAME lookups compared to kube-dns (I think that's true?). It's also a more general issue, because right now CNAME lookups only work if the zone of the CNAME is handled by the same plugin, or some external server.

@miekg
Copy link
Member Author

miekg commented Feb 8, 2018

upstream and the proxy fix are most critical,, the rest is less important IMO

@chrisohaver
Copy link
Member

... behavior will change for the CNAME lookups compared to kube-dns (I think that's true?)

Thats my understanding. I will verify.

@chrisohaver
Copy link
Member

Thats my understanding. I will verify.

I was mistaken about kube-dns behavior. kube-dns does not honor stub-domains when looking up CNAMEs.

I also found an open issue, kubernetes/dns#131 that describes the same behavior.

IMO, this is a bug, but since it's a pre-existing behavior, there is no urgency to fix it.

@miekg
Copy link
Member Author

miekg commented Feb 8, 2018 via email

@miekg
Copy link
Member Author

miekg commented Feb 8, 2018

also seeing all the bugs issues with reload and other plugins we should disable the reload plugin.

@johnbelamaric
Copy link
Member

That's fine, but all those issues still apply with signal based reloads. But having the plugin encourages its use a lot more.

@miekg
Copy link
Member Author

miekg commented Feb 15, 2018

Think all things that should be in are in, and should be could to go for a 1.0.6?
I could easily kick one off today/tomorrow?

@chrisohaver
Copy link
Member

I think we need to sort out #1532 first...

@johnbelamaric
Copy link
Member

Yes, very soon. @fturib just filed one more issue he found, it looks like someone made a backwards-incompatible change in K8s API so we are going to work around it. #1532 - @chrisohaver should have it done today or tomorrow. It's a pain to test because the issue is only when running against a k8s built off of k8s master not in any released version.

@miekg
Copy link
Member Author

miekg commented Feb 15, 2018

yeah, saw it. Wish we could easily release and update the CoreDNS in k8s, separate from their release cadence

@caffeinejolt
Copy link

Not sure if this will be fixed in proxy or not - but for what its worth... I am running coredns to proxy requests arriving on an ipv6 address/interface to an older tinydns authoritative server that I have not bothered to apply this patch since it does not apply cleanly and I was 1) lazy and 2) wanted to play with coredns. I am running coredns in a chroot as non-root and allow it to bind to port 53 ala setcap cap_net_bind_service=+ep coredns Here is the config:

. {
bind [IPv6 address]
proxy . [IPv4 address]
log
}

With that background out of the way - here's the problem I am seeing... When I fire it up it works perfect - for about 5 minutes or so and then it gets nothing but SERVFAIL responses. The "fix" is to restart coredns - then it works fine again - for another 5 minutes or so and then back to all SERVFAIL

@johnbelamaric
Copy link
Member

Can you add errors to the config and see if anything shows up in the log?

@caffeinejolt
Copy link

caffeinejolt commented Feb 16, 2018

Sure - I now have this:

. {
bind [IPv6 address]
proxy . [IPv4 address]
errors
log
}

Will have something shortly. Not sure if it matters or not, but again - for a full background, coredns gets somewhere between 20-200 queries per second on this setup.

@caffeinejolt
Copy link

Results are in: unreachable backend: no upstream host

However, I can dig the upstream host (which runs on the same machine) while it is issuing that error. Restarting coredns fixes. Will just add a cron to kill coredns once a minute for now.

@johnbelamaric
Copy link
Member

This is 1.0.5 or master? This may already be fixed in master, there are upstream health check problems in 1.0.5

@caffeinejolt
Copy link

1.0.5 - guess i will just update when 1.0.6 gets released and see what happens

@caffeinejolt
Copy link

is there any way to disable upstream health checks?

@chrisohaver
Copy link
Member

max_fails 0 i think...

max_fails is the number of failures within fail_timeout that are needed before considering a backend to be down. If 0, the backend will never be marked as down. Default is 1.

@caffeinejolt
Copy link

Ok - had chance to play with this a bit more and figured out what is going on. Tinydns by default...

tinydns is quite possibly the simplest possible full-function authoritative nameserver. It serves authoritative data only; any query that cannot be answered from its disk database is simply not answered. The disk database has provisions for telling the tinydns that it has comprehensive knowledge of a specific domain; only if that is set (indicated by the presence of a "." record in the data file) does tinydns return NXDOMAIN, the official statement that the requested domain does not exist.

So without the "." in the data file, tinydns never replies to requests for which it has no data - which coredns of treats as a timeout. When I add the "." tinydns replies immediately with NXDOMAIN so coredns does not get a timeout.

My guess is that coredns marks the upstream as down and stops trying once it gets too many timeouts? However, most of the queries sent upstream do not timeout even when I do not have the "." in the data file. Therefore, I concur there are upstream health check problems in 1.0.5.

Now I guess the question is if this problem is present in master - I will download and build master and try it out to see and let you know.

@caffeinejolt
Copy link

caffeinejolt commented Feb 16, 2018

Scratch that plan...

go build github.com/coredns/coredns
go/src/github.com/coredns/coredns/core/dnsserver/register.go:33:13: cannot use newContext (type func() caddy.Context) as type func(*caddy.Instance) caddy.Context in field value

@miekg
Copy link
Member Author

miekg commented Feb 19, 2018 via email

@caffeinejolt
Copy link

Thanks for the offer on the binary. I was able to get things working fine by setting max_fails to 0 - I also ended up leaving tinydns responding to unknown hosts as nxdomain since I think this is a better default in any case - doesn't leave resolvers hanging. Overall I really like coredns - a great tool!

@fturib
Copy link
Contributor

fturib commented Feb 21, 2018

@miekg : do you have an idea when the 1.0.6 will be released ? The code freeze for K8s is this Friday (2/26). We would need this new versions with fixes for integration in v1.10 as BETA.
Let me know. Thanks!

@miekg
Copy link
Member Author

miekg commented Feb 21, 2018

yeah, we can release. Want to get #1543 in as well, but that should be OK to do today and then kick off a release.

@miekg
Copy link
Member Author

miekg commented Feb 22, 2018

and released

@miekg miekg closed this as completed Feb 22, 2018
@fturib
Copy link
Contributor

fturib commented Feb 22, 2018

Thanks !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants