Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS improvements #16314

Open
alyssawilk opened this issue May 4, 2021 · 4 comments
Open

DNS improvements #16314

alyssawilk opened this issue May 4, 2021 · 4 comments
Labels
cronvoy-mvp feature list for cronvoy stable enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue

Comments

@alyssawilk
Copy link
Contributor

HTTPS resource record (will improve QUIC usage)
Serve stale DNS results (improves performance and reliability)
DoT/DoH

@alyssawilk alyssawilk added enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue cronvoy-mvp feature list for cronvoy stable labels May 4, 2021
@alyssawilk
Copy link
Contributor Author

alyssawilk commented Sep 13, 2021

Also right now DNS has a hard coded refresh rate, and resolved entries are valid until that expires (ignoring TTL)

Once we have support for stale DNS, I think we can have DNS results be respected for min(ttl, refresh_rate)* and use the stale DNS support to make sure we don't end up stalling requests when we have a slightly-expired entry

*possibly actully max(30s, min(ttl, refresh_rate)) to make sure we don't DNS thrash

@andrewjjenkins
Copy link
Contributor

resolved entries are valid until that expires (ignoring TTL)

This current behavior is a problem for some of my users. They were expecting that once the DNS result TTL expired, stale endpoints would not receive any new connections.

My team saw some other tickets like this one: #2691 and we're not sure what the desired behavior is.

Does the below describe the goal here?

If DNS resolution fails:

  1. The stale endpoints would be used until TTL expired
  2. Once TTL expires, the stale endpoints are removed (and Envoy would start delivering "no upstream" errors)
  3. DNS resolution continues periodically (but as you say, not DNS-thrashing), as soon as it succeeds, the newly-discovered endpoints are used.

@alyssawilk
Copy link
Contributor Author

Oh hey I totally failed to update this issue.

As of #18408 Envoy is closer to respecting TTL. DNS reresolve will be more aggressive based on the DNS TTL, The main DNS cache will remove stale entries but I don't think it propagates the non-null to null address resolution to worker threads to really ensure no additional traffic. I think it'd be a fairly simple change, which I think should be config guarded as I think it's pretty common to use stale entries if re-resolve fails.

@andrewjjenkins
Copy link
Contributor

Thank you! Didn't notice that issue, sorry.

I think should be config guarded as I think it's pretty common to use stale entries if re-resolve fails.

Understood. That's the opposite of what this user wants in this particular case but I understand as a general principle the desire to allow the use of stale entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cronvoy-mvp feature list for cronvoy stable enhancement Feature requests. Not bugs or questions. no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

2 participants