Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(runloop) warm up DNS records for Services on updates #4656

Closed
wants to merge 1 commit into from

Conversation

kikito
Copy link
Member

@kikito kikito commented May 27, 2019

No description provided.

Copy link
Contributor

@p0pr0ck5 p0pr0ck5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth noting (or documenting internally) that the cache provided by lua-resty-dns-client is a per-worker LRU, so this action will have limited impact in multi-core environments.

Cool improvement nonetheless :)

Copy link
Member

@Tieske Tieske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the exact event order and if the event handler in this case is called in 1 worker (through post_local) or on all workers (through post). So cannot confirm @p0pr0ck5 remark.

Usefulness of this change is mostly for scaling a Kong cluster imo.

return
end

kong.dns.toip(host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call will fail if the configured hostname is not an actual hostname, but an upstream. toip does not resolve upstreams, it only does the dns part. Resolving upstream names is done in balancer.lua.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw: could be that the error just gets ignored, in that case all is well, but better test it first so we do not get unexpected error messages in the logs.

@bungle
Copy link
Member

bungle commented May 27, 2019

I tried this and yes (the same problem was in my original pr), @p0pr0ck5 is right, it currently only warms up a single worker.

@jeremyjpj0916
Copy link
Contributor

jeremyjpj0916 commented May 27, 2019

Might be worth noting (or documenting internally) that the cache provided by lua-resty-dns-client is a per-worker LRU, so this action will have limited impact in multi-core environments.

Any reason not to rewrite the cache to use thibaultcha's awesome ml cache as a dependency so DNS cache could become a global vs local worker lru? I would think such rewrites for resources that will be used or updated often it makes sense? Is there a downside I am not seeing(besides a little extra mem needed)?

@Tieske
Copy link
Member

Tieske commented May 28, 2019

@jeremyjpj0916 if there a large number of entries (20), the dns server might respond with a random subset (say 4 only). Say we have a 12 core machine, with hence also 12 workers.

Kong would get 4 random results in each of the 12 workers, so 48 results in total. And would hence probably have all 20 entries, and basic statistics will ensure we still proxy to all the backends.

Now alter Kong such that it shared the dns data. Say so we do a single query, get 4 backends, all 12 Kong workers now start proxying to only those 4 backends. So 16 backends sit idle. That is most likely not what you want.

@jeremyjpj0916
Copy link
Contributor

jeremyjpj0916 commented May 28, 2019

@Tieske interesting, not super familiar with DNS outside of the co I work for where generally a hostname resolves to 1 ip essentially per data center in our internal network, when you only have 2-3 dc's(if the app is HA) it keeps the entries small. Maybe the situation you are describing is more common on public cloud. One of those engineering decisions where you want to make the shoe fit for all cases as best you can so now it makes a bit more sense why it works the way it does. Thanks for the insight.

I suppose the alternative if you still wanted a global would be to track which workers had queried a nameserver yet for that in the cache, key(the hostname)->value(a object containing pids of worker processes and the ip's they resolved) pairing and if the value does not contain entries within it for that worker process yet then do a lookup so your statistics could play out and get all those entries. A little more complicated than current design though for sure 😄 .

@hishamhm
Copy link
Contributor

@kikito A valid perf test for this (even if manual) would be to start Kong, create a service to example.com and a route pointing to it, wait a second, issue one first request, and check the latency headers. Do this with and without the patch (both times on a freshly started, empty database Kong to avoid cache warmup on each run) and verify the latency difference. Might want to bash-script this to do it a few times. Please report back if the difference is observed as expected!

kong/runloop/handler.lua Outdated Show resolved Hide resolved
@hishamhm hishamhm changed the title perf(runloop) service dns prewarm on change perf(runloop) warm up DNS records for Services on updates May 28, 2019
@thibaultcha
Copy link
Member

@kikito A valid perf test for this (even if manual) would be to start Kong, create a service to example.com and a route pointing to it, wait a second, issue one first request, and check the latency headers.

A more robust way to avoid slow/flaky tests may be to point dns_resolver to a mock UDP server which answers DNS queries. By never making a proxy request but receiving a DNS query on the mock server, we can deduct that DNS cache warmup ran.

(@jeremyjpj0916 Just as an aside here, I do have plans for an mlcache-based DNS resolver library, but really unsure if they will ever see the light of day, due to time constraints. I often mention such a DNS library as a typical usage example for mlcache in talks and such, since many mlcache features would benefit such a library and make it fairly simple to write. lua-resty-dns-client does a great job for us today already, and provides a nice load balancing abstraction :) )

@Tieske
Copy link
Member

Tieske commented May 29, 2019

@jeremyjpj0916

I suppose the alternative if you still wanted a global would be to track which workers had queried a nameserver yet for that in the cache, key(the hostname)->value(a object containing pids of worker processes and the ip's they resolved) pairing and if the value does not contain entries within it for that worker process yet then do a lookup so your statistics could play out and get all those entries. A little more complicated than current design though for sure 😄 .

I'm not getting it? You mean each worker does its own query, but shares the results. And then each worker uses the entries in the shared results?

The technical solution to that would be to use a TCP query. This is tracked here Kong/lua-resty-dns-client#63

@kikito kikito force-pushed the perf/service-dns-prewarm-on-change branch from 5e51e8f to 8879abb Compare May 29, 2019 11:40
@jeremyjpj0916
Copy link
Contributor

jeremyjpj0916 commented May 29, 2019

@Tieske

I'm not getting it? You mean each worker does its own query, but shares the results. And then each worker uses the entries in the shared results?

Precisely.

The technical solution to that would be to use a TCP query. This is tracked here Kong/lua-resty-dns-client#63

Interesting so DNS via UDP can generally return a subset whereas TCP its always going to return the full list? Then yeah the tcp approach is cleaner than my idea to enable using something like ml-cache . If this were ever done might make sense to give people the option to use the udp + per worker lru vs tcp + ml-cache down the road if the TCP approach gets hardened(seems like it just needs retries added based on your raised issue).

Or if you were eager to implement, a tcp first with fallback to udp upon call failure(so no tcp retransmit) is doable now if this was an itch you have been wanting to scratch hah. Where if its tcp success it goes in the ml-cache and used by all workers and if that call happens to fail it ends up udp + cache in a local lru. Likely would just be easier to fork the dns dependency and add re-transmit though at that point 😄 . Anyways I will leave it at that, don't wanna derail this PR too much but interesting discussion.

@p0pr0ck5
Copy link
Contributor

TCP DNS queries do not imply that the server will return a “full list” of records for a name, and UDP queries do not imply that a “partial” list is returned. The underlying transport is unrelated to the number of records returned. If the authoritative server (or an intermediate resolver) is only returning a subset of the record set, that’s a problem with that resolver/auth server, not the transport or client. Forcing TCP queries to try to get a full list of records is incorrect behavior. For example, the behavior of Consul returning a subset of records is unrelated to the transport mechanism, as documented by Consul:

Note that DNS is limited in size per request, even when performing DNS TCP queries.

https://www.consul.io/docs/agent/dns.html

Consuls resolver also has some questionable behavior, like truncating responses without setting the truncate flag 😳 and this behavior is configurable anyhow:

https://www.consul.io/docs/agent/options.html#enable_truncate

If anything, the approach of caching records at all is antithetical to the approach of supporting the edge case of “round robining” multiple different answers from an authoritative DNS server. If Kong wants to support response record sets that would be truncated over TCP, it should follow the spec and fetch the record set over TCP (and Consul should be configured appropriately).

@kikito
Copy link
Member Author

kikito commented May 29, 2019

I tried the test suggested by @hisham. Here's the bash script I used:

for i in {1..20}
do
  kong start &> test.log
  http POST :8001/services name=serv host=example.net path=/ &> test.log
  http POST :8001/services/serv/routes hosts:='["example.net"]' name=rout &> test.log
  sleep 3
  http :8000 Host:example.net -h 2>&1 | grep X-Kong-Proxy-Latency
  http DELETE :8001/routes/rout &> test.log
  http DELETE :8001/services/serv &> test.log
  kong stop &> test.log
done

(Twenty times, start kong, create route and service, wait 3 seconds, route traffic towards the service, output the X-Kong-Proxy-Latency value, then delete everything and stop kong. Notice that I increased the wait time to make sure that the events were propagated.)

With next the latency is stable at around 16:

X-Kong-Proxy-Latency: 122
X-Kong-Proxy-Latency: 18
X-Kong-Proxy-Latency: 18
X-Kong-Proxy-Latency: 17
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 17
X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 18
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 14
X-Kong-Proxy-Latency: 17
X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 14
X-Kong-Proxy-Latency: 15

With this PR, the latency fluctuates more, sometimes it gets to 0 or 1:

X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 19
X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 20
X-Kong-Proxy-Latency: 18
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 15
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 14
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 16
X-Kong-Proxy-Latency: 0

So for the default kong case the first request seems to benefit from this change, on average.

@p0pr0ck5
Copy link
Contributor

(Just to note, the default behavior for Kong is not to run one worker process, but to run a number of worker processes equal to the number of cores: https://github.com/Kong/kong/blob/master/kong/templates/kong_defaults.lua#L16)

@kikito
Copy link
Member Author

kikito commented May 29, 2019

if there a large number of entries (20), the dns server might respond with a random subset (say 4 only)

@Tieske A bit off-topic, but would "keep asking for entries for a while" help solve the problem? Something like:

Get the initial 4 entries. Schedule a retry 1 second later.
On retry:
  If no new entries found, stop.
  If new entries found, add them to the list, and schedule another retry. 

@kikito
Copy link
Member Author

kikito commented May 29, 2019

@p0pr0ck5 Thanks, fixed the last phrase in my comment.

@thibaultcha
Copy link
Member

@kikito The suggested test case isn't robust enough to be asserting this behavior. Can we please follow the approach I suggested above, or something similar?

@hishamhm
Copy link
Contributor

@kikito are you sure kong was stopping and starting correctly when you ran the test? At first I tried running it here and got results similar to yours (except for the 15 values which I assume are macOS-specific, but it ran "high-then-low" as well), but then realized that on my machine I need extra arguments for kong start/stop. When I fixed the script for my environment I got:

next:

X-Kong-Proxy-Latency: 47
X-Kong-Proxy-Latency: 51
X-Kong-Proxy-Latency: 48
X-Kong-Proxy-Latency: 53
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 49
X-Kong-Proxy-Latency: 51
X-Kong-Proxy-Latency: 51
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 53
X-Kong-Proxy-Latency: 60
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 60
X-Kong-Proxy-Latency: 51
X-Kong-Proxy-Latency: 50
X-Kong-Proxy-Latency: 51
X-Kong-Proxy-Latency: 61

PR:

X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 0
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 1
X-Kong-Proxy-Latency: 0

which is what I would expect from this change.

@hishamhm
Copy link
Contributor

@thibaultcha your suggestion is good for a test case in the suite to check that the code is called, mine was meant as a manual check of the perf impact.

@kikito kikito force-pushed the perf/service-dns-prewarm-on-change branch from 9b34165 to 66d44a7 Compare May 30, 2019 13:58
This change prewarms the DNS cache when a new Service is added or
updated.
@kikito kikito force-pushed the perf/service-dns-prewarm-on-change branch from 66d44a7 to 4a23ba3 Compare May 30, 2019 16:27
@kikito
Copy link
Member Author

kikito commented May 30, 2019

@thibaultcha I added a test which creates a service, waits a bit, and then checks that the DNS request was done, without involving any traffic.

if utils.hostname_type(data.entity.host) == "name" then
timer_at(0, warmup_hostname_dns_timer, data.entity.host)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth noting that this event will only trigger for the node handling the Service creation/update. Those worker events are not propagated to other nodes in the cluster. The benefits of this early DNS query are very limited in production deployments imho, and is mostly optimizing a development/testing use-case.


if data.operation == "create" or
data.operation == "update" then
if utils.hostname_type(data.entity.host) == "name" then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: a single branch instead of a nested if would be simpler:

if data.operation == "create" or data.operation == "update"
   and utils.hostname_type(data.entity.host) == "name"
then

end

if data.operation == "create" or
data.operation == "update" then
if utils.hostname_type(data.entity.host) == "name" then
timer_at(0, warmup_hostname_dns_timer, data.entity.host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Timers are a scarce resource. We need error handling allocating here, at least to prevent administrators from not realizing they've maxed-out their number of pending timers.
  2. It'd be nice to have the ability to force an asyncQuery from lua-resty-dns-client, so we don't have to even allocate this timer in the first place, and directly rely on lua-resty-dns-client's asyncQuery time. Is that possible with today's lua-resty-dns-client @Tieske?

As it stands, this can easily overwhelm timer resources when performing a large number of subsequent Services creations in a short period of time (e.g. from a script or from delarative config), to the point that given the below comment (on this only warming up the local node), I'm not sure this approach is worth our time.

If we could group the DNS lookups in a single timer, we would be much better off already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding to the above:

  1. This also only warms up a single worker, which furthermore emphasizes that this is optimizing local/testing environments but doesn't do much to help a production deployment, as far as I can tell?
  2. We may also be evicting (oldest) entries from the worker's DNS' lrucache in favor of warmup DNS lookups, but without being guaranteed that the warmed-up records will be queried before the evicted records, which somewhat deafeats the purpose of our LRU cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have the ability to force an asyncQuery from lua-resty-dns-client, so we don't have to even allocate this timer in the first place, and directly rely on lua-resty-dns-client's asyncQuery time. Is that possible with today's lua-resty-dns-client @Tieske?

No. The asyncQuery is related to the stale_ttl setting. Where we return stale data, but fire a refresh query in the background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my understanding as well. Alas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also only warms up a single worker, which furthermore emphasizes that this is optimizing local/testing environments but doesn't do much to help a production deployment, as far as I can tell?

The benefits would be (if warm up is done in each worker, not just 1) for scaling a Kong cluster, the newly spun up Kong node would be faster in dealing with first requests. This might actually make a difference with "wall-of-traffic" events.

But as said; only if it is done for all the workers.

Copy link
Member

@thibaultcha thibaultcha May 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newly spun up Kong node would be faster in dealing with first requests

This is not what this PR does (at least in DB mode). Making fresh node's requests faster is already handled by 8884973.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @thibaultcha I missed that one. That will deal with it indeed. Considering that, imo, this PR is not worth the additional complexity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. @kikito, I think this was a worthwhile experiment, but how about we close this and move on?

@bungle
Copy link
Member

bungle commented May 31, 2019

One comment here:
what if we do warmup on router rebuild (e.g. collect service names and then resolve them on a timer started after router rebuild)? This way we would prewarm dns cache on all workers? Though it would mean a lot of dns queries around the same time which may be problematic? And also may be against LRU.

@kikito
Copy link
Member Author

kikito commented May 31, 2019

Ok, I have decided to close down this PR without merging - I agree that the complexity is not worth the benefits.

@kikito kikito closed this May 31, 2019
@kikito kikito deleted the perf/service-dns-prewarm-on-change branch May 31, 2019 10:14
bungle added a commit that referenced this pull request May 31, 2019
This is alternative approach to what was originally proposed on #4656.

Also warmups all the workers on init (not just worker 0).
bungle added a commit that referenced this pull request Jun 5, 2019
This is alternative approach to what was originally proposed on #4656.

Also warmups all the workers on init (not just worker 0).
bungle added a commit that referenced this pull request Jun 6, 2019
This is alternative approach to what was originally proposed on #4656.

Also warmups all the workers on init (not just worker 0).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants