Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative responses get cached even when setting cache-max-negative-ttl: 1 #533

Closed
codeswhite opened this issue Aug 23, 2021 · 5 comments
Closed

Comments

@codeswhite
Copy link

Describe the bug
Negative responses get cached even when setting cache-max-negative-ttl: 1

To reproduce
Steps to reproduce the behavior:

  1. Verify unbound used by system in /etc/resolv.conf
  2. Set cache-max-negative-ttl: 1
  3. Restart unbound
  4. Query non-existant domain (ex. dig www.john.doe)
  5. See that query took some time (more than 10ms) and returned NXDOMAIN
...
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 44925
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.john.doe.			IN	A

;; AUTHORITY SECTION:
.			30	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2021082300 1800 900 604800 86400

;; Query time: 463 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Aug 23 12:36:58 IDT 2021
...
  1. Wait for 5 secunds for negative cache TTL to end
  2. Query again, look at "Query time" to identify whether cache was used (can be also identified by watching logs for "using cached" message from unbound when running with log-queries and log-replies)
...
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 44159
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.john.doe.			IN	A

;; AUTHORITY SECTION:
.			24	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2021082300 1800 900 604800 86400

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Aug 23 12:37:04 IDT 2021
...

Expected behavior
As stated in unbound.conf(5):

cache-max-negative-ttl: <seconds>
    Time to live maximum for negative responses, 
    these have a SOA in the authority section that is limited in time.
    Default is 3600.  This applies to nxdomain and nodata answers.

I expect for the cache to end after <seconds>

System:

  • Unbound version: 1.13.1
  • OS: Gentoo (happens to me also on Arch)
  • unbound -V output:
Version 1.13.1

Configure line: --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --docdir=/usr/share/doc/unbound-1.13.1-r2 --htmldir=/usr/share/doc/unbound-1.13.1-r2/html --with-sysroot=/ --libdir=/usr/lib64 --disable-debug --disable-gost --enable-dnscrypt --disable-dnstap --enable-ecdsa --disable-subnet --disable-cachedb --disable-static --enable-systemd --without-pythonmodule --without-pyunbound --without-pthreads --with-libnghttp2 --disable-flto --disable-rpath --enable-event-api --enable-ipsecmod --enable-tfo-client --enable-tfo-server --with-libevent=/usr --without-libhiredis --with-pidfile=/run/unbound.pid --with-rootkey-file=/etc/dnssec/root-anchors.txt --with-ssl=/usr --with-libexpat=/usr
Linked libs: libevent 2.1.11-stable (it uses epoll), OpenSSL 1.1.1k  25 Mar 2021
Linked modules: dns64 ipsecmod respip validator iterator
DNSCrypt feature available
TCP Fastopen feature available

Additional information
It is happening to me for a while now (from previous versions) and I just finally decided to write an issue about it

@wcawijngaards
Copy link
Member

If I perform these actions, everything works. The first printout shows a 1 second TTL (at step 5), for me.
Is unbound somehow not configured with the option even though you think you set it? Eg. you edited /etc/unbound.conf but unbound is using /usr/local/etc/unbound.conf or somewhere in /var a chroot with the file?

@codeswhite
Copy link
Author

Thanks for the quick reply!

Is unbound somehow not configured with the option even though you think you set it? Eg. you edited /etc/unbound.conf but unbound is using /usr/local/etc/unbound.conf or somewhere in /var a chroot with the file?

I've checked, no other config than /etc/unbound/unbound.conf exists.

If I perform these actions, everything works. The first printout shows a 1 second TTL (at step 5), for me.

Might it be a misunderstading on my side?
As far as i understand the first dig shows:

;; Query time: 463 msec

Because the user sends a request to unbound and the unbound service dont have the domain cached so it is forwarding the request to the global DNS - thus creating a 463ms delay.

Afterwards on the 2nd dig we have:

;; Query time: 0 msec

Because the request was served from unbound's cache (on localhost).

What i wanted to achieve is for Unbound to not cache the NXDOMAIN response after the 1st dig, so that 2nd time i request the domain from unbound it will forward the query to the global DNS once again.

@wcawijngaards
Copy link
Member

Yes it should then not cache it. That does not seem to happen, but if I try the option just plainly, it works for me. Something else must be wrong for you? Perhaps you do not query unbound, but another resolver like systemd, or I guessed you could be editing a different config file, because that happened to others.

@codeswhite
Copy link
Author

@wcawijngaards I have narrowed down the root cause, it is happening because of cache-min-ttl is succeeding cache-max-negative-ttl which is not the expected behaviour for me.

My config initially had:

...
cache-min-ttl: 300
cache-max-negative-ttl: 1
...

The current situation is creating a possible "denial of service" in case that the forward-zone server is down for a moment it might result in caching of a negative result for at least cache-min-ttl.

Better explained here: How long does negative DNS caching typically last?

So in conclusion I think cache-max-negative-ttl should precede.
Or maybe cache-min-ttl should apply only to positive responses?
What do you think?

@wcawijngaards
Copy link
Member

Yes, you are correct. The negative ttl modification should precede over the global min and max ttl. Changed the code to do that.

jedisct1 added a commit to jedisct1/unbound that referenced this issue Oct 5, 2021
* nlnet/master: (118 commits)
  - Fix to add example.conf note for outbound-msg-retry.
  - Implement RFC8375: Special-Use Domain 'home.arpa.'.
  - Fix crosscompile script for the shared build flags.
  - Fix crosscompile windows to use libssp when it exists. - For the windows compile script disable gost. - Fix that on windows, use BIO_set_callback_ex instead of deprecated
  - Fix crosscompile shell syntax.
  - For crosscompile on windows, detect 64bit stackprotector library.
  - Fix crosscompile on windows to work with openssl 3.0.0 the   link with ws2_32 needs -l:libssp.a for __strcpy_chk.   Also copy results from lib64 directory if needed.
  - Fix more initialisation errors reported by gcc sanitizer.
  - Fix lock debug code for gcc sanitizer reports.
  - Fix initialisation errors reported by gcc sanitizer.
  - Fix root_anchor test to check with new icannbundle date.
  - Fix for NLnetLabs#41: change outbound retry to int to fix signed comparison   warnings.
  - Small fixes for NLnetLabs#41: changelog, conflicts resolved,   processQueryResponse takes an iterator env argument like other   functions in the iterator, no colon in string for set_option,   and some whitespace style, to make it similar to the rest.
  Changelog entry for NLnetLabs#538 - Fix NLnetLabs#538: Fix subnetcache statistics.
  Fix subnetcache statistics
  - Fix tcp fastopen failure when disabled, try normal connect instead.
  - Fix NLnetLabs#533: Negative responses get cached even when setting   cache-max-negative-ttl: 1
  - Fix asynclook unit test for setup of lockchecks before log.
  - Fix compile warning in libunbound for listen desetup routine.
  - Fix RPZ locks. Do not unlock zones lock if requested and rpz find   zone does not find the zone. Readlock the clientip that is found   for ipbased triggers. Unlock the nsdname zone lock when done.   Unlock zone and ip in rpz nsip and nsdname callback. Unlock   authzone and localzone if clientip found in rpz worker call.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants