Skip to content

Stream reuse does not work on Windows (rbtree issue?) #516

@yan12125

Description

@yan12125

Describe the bug

I use forward-addr: 1.0.0.1@853#cloudflare-dns.com as the DoT upstream. On Windows, connections are apparently not reused as there are many TCP connections, and the latency for the first query after restarting unbound is much higher than Linux and Android devices in the same local network.

To reproduce

  1. Run unbound.exe -ddpvvvvv -c unbound.conf 2>&1 > unbound.log from PowerShell. The content of unbound.conf is:
server:
    verbosity: 1
    tls-win-cert: yes
    auto-trust-anchor-file: "C:\Program Files\Unbound\root.key"

forward-zone:
    name: "."
    forward-tls-upstream: yes
    forward-addr: 1.0.0.1@853#cloudflare-dns.com
  1. Run nslookup example.com

Expected behavior

netstat -ano reports only one connection to 1.0.0.1:853, and the nslookup command returns the result in a short time (e.g., <200ms)

System:

  • Unbound version: tested both 1.13.1 from the official installer and 1.13.2 (commit b6abcb1) I built locally
  • OS: Windows 10 20H2
  • unbound -V output:
Version 1.13.2

Configure line: --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --build= --prefix=/usr/x86_64-w64-mingw32 --libdir=/usr/x86_64-w64-mingw32/lib --includedir=/usr/x86_64-w64-mingw32/include --enable-shared --enable-static --enable-static-exe
Linked libs: event winsock (it uses WSAWaitForMultipleEvents), OpenSSL 1.1.1k  25 Mar 2021
Linked modules: dns64 respip validator iterator

BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues
Version 1.13.1

Configure line: --host=x86_64-w64-mingw32 --build=x86_64-redhat-linux-gnu --target=x86_64-w64-mingw32 --prefix=/usr/x86_64-w64-mingw32/sys-root/mingw --exec-prefix=/usr/x86_64-w64-mingw32/sys-root/mingw --bindir=/usr/x86_64-w64-mingw32/sys-root/mingw/bin --sbindir=/usr/x86_64-w64-mingw32/sys-root/mingw/sbin --sysconfdir=/usr/x86_64-w64-mingw32/sys-root/mingw/etc --datadir=/usr/x86_64-w64-mingw32/sys-root/mingw/share --includedir=/usr/x86_64-w64-mingw32/sys-root/mingw/include --libdir=/usr/x86_64-w64-mingw32/sys-root/mingw/lib --libexecdir=/usr/x86_64-w64-mingw32/sys-root/mingw/libexec --localstatedir=/usr/x86_64-w64-mingw32/sys-root/mingw/var --sharedstatedir=/usr/x86_64-w64-mingw32/sys-root/mingw/com --mandir=/usr/x86_64-w64-mingw32/sys-root/mingw/share/man --infodir=/usr/x86_64-w64-mingw32/sys-root/mingw/share/info --enable-debug --enable-static-exe --disable-flto --with-ssl=/home/wouter/src/unbound/unbound-dist-AQW9ts/sslinstall --with-libexpat=/home/wouter/src/unbound/unbound-dist-AQW9ts/wxpinstall
Linked libs: event winsock (it uses WSAWaitForMultipleEvents), OpenSSL 1.1.1h  22 Sep 2020
Linked modules: dns64 respip validator iterator

BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues

Additional information

I built git-master for Windows and added some debugging codes

diff --git a/services/outside_network.c b/services/outside_network.c
index 73cb4ff3..16aa38ef 100644
--- a/services/outside_network.c
+++ b/services/outside_network.c
@@ -161,13 +161,20 @@ int
 reuse_cmp(const void* key1, const void* key2)
 {
 	int r;
+	verbose(VERB_CLIENT, "key1=%p, key2=%p", key1, key2);
 	r = reuse_cmp_addrportssl(key1, key2);
 	if(r != 0)
 		return r;
 
 	/* compare ptr value */
-	if(key1 < key2) return -1;
-	if(key1 > key2) return 1;
+	if(key1 < key2) {
+		verbose(VERB_CLIENT, "key mismatch");
+		return -1;
+	}
+	if(key1 > key2) {
+		verbose(VERB_CLIENT, "key mismatch");
+		return 1;
+	}
 	return 0;
 }
 
@@ -534,8 +541,10 @@ reuse_tcp_find(struct outside_network* outnet, struct sockaddr_storage* addr,
 		log_assert(&key_p != ((struct reuse_tcp*)result)->pending);
 	}
 	/* not found, return null */
-	if(!result || result == RBTREE_NULL)
+	if(!result || result == RBTREE_NULL) {
+		verbose(VERB_CLIENT, "not found in rbtree");
 		return NULL;
+	}
 	verbose(VERB_CLIENT, "reuse_tcp_find check inexact match");
 	/* inexact match, find one of possibly several connections to the
 	 * same destination address, with the correct port, ssl, and

From logs, not found in rbtree is reported since the second connection. For example,

[1627796331] C:\Users\yen\Documents\local\Computer\unbound\build\unbound.exe[3200:0] debug: reuse_tcp_find: num reuse streams 1
[1627796331] C:\Users\yen\Documents\local\Computer\unbound\build\unbound.exe[3200:0] debug: key1=000000bbaf7fe5b0, key2=0000029d73620170
[1627796331] C:\Users\yen\Documents\local\Computer\unbound\build\unbound.exe[3200:0] debug: key mismatch
[1627796331] C:\Users\yen\Documents\local\Computer\unbound\build\unbound.exe[3200:0] debug: not found in rbtree

Complete log: unbound.log

rbtree_find_less_equal returns a temporary match when the cmp function returns a positive value. Here key1 < key2, so reuse_cmp returns -1. Apparently stack addresses are not always larger than heap addresses on Windows. If I change reuse_cmp to return a positive value for all mismatched keys,

diff --git a/services/outside_network.c b/services/outside_network.c
index 73cb4ff3..dff31c48 100644
--- a/services/outside_network.c
+++ b/services/outside_network.c
@@ -166,8 +166,9 @@ reuse_cmp(const void* key1, const void* key2)
 		return r;
 
 	/* compare ptr value */
-	if(key1 < key2) return -1;
-	if(key1 > key2) return 1;
+	if(key1 != key2) {
+		return 1;
+	}
 	return 0;
 }
 

Connections appear reused.

I'm not familiar with rbtree and thus I'm not sure if this is a correct fix, so I just share my observations instead of creating a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions