Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbound-1.13.1 crashed by SIGABRT #469

Closed
iruzanov opened this issue Apr 21, 2021 · 56 comments
Closed

Unbound-1.13.1 crashed by SIGABRT #469

iruzanov opened this issue Apr 21, 2021 · 56 comments
Assignees

Comments

@iruzanov
Copy link

Hello, Wouter!

I am actively using unbound-1.13.1 (with our DNSTAP patches, issue #367). And sometimes my unbound is crashing under highload, massive recursive TCP-requests. Any abnormal terminations caused by services/outside_network.c code. And now i have one of such core dumps:
(gdb) bt
#0 0x0000000800955c2a in thr_kill () from /lib/libc.so.7
#1 0x0000000800954084 in raise () from /lib/libc.so.7
#2 0x00000008008ca279 in abort () from /lib/libc.so.7
#3 0x0000000800464641 in ?? () from /usr/local/lib/libevent-2.1.so.7
#4 0x0000000800464939 in event_errx () from /usr/local/lib/libevent-2.1.so.7
#5 0x000000080045ec54 in evmap_io_del_ () from /usr/local/lib/libevent-2.1.so.7
#6 0x0000000800457e8f in event_del_nolock_ () from /usr/local/lib/libevent-2.1.so.7
#7 0x000000080045ada8 in event_del () from /usr/local/lib/libevent-2.1.so.7
#8 0x000000000030e25b in ub_event_del (ev=) at ./util/ub_event.c:395
#9 comm_point_close (c=0xdc97b7c00) at ./util/netevent.c:3860
#10 0x0000000000315bab in decommission_pending_tcp (outnet=, pend=0xdc9494980)
at ./services/outside_network.c:945
#11 0x00000000003147d6 in reuse_cb_and_decommission (outnet=0x18e75, pend=0x6, error=-2)
at ./services/outside_network.c:986
#12 0x0000000000317491 in outnet_tcptimer (arg=0xee67c2300) at ./services/outside_network.c:2033
#13 0x000000080045e0ed in ?? () from /usr/local/lib/libevent-2.1.so.7
#14 0x000000080045a09c in event_base_loop () from /usr/local/lib/libevent-2.1.so.7
#15 0x000000000024dc54 in thread_start (arg=0x8014c0800) at ./util/ub_event.c:280
#16 0x0000000800780fac in ?? () from /lib/libthr.so.3
#17 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf7fa000
(gdb)

If we enter frame 12 (outnet_tcptimer) and do print pend structure, we will see the following:
(gdb) print pend
$15 = (struct pending_tcp *) 0x6
(gdb) print *pend
Cannot access memory at address 0x6
(gdb)
And this corrupt pend structure is passing to reuse_cb_and_decommission() function (frame 11) and higher in the stacktrace output above.

In the outnet_tcptimer() function we can see the following code (in services/outside_network.c):
/* it was in use /
struct pending_tcp
pend=(struct pending_tcp*)w->next_waiting;

But the structure w->next_waiting is of type waiting_tcp:
(gdb) print w->next_waiting
$18 = (struct waiting_tcp *) 0xdc9494980
(gdb)

So my question - is the types casting correct in outnet_tcptimer() function? And does this corrupt pend structure cause event_errx() in libevent?
If it might help, i found structure of pending_tcp type in w structure:
(gdb) print w->outnet->tcp_free
$23 = (struct pending_tcp *) 0xdc9494980
(gdb)
(gdb) print *w->outnet->tcp_free
$24 = {next_free = 0xdc9493e40, pi = 0xd7da2c000, c = 0xdc97b7c00, query = 0x0, reuse = {node = {parent = 0xdc94953a0,
left = 0x3287d0 <rbtree_null_node>, right = 0x3287d0 <rbtree_null_node>, key = 0x0, color = 1 '\001'}, addr = {
ss_len = 0 '\000', ss_family = 2 '\002', __ss_pad1 = "\000\065X\320\017\067", __ss_align = 0,
__ss_pad2 = "\000\000\000\000\000\000\000\016", '\000' <repeats 103 times>}, addrlen = 16, is_ssl = 0,
lru_next = 0xdc9494ae0, lru_prev = 0x0, item_on_lru_list = 0, pending = 0xdc9494980, cp_more_read_again = 0,
cp_more_write_again = 0, tree_by_id = {root = 0x3287d0 <rbtree_null_node>, count = 0,
cmp = 0x3133e0 <reuse_id_cmp>}, write_wait_first = 0x0, write_wait_last = 0x0, outnet = 0xd7d805000}}
(gdb)

Big thank you in advance!

PS I did not send core-file itself because of 31GB in size of the file.

@gthess
Copy link
Member

gthess commented Apr 21, 2021

Hello,

Thanks for the backtrace! This seems to be the same as #411 and #439 and the backtrace will hopefully help narrow down the cause of the corruption.
In this case w->next_waiting is supposed to be casted to struct pending_tcp* because of the w->on_tcp_waiting_list check above.
So the error is not there per say; rather at an earlier point the w struct was corrupted.

@gthess gthess self-assigned this Apr 21, 2021
@gthess
Copy link
Member

gthess commented Apr 21, 2021

I also see the cause of the crash for the other issues, namely:

reuse = {node = {parent = 0xdc94953a0,
left = 0x3287d0 <rbtree_null_node>, right = 0x3287d0 <rbtree_null_node>, key = 0x0, color = 1 '\001'}

there is a node in the reuse tree with key=NULL. That is the case that shouldn't be happening in the first place.

No scratch that, this is just a node and having key=NULL just means that this node is not part of the tree.

@gthess
Copy link
Member

gthess commented Apr 21, 2021

Btw, can you somehow reproduce the issue reliably-ish?

@iruzanov
Copy link
Author

I can see in my case that the key=NULL too:
(gdb) print w->outnet->tcp_free.reuse.node
$27 = {parent = 0xdc94953a0, left = 0x3287d0 <rbtree_null_node>, right = 0x3287d0 <rbtree_null_node>, key = 0x0,
color = 1 '\001'}
(gdb)

@iruzanov
Copy link
Author

By the way, i have another core dump where the key is NULL:
Core was generated by `/usr/local/sbin/unbound -c /srvs/unbound/etc/unbound/unbound.conf'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000003133ea in reuse_id_cmp (key1=0xf194e0500, key2=0x0) at ./services/outside_network.c:174
174 if(w1->id < w2->id)
[Current thread is 1 (LWP 100527)]
(gdb) bt
#0 0x00000000003133ea in reuse_id_cmp (key1=0xf194e0500, key2=0x0) at ./services/outside_network.c:174
#1 0x00000000002c2f1a in rbtree_insert (rbtree=0xd810e9ab8, data=0xf194e0528) at ./util/rbtree.c:241
#2 0x0000000000314b08 in reuse_tree_by_id_insert (reuse=0xd810e99e0, w=0xf194e0500)
at ./services/outside_network.c:400
#3 use_free_buffer (outnet=0xc72751000) at ./services/outside_network.c:743
#4 0x000000000031431f in outnet_tcp_cb (c=, arg=, error=,
reply_info=) at ./services/outside_network.c:1125
#5 0x000000080045fd7d in ?? () from /usr/local/lib/libevent-2.1.so.7
#6 0x000000080045bd2c in event_base_loop () from /usr/local/lib/libevent-2.1.so.7
#7 0x000000000024dc54 in thread_start (arg=0x8014c8000) at ./util/ub_event.c:280
#8 0x0000000800783fac in ?? () from /lib/libthr.so.3
#9 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf1f7000
(gdb)
(gdb) fr 1
#1 0x00000000002c2f1a in rbtree_insert (rbtree=0xd810e9ab8, data=0xf194e0528) at ./util/rbtree.c:241
241 if ((r = rbtree->cmp(data->key, node->key)) == 0) {
(gdb) print *node
$14 = {parent = 0xf0686c928, left = 0x3287d0 <rbtree_null_node>, right = 0x3287d0 <rbtree_null_node>, key = 0x0,
color = 0 '\000'}
(gdb)

This node is appearing in while(node != RBTREE_NULL) loop (source is util/rbtree.c in rbtree_insert() function) and program is abnormally terminating when we call rbtree->cmp(data->key, node->key) in frame 0.

@iruzanov
Copy link
Author

Btw, can you somehow reproduce the issue reliably-ish?

It would be hard. Unbound crashes under massive TCP-requests when server generating huge amount of iterative requests to upstreams. So i will need to research this traffic first to reproduce the issue in my sandbox.

gthess added a commit that referenced this issue May 19, 2021
  between TCP streams.
- Refactor for uniform way to produce random DNS message IDs.
@gthess
Copy link
Member

gthess commented May 19, 2021

There is a possible fix on master branch (ff6b527) for this.
It would be great if you could test and provide feedback!

@gthess
Copy link
Member

gthess commented May 28, 2021

Hi @iruzanov, just checking if you were able to test with the aforementioned fix.

@iruzanov
Copy link
Author

Hello colleagues!
Please excuse me for long delay! I'm just very "loaded" by other tasks right now ;)

In will definitely test your patch on my load Unbound resolvers in this week. And then i will report you about results! Thank you very much for help!

jedisct1 added a commit to jedisct1/unbound that referenced this issue May 31, 2021
* nlnet/master:
  - zonemd-check: yesno option, default no, enables the processing   of ZONEMD records for that zone.
  - Merge NLnetLabs#496 from banburybill: Use build system endianness if   available, otherwise try to work it out.
  Use build system endianness if available, otherwise try to work it out.
  - For NLnetLabs#492: Fix font highlighting for the man page on emacs.
  - Fix NLnetLabs#492: module-config respip missing in unbound.conf.5.in man   page. Merges NLnetLabs#494 from he32. Remove comment line (?) from man page.
  Transplant parts of the contributed RPZ documentation.
  - Move the NSEC3 max iterations count in line with the 150 value   used by BIND, Knot and PowerDNS. This sets the default value   for it in the configuration to 150 for all key sizes.
  - Test code has -q option for quiet output.
  - Fix for NLnetLabs#411, NLnetLabs#439, NLnetLabs#469: Reset the DNS message ID when moving queries   between TCP streams. - Refactor for uniform way to produce random DNS message IDs.
  Fix date in changelog.
  - Fix NLnetLabs#489: Compile using MSYS2 MinGW 64-bit.
  - Fix that auth-zone zonefiles use last TTL if no TTL is specified.
  Changelog note for NLnetLabs#487 - Merge PR NLnetLabs#487: ifdef RLIMIT_AS in recently added check.
  ifdef RLIMIT_AS in recently added check
@iruzanov
Copy link
Author

iruzanov commented Jun 3, 2021

Today i have patched one of my loaded Unbound resolvers. My plan is:

  • to monitor the service during of two working days
  • to report you about results
  • to apply your patch to one more loaded resolvers in next week
  • to report you about results

Thank you very much for big help!

@iruzanov
Copy link
Author

iruzanov commented Jun 7, 2021

Unbound patched on one of the most loaded servers is still running without any core dumps. Looks good. In days i'm going to patch still more loaded servers.

@gthess
Copy link
Member

gthess commented Jun 7, 2021

Thanks for letting us know, very good news!
Normally would you expect a core dump by now?

@iruzanov
Copy link
Author

Hi!
The server works fine for 4 days without core dumps. Super! ;)
And i'm going to apply your patch to list of another servers. In next week i will let you know about success! (i hope so ;)

@iruzanov
Copy link
Author

I have patched still three Unbound resolvers that are running under heavy load. No one of the processes are crased with core dump. I would like you to give me still more time to observe the patched software and then i will let you know with my final conclusion.

@iruzanov
Copy link
Author

iruzanov commented Jun 22, 2021

Hello colleagues!

Unfortunately i have detected core dumps from two Unbound processes.

  1. The first core dump looks like:
[New LWP 100571]
[New LWP 148594]
[New LWP 100646]
[New LWP 101330]
[New LWP 101361]
[New LWP 101577]
[New LWP 101603]
[New LWP 101611]
[New LWP 101613]
[New LWP 101619]
[New LWP 101620]
[New LWP 101631]
[New LWP 101649]
[New LWP 101653]
[New LWP 101672]
[New LWP 101674]
[New LWP 101689]
[New LWP 101693]
[New LWP 101702]
[New LWP 101719]
[New LWP 101721]
Core was generated by `/usr/local/sbin/unbound -c /srvs/unbound/etc/unbound/unbound.conf'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000031344a in reuse_id_cmp (key1=0x7fffdfffda68, key2=0x0) at ./services/outside_network.c:178
178             if(w1->id < w2->id)
[Current thread is 1 (LWP 100571)]
(gdb) bt
#0  0x000000000031344a in reuse_id_cmp (key1=0x7fffdfffda68, key2=0x0) at ./services/outside_network.c:178
#1  0x00000000003156e9 in rbtree_find_less_equal (rbtree=0xdc7262a78, key=0x7fffdfffda68, result=<optimized out>) at ./util/rbtree.c:527
#2  rbtree_search (rbtree=0xdc7262a78, key=0x7fffdfffda68) at ./util/rbtree.c:285
#3  reuse_tcp_by_id_find (reuse=0xdc72629a0, id=6060) at ./services/outside_network.c:425
#4  reuse_tcp_select_id (reuse=0xdc72629a0, outnet=<optimized out>) at ./services/outside_network.c:2116
#5  0x0000000000314a38 in tcp_select_id (outnet=0xd40aa3000, reuse=0xdc72629a0) at ./services/outside_network.c:2092
#6  use_free_buffer (outnet=0xd40aa3000) at ./services/outside_network.c:745
#7  0x000000000031437f in outnet_tcp_cb (c=<optimized out>, arg=<optimized out>, error=<optimized out>, reply_info=<optimized out>)
    at ./services/outside_network.c:1150
#8  0x000000080045e2b1 in ?? () from /usr/local/lib/libevent-2.1.so.7
#9  0x000000080045a09c in event_base_loop () from /usr/local/lib/libevent-2.1.so.7
#10 0x000000000024dc84 in thread_start (arg=0x8014ba800) at ./util/ub_event.c:280
#11 0x0000000800780fac in ?? () from /lib/libthr.so.3
#12 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfffe000
(gdb) list
173
174     int reuse_id_cmp(const void* key1, const void* key2)
175     {
176             struct waiting_tcp* w1 = (struct waiting_tcp*)key1;
177             struct waiting_tcp* w2 = (struct waiting_tcp*)key2;
178             if(w1->id < w2->id)
179                     return -1;
180             if(w1->id > w2->id)
181                     return 1;
182             return 0;

I.e. we have deal with the same issue related to key2=NULL:

(gdb) print *w2
Cannot access memory at address 0x0
  1. The core dump of second Unbound (on other server):
[New LWP 101107]
[New LWP 104932]
[New LWP 100498]
[New LWP 100601]
[New LWP 100606]
[New LWP 100622]
[New LWP 100719]
[New LWP 100915]
[New LWP 100936]
[New LWP 100997]
[New LWP 101099]
[New LWP 101113]
[New LWP 101115]
[New LWP 101128]
[New LWP 101132]
[New LWP 101138]
[New LWP 101146]
[New LWP 101161]
[New LWP 101171]
[New LWP 101178]
[New LWP 101196]
Core was generated by `/usr/local/sbin/unbound -c /srvs/unbound/etc/unbound/unbound.conf'.
Program terminated with signal SIGABRT, Aborted.
#0  0x0000000800955c2a in thr_kill () from /lib/libc.so.7
[Current thread is 1 (LWP 101107)]
(gdb) bt
#0  0x0000000800955c2a in thr_kill () from /lib/libc.so.7
#1  0x0000000800954084 in raise () from /lib/libc.so.7
#2  0x00000008008ca279 in abort () from /lib/libc.so.7
#3  0x0000000800464641 in ?? () from /usr/local/lib/libevent-2.1.so.7
#4  0x0000000800464939 in event_errx () from /usr/local/lib/libevent-2.1.so.7
#5  0x000000080045ec54 in evmap_io_del_ () from /usr/local/lib/libevent-2.1.so.7
#6  0x0000000800457e8f in event_del_nolock_ () from /usr/local/lib/libevent-2.1.so.7
#7  0x000000080045ada8 in event_del () from /usr/local/lib/libevent-2.1.so.7
#8  0x000000000030e2bb in ub_event_del (ev=<optimized out>) at ./util/ub_event.c:395
#9  comm_point_close (c=0xdd6d3cc00) at ./util/netevent.c:3860
#10 0x000000000031601b in decommission_pending_tcp (outnet=<optimized out>, pend=0xdd6721d80) at ./services/outside_network.c:967
#11 0x0000000000314836 in reuse_cb_and_decommission (outnet=0x18af3, pend=0x6, error=-2) at ./services/outside_network.c:1011
#12 0x0000000000317901 in outnet_tcptimer (arg=0xe9a748500) at ./services/outside_network.c:2058
#13 0x000000080045e0ed in ?? () from /usr/local/lib/libevent-2.1.so.7
#14 0x000000080045a09c in event_base_loop () from /usr/local/lib/libevent-2.1.so.7
#15 0x000000000024dc84 in thread_start (arg=0x801543000) at ./util/ub_event.c:280
#16 0x0000000800780fac in ?? () from /lib/libthr.so.3
#17 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdedf5000

#12 0x0000000000317901 in outnet_tcptimer (arg=0xe9a748500) at ./services/outside_network.c:2058
2058                    reuse_cb_and_decommission(outnet, pend, NETEVENT_TIMEOUT);
(gdb) list
2053                    waiting_tcp_callback(w, NULL, NETEVENT_TIMEOUT, NULL);
2054                    waiting_tcp_delete(w);
2055            } else {
2056                    /* it was in use */
2057                    struct pending_tcp* pend=(struct pending_tcp*)w->next_waiting;
2058                    reuse_cb_and_decommission(outnet, pend, NETEVENT_TIMEOUT);
2059            }
2060            use_free_buffer(outnet);
2061    }
2062
(gdb) print *(struct pending_tcp*) pend
Cannot access memory at address 0x6

(gdb) print *(struct pending_tcp*) w->next_waiting
$11 = {next_free = 0xdd67214c0, pi = 0xd0184b0c0, c = 0xdd6d3cc00, query = 0x0, reuse = {node = {parent = 0xdd6721b20, left = 0x328890 <rbtree_null_node>,
      right = 0xdd67213a0, key = 0x0, color = 0 '\000'}, addr = {ss_len = 0 '\000', ss_family = 28 '\034', __ss_pad1 = "\000\065\000\000\000",
      __ss_align = 79164837527840, __ss_pad2 = "\000\000\000\000\000\000\000\001", '\000' <repeats 103 times>}, addrlen = 28, is_ssl = 0,
    lru_next = 0xdd6721ee0, lru_prev = 0x0, item_on_lru_list = 0, pending = 0xdd6721d80, cp_more_read_again = 0, cp_more_write_again = 0, tree_by_id = {
      root = 0x328890 <rbtree_null_node>, count = 0, cmp = 0x313440 <reuse_id_cmp>}, write_wait_first = 0x0, write_wait_last = 0x0, outnet = 0xd017f4000}}
(gdb)

@iruzanov
Copy link
Author

I have applied the patch that is on resource:
ff6b527

@gthess
Copy link
Member

gthess commented Jun 22, 2021

Oh, bad news but thanks for the information as always!
I'll start looking into this again right away.
Any chance that you can make out any similarities traffic-wise I guess between the two crashes?

@iruzanov
Copy link
Author

I will dig Unbound logs. May be i will find some common points to describe under what traffic the program has crashed. Also i will recompile libevent and Unbound itself with -O0 flag. To get the maximum debug info.

@gthess
Copy link
Member

gthess commented Jun 22, 2021

Would you be able to share your config file?
Also do you happen to issue unbound-control commands to unbound?

@iruzanov
Copy link
Author

  1. My config file:
    server:
    verbosity: 1
    extended-statistics: yes

     num-threads: 20
    
     interface: 127.0.0.1
     interface: ::1
    
      interface: 193.x.x.x
      interface: 2001:x:x:x:x:x:x:x
      interface: 62.x.x.x
      interface: 62.x.x.x
      interface: 195.x.x.x
      interface: 195.x.x.x
      interface: 2001:x:x::x
      interface: 2001:x:x::x
      interface: 2a0c:x:x::x
      interface: 2a0c:x:x::x
    

    outgoing-interface: 193.x.x.x
    outgoing-interface: 2001:x:x:x:x:x:x:x
    outgoing-range: 32768
    outgoing-num-tcp: 2048
    incoming-num-tcp: 1024

     so-reuseport: yes  # we are running under FreeBSD 12.2 with SO_REUSEPORT_LB
    
     msg-cache-size: 1024m
     msg-cache-slabs: 16
    
     num-queries-per-thread: 2048
    
    
    
     rrset-cache-size: 2048m
     rrset-cache-slabs: 16
    
     infra-cache-numhosts: 100000
    
     do-ip4: yes
     do-ip6: yes
     do-udp: yes
     do-tcp: yes
     do-daemonize: yes
    
     access-control: 0.0.0.0/0 allow
     access-control: 127.0.0.0/8 allow
     access-control: ::0/0 allow
     access-control: ::1 allow
    
     chroot: "/srvs/unbound"
     username: "bind"
     directory: "/srvs/unbound/etc/unbound"
     use-syslog: yes
    
     pidfile: "/srvs/unbound/var/run/unbound/unbound.pid"
     hide-identity: no
     hide-version: yes
    
     identity: "res1"
    
    
     aggressive-nsec: yes
    
     do-not-query-address: ::1
     do-not-query-localhost: no
    

    ratelimit: 1000
    minimal-responses: no
    module-config: "respip validator iterator"

     auto-trust-anchor-file: "/srvs/unbound/etc/unbound/autokey/auto.key"
    
     neg-cache-size: 100m
     deny-any: yes
    

    serve-expired: yes
    serve-expired-ttl: 14400
    prefetch: yes
    prefetch-key: yes
    rpz:
    name: "blocklist." # name of the policy zone
    master: 194.x.x.x # address of the name server to transfer from
    rpz:
    name: "operator."
    master: 194.x.x.x

python:

dnstap:
dnstap-enable: yes
dnstap-socket-path: "/tmp/dnstap.sock"
dnstap-send-identity: yes
dnstap-send-version: yes
dnstap-identity: "res1"
dnstap-log-client-query-messages: yes
dnstap-log-client-response-messages: yes
dnstap-log-forwarder-query-messages: no
dnstap-log-forwarder-response-messages: no
dnstap-log-resolver-query-messages: no
dnstap-log-resolver-response-messages: no

remote-control:
control-enable: yes
control-interface: ::1
control-use-cert: "yes"
server-key-file: "/srvs/unbound/etc/unbound/unbound_server.key"
server-cert-file: "/srvs/unbound/etc/unbound/unbound_server.pem"
control-key-file: "/srvs/unbound/etc/unbound/unbound_control.key"
control-cert-file: "/srvs/unbound/etc/unbound/unbound_control.pem"

stub-zone:
name: "."
stub-prime: yes
stub-addr: 127.12.12.12
stub-addr: 193.x.x.x

  1. unbound-control.
    Yes, we are running special programm named unbound_exporter that just like unbound-control connects via control TCP-socket to Unbound and send stats_noreset command to get statistics. This command is issued every minute.

@iruzanov
Copy link
Author

What about traffic similarity - i have grep-ed Unbound logs on two servers. The log records (tons of the records) before Unbound was crashing to core dump are look like:

  1. first server:
    [...]
    Jun 18 18:50:07 res1 unbound[28935]: [28935:11] error: read (in tcp s): Connection refused for 2a01:5b40:ac3::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:1] error: read (in tcp s): Connection refused for 2a01:5b40:ac2::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:1] error: read (in tcp s): Connection refused for 151.249.125.2 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:11] error: read (in tcp s): Connection refused for 2a01:5b40:ac1::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:10] error: read (in tcp s): Connection refused for 151.249.125.2 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:c] error: read (in tcp s): Connection refused for 151.249.124.1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:9] error: read (in tcp s): Connection refused for 2a01:5b40:ac3::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:1] error: read (in tcp s): Connection refused for 2a01:5b40:ac3::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:a] error: read (in tcp s): Connection refused for 151.249.125.2 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:5] error: read (in tcp s): Connection refused for 2a01:5b40:ac2::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:1] error: read (in tcp s): Connection refused for 151.249.126.3 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:11] error: read (in tcp s): Connection refused for 2a01:5b40:ac2::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:1] error: read (in tcp s): Connection refused for 151.249.125.2 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:10] error: read (in tcp s): Connection refused for 2a01:5b40:ac3::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:11] error: read (in tcp s): Connection refused for 2a01:5b40:ac1::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:6] error: read (in tcp s): Connection refused for 2a01:5b40:ac3::1 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:5] error: read (in tcp s): Connection refused for 151.249.126.3 port 53
    Jun 18 18:50:07 res1 unbound[28935]: [28935:a] error: read (in tcp s): Connection refused for 151.249.125.2 port 53
    [...]

  2. second server:
    [...]
    Jun 20 15:52:42 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:44 res2 unbound[14692]: [14692:11] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:44 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:44 res2 unbound[14692]: [14692:6] error: read (in tcp s): Connection refused for 199.249.127.1 port 53
    Jun 20 15:52:44 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:46 res2 unbound[14692]: [14692:c] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:46 res2 syslogd: last message repeated 2 times
    Jun 20 15:52:47 res2 unbound[14692]: [14692:a] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:47 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:48 res2 unbound[14692]: [14692:10] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:48 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:49 res2 unbound[14692]: [14692:0] error: read (in tcp s): Connection refused for 199.249.120.1 port 53
    Jun 20 15:52:49 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:50 res2 unbound[14692]: [14692:e] error: read (in tcp s): Connection refused for 199.249.120.1 port 53
    Jun 20 15:52:51 res2 syslogd: last message repeated 2 times
    Jun 20 15:52:53 res2 unbound[14692]: [14692:a] error: read (in tcp r): Connection reset by peer for 199.81.212.13 port 53
    Jun 20 15:52:53 res2 unbound[14692]: [14692:10] notice: ratelimit exceeded corp.ds.fedex.com. 1000 query pwn90150.corp.ds.fedex.com. IN A
    Jun 20 15:52:55 res2 unbound[14692]: [14692:1] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:56 res2 syslogd: last message repeated 1 times
    Jun 20 15:52:56 res2 unbound[14692]: [14692:5] error: read (in tcp r): Connection reset by peer for 199.81.212.13 port 53
    Jun 20 15:52:56 res2 unbound[14692]: [14692:11] error: read (in tcp r): Connection reset by peer for 199.81.212.13 port 53
    Jun 20 15:52:57 res2 unbound[14692]: [14692:2] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:52:57 res2 unbound[14692]: [14692:13] error: read (in tcp r): Connection reset by peer for 204.135.12.100 port 53
    Jun 20 15:52:59 res2 unbound[14692]: [14692:5] error: read (in tcp r): Connection reset by peer for 199.81.212.13 port 53
    Jun 20 15:53:03 res2 unbound[14692]: [14692:9] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:53:03 res2 syslogd: last message repeated 3 times
    Jun 20 15:53:05 res2 unbound[14692]: [14692:a] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:53:06 res2 syslogd: last message repeated 1 times
    Jun 20 15:53:09 res2 unbound[14692]: [14692:7] error: read (in tcp s): Connection refused for 199.249.112.1 port 53
    Jun 20 15:53:10 res2 unbound[14692]: [14692:7] error: read (in tcp s): Connection refused for 199.249.120.1 port 53
    [...]

I.e. these are tons of oubound requests to upstreams (probably fake upstreams received from crafted DNS packets from spooffed clients or intruders) by TCP protocol. The code responding for this functionality is in outside_network.c source file.

@iruzanov
Copy link
Author

Hello, Colleagues!

Last weekend i've detected that Unbound was crashed and there is interesting core dump:
Core was generated by `/usr/local/sbin/unbound -c /srvs/unbound/etc/unbound/unbound.conf'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004d116c in reuse_id_cmp (key1=0xec64aab00, key2=0x0) at ./services/outside_network.c:179
179 if(w1->id < w2->id)
[Current thread is 1 (LWP 101705)]
(gdb) bt
#0 0x00000000004d116c in reuse_id_cmp (key1=0xec64aab00, key2=0x0) at ./services/outside_network.c:179
#1 0x0000000000479045 in rbtree_find_less_equal (rbtree=0xd8b32bc38, key=0xec64aab00, result=0x7fffdeff5b08) at ./util/rbtree.c:530
#2 0x00000000004787ed in rbtree_search (rbtree=0xd8b32bc38, key=0xec64aab00) at ./util/rbtree.c:285
#3 0x0000000000478930 in rbtree_delete (rbtree=0xd8b32bc38, key=0xec64aab00) at ./util/rbtree.c:333
#4 0x00000000004d2958 in reuse_tree_by_id_delete (reuse=0xd8b32bb60, w=0xec64aab00) at ./services/outside_network.c:836
#5 0x00000000004d2c9a in reuse_move_writewait_away (outnet=0xcc6ec0000, pend=0xd8b32bb40) at ./services/outside_network.c:890
#6 0x00000000004d33f0 in outnet_tcp_cb (c=0xd8b665200, arg=0xd8b32bb40, error=-1, reply_info=0x0) at ./services/outside_network.c:1096
#7 0x00000000004c8220 in comm_point_tcp_handle_callback (fd=7258, event=6, arg=0xd8b665200) at ./util/netevent.c:2092
#8 0x000000080086091a in event_persist_closure (base=0x801dfa000, ev=0xd8b7a6100) at event.c:1623
#9 0x0000000800860b22 in event_process_active_single_queue (base=0x801dfa000, activeq=0x802c91000, max_to_process=2147483647, endtime=0x0) at event.c:1682
#10 0x00000008008610b1 in event_process_active (base=0x801dfa000) at event.c:1783
#11 0x00000008008617bf in event_base_loop (base=0x801dfa000, flags=0) at event.c:2006
#12 0x0000000800861139 in event_base_dispatch (event_base=0x801dfa000) at event.c:1817
#13 0x00000000004da603 in ub_event_base_dispatch (base=0x801dfa000) at ./util/ub_event.c:280
#14 0x00000000004c4643 in comm_base_dispatch (b=0x801dc6000) at ./util/netevent.c:256
#15 0x000000000041f070 in worker_work (worker=0x801c1f800) at ./daemon/worker.c:1951
#16 0x000000000040b0c9 in thread_start (arg=0x801c1f800) at ./daemon/daemon.c:540
#17 0x0000000800d9dfac in ?? () from /lib/libthr.so.3
#18 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdeff6000

(gdb) fr 1
#1 0x0000000000479045 in rbtree_find_less_equal (rbtree=0xd8b32bc38, key=0xec64aab00, result=0x7fffdeff5b08) at ./util/rbtree.c:530
530 r = rbtree->cmp(key, node->key);
(gdb) print *node
$1 = {parent = 0xe8626cea8, left = 0xec64aab28, right = 0xee7b31428, key = 0x0, color = 1 '\001'}
(gdb) print *node->parent
$2 = {parent = 0xeca76ada8, left = 0xf27fdcd28, right = 0xede50e5a8, key = 0xe8626ce80, color = 0 '\000'}
(gdb) print *node->left
$6 = {parent = 0xede50e5a8, left = 0x73abc0 <rbtree_null_node>, right = 0x73abc0 <rbtree_null_node>, key = 0xec64aab00, color = 0 '\000'}
(gdb) print *node->right
$7 = {parent = 0xede50e5a8, left = 0x73abc0 <rbtree_null_node>, right = 0x73abc0 <rbtree_null_node>, key = 0xee7b31400, color = 0 '\000'}

As we can see there are nodes colored with BLACK with not NULL keys and one node colored with RED with NULL key.

@iruzanov
Copy link
Author

And still one thing - our traffic includes tons of queries on which Unbound reply with ServFail status.

@gthess
Copy link
Member

gthess commented Jul 12, 2021

Do you mean that you see an increase of SERVFAIL responses with the affected versions or that your usual traffic includes a lot of SERVFAIL responses?

@iruzanov
Copy link
Author

The second - usual traffic includes a lot of SERVFAIL responses.

gthess added a commit that referenced this issue Jul 26, 2021
gthess added a commit that referenced this issue Jul 26, 2021
@iruzanov
Copy link
Author

iruzanov commented Aug 2, 2021

Hello Colleagues!

In previous week i have patched 6 servers. These are working FINE for now! ;) So my plan is to upgrade Unbound on all other servers to monitor my service during this week. And in next monday i will report you about results.

@gthess
Copy link
Member

gthess commented Aug 2, 2021

Hi Igor, these are some good news for now!
Waiting for your results and then we are planning on moving on with a release.
In the mean time, can you verify that the patched servers have considerable TCP traffic to upstreams, so that we know the different stream reuse codepaths are used?
Thanks!

@iruzanov
Copy link
Author

iruzanov commented Aug 2, 2021

Yes, i will!
But still i would like to ask you about the following logs on all patched servers:
Aug 2 17:07:06 unbound[3701]: [3701:6] error: internal error: reuse_tcp_insert: in lru list without key
Aug 2 17:07:06 unbound[3701]: [3701:0] error: internal error: reuse_tcp_insert: in lru list without key
Aug 2 17:07:06 unbound[3701]: [3701:d] error: internal error: reuse_tcp_insert: in lru list without key
Aug 2 17:07:06 unbound[3701]: [3701:c] error: internal error: reuse_tcp_insert: in lru list without key
[...]

Is it the normal situation?

@iruzanov
Copy link
Author

iruzanov commented Aug 9, 2021

Hello, collegues!
In previous monday and tuesday i have patched all of my Unbound servers. And they work without any crash dumps! Great patches! ;)
What about considerable tcp traffic to upstreams - i can see the traffic on all of my patched servers too.

@gthess
Copy link
Member

gthess commented Aug 11, 2021

Hi Igor, great news!
These patches will be part of the upcoming 1.13.2 version coming later this week.

The error message is a bit worrying since it reveals a situation that shouldn't happen normally. But the non-crashing part seems to indicate that this may happen during a callback and no extra harm at this point (like

(void)reuse_tcp_insert(outnet, pend);
, or
if(!reuse_tcp_insert(sq->outnet, pend_tcp)) {
).

Unfortunately I haven't come across this in any of my tests to provide more useful information.

Before debugging further I would wait for you to test the full 1.13.2 release because as I understand, you test 1.13.1 with patches from the related commits currently.
If these error messages keep appearing you can continue to debug further.

If you want to debug further you can configure unbound with --enable-debug. This will enable the various log_assert() we have in place during developing and they may shed more light in your case. It is usually not recommended to run like that in production since log_assert will just terminate unbound with a log message when a condition is not met.

This WILL NOT produce a core dump.
If you prefer that, you can change

unbound/util/log.h

Lines 222 to 226 in 8e538dc

# define log_assert(x) \
do { if(!(x)) \
fatal_exit("%s:%d: %s: assertion %s failed", \
__FILE__, __LINE__, __func__, #x); \
} while(0);
to just a normal assert like a few lines above.

@iruzanov
Copy link
Author

Thank you very much!
I will update Unbound to 1.13.2 release first and then i will debug the program with your proposed asserts. I think this way more accurate. And i might even use abort() to produce core dumps at the program point that we are asserting ;)

Well, tomorrow i will be in vacation untill august 29. On monday august 30 i will return to start step 1: upgrade Unbound to 1.13.2. I hope this release will be available ;)

@gthess
Copy link
Member

gthess commented Aug 12, 2021

Actually it was released not more than 30 minutes ago :)

Thank you for your help thus far and enjoy your vacation!

@iruzanov
Copy link
Author

Hello colleagues!

Its my pleasure to use Unbound 1.13.2. And the program is working fine but one more bug i've detected one month ago. Unbound is crashing via SIGSEGV. So i have recompiled the program with the following flags: "-g -O0 -fsanitize=address -fno-omit-frame-pointer". Below i put full stacktrace of the problem thread catched by libasan:

(gdb) thr 3
[Switching to thread 3 (Thread 0x7f1ae971e700 (LWP 3271756))]
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 return ret;
(gdb) bt full
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {18446744067266838239, 139753671451101, 106790115319360, 72197347709311613, 139753671629156, 3271753, 206158430216, 139753562427008,
139753562426816, 7012368136350825216, 139753676056768, 139753390149632, 139753562430192, 139753562427046, 139753678510688, 106858908091888}}
pid =
tid =
ret =
#1 0x00007f1aeec6bdb5 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x7f1aefef47e1 <__interceptor_free(void*)+209>,
sa_sigaction = 0x7f1aefef47e1 <__interceptor_free(void*)+209>}, sa_mask = {__val = {4838403, 5068870, 5070683, 5101591, 5103337, 5063904,
4431218, 6578138, 139753562428096, 106858908091888, 139753562426760, 8, 139753562428096, 139753562428096, 139753670522756, 139753562427504}},
sa_flags = -1183483136, sa_restorer = 0x6130074205b0}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f1aeff14c02 in __sanitizer::Abort () at ../../../../libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cc:145
No locals.
#3 0x00007f1aeff1d52c in __sanitizer::Die () at ../../../../libsanitizer/sanitizer_common/sanitizer_termination.cc:57
No locals.
#4 0x00007f1aefeffc55 in __asan::ScopedInErrorReport::~ScopedInErrorReport (this=, __in_chrg=)
at ../../../../libsanitizer/asan/asan_report.cc:181
buffer_copy =
#5 asan::ReportGenericError (pc=, bp=bp@entry=139753562430240, sp=sp@entry=139753562430224, addr=106858908091888,
is_write=is_write@entry=false, access_size=access_size@entry=8, exp=0, fatal=true) at ../../../../libsanitizer/asan/asan_report.cc:397
in_report = {error_report_lock
= {}, static current_error
= {kind = __asan::kErrorKindGeneric, {Base = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, DeadlySignal = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, signal = {siginfo = 0x2, context = 0x6130074205f0, addr = 3, pc = 3,
sp = 5440349067550916617, bp = 106858908091888, is_memory_access = 48, write_flag = __sanitizer::SignalContext::UNKNOWN}},
DoubleFree = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, second_free_stack = 0x2, addr_description = {addr = 106858908091888,
alloc_tid = 3, free_tid = 3, alloc_stack_id = 427819017, free_stack_id = 1266679975, chunk_access = {bad_addr = 106858908091888,
offset = 48, chunk_begin = 106858908091840, chunk_size = 355, access_type = 2, alloc_type = 1}}},
NewDeleteSizeMismatch = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, free_stack = 0x2, addr_description = {addr = 106858908091888, alloc_tid = 3,
free_tid = 3, alloc_stack_id = 427819017, free_stack_id = 1266679975, chunk_access = {bad_addr = 106858908091888, offset = 48,
chunk_begin = 106858908091840, chunk_size = 355, access_type = 2, alloc_type = 1}}, delete_size = 7012368136350825216},
FreeNotMalloced = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, free_stack = 0x2, addr_description = {data = {kind = 121767408, {shadow = {
addr = 3, kind = (__asan::kShadowKindGap | __asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {addr = 3, alloc_tid = 3,
free_tid = 5440349067550916617, alloc_stack_id = 121767408, free_stack_id = 24880, chunk_access = {bad_addr = 48,
offset = 106858908091840, chunk_begin = 355, chunk_size = 5432374, access_type = 0, alloc_type = 0}}, stack = {addr = 3, tid = 3,
--Type for more, q to quit, c to continue without paging--
offset = 5440349067550916617, frame_pc = 106858908091888, access_size = 48, frame_descr = 0x6130074205c0 "\247"}, global = {
addr = 3, static kMaxGlobals = 4, globals = {{beg = 3, size = 5440349067550916617, size_with_redzone = 106858908091888,
name = 0x30 <error: Cannot access memory at address 0x30>, module_name = 0x6130074205c0 "\247", has_dynamic_init = 355,
location = 0x52e436 <fatal_exit+276>, odr_indicator = 7012368136350825216}, {beg = 206158430224, size = 139753562428464,
size_with_redzone = 17469195303558, name = 0x7f1ae971d490 "h9\306\360\032\177",
module_name = 0x7f1ae971d430 "te-read-heap-use-after-free", has_dynamic_init = 106378361083840, location = 0x2,
odr_indicator = 9223372036854776046}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0, module_name = 0x0,
has_dynamic_init = 0, location = 0x41b58ab3, odr_indicator = 139753562428400}, {beg = 106858908091840, size = 106790081373240,
size_with_redzone = 206158430224, name = 0x942920 <rbtree_null_node> " )\224",
module_name = 0x7f1ae971d450 "\210\317q\351\032\177", has_dynamic_init = 5471136, location = 0x6130074205c0,
odr_indicator = 106790081373240}}, reg_sites = {1102416563, 0, 7101984, 0}, access_size = 5470962, size = 26 '\032'},
addr = 3}}}}, AllocTypeMismatch = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, dealloc_stack = 0x2, addr_description = {addr = 106858908091888,
alloc_tid = 3, free_tid = 3, alloc_stack_id = 427819017, free_stack_id = 1266679975, chunk_access = {bad_addr = 106858908091888,
offset = 48, chunk_begin = 106858908091840, chunk_size = 355, access_type = 2, alloc_type = 1}}, alloc_type = 3111484160,
dealloc_type = 1632694186}, MallocUsableSizeNotOwned = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, stack = 0x2, addr_description = {data = {kind = 121767408, {shadow = {
addr = 3, kind = (__asan::kShadowKindGap | __asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {addr = 3, alloc_tid = 3,
free_tid = 5440349067550916617, alloc_stack_id = 121767408, free_stack_id = 24880, chunk_access = {bad_addr = 48,
offset = 106858908091840, chunk_begin = 355, chunk_size = 5432374, access_type = 0, alloc_type = 0}}, stack = {addr = 3, tid = 3,
offset = 5440349067550916617, frame_pc = 106858908091888, access_size = 48, frame_descr = 0x6130074205c0 "\247"}, global = {
addr = 3, static kMaxGlobals = 4, globals = {{beg = 3, size = 5440349067550916617, size_with_redzone = 106858908091888,
name = 0x30 <error: Cannot access memory at address 0x30>, module_name = 0x6130074205c0 "\247", has_dynamic_init = 355,
location = 0x52e436 <fatal_exit+276>, odr_indicator = 7012368136350825216}, {beg = 206158430224, size = 139753562428464,
size_with_redzone = 17469195303558, name = 0x7f1ae971d490 "h9\306\360\032\177",
module_name = 0x7f1ae971d430 "te-read-heap-use-after-free", has_dynamic_init = 106378361083840, location = 0x2,
odr_indicator = 9223372036854776046}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0, module_name = 0x0,
has_dynamic_init = 0, location = 0x41b58ab3, odr_indicator = 139753562428400}, {beg = 106858908091840, size = 106790081373240,
size_with_redzone = 206158430224, name = 0x942920 <rbtree_null_node> " )\224",
module_name = 0x7f1ae971d450 "\210\317q\351\032\177", has_dynamic_init = 5471136, location = 0x6130074205c0,
odr_indicator = 106790081373240}}, reg_sites = {1102416563, 0, 7101984, 0}, access_size = 5470962, size = 26 '\032'},
addr = 3}}}}, SanitizerGetAllocatedSizeNotOwned = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, stack = 0x2, addr_description = {data = {kind = 121767408, {shadow = {
addr = 3, kind = (__asan::kShadowKindGap | __asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {addr = 3, alloc_tid = 3,
free_tid = 5440349067550916617, alloc_stack_id = 121767408, free_stack_id = 24880, chunk_access = {bad_addr = 48,
offset = 106858908091840, chunk_begin = 355, chunk_size = 5432374, access_type = 0, alloc_type = 0}}, stack = {addr = 3, tid = 3,
offset = 5440349067550916617, frame_pc = 106858908091888, access_size = 48, frame_descr = 0x6130074205c0 "\247"}, global = {
addr = 3, static kMaxGlobals = 4, globals = {{beg = 3, size = 5440349067550916617, size_with_redzone = 106858908091888,
name = 0x30 <error: Cannot access memory at address 0x30>, module_name = 0x6130074205c0 "\247", has_dynamic_init = 355,
location = 0x52e436 <fatal_exit+276>, odr_indicator = 7012368136350825216}, {beg = 206158430224, size = 139753562428464,
size_with_redzone = 17469195303558, name = 0x7f1ae971d490 "h9\306\360\032\177",
module_name = 0x7f1ae971d430 "te-read-heap-use-after-free", has_dynamic_init = 106378361083840, location = 0x2,
odr_indicator = 9223372036854776046}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0, module_name = 0x0,
has_dynamic_init = 0, location = 0x41b58ab3, odr_indicator = 139753562428400}, {beg = 106858908091840, size = 106790081373240,
size_with_redzone = 206158430224, name = 0x942920 <rbtree_null_node> " )\224",
module_name = 0x7f1ae971d450 "\210\317q\351\032\177", has_dynamic_init = 5471136, location = 0x6130074205c0,
odr_indicator = 106790081373240}}, reg_sites = {1102416563, 0, 7101984, 0}, access_size = 5470962, size = 26 '\032'},
--Type for more, q to quit, c to continue without paging--
addr = 3}}}}, StringFunctionMemoryRangesOverlap = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, stack = 0x2, length1 = 106858908091888, length2 = 3, addr1_description = {
data = {kind = __asan::kAddressKindStack, {shadow = {addr = 5440349067550916617, kind = (unknown: -16), shadow_byte = 5 '\005'}, heap = {
addr = 5440349067550916617, alloc_tid = 106858908091888, free_tid = 48, alloc_stack_id = 121767360, free_stack_id = 24880,
chunk_access = {bad_addr = 355, offset = 5432374, chunk_begin = 7012368136350825216, chunk_size = 206158430224, access_type = 0,
alloc_type = 0}}, stack = {addr = 5440349067550916617, tid = 106858908091888, offset = 48, frame_pc = 106858908091840,
access_size = 355, frame_descr = 0x52e436 <fatal_exit+276> ""}, global = {addr = 5440349067550916617, static kMaxGlobals = 4,
globals = {{beg = 106858908091888, size = 48, size_with_redzone = 106858908091840,
name = 0x163 <error: Cannot access memory at address 0x163>, module_name = 0x52e436 <fatal_exit+276> "",
has_dynamic_init = 7012368136350825216, location = 0x3000000010, odr_indicator = 139753562428464}, {beg = 17469195303558,
size = 139753562428560, size_with_redzone = 139753562428464, name = 0x60c0246c97c0 "",
module_name = 0x2 <error: Cannot access memory at address 0x2>, has_dynamic_init = 9223372036854776046, location = 0x0,
odr_indicator = 0}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0,
module_name = 0x41b58ab3 <error: Cannot access memory at address 0x41b58ab3>, has_dynamic_init = 139753562428400,
location = 0x6130074205c0, odr_indicator = 106790081373240}, {beg = 206158430224, size = 9709856,
size_with_redzone = 139753562428496,
name = 0x537ba0 <rbtree_search+174> "\205\300t'I\215E\300H\211\302H\301\352\003H\201", <incomplete sequence \302>,
module_name = 0x6130074205c0 "\247", has_dynamic_init = 106790081373240, location = 0x41b58ab3, odr_indicator = 7101984}},
reg_sites = {5470962, 0, 5477914, 0}, access_size = 139753562430240, size = 16 '\020'}, addr = 5440349067550916617}}},
addr2_description = {data = {kind = 8, {shadow = {addr = 139753671585219, kind = _asan::kShadowKindLow, shadow_byte = 253 '\375'}, heap = {
addr = 139753671585219, alloc_tid = 139753562438912, free_tid = 106583908513840, alloc_stack_id = 279502912, free_stack_id = 24864,
chunk_access = {bad_addr = 106858968657856, offset = 106790081372992, chunk_begin = 1102416563, chunk_size = 7264512,
access_type = 2, alloc_type = 2}}, stack = {addr = 139753671585219, tid = 139753562438912, offset = 106583908513840,
frame_pc = 106790346350656, access_size = 106858968657856, frame_descr = 0x612000dda340 "\300\236", <incomplete sequence \335>},
global = {addr = 139753671585219, static kMaxGlobals = 4, globals = {{beg = 139753562438912, size = 106583908513840,
size_with_redzone = 106790346350656, name = 0x61300ade2fc0 "@\243", <incomplete sequence \335>,
module_name = 0x612000dda340 "\300\236", <incomplete sequence \335>, has_dynamic_init = 1102416563, location = 0x6ed900,
odr_indicator = 6578362}, {beg = 139753562430272, size = 139753562429952, size_with_redzone = 106378361083840,
name = 0x7f1ae971db60 "\200\333q\351\376\377\377\377\320\333q\351\032\177",
module_name = 0x63872f <reuse_tcp_remove_tree_list+690> "H\213\205\220\376\377\377H\005\300",
has_dynamic_init = 106790081373024, location = 0x613000030080, odr_indicator = 7242016}, {beg = 17187126944, size = 0,
size_with_redzone = 18033364716583124994, name = 0x0,
module_name = 0x5e971da60 <error: Cannot access memory at address 0x5e971da60>, has_dynamic_init = 206158430224,
location = 0x7f1ae971db50, odr_indicator = 139753562430064}, {beg = 139753663233543, size = 21474836496,
size_with_redzone = 8770, name = 0x7f1ae971da90 "\300", module_name = 0x60c003526480 "", has_dynamic_init = 105690610497296,
location = 0x0, odr_indicator = 8770}}, reg_sites = {55277312, 24608, 192, 0}, access_size = 16, size = 64 '@'},
addr = 139753671585219}}}, function = 0x7f1aef739640 <evmap_io_del
+224> "\272\001"},
StringFunctionSizeOverflow = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, stack = 0x2, addr_description = {data = {kind = 121767408, {shadow = {
addr = 3, kind = (__asan::kShadowKindGap | __asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {addr = 3, alloc_tid = 3,
free_tid = 5440349067550916617, alloc_stack_id = 121767408, free_stack_id = 24880, chunk_access = {bad_addr = 48,
offset = 106858908091840, chunk_begin = 355, chunk_size = 5432374, access_type = 0, alloc_type = 0}}, stack = {addr = 3, tid = 3,
offset = 5440349067550916617, frame_pc = 106858908091888, access_size = 48, frame_descr = 0x6130074205c0 "\247"}, global = {
addr = 3, static kMaxGlobals = 4, globals = {{beg = 3, size = 5440349067550916617, size_with_redzone = 106858908091888,
name = 0x30 <error: Cannot access memory at address 0x30>, module_name = 0x6130074205c0 "\247", has_dynamic_init = 355,
location = 0x52e436 <fatal_exit+276>, odr_indicator = 7012368136350825216}, {beg = 206158430224, size = 139753562428464,
size_with_redzone = 17469195303558, name = 0x7f1ae971d490 "h9\306\360\032\177",
module_name = 0x7f1ae971d430 "te-read-heap-use-after-free", has_dynamic_init = 106378361083840, location = 0x2,
odr_indicator = 9223372036854776046}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0, module_name = 0x0,
has_dynamic_init = 0, location = 0x41b58ab3, odr_indicator = 139753562428400}, {beg = 106858908091840, size = 106790081373240,
size_with_redzone = 206158430224, name = 0x942920 <rbtree_null_node> " )\224",
--Type for more, q to quit, c to continue without paging--
module_name = 0x7f1ae971d450 "\210\317q\351\032\177", has_dynamic_init = 5471136, location = 0x6130074205c0,
odr_indicator = 106790081373240}}, reg_sites = {1102416563, 0, 7101984, 0}, access_size = 5470962, size = 26 '\032'},
addr = 3}}}, size = 139753562430240}, BadParamsToAnnotateContiguousContainer = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, stack = 0x2, beg = 106858908091888, end = 3, old_mid = 3,
new_mid = 5440349067550916617}, ODRViolation = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, global1 = {beg = 2, size = 106858908091888, size_with_redzone = 3,
name = 0x3 <error: Cannot access memory at address 0x3>,
module_name = 0x4b8000a719800009 <error: Cannot access memory at address 0x4b8000a719800009>, has_dynamic_init = 106858908091888,
location = 0x30, odr_indicator = 106858908091840}, global2 = {beg = 355, size = 5432374, size_with_redzone = 7012368136350825216,
name = 0x3000000010 <error: Cannot access memory at address 0x3000000010>, module_name = 0x7f1ae971d430 "te-read-heap-use-after-free",
has_dynamic_init = 17469195303558, location = 0x7f1ae971d490, odr_indicator = 139753562428464}, stack_id1 = 611096512,
stack_id2 = 24768}, InvalidPointerPair = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, pc = 2, bp = 106858908091888, sp = 3, addr1_description = {data = {
kind = __asan::kAddressKindStack, {shadow = {addr = 5440349067550916617, kind = (unknown: -16), shadow_byte = 5 '\005'}, heap = {
addr = 5440349067550916617, alloc_tid = 106858908091888, free_tid = 48, alloc_stack_id = 121767360, free_stack_id = 24880,
chunk_access = {bad_addr = 355, offset = 5432374, chunk_begin = 7012368136350825216, chunk_size = 206158430224, access_type = 0,
alloc_type = 0}}, stack = {addr = 5440349067550916617, tid = 106858908091888, offset = 48, frame_pc = 106858908091840,
access_size = 355, frame_descr = 0x52e436 <fatal_exit+276> ""}, global = {addr = 5440349067550916617, static kMaxGlobals = 4,
globals = {{beg = 106858908091888, size = 48, size_with_redzone = 106858908091840,
name = 0x163 <error: Cannot access memory at address 0x163>, module_name = 0x52e436 <fatal_exit+276> "",
has_dynamic_init = 7012368136350825216, location = 0x3000000010, odr_indicator = 139753562428464}, {beg = 17469195303558,
size = 139753562428560, size_with_redzone = 139753562428464, name = 0x60c0246c97c0 "",
module_name = 0x2 <error: Cannot access memory at address 0x2>, has_dynamic_init = 9223372036854776046, location = 0x0,
odr_indicator = 0}, {beg = 0, size = 0, size_with_redzone = 0, name = 0x0,
module_name = 0x41b58ab3 <error: Cannot access memory at address 0x41b58ab3>, has_dynamic_init = 139753562428400,
location = 0x6130074205c0, odr_indicator = 106790081373240}, {beg = 206158430224, size = 9709856,
size_with_redzone = 139753562428496,
name = 0x537ba0 <rbtree_search+174> "\205\300t'I\215E\300H\211\302H\301\352\003H\201", <incomplete sequence \302>,
module_name = 0x6130074205c0 "\247", has_dynamic_init = 106790081373240, location = 0x41b58ab3, odr_indicator = 7101984}},
reg_sites = {5470962, 0, 5477914, 0}, access_size = 139753562430240, size = 16 '\020'}, addr = 5440349067550916617}}},
addr2_description = {data = {kind = 8, {shadow = {addr = 139753671585219, kind = __asan::kShadowKindLow, shadow_byte = 253 '\375'}, heap = {
addr = 139753671585219, alloc_tid = 139753562438912, free_tid = 106583908513840, alloc_stack_id = 279502912, free_stack_id = 24864,
chunk_access = {bad_addr = 106858968657856, offset = 106790081372992, chunk_begin = 1102416563, chunk_size = 7264512,
access_type = 2, alloc_type = 2}}, stack = {addr = 139753671585219, tid = 139753562438912, offset = 106583908513840,
frame_pc = 106790346350656, access_size = 106858968657856, frame_descr = 0x612000dda340 "\300\236", <incomplete sequence \335>},
global = {addr = 139753671585219, static kMaxGlobals = 4, globals = {{beg = 139753562438912, size = 106583908513840,
size_with_redzone = 106790346350656, name = 0x61300ade2fc0 "@\243", <incomplete sequence \335>,
module_name = 0x612000dda340 "\300\236", <incomplete sequence \335>, has_dynamic_init = 1102416563, location = 0x6ed900,
odr_indicator = 6578362}, {beg = 139753562430272, size = 139753562429952, size_with_redzone = 106378361083840,
name = 0x7f1ae971db60 "\200\333q\351\376\377\377\377\320\333q\351\032\177",
module_name = 0x63872f <reuse_tcp_remove_tree_list+690> "H\213\205\220\376\377\377H\005\300",
has_dynamic_init = 106790081373024, location = 0x613000030080, odr_indicator = 7242016}, {beg = 17187126944, size = 0,
size_with_redzone = 18033364716583124994, name = 0x0,
module_name = 0x5e971da60 <error: Cannot access memory at address 0x5e971da60>, has_dynamic_init = 206158430224,
location = 0x7f1ae971db50, odr_indicator = 139753562430064}, {beg = 139753663233543, size = 21474836496,
size_with_redzone = 8770, name = 0x7f1ae971da90 "\300", module_name = 0x60c003526480 "", has_dynamic_init = 105690610497296,
location = 0x0, odr_indicator = 8770}}, reg_sites = {55277312, 24608, 192, 0}, access_size = 16, size = 64 '@'},
--Type for more, q to quit, c to continue without paging--
addr = 139753671585219}}}}, Generic = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, addr_description = {data = {kind = __asan::kAddressKindHeap, {shadow = {
addr = 106858908091888, kind = (__asan::kShadowKindGap | _asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {
addr = 106858908091888, alloc_tid = 3, free_tid = 3, alloc_stack_id = 427819017, free_stack_id = 1266679975, chunk_access = {
bad_addr = 106858908091888, offset = 48, chunk_begin = 106858908091840, chunk_size = 355, access_type = 2, alloc_type = 1}},
stack = {addr = 106858908091888, tid = 3, offset = 3, frame_pc = 5440349067550916617, access_size = 106858908091888,
frame_descr = 0x30 <error: Cannot access memory at address 0x30>}, global = {addr = 106858908091888, static kMaxGlobals = 4,
globals = {{beg = 3, size = 3, size_with_redzone = 5440349067550916617, name = 0x6130074205f0 " )\224",
module_name = 0x30 <error: Cannot access memory at address 0x30>, has_dynamic_init = 106858908091840, location = 0x163,
odr_indicator = 5432374}, {beg = 7012368136350825216, size = 206158430224, size_with_redzone = 139753562428464,
name = 0xfe35d2e3a86 <error: Cannot access memory at address 0xfe35d2e3a86>, module_name = 0x7f1ae971d490 "h9\306\360\032\177",
has_dynamic_init = 139753562428464, location = 0x60c0246c97c0, odr_indicator = 2}, {beg = 9223372036854776046, size = 0,
size_with_redzone = 0, name = 0x0, module_name = 0x0, has_dynamic_init = 0, location = 0x0, odr_indicator = 1102416563}, {
beg = 139753562428400, size = 106858908091840, size_with_redzone = 106790081373240,
name = 0x3000000010 <error: Cannot access memory at address 0x3000000010>, module_name = 0x942920 <rbtree_null_node> " )\224",
has_dynamic_init = 139753562428496, location = 0x537ba0 <rbtree_search+174>, odr_indicator = 106858908091840}}, reg_sites = {
14525496, 24864, 1102416563, 0}, access_size = 7101984, size = 242 '\362'}, addr = 106858908091888}}}, pc = 5477914,
bp = 139753562430240, sp = 139753562430224, access_size = 8, bug_descr = 0x7f1aeff36dc3 "heap-use-after-free", is_write = false,
shadow_val = 253 '\375'}}}, halt_on_error
= true}
error = {<__asan::ErrorBase> = {scariness = {score = 51,
descr = "8-byte-read-heap-use-after-free\000\032\177\000\000\210\317q\351\032\177\000\000\001\000\000\000\000\000\000\000\016\325\335\360\032\177\000\000\230\002\000\000\000\000\000\000P.\341\357\032\177\000\000\320\321\377\360\032\177\000\000\210\317q\351\032\177\000\000\204\317q\351\032\177\000\000h9\306\360\032\177\000\000 \320q\351\032\177\000\000\000\000\000\000\000\000\000\000\360\205\340\357\032\177\000\000\063!@\000\000\000\000\000\070\333\347\303\000\000\000\000H\030@\000\000\000\000\000P.\341\357\032\177\000\000l\237\017\003\000\000\000\000P\320q\351\032\177\000\000@\320q\351\032\177\000\000\200H\377\360\032\177\000\000\017\000\000\000\000\000\000\000"...}, tid = 3}, addr_description = {data = {kind = __asan::kAddressKindHeap, {shadow = {
addr = 106858908091888, kind = (__asan::kShadowKindGap | __asan::kShadowKindHigh), shadow_byte = 0 '\000'}, heap = {
addr = 106858908091888, alloc_tid = 3, free_tid = 3, alloc_stack_id = 427819017, free_stack_id = 1266679975, chunk_access = {
bad_addr = 106858908091888, offset = 48, chunk_begin = 106858908091840, chunk_size = 355, access_type = 2, alloc_type = 1}}, stack = {
addr = 106858908091888, tid = 3, offset = 3, frame_pc = 5440349067550916617, access_size = 106858908091888,
frame_descr = 0x30 <error: Cannot access memory at address 0x30>}, global = {addr = 106858908091888, static kMaxGlobals = 4, globals = {{
beg = 3, size = 3, size_with_redzone = 5440349067550916617, name = 0x6130074205f0 " )\224",
module_name = 0x30 <error: Cannot access memory at address 0x30>, has_dynamic_init = 106858908091840, location = 0x163,
odr_indicator = 5432374}, {beg = 7012368136350825216, size = 206158430224, size_with_redzone = 139753562428464,
name = 0xfe35d2e3a86 <error: Cannot access memory at address 0xfe35d2e3a86>, module_name = 0x7f1ae971d490 "h9\306\360\032\177",
has_dynamic_init = 139753562428464, location = 0x60c0246c97c0, odr_indicator = 2}, {beg = 9223372036854776046, size = 0,
size_with_redzone = 0, name = 0x0, module_name = 0x0, has_dynamic_init = 0, location = 0x0, odr_indicator = 1102416563}, {
beg = 139753562428400, size = 106858908091840, size_with_redzone = 106790081373240,
name = 0x3000000010 <error: Cannot access memory at address 0x3000000010>, module_name = 0x942920 <rbtree_null_node> " )\224",
has_dynamic_init = 139753562428496, location = 0x537ba0 <rbtree_search+174>, odr_indicator = 106858908091840}}, reg_sites = {
14525496, 24864, 1102416563, 0}, access_size = 7101984, size = 242 '\362'}, addr = 106858908091888}}}, pc = 5477914,
bp = 139753562430240, sp = 139753562430224, access_size = 8, bug_descr = 0x7f1aeff36dc3 "heap-use-after-free", is_write = false,
shadow_val = 253 '\375'}
#6 0x00007f1aeff0075b in __asan::__asan_report_load8 (addr=) at ../../../../libsanitizer/asan/asan_rtl.cc:112
bp = 139753562430240
pc =
local_stack = 139753562430384
sp = 139753562430224
#7 0x000000000053961a in rbtree_next (node=0x6130074205e8) at util/rbtree.c:577
parent = 0x61300ade2fc0
#8 0x00000000006390ac in reuse_cb_readwait_for_failure (tree_by_id=0x7f1ae971dbd0, err=-2) at services/outside_network.c:1135
w = 0x61300ade2fc0
node = 0x61300ade2fe8
#9 0x0000000000639355 in reuse_cb_and_decommission (outnet=0x613000030080, pend=0x612000dda340, error=-2) at services/outside_network.c:1154
store = {root = 0x61300ade2fe8, count = 2, cmp = 0x632871 <reuse_id_cmp>}
#10 0x0000000000640e47 in outnet_tcptimer (arg=0x61300ade2fc0) at services/outside_network.c:2197
--Type for more, q to quit, c to continue without paging--
pend = 0x612000dda340
w = 0x61300ade2fc0
outnet = 0x613000030080
#11 0x0000000000625f3e in comm_timer_callback (fd=-1, event=1, arg=0x6040089ac910) at util/netevent.c:4280
tm = 0x6040089ac910
func = "comm_timer_callback"
#12 0x00007f1aef736be2 in event_process_active_single_queue (base=base@entry=0x61700002ff00, activeq=0x602000011010,
max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0) at event.c:1691
evcb_callback =
res = 1
ev = 0x60c0246c97c0
evcb =
count = 2
func = "event_process_active_single_queue"
#13 0x00007f1aef7373e7 in event_process_active (base=0x61700002ff00) at event.c:1783
activeq =
i = 0
c = 0
tv = {tv_sec = 8936595, tv_usec = 347428}
maxcb = 2147483647
endtime = 0x0
limit_after_prio = 2147483647
activeq =
i =
c =
endtime =
tv =
maxcb =
limit_after_prio =
#14 event_base_loop (base=0x61700002ff00, flags=) at event.c:2006
n =
evsel = 0x7f1aef965b20
tv = {tv_sec = 0, tv_usec = 880}
tv_p =
res =
done = 0
retval = 0
func = "event_base_loop"
#15 0x000000000064ce5a in ub_event_base_dispatch (base=0x61700002ff00) at util/ub_event.c:280
No locals.
#16 0x000000000060dc9f in comm_base_dispatch (b=0x60300004a830) at util/netevent.c:256
retval = 32767
#17 0x000000000044a185 in worker_work (worker=0x62300000e100) at daemon/worker.c:1940
No locals.
#18 0x000000000041283d in thread_start (arg=0x62300000e100) at daemon/daemon.c:541
worker = 0x62300000e100
port_num = 3
#19 0x00007f1aef01714a in start_thread (arg=) at pthread_create.c:479
ret =
pd =
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139753562433280, 5629422553894277874, 140735375915374, 140735375915375, 0, 139753562431296,
-5752667792546409742, -5752671231268433166}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0,
canceltype = 0}}}
not_first_call =
#20 0x00007f1aeed46dc3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
(gdb)

Now, if we will dig the point of the thread where Unbound has terminated by sanitizer:
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f1aeec6bdb5 in __GI_abort () at abort.c:79
#2 0x00007f1aeff14c02 in __sanitizer::Abort () at ../../../../libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cc:145
#3 0x00007f1aeff1d52c in __sanitizer::Die () at ../../../../libsanitizer/sanitizer_common/sanitizer_termination.cc:57
#4 0x00007f1aefeffc55 in __asan::ScopedInErrorReport::~ScopedInErrorReport (this=, __in_chrg=)
at ../../../../libsanitizer/asan/asan_report.cc:181
#5 __asan::ReportGenericError (pc=, bp=bp@entry=139753562430240, sp=sp@entry=139753562430224, addr=106858908091888,
is_write=is_write@entry=false, access_size=access_size@entry=8, exp=0, fatal=true) at ../../../../libsanitizer/asan/asan_report.cc:397
#6 0x00007f1aeff0075b in __asan::__asan_report_load8 (addr=) at ../../../../libsanitizer/asan/asan_rtl.cc:112
#7 0x000000000053961a in rbtree_next (node=0x6130074205e8) at util/rbtree.c:577
#8 0x00000000006390ac in reuse_cb_readwait_for_failure (tree_by_id=0x7f1ae971dbd0, err=-2) at services/outside_network.c:1135
#9 0x0000000000639355 in reuse_cb_and_decommission (outnet=0x613000030080, pend=0x612000dda340, error=-2) at services/outside_network.c:1154
#10 0x0000000000640e47 in outnet_tcptimer (arg=0x61300ade2fc0) at services/outside_network.c:2197
#11 0x0000000000625f3e in comm_timer_callback (fd=-1, event=1, arg=0x6040089ac910) at util/netevent.c:4280
#12 0x00007f1aef736be2 in event_process_active_single_queue (base=base@entry=0x61700002ff00, activeq=0x602000011010,
max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0) at event.c:1691
#13 0x00007f1aef7373e7 in event_process_active (base=0x61700002ff00) at event.c:1783
#14 event_base_loop (base=0x61700002ff00, flags=) at event.c:2006
#15 0x000000000064ce5a in ub_event_base_dispatch (base=0x61700002ff00) at util/ub_event.c:280
#16 0x000000000060dc9f in comm_base_dispatch (b=0x60300004a830) at util/netevent.c:256
#17 0x000000000044a185 in worker_work (worker=0x62300000e100) at daemon/worker.c:1940
#18 0x000000000041283d in thread_start (arg=0x62300000e100) at daemon/daemon.c:541
#19 0x00007f1aef01714a in start_thread (arg=) at pthread_create.c:479
#20 0x00007f1aeed46dc3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) fr 7
#7 0x000000000053961a in rbtree_next (node=0x6130074205e8) at util/rbtree.c:577
577 for (node = node->right; node->left != RBTREE_NULL; node = node->left);
(gdb) list
572 {
573 rbnode_type parent;
574
575 if (node->right != RBTREE_NULL) {
576 /
One right, then keep on going left... */
577 for (node = node->right; node->left != RBTREE_NULL; node = node->left);
578 } else {
579 parent = node->parent;
580 while (parent != RBTREE_NULL && node == parent->right) {
581 node = parent;

so we will see that the thread is crashing in for() (line 577 in util/rbtree.c) loop. As i can understand there free() was called somewhere before this loop where the operation node = node->right has provoked program crash. Could you please see the stack trace i put above? Maybe it will prompt the correct place in your code with root cause.

Big thanks in advance!

@gthess
Copy link
Member

gthess commented Dec 14, 2021

Hi @iruzanov!
I am currently working on some race condition bugs (related to TCP code) and this one seems like one of them. I was planning to update you here when they were solved but good that you already reported.
Can I count you in for testing? Also, it would be interesting to see how often you get those, so that we know what to expect for reproducibility in production.

@iruzanov
Copy link
Author

Sure, i am totally at your service! ;)
What about of the rate of the program crashes - at less loaded sites Unbound works well. It might be one core dump during a month. And at more loaded sites where a lot of TCP-traffic too, i can detect one or two crashes per a day.

@iruzanov
Copy link
Author

Hello, @gthess!

A couple of days ago i have received email with subject "Unbound DoS vulnerability (only with debugging enabled!)" from your colleagues. In short - this is the warning about to use "--disable-debug --disable-checking" when compiling Unbound to avoid resolver termination when it receives crafted packet from upstream. It is the subject of versions 1.13.x and 1.14.0 of the program. So my question is - does it relates to race conditions in TCP-code that you have mentioned above? And are there some ready fixes to test by myself in the Unbound (1.13.2) running on my servers? ;)

@gthess
Copy link
Member

gthess commented Jan 14, 2022

Using '--disable-debug --disable-checking' explicitly turns off debugging (if you are not sure what is happening with the configure line in your environment). If you don't specify them at all, debugging is off by default.
It does not relate to the race condition. It happened to be found during testing for the race condition bug.
Apart from that I don't have complete fixes for the race condition yet.

@iruzanov
Copy link
Author

Thanks for the detailed clarification!
And now my Unbound resolvers are working without deep debug. But i am ready to recompile the program at any time that you will demend.

@gthess
Copy link
Member

gthess commented Jan 25, 2022

Hi @iruzanov,
The above PR addresses the issues I was seeing with the race condition. It will be merged into master somewhere this week but if you would like to test already you can build from that branch in the meantime.

@iruzanov
Copy link
Author

Hello @gthess!

I will gladly test your fixpack mantioned above, BIG thank you! And in next week i will report you about my first observations for Unbound on a couple of my loaded servers.

@gthess
Copy link
Member

gthess commented Jan 25, 2022

Just a heads up that the branch is now merged to master which also includes other (relevant) fixes.

@iruzanov
Copy link
Author

ohh, ok! So i need to recompile my first patched Unbound again. Because i have fetched master branch with fixpacks from #612 yesterday :)

@gthess
Copy link
Member

gthess commented Jan 26, 2022

I would just go for the master branch as is since it contains other fixes. For this case, 8e76eb9 addresses a dnstap issue that could trigger something similar.

@iruzanov
Copy link
Author

Thank you @gthess!
I have compiled Unbound from master branch once again some hours ago.
And my plan is to observe the server with the patched program next week and then i would like to upgrade some other of my loaded servers. I will report you about my first results next week!

@iruzanov
Copy link
Author

iruzanov commented Feb 1, 2022

Hello @gthess!

My first resolver with new code from master branch works fine during last week. And today i have upgraded next two of my loaded resolvers with the latest code.
I will monitor these servers during 5-7 days and then i will report you about status of upgrading procedure.

@iruzanov
Copy link
Author

iruzanov commented Feb 10, 2022

Hi, @gthess!
I gladly report you about stable work of my three patched very loaded resolvers for the last 7 days. So i've made decision to patch next three ones on next Monday and monitor them up to friday.

@gthess
Copy link
Member

gthess commented Feb 11, 2022

That's always good to hear, thanks for always reporting back!

@iruzanov
Copy link
Author

Hi, @gthess!

I'm sorry for timeout. I was just waiting for core dump on some of loaded resolver that i still did not patch. And today such resolver has crashed ;) So, some hours ago i have patched next three loaded resolvers. The resolvers patched on previous week work fine.
Please give me a time to monitor all of patched resolvers together with non-patched ones to see if core dump on any resolver will happen.

@iruzanov
Copy link
Author

iruzanov commented Mar 9, 2022

Hello, @gthess!
I plan to upgrade next set of my resolvers from master branch (that i saved one month ago, version 1.14.1). But i saw that new version 1.15.0 of Unbound has released 10 February. So can i ugrade to this version, 1.15.0? Or 1.14.1 is more suitable for my testing?

@gthess
Copy link
Member

gthess commented Mar 14, 2022

Hi!
1.14.1 was supposed to be the version after 1.14.0 but since new code included changes to the ratelimit logic and the default value of aggressive nsec, we had to increase the major version. Thus, 1.14.1 turned to 1.15.0 upon releasing.
So to answer your question, 1.15.0 is better to use (and it includes all the relevant fixes) than a specific point during development. It also went through the testing we do before releasing.

Btw, are you subscribed to the unbound-users mailing list? Early announcements about releases (also for release candidates) are announced there.

@iruzanov
Copy link
Author

Thank you for your answer!
I will upgrade all of my resolvers to version 1.15.0 then.

No, i am not subscribed yet. Usualy i often go to your web-site to see for the last news about NSD and Unbound ;) Ok, i will.

@gthess
Copy link
Member

gthess commented Mar 14, 2022

In that case nsd-users can also be useful to you.

@gthess
Copy link
Member

gthess commented Mar 17, 2023

Closing this as resolved by now.

@gthess gthess closed this as completed Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants