Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.4.0] SIGSEGV in zdb_zone_load_ex #9

Closed
kolbma opened this issue Nov 6, 2020 · 9 comments
Closed

[2.4.0] SIGSEGV in zdb_zone_load_ex #9

kolbma opened this issue Nov 6, 2020 · 9 comments

Comments

@kolbma
Copy link

kolbma commented Nov 6, 2020

I always get a SIGSEGV with the 2.4.0 version when the zone file would be loaded ...
But seems to be like a race condition because if I set a breakpoint at the function zdb_zone_load_ex and continue afterwards it works without SIGSEGV.

Starting program: /usr/sbin/yadifad --log --nodaemon -u yadifa -g yadifa -c /etc/yadifa/yadifad.conf
warning: Error disabling address space randomization: Operation not permitted
[New LWP 2170]
[LWP 2170 exited]
[Detaching after fork from child process 2171]
[New LWP 2172]
[New LWP 2175]
2020-11-06 10:30:56.961131 |  2169 | main     | server   | I | starting YADIFA 2.4.0-9809
2020-11-06 10:30:56.961134 |  2169 | main     | server   | I | built with --enable-dynamic-provisioning --enable-ctrl --enable-static=no --enable-shared --enable-non-aa-axfr-support --enable-rrl
2020-11-06 10:30:56.961136 |  2169 | main     | server   | I | release build
2020-11-06 10:30:56.961138 |  2169 | main     | server   | I | ------------------------------------------------
2020-11-06 10:30:56.961139 |  2169 | main     | server   | I | YADIFA is maintained by EURid
2020-11-06 10:30:56.961141 |  2169 | main     | server   | I | Source code is available at http://www.yadifa.eu
2020-11-06 10:30:56.961143 |  2169 | main     | server   | I | ------------------------------------------------
2020-11-06 10:30:56.961147 |  2169 | main     | server   | I | got 2 CPUs
2020-11-06 10:30:56.961148 |  2169 | main     | server   | I | using 1 UDP listeners per interface
2020-11-06 10:30:56.961150 |  2169 | main     | server   | I | accepting up to 100 TCP queries
[New LWP 2176]
[New LWP 2177]
[New LWP 2178]
[New LWP 2179]
[New LWP 2180]
[New LWP 2181]
[New LWP 2182]
[New LWP 2183]
[New LWP 2184]
[New LWP 2185]
[New LWP 2186]
[New LWP 2187]
[New LWP 2188]
[New LWP 2189]
[New LWP 2190]
[New LWP 2191]
[New LWP 2192]
[New LWP 2193]
[New LWP 2194]
[New LWP 2195]
[New LWP 2196]
2020-11-06 10:30:56.970923 |  2169 | main     | server   | I | zone: localhost.: 00007FF67CF92000: config: registered
2020-11-06 10:30:56.972386 |  2169 | main     | server   | I | database: reconfigure done
2020-11-06 10:30:56.972605 |  2169 | signal   | system   | I | signal: thread started
2020-11-06 10:30:56.977164 |  2169 | main     | system   | I | changing identity to 100:101 (current: 0:0)
2020-11-06 10:30:56.991069 |  2169 | main     | server   | I | loading zones
[New LWP 2197]
2020-11-06 10:30:56.991715 |  2169 | main     | server   | I | starting notify service
2020-11-06 10:30:56.991892 |  2169 | DBsrvice | server   | I | database: service starting
2020-11-06 10:30:56.991950 |  2169 | DBsrvice | server   | I | database: service started[New LWP 2198]

2020-11-06 10:30:56.992417 |  2169 | dbload   | server   | I | zone load: loading '/var/yadifa/zones/masters/localhost.zone'
2020-11-06 10:30:56.992582 |  2169 | yadifad0 | server   | I | notify: notification service started
[New LWP 2199]
2020-11-06 10:30:56.995173 |  2169 | yadifad1 | server   | I | notify: notification service IPv4 receiver started (socket 3)
[New LWP 2200]
--Type <RET> for more, q to quit, c to continue without paging--

Thread 6 "dbload" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 2177]
zdb_zone_load_ex (parms=0x7ff67e270990) at src/zdb_zone_load.c:297
297	    resource_record_init(&entry);
(gdb) bt
#0  zdb_zone_load_ex (parms=0x7ff67e270990) at src/zdb_zone_load.c:297
Backtrace stopped: Cannot access memory at address 0x7ff67e25e6b8

I think the source compare for line 297 might be not correct.

This is the simple zone file...

# cat /var/yadifa/zones/masters/localhost.zone
$TTL    86400                     ; 24 hours
$ORIGIN localhost.

localhost.  86400   IN  SOA localhost. root.localhost. (
                         20120201 ; serial
                         3H       ; refresh
                         15       ; retry
                         1w       ; expire
                         3h       ; minimum
                        )

            86400   IN  NS      localhost.
            86400   IN  A       127.0.0.1
@kolbma
Copy link
Author

kolbma commented Nov 6, 2020

Here is another gdb run with debug log enabled...

Starting program: /usr/sbin/yadifad --log --nodaemon -u yadifa -g yadifa -c /etc/yadifa/yadifad.conf
warning: Error disabling address space randomization: Operation not permitted
[New LWP 3200]
[LWP 3200 exited]
[Detaching after fork from child process 3201]
[New LWP 3202]
[New LWP 3205]
2020-11-06 11:02:57.187435 |  3199 | main     | server   | I | starting YADIFA 2.4.0-9809[New LWP 3206]

2020-11-06 11:02:57.187441 |  3199 | main     | server   | I | built with --enable-dynamic-provisioning --enable-ctrl --enable-static=no --enable-shared --enable-non-aa-axfr-support --enable-rrl[New LWP 3207]

2020-11-06 11:02:57.187443 |  3199 | main     | server   | I | release build
[New LWP 3208]
2020-11-06 11:02:57.187445 |  3199 | main     | server   | I | ------------------------------------------------
2020-11-06 11:02:57.187447 |  3199 | main     | server   | I | YADIFA is maintained by EURid[New LWP 3209]

2020-11-06 11:02:57.187448 |  3199 | main     | server   | I | Source code is available at http://www.yadifa.eu[New LWP 3210]

[New LWP 3211]
2020-11-06 11:02:57.187450 |  3199 | main     | server   | I | ------------------------------------------------
[New LWP 3212]
2020-11-06 11:02:57.187454 |  3199 | main     | server   | I | got 2 CPUs
[New LWP 3213]
2020-11-06 11:02:57.187455 |  3199 | main     | server   | I | using 1 UDP listeners per interface
[New LWP 3214]
2020-11-06 11:02:57.187457 |  3199 | main     | server   | I | accepting up to 100 TCP queries[New LWP 3215]

2020-11-06 11:02:57.187461 |  3199 | main     | system   | D | service: yadifad init 1 workers[New LWP 3216]

[New LWP 3217]
2020-11-06 11:02:57.187469 |  3199 | main     | system   | D | thread-pool: 'keypub' init
[New LWP 3218]
2020-11-06 11:02:57.187645 |  3199 | main     | system   | D | thread-pool: 'keypub' ready[New LWP 3219]

[New LWP 3220]
2020-11-06 11:02:57.187648 |  3199 | main     | system   | D | thread-pool: 'dbload' init
[New LWP 3221]
2020-11-06 11:02:57.187853 |  3199 | main     | system   | D | thread-pool: 'dbload' ready
[New LWP 3222]
2020-11-06 11:02:57.187856 |  3199 | main     | system   | D | thread-pool: 'dbstore' init
[New LWP 3223]
[New LWP 3224]
2020-11-06 11:02:57.188057 |  3199 | main     | system   | D | thread-pool: 'dbstore' ready[New LWP 3225]

[New LWP 3226]
2020-11-06 11:02:57.188059 |  3199 | main     | system   | D | thread-pool: 'dbunload' init
2020-11-06 11:02:57.188254 |  3199 | main     | system   | D | thread-pool: 'dbunload' ready
2020-11-06 11:02:57.188260 |  3199 | main     | system   | D | thread-pool: 'dbdownld' init
2020-11-06 11:02:57.188940 |  3199 | main     | system   | D | thread-pool: 'dbdownld' ready
2020-11-06 11:02:57.188943 |  3199 | main     | system   | D | thread-pool: 'callback' init
2020-11-06 11:02:57.189142 |  3199 | main     | system   | D | thread-pool: 'callback' ready
2020-11-06 11:02:57.189144 |  3199 | main     | system   | D | thread-pool: 'dbresign' init
2020-11-06 11:02:57.189319 |  3199 | main     | system   | D | thread-pool: 'dbresign' ready
2020-11-06 11:02:57.189323 |  3199 | main     | system   | D | service: DBsrvice init 1 workers
2020-11-06 11:02:57.189348 |  3199 | main     | system   | D | thread-pool: 'notify-tp' init
2020-11-06 11:02:57.191110 |  3199 | main     | system   | D | thread-pool: 'notify-tp' ready
2020-11-06 11:02:57.191113 |  3199 | main     | system   | D | service: yadifad-notify init 3 workers
2020-11-06 11:02:57.191118 |  3199 | main     | system   | D | signal_handler_init()
2020-11-06 11:02:57.191313 |  3199 | main     | system   | D | signal_handler_init() done
2020-11-06 11:02:57.191729 |  3199 | main     | system   | 1 | config: 'control' setting 'enabled' to 'true'
2020-11-06 11:02:57.191733 |  3199 | main     | system   | 1 | config: 'enabled' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.191949 |  3199 | main     | system   | 1 | config: 'enabled' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.192504 |  3199 | main     | server   | 6 | new: ?@00007FC2FC89E000
2020-11-06 11:02:57.192508 |  3199 | main     | system   | 1 | config: 'zone' setting 'type' to 'master'
2020-11-06 11:02:57.192518 |  3199 | main     | system   | 1 | config: 'zone' setting 'domain' to 'localhost'
2020-11-06 11:02:57.192526 |  3199 | main     | system   | 1 | config: 'zone' setting 'file_name' to 'masters/localhost.zone'
2020-11-06 11:02:57.192529 |  3199 | main     | system   | 1 | config: 'zone' setting 'notify_auto' to '1'
2020-11-06 11:02:57.192546 |  3199 | main     | system   | 1 | config: 'zone' setting 'drop_before_load' to '0'
2020-11-06 11:02:57.192550 |  3199 | main     | system   | 1 | config: 'zone' setting 'no_master_updates' to '0'
2020-11-06 11:02:57.192552 |  3199 | main     | system   | 1 | config: 'zone' setting 'true_multimaster' to '0'
2020-11-06 11:02:57.192555 |  3199 | main     | system   | 1 | config: 'zone' setting 'maintain_dnssec' to '1'
2020-11-06 11:02:57.192559 |  3199 | main     | system   | 1 | config: 'zone' setting 'maintain_zone_before_mount' to '1'
2020-11-06 11:02:57.192567 |  3199 | main     | system   | 1 | config: 'zone' setting 'notify.retry_count' to '5'
2020-11-06 11:02:57.192572 |  3199 | main     | system   | 1 | config: 'zone' setting 'notify.retry_period' to '1'
2020-11-06 11:02:57.192576 |  3199 | main     | system   | 1 | config: 'zone' setting 'notify.retry_period_increase' to '0'
2020-11-06 11:02:57.192579 |  3199 | main     | system   | 1 | config: 'zone' setting 'multimaster_retries' to '0'
2020-11-06 11:02:57.192583 |  3199 | main     | system   | 1 | config: 'zone' setting 'rrsig_nsupdate_allowed' to '0'
2020-11-06 11:02:57.192587 |  3199 | main     | system   | 1 | config: 'zone' setting 'dnssec_mode' to 'off'
2020-11-06 11:02:57.192592 |  3199 | main     | system   | 1 | config: 'zone' setting 'journal_size_kb' to '0'
2020-11-06 11:02:57.192596 |  3199 | main     | system   | 1 | config: 'zone' setting 'dynamic_provisioning' to 'AAA='
2020-11-06 11:02:57.192605 |  3199 | main     | server   | D | config: localhost.: zone section parsed
2020-11-06 11:02:57.192607 |  3199 | main     | server   | D | config: localhost.: sending zone to service
2020-11-06 11:02:57.192609 |  3199 | main     | server   | D | database: localhost.: loading settings
2020-11-06 11:02:57.192611 |  3199 | main     | server   | D | database: localhost.: loading setting with offline database
2020-11-06 11:02:57.192614 |  3199 | main     | server   | 1 | database_load_zone_desc(localhost.@00007FC2FC89E000=1)
2020-11-06 11:02:57.192617 |  3199 | main     | server   | D | zone: localhost. is a new zone
2020-11-06 11:02:57.192621 |  3199 | main     | server   | I | zone: localhost.: 00007FC2FC89E000: config: registered
2020-11-06 11:02:57.192624 |  3199 | main     | server   | D | database: localhost.: enqueue operation DATABASE_SERVICE_ZONE_PROCESSED
2020-11-06 11:02:57.192628 |  3199 | main     | system   | 7 | pool 'async message': alloc 00007FC2FEBC3000
2020-11-06 11:02:57.192633 |  3199 | main     | server   | 1 | database_load_zone_desc(00007FC2FC89E000) done
2020-11-06 11:02:57.193095 |  3199 | main     | system   | 1 | config: 'rrl' setting 'enabled' to 'true'
2020-11-06 11:02:57.193099 |  3199 | main     | system   | 1 | config: 'rrl' setting 'log_only' to 'false'
2020-11-06 11:02:57.193102 |  3199 | main     | system   | 1 | config: 'rrl' setting 'responses_per_second' to '5'
2020-11-06 11:02:57.193105 |  3199 | main     | system   | 1 | config: 'rrl' setting 'errors_per_second' to '5'
2020-11-06 11:02:57.193108 |  3199 | main     | system   | 1 | config: 'rrl' setting 'window' to '15'
2020-11-06 11:02:57.193110 |  3199 | main     | system   | 1 | config: 'rrl' setting 'slip' to '2'
2020-11-06 11:02:57.193113 |  3199 | main     | system   | 1 | config: 'rrl' setting 'min_table_size' to '1024'
2020-11-06 11:02:57.193116 |  3199 | main     | system   | 1 | config: 'rrl' setting 'max_table_size' to '16384'
2020-11-06 11:02:57.193119 |  3199 | main     | system   | 1 | config: 'rrl' setting 'ipv4_prefix_length' to '24'
2020-11-06 11:02:57.193121 |  3199 | main     | system   | 1 | config: 'rrl' setting 'ipv6_prefix_length' to '56'
2020-11-06 11:02:57.193124 |  3199 | main     | system   | 1 | config: 'rrl' setting 'exempted' to 'none'
2020-11-06 11:02:57.193133 |  3199 | main     | system   | 1 | config: 'responses_per_second' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193140 |  3199 | main     | system   | 1 | config: 'errors_per_second' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193143 |  3199 | main     | system   | 1 | config: 'window' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193145 |  3199 | main     | system   | 1 | config: 'slip' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193148 |  3199 | main     | system   | 1 | config: 'max_table_size' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193151 |  3199 | main     | system   | 1 | config: 'min_table_size' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193153 |  3199 | main     | system   | 1 | config: 'ipv4_prefix_length' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193156 |  3199 | main     | system   | 1 | config: 'ipv6_prefix_length' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193158 |  3199 | main     | system   | 1 | config: 'log_only' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193160 |  3199 | main     | system   | 1 | config: 'enabled' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193163 |  3199 | main     | system   | 1 | config: 'exempted' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193361 |  3199 | main     | system   | 1 | config: 'responses_per_second' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193363 |  3199 | main     | system   | 1 | config: 'errors_per_second' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193365 |  3199 | main     | system   | 1 | config: 'window' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193368 |  3199 | main     | system   | 1 | config: 'slip' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193370 |  3199 | main     | system   | 1 | config: 'max_table_size' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193372 |  3199 | main     | system   | 1 | config: 'min_table_size' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193374 |  3199 | main     | system   | 1 | config: 'ipv4_prefix_length' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193377 |  3199 | main     | system   | 1 | config: 'ipv6_prefix_length' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193379 |  3199 | main     | system   | 1 | config: 'log_only' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193382 |  3199 | main     | system   | 1 | config: 'enabled' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193384 |  3199 | main     | system   | 1 | config: 'exempted' has already been set by source 128 (current is 1)
2020-11-06 11:02:57.193649 |  3199 | main     | system   | 1 | config: 'nsid' setting 'ascii' to '16ca3c979440'
2020-11-06 11:02:57.193842 |  3199 | main     | server   | I | database: reconfigure done
2020-11-06 11:02:57.196171 |  3199 | signal   | system   | I | signal: thread started
2020-11-06 11:02:57.200357 |  3199 | main     | system   | I | changing identity to 100:101 (current: 0:0)
2020-11-06 11:02:57.212664 |  3199 | main     | server   | I | loading zones
2020-11-06 11:02:57.212966 |  3199 | main     | database | D | journal: initialising with an MRU of 0 slots
2020-11-06 11:02:57.213003 |  3199 | main     | system   | D | alarm_open(d@tabase.) opened alarm with handle 0
2020-11-06 11:02:57.213236 |  3199 | main     | database | 7 | zdb_zone_create localhost.@00007FC2FB890000
2020-11-06 11:02:57.213268 |  3199 | main     | system   | D | alarm_open(localhost.) opened alarm with handle 1
2020-11-06 11:02:57.213446 |  3199 | main     | system   | D | service: DBsrvice start
[New LWP 3227]
2020-11-06 11:02:57.213758 |  3199 | main     | system   | D | service_start: worker 0 created with id 00007FC2F988FB20
2020-11-06 11:02:57.213772 |  3199 | main     | server   | D | database: localhost.: enqueue operation DATABASE_SERVICE_ZONE_LOAD
2020-11-06 11:02:57.213778 |  3199 | main     | system   | 7 | pool 'async message': alloc 00007FC2FEBC30A0
2020-11-06 11:02:57.213785 |  3199 | main     | server   | I | starting notify service
2020-11-06 11:02:57.213788 |  3199 | main     | system   | D | service: yadifad-notify start
2020-11-06 11:02:57.214954 |  3199 | DBsrvice | system   | D | service: DBsrvice tagged 'DBsrvice' (pid=3199, thread=00007FC2F988FB20)
2020-11-06 11:02:57.214961 |  3199 | DBsrvice | system   | D | service: DBsrvice starting
2020-11-06 11:02:57.214963 |  3199 | DBsrvice | server   | I | database: service starting
2020-11-06 11:02:57.214966 |  3199 | DBsrvice | server   | I | database: service started
2020-11-06 11:02:57.215000 |  3199 | DBsrvice | server   | D | database: localhost.: processing done
2020-11-06 11:02:57.215006 |  3199 | DBsrvice | system   | 7 | pool 'async message': release 00007FC2FEBC3000
2020-11-06 11:02:57.215011 |  3199 | DBsrvice | server   | D | database: localhost.: load, @00007FC2FC89E000
2020-11-06 11:02:57.215204 |  3199 | DBsrvice | server   | D | database: localhost.: processing zone @00007FC2FC89E000 (ZONE-LOAD)
[New LWP 3228]
2020-11-06 11:02:57.215223 |  3199 | DBsrvice | server   | 1 | database_service_zone_load(localhost.@00007FC2FC89E000=2)
2020-11-06 11:02:57.215226 |  3199 | DBsrvice | server   | 1 | database_service_zone_load: locking zone 'localhost.' for loading
2020-11-06 11:02:57.215229 |  3199 | DBsrvice | server   | D | database: localhost.: enqueue operation DATABASE_SERVICE_ZONE_PROCESSED
2020-11-06 11:02:57.215233 |  3199 | DBsrvice | system   | 7 | pool 'async message': alloc 00007FC2FEBC3000
2020-11-06 11:02:57.215289 |  3199 | DBsrvice | server   | 1 | database_service_zone_load: unlocking zone 'localhost.' for loading
2020-11-06 11:02:57.215293 |  3199 | DBsrvice | system   | 7 | pool 'async message': release 00007FC2FEBC30A0
2020-11-06 11:02:57.215297 |  3199 | DBsrvice | server   | D | database: localhost.: processing done
2020-11-06 11:02:57.215299 |  3199 | DBsrvice | system   | 7 | pool 'async message': release 00007FC2FEBC3000
2020-11-06 11:02:57.215323 |  3199 | dbload   | server   | 1 | zone load: 'localhost' zone@00007FC2FB890000 in the database is a placeholder
2020-11-06 11:02:57.215543 |  3199 | dbload   | server   | I | zone load: loading '/var/yadifa/zones/masters/localhost.zone'
2020-11-06 11:02:57.215945 |  3199 | main     | system   | D | service_start: worker 0 created with id 00007FC2F986CB20
2020-11-06 11:02:57.216526 |  3199 | yadifad0 | system   | D | service: yadifad-notify tagged 'yadifad0' (pid=3199, thread=00007FC2F986CB20) (1/3)
2020-11-06 11:02:57.216534 |  3199 | yadifad0 | system   | D | service: yadifad-notify starting (1/3)
2020-11-06 11:02:57.216546 |  3199 | yadifad0 | server   | I | notify: notification service started
2020-11-06 11:02:57.216559 |  3199 | yadifad0 | server   | D | notify: notification service main loop reached
[New LWP 3229]
--Type <RET> for more, q to quit, c to continue without paging--

Thread 6 "dbload" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 3207]
zdb_zone_load_ex (parms=0x7fc2fdb7c990) at src/zdb_zone_load.c:297
297	    resource_record_init(&entry);

@edfeu
Copy link

edfeu commented Nov 6, 2020 via email

@kolbma
Copy link
Author

kolbma commented Nov 6, 2020

System is Alpine Linux with musl running in a Docker container...

# uname -a
Linux 16ca3c979440 5.4.72-0-lts #1-Alpine SMP Mon, 19 Oct 2020 06:22:29 UTC x86_64 Linux

The warning is from gdb which is not allowed to disable this because running in the container.

I've patched yadifa with this for linking against musl... (seems to be ok with 2.3 versions)

--- a/lib/dnscore/src/debug.c
+++ b/lib/dnscore/src/debug.c
@@ -56,7 +56,7 @@
 #include "dnscore/dnscore-config.h"
 #include "dnscore/timems.h"
 
-#if defined(__linux__) || defined(__APPLE__)
+#if defined(__GLIBC__) || defined(__APPLE__)
     #include <execinfo.h>
     #if HAS_BFD_DEBUG_SUPPORT
         #include <bfd.h>
@@ -84,7 +84,7 @@
 #undef debug_stat
 #undef debug_mallocated
 
-#if defined(__linux__) || defined(__APPLE__)
+#if defined(__GLIBC__) || defined(__APPLE__)
 #define DNSCORE_DEBUG_STACKTRACE 1
 #else /* __FreeBSD__ or unknown */
 #define DNSCORE_DEBUG_STACKTRACE 0
@@ -603,7 +603,7 @@
 stacktrace
 debug_stacktrace_get()
 {
-#ifdef __linux__
+#ifdef __GLIBC__
     void* buffer_[1024];
 
     int n = backtrace(buffer_, sizeof(buffer_) / sizeof(void*));
@@ -705,7 +705,7 @@
 void
 debug_stacktrace_log(logger_handle* handle, u32 level, stacktrace trace)
 {
-#ifdef __linux__
+#ifdef __GLIBC__
     int n = 0;
 
     if(trace != NULL)
@@ -825,7 +825,7 @@
 void
 debug_stacktrace_try_log(logger_handle* handle, u32 level, stacktrace trace)
 {
-#ifdef __linux__
+#ifdef __GLIBC__
     int n = 0;
 
     if(trace != NULL)
@@ -891,7 +891,7 @@
         return;
     }
 
-#ifdef __linux__
+#ifdef __GLIBC__
     int n = 0;
 
     while(trace[n] != 0)
@@ -983,7 +983,7 @@
 
 /****************************************************************************/
 
-#if defined(__linux__)
+#if defined(__GLIBC__)
 
 bool
 debug_log_stacktrace(logger_handle *handle, u32 level, const char *prefix)
@@ -993,7 +993,7 @@
     char binary[PATH_MAX];
 #endif
 
-#if defined(__linux__)
+#if defined(__GLIBC__)
     
     int n = backtrace(addresses, sizeof(addresses) / sizeof(void*));
     
--- a/lib/dnscore/src/signals.c
+++ b/lib/dnscore/src/signals.c
@@ -55,7 +55,7 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 
-#if defined(__linux__) || defined(__gnu_hurd__)
+#if defined(__GLIBC__) || defined(__gnu_hurd__)
 #define _GNU_SOURCE 1
 #include <execinfo.h>
 #include <sys/mman.h>
@@ -694,7 +694,7 @@
                         log_err(filepath);
                     }
 
-#if defined(__linux__) || defined(__gnu_hurd__)
+#if defined(__GLIBC__) || defined(__gnu_hurd__)
                     void* buffer[MAXTRACE];
                     char** strings;
                     int n = backtrace(buffer, MAXTRACE);
@@ -724,7 +724,7 @@
                         log_err(filepath);
                     }
 
-#if __linux__
+#if __GLIBC__
                     ucontext_t* ucontext = (ucontext_t*)context;
 
                     /*
@@ -902,7 +902,7 @@
                         log_err(filepath);
                     }
                     
-#if __linux__ && (defined(__x86_64__) || defined(__i386__)) && (_BSD_SOURCE || _SVID_SOURCE || _DEFAULT_SOURCE)
+#if __GLIBC__ && (defined(__x86_64__) || defined(__i386__)) && (_BSD_SOURCE || _SVID_SOURCE || _DEFAULT_SOURCE)
                     // dump more information about the memory address of the error
 #define PAGESIZE 4096
 #define LINESIZE 32
--- a/sbin/yadifad/signals.c
+++ b/sbin/yadifad/signals.c
@@ -53,7 +53,7 @@
 #include <sys/stat.h>
 #include <fcntl.h>
 
-#if defined(__linux__) || defined(__gnu_hurd__)
+#if defined(__GLIBC__) || defined(__gnu_hurd__)
 #define _GNU_SOURCE 1
 #include <execinfo.h>
 #include <sys/mman.h>

@kolbma
Copy link
Author

kolbma commented Nov 6, 2020

Here are the config files attached...
It is the complete /etc/yadifa directory, so not all files are used for yadifad. yadifad.conf is the entrypoint for yadifad.
yadifa_conf.tar.gz

@kolbma
Copy link
Author

kolbma commented Nov 6, 2020

What is the meaning of the different network models (0,1,2)?

@edfeu
Copy link

edfeu commented Nov 6, 2020 via email

@edfeu
Copy link

edfeu commented Nov 6, 2020

What is the meaning of the different network models (0,1,2)?

yadifad has 3 ways to handle messages.

  • Model 0 : each thread reads one query, generates its answer and sends it back.
  • Model 1 : work is divided between one receiver and one sender, linked by a fast/slow queue. It can be useful on slower system with lots of memory and can, under heavy load, give delayed answers still in the timeout window. Multiple thread couples are running.
  • Model 2: each thread reads multiple messages, generate answers and sends them back in batches.

The reason I'm suggesting to use the network model 0 or 1 is because we have identified an issue in 2.4.0 where a mis-configuration in the build could make yadifad crash when model 2 was being used.
The version we are about to commit fixes it.

@edfeu
Copy link

edfeu commented Nov 6, 2020

Here are the config files attached...
It is the complete /etc/yadifa directory, so not all files are used for yadifad. yadifad.conf is the entrypoint for yadifad.
yadifa_conf.tar.gz

Thank you for the files.
We will install a similar setup and try to reproduce the issue on it.
Some of us are off next week so the next update about this will happen from the 2020/11/16 on.

@edfeu
Copy link

edfeu commented Nov 27, 2020

As suspected, the issue was caused by a very small stack size given to yadifad by musl .
The autotools build now sets it to 8MB for gcc and clang linux builds (it should be a lazy allocation).
We also have patched the source following the patch you provided.
If you put it as a pull request, it will be added before we update the source in the github repository.

yadifa added a commit that referenced this issue Dec 3, 2020
…comment))

adds stack size fix for musl support (the default size is way too small)
adds error reporting in socket_server_opensocket_init
fixes CNAME recursion not returning the same answer as named in NXDOMAIN cases (reported by https://github.com/SivaKesava1, see #11)
modified the keyroll key hash so output would group by flags then algorithm then tag
adds a new yadifa module : zonesign
    zone (re-)signature tool that can replace dnssec-signzone
    designed to work through some limit cases (yakeyrolld)
fixes an issue where a zone signature could incorrectly be detected as already ongoing
fixes an issue that could occur parsing confguration files with optional content
fixes CNAME answers not following the aliases chain (side effect of a previous fix, regression added)
fixes a possible race-condition when initialising the keyroll context error codes
added an internal tool to verify what decided a configuration value (default, command line, ...)
added a NSEC3 record view so they can directly be signed
added stdatomic.h for older compilers (CentOS 7)
zdb_zone_write_text nolonger closes the output stream, the responsibility is left to the caller
keyroll context destruction now releases all the memory (needed now that a keyroll can be fully restarted during a run)
@yadifa yadifa closed this as completed Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants