-
-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FTL Crash issues? Read this thread first! #705
Comments
Just had my crash at Working to get GDB but nothing is cooperating right now |
GNU gdb (Raspbian 8.2.1-2) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 666
[New LWP 667]
[New LWP 668]
[New LWP 669]
[New LWP 670]
[New LWP 671]
[New LWP 672]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
__GI___poll (timeout=-1, nfds=6, fds=0xb1fab0) at ../sysdeps/unix/sysv/linux/poll.c:29
29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) handle SIGHUP nostop SIGPIPE nostop
Signal Stop Print Pass to program Description
SIGHUP No Yes Yes Hangup
SIGPIPE No Yes Yes Broken pipe
(gdb) continue
Continuing.
Thread 1 "pihole-FTL" received signal SIGSEGV, Segmentation fault.
0x004ab798 in forward_query (udpfd=4, udpaddr=0x2, udpaddr@entry=0xbedf99d8, dst_addr=0xbedf99d8, dst_addr@entry=0xbedf9a08, dst_iface=3202324856, dst_iface@entry=2, header=<optimized out>, header@entry=0xb1be48, plen=47,
plen@entry=30, now=<optimized out>, now@entry=4096, forward=0x1b854d0, ad_reqd=ad_reqd@entry=0, do_bit=0, do_bit@entry=11648584) at dnsmasq/forward.c:313
313 dnsmasq/forward.c: No such file or directory.
(gdb) backtrace
#0 0x004ab798 in forward_query (udpfd=4, udpaddr=0x2, udpaddr@entry=0xbedf99d8, dst_addr=0xbedf99d8, dst_addr@entry=0xbedf9a08, dst_iface=3202324856, dst_iface@entry=2, header=<optimized out>, header@entry=0xb1be48, plen=47,
plen@entry=30, now=<optimized out>, now@entry=4096, forward=0x1b854d0, ad_reqd=ad_reqd@entry=0, do_bit=0, do_bit@entry=11648584) at dnsmasq/forward.c:313
#1 0x004ac4ce in receive_query (listen=listen@entry=0xb532c0, now=4096, now@entry=1583279886) at dnsmasq/forward.c:1641
#2 0x004b9ed6 in check_dns_listeners (now=now@entry=1583279886) at dnsmasq/dnsmasq.c:1657
#3 0x004bb13c in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at dnsmasq/dnsmasq.c:1108
#4 0x00491e18 in main (argc=<optimized out>, argv=<optimized out>) at main.c:71
(gdb) continue
Continuing.
Thread 1 "pihole-FTL" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) backtrace
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0xb6daf230 in __GI_abort () at abort.c:79
#2 0x004932ba in SIGSEGV_handler (sig=<optimized out>, si=<optimized out>, unused=<optimized out>) at signals.c:66
#3 <signal handler called>
#4 0x004ab798 in forward_query (udpfd=4, udpaddr=0x2, udpaddr@entry=0xbedf99d8, dst_addr=0xbedf99d8, dst_addr@entry=0xbedf9a08, dst_iface=3202324856, dst_iface@entry=2, header=<optimized out>, header@entry=0xb1be48, plen=47,
plen@entry=30, now=<optimized out>, now@entry=4096, forward=0x1b854d0, ad_reqd=ad_reqd@entry=0, do_bit=0, do_bit@entry=11648584) at dnsmasq/forward.c:313
#5 0x004ac4ce in receive_query (listen=listen@entry=0xb532c0, now=4096, now@entry=1583279886) at dnsmasq/forward.c:1641
#6 0x004b9ed6 in check_dns_listeners (now=now@entry=1583279886) at dnsmasq/dnsmasq.c:1657
#7 0x004bb13c in main_dnsmasq (argc=<optimized out>, argv=<optimized out>) at dnsmasq/dnsmasq.c:1108
#8 0x00491e18 in main (argc=<optimized out>, argv=<optimized out>) at main.c:71` |
That's the date the binary was compiled. |
|
I also have the same issue.
|
via https://docs.pi-hole.net/ftldns/debugging/
|
I was facing this as well this morning & disabling DNSSEC in the options appears to get FTL back to stable running (at least for the last 30mins at least) |
FTL also started crashing here an hour or two ago, after doing no changes to the setup in months. |
yep i confirm must be something with dnssec |
Disabling DNSSEC also fixed the constant crashing |
Is anyone using CloudFlare as the upstream service? And is DNSSEC enabled or disabled? |
Ah ha! Yes, I am using Cloudflare as up stream w/ DNSSEC |
yep using cloudflare through stubby because of DoT and enabled dnssec |
OP on reddit mentions Cloudflare: |
I'm using stubby to forward queries to Cloudflare using DNS over TLS w/ DNSSEC |
Exact same issue for me - using cloudflared. Disabling DNSSEC has me back up for now. |
i will try to change the server @ stubby and report back if its running with another company :D |
CloudFlare + DNSSEC via |
with the google dns servers it runs fine |
Created a CloudFlare Support ticket: 1842464 (not sure if that's publicly accessible, but feel free to reference it) |
CloudFlare Community thread created. |
maybe they're trying to exploit some vulnerabilities? 😄 |
Meanwhile https://docs.pi-hole.net/guides/unbound/ is something to consider. Unbound as a local upstream and no longer depending on a service for upstreams. |
This issue has been mentioned on Pi-hole Userspace. There might be relevant details there: https://discourse.pi-hole.net/t/lost-connection-to-api/29062/2 |
Which other DNS services besides CloudFlare support DNSCrypt and DNSSEC?
I'm having the issue using dnscrypt-proxy, which I believe fulfills the same local upstream DNS functionality as unbound. |
Op Here from https://www.reddit.com/r/pihole/comments/fd44dc/ftl_crashing/ |
@jdrch No. dnscrypt-proxy is a forwarding DNS resolver: it just forwards the request to 1.1.1.1. Meanwhile, unbound is a recursive DNS server: it provides the same service as 1.1.1.1 itself, meaning that it contacts the root DNS nameservers for the TLD nameserver, which it asks for the domain nameserver, and so on until it gets the actual IP. DNS over TLS and DNS over HTTPS are not used between authoritative and recursive DNS server communication, only DNSSEC is used (if available), so you get the same result as contacting 1.1.1.1 encryption-wise. In fact, the RFC for DoT only mentions authoritative twice. |
@sylveon TIL, thanks. The performance hit does seem to be a significant drawback, though, which is bad for gaming ping times, among other things. |
@madpsy thanks, that's super helpful! I'll see if I can reproduce and find a workaround until the fixed dnsmasq package is released. |
@madpsy any chance you could try now and see if you still experience crashes? |
@vavrusa Looking good! |
@DL6ER I just reviewed this issue after getting tagged by you, and checked out my setup again. Shortly after I closed #645, I actually went back to my entire original config that was causing the crashes - DNSSEC, dnscrypt-proxy, using Cloudflare alternating with ventricle.us and even doh-crypto-sx (the originally problematic server) on occasion, and have had absolutely no problems over the last couple of days. The one thing that's different is that I switched to a new ISP, which doesn't have IPv6 - so I removed
So perhaps the error is now being triggered by Cloudflare in some different way (IPv6?), that isn't affecting me. Anyway, hope the work you did way back when is paying off for people now. |
The common thing between |
Also, not only they support padding, but they also return padded responses for queries over DoH even if the query wasn't padded (some people may qualify that behavior as "not right", but it can only improve security). |
@jedisct1 I assumed it was padding as well as that's the only difference, but couldn't reproduce it. It looks like dnsmasq crashes upon receiving REFUSED response (I've managed to reproduce that). I did some digging and some portion of frequent DNSKEY queries could have been throttled as part of abuse traffic in some PoPs for the last few days, particularly if it's coming from shared prefixes. I've added an exception, so this shouldn't be happening anymore, so it'd be great if more people could confirm. |
@vavrusa Sorry for the late delay, my box that was running 4 needed to be rebuilt, its up and looks stable so far. |
@vavrusa Has been stable for me since you made the change at cloudflare's side too. Thanks. |
I can confirm it also. I re-enabled DNSSEC about 4 hours ago and I didn't have any issues since. |
FTL v4.3.1 + @vavrusa I re-enabled DNSSEC in Pi-hole this morning and it ran just fine for ~2h before I left for work. I'll consider the bug fixed if it lasts 24h without a crash (though I don't see why it shouldn't.) Thanks so much! |
Re-enabled DNSSEC about 2 hours ago. My 21 clients are happy again. Thank you all. |
Seems to be working again. Thanks! |
Everything's been working now for over 24 hours, so I'd say this issue is resolved. |
Is this back on the main 5.0 branch or the feature branch noted above? |
See my comment before the one you're replying to :)
|
Thanks for all your input. This bug seems to have been fixed in two independent ways:
We'll absorb our fix in the main code as soon as |
This issue has been mentioned on Pi-hole Userspace. There might be relevant details there: |
Still occurring. Just had 2 pi-holes crash within 15 mins of each other. v5. Where exactly is the fix? |
@biship Still working on v5 over here with dnscrypt-proxy and CloudFlare upstream. What's your upstream? FWIW the fix was made on CloudFlare's side over 2 months ago. |
1.1.1.2 |
I'm not using dnscrypt-proxy, whatever that is. |
OK so we both use CloudFlare. Still up and running here ... |
it's happened twice in the last week. |
I just experienced this issue when enabling DNSSEC on an otherwise working setup. After some brief debugging, it seems that FTL does not gracefully transition between certain upstream DNS configurations. Leaving DNSSEC and restarting my docker container seems to have handled it. I also tested switching between some other combinations of DNS configurations and it really struggles. Seems like FTL just shuts down and doesn't bother trying to come up... [2022-03-23 00:23:33.042 871M] Shutting down... And nothing thereafter, but saving again (even with the same configuration which it just failed to start) will start FTL again. |
In raising this issue, I confirm the following (please check boxes, eg [X]) Failure to fill the template will close your issue:
How familiar are you with the codebase?:
1
[BUG | ISSUE] Expected Behaviour:
[BUG | ISSUE] Actual Behaviour:
pi-hole on my 2 completly different systems crashed @ nearly the same time.
[BUG | ISSUE] Steps to reproduce:
Log file output [if available]
RPi4 Log:
PC Log:
Device specifics
Hardware Type: RPi4 4GB and a PC
OS: newest Raspbian on RPi4 and Ubuntu Server on PC
This template was created based on the work of
udemy-dl
.The text was updated successfully, but these errors were encountered: