Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread 'trust-dns-server-runtime' has overflowed its stack #980

Closed
Darkspirit opened this issue Jan 4, 2020 · 14 comments · Fixed by #982
Closed

thread 'trust-dns-server-runtime' has overflowed its stack #980

Darkspirit opened this issue Jan 4, 2020 · 14 comments · Fixed by #982

Comments

@Darkspirit
Copy link
Contributor

Darkspirit commented Jan 4, 2020

Describe the bug
Sometimes the server crashes after few seconds or after a few hours:

thread 'trust-dns-server-runtime' has overflowed its stack
fatal runtime error: stack overflow

To Reproduce
Steps to the behavior. Sorry, it doesn't seem reliably reproducible at the moment. I'll add more info, as soon as I get more.

cargo install trust-dns --features dnssec-ring,sqlite -f
named --config /home/trustdns/config.toml --zonedir /home/trustdns/zones

Expected behavior
No stack overflow.

System:

  • OS: Debian
  • Architecture: x86_64
  • Version: Testing
  • rustc version: 1.40.0 (73528e339 2019-12-16)

Version:
Crate: trust-dns
Version: 0.18.0

Additional context

1578131419.246871:INFO:named:350:Trust-DNS 0.18.0 starting
1578131419.247107:INFO:named:355:loading configuration from: "/home/trustdns/config.toml"
[...all zones load correctly...]
1578131419.506362:INFO:named:412:binding UDP to V4(217.197.83.185:53)
1578131419.506495:INFO:named:417:listening for UDP on V4(217.197.83.185:53)
1578131419.506616:INFO:named:412:binding UDP to V6([::1]:53)
1578131419.506706:INFO:named:417:listening for UDP on V6([::1]:53)
1578131419.506803:INFO:named:412:binding UDP to V6([2001:67c:1400:2190::1]:53)
1578131419.506886:INFO:named:417:listening for UDP on V6([2001:67c:1400:2190::1]:53)
1578131419.506977:INFO:named:412:binding UDP to V6([2001:67c:1400:2190::2]:53)
1578131419.507058:INFO:named:417:listening for UDP on V6([2001:67c:1400:2190::2]:53)
1578131419.507147:INFO:named:428:binding TCP to V4(217.197.83.185:53)
1578131419.507238:INFO:named:433:listening for TCP on V4(217.197.83.185:53)
1578131419.507328:INFO:named:428:binding TCP to V6([::1]:53)
1578131419.507408:INFO:named:433:listening for TCP on V6([::1]:53)
1578131419.507496:INFO:named:428:binding TCP to V6([2001:67c:1400:2190::1]:53)
1578131419.507573:INFO:named:433:listening for TCP on V6([2001:67c:1400:2190::1]:53)
1578131419.507684:INFO:named:428:binding TCP to V6([2001:67c:1400:2190::2]:53)
1578131419.507761:INFO:named:433:listening for TCP on V6([2001:67c:1400:2190::2]:53)
1578131419.507862:INFO:named:615:
1578131419.507923:INFO:named:616:    o                      o            o             
1578131419.507984:INFO:named:617:    |                      |            |             
1578131419.508052:INFO:named:618:  --O--  o-o  o  o  o-o  --O--  o-o   o-O  o-o   o-o  
1578131419.508113:INFO:named:619:    |    |    |  |   \     |         |  |  |  |   \   
1578131419.508174:INFO:named:620:    o    o    o--o  o-o    o          o-o  o  o  o-o  
1578131419.508235:INFO:named:621:
1578131419.508295:INFO:named:476:awaiting connections...
1578131419.508362:INFO:named:481:Server starting up

1578131426.342032:INFO:trust_dns_server::server::server_future:583:request: 65314 type: Query op_code: Query dnssec: true name: ns2.darkspirit.eu. type: A class: IN
1578131426.342103:INFO:trust_dns_server::authority::catalog:515:request: 65314 found authority: darkspirit.eu.
1578131426.342121:INFO:trust_dns_server::authority::catalog:544:request: 65314 supported_algs: 
1578131426.342206:INFO:trust_dns_server::server::response_handler:48:response: 65314 response_code: 0
1578131426.342326:INFO:trust_dns_server::server::server_future:583:request: 36776 type: Query op_code: Query dnssec: true name: ns2.darkspirit.eu. type: A class: IN
1578131426.342344:INFO:trust_dns_server::authority::catalog:515:request: 36776 found authority: darkspirit.eu.
1578131426.342356:INFO:trust_dns_server::authority::catalog:544:request: 36776 supported_algs: 
1578131426.342393:INFO:trust_dns_server::server::response_handler:48:response: 36776 response_code: 0
1578131426.370136:INFO:trust_dns_server::server::server_future:583:request: 34702 type: Query op_code: Query dnssec: true name: ns2.darkspirit.eu. type: AAAA class: IN
1578131426.370192:INFO:trust_dns_server::authority::catalog:515:request: 34702 found authority: darkspirit.eu.
1578131426.370209:INFO:trust_dns_server::authority::catalog:544:request: 34702 supported_algs: 
1578131426.370285:INFO:trust_dns_server::server::response_handler:48:response: 34702 response_code: 0
1578131426.400105:INFO:trust_dns_server::server::server_future:583:request: 599 type: Query op_code: Query dnssec: true name: ns1.darkspirit.eu. type: A class: IN
1578131426.400162:INFO:trust_dns_server::authority::catalog:515:request: 599 found authority: darkspirit.eu.
1578131426.400176:INFO:trust_dns_server::authority::catalog:544:request: 599 supported_algs: 
1578131426.400254:INFO:trust_dns_server::server::response_handler:48:response: 599 response_code: 0
1578131426.419572:INFO:trust_dns_server::server::server_future:583:request: 11325 type: Query op_code: Query dnssec: true name: ns1.darkspirit.eu. type: A class: IN
1578131426.419624:INFO:trust_dns_server::authority::catalog:515:request: 11325 found authority: darkspirit.eu.
1578131426.419640:INFO:trust_dns_server::authority::catalog:544:request: 11325 supported_algs: 
1578131426.419695:INFO:trust_dns_server::server::response_handler:48:response: 11325 response_code: 0
1578131426.438055:INFO:trust_dns_server::server::server_future:583:request: 22464 type: Query op_code: Query dnssec: true name: ns1.darkspirit.eu. type: AAAA class: IN
1578131426.438189:INFO:trust_dns_server::authority::catalog:515:request: 22464 found authority: darkspirit.eu.
1578131426.438219:INFO:trust_dns_server::authority::catalog:544:request: 22464 supported_algs: 
1578131426.438285:INFO:trust_dns_server::server::response_handler:48:response: 22464 response_code: 0
1578131439.194109:INFO:trust_dns_server::server::server_future:583:request: 7434 type: Query op_code: Query dnssec: false name: ns1.terrax.net. type: AAAA class: IN
1578131439.194177:INFO:trust_dns_server::authority::catalog:515:request: 7434 found authority: terrax.net.
1578131439.194249:INFO:trust_dns_server::server::response_handler:48:response: 7434 response_code: 0
1578131439.325017:INFO:trust_dns_server::server::server_future:583:request: 19266 type: Query op_code: Query dnssec: false name: ns1.terrax.net. type: MX class: IN
1578131439.325085:INFO:trust_dns_server::authority::catalog:515:request: 19266 found authority: terrax.net.

thread 'trust-dns-server-runtime' has overflowed its stack
fatal runtime error: stack overflow
@bluejekyll
Copy link
Member

Do you think you could increase the log level and perhaps set the RUST_BACKTRACE=full environment variable?

That might help us see where the program is dying.

@Darkspirit
Copy link
Contributor Author

Darkspirit commented Jan 4, 2020

With

cargo install trust-dns --features dnssec-ring,sqlite -f --debug
RUST_BACKTRACE=full RUST_LOG=debug named --config /home/trustdns/config.toml --zonedir /home/trustdns/zones --debug

I don't get any backtrace or further error log. There was just a request, then it died.
But it seems I have found STR and could narrow it down to the regressing commit, it will take some time.

Even 0.16 dies with:

1578162425.750836:DEBUG:trust_dns_server::store::in_memory::authority:958:searching InMemoryAuthority for: name: rustls.com. type: MX class: IN

thread 'tokio-runtime-worker-0' has overflowed its stack
fatal runtime error: stack overflow

@Darkspirit
Copy link
Contributor Author

Darkspirit commented Jan 4, 2020

searching InMemoryAuthority is the last action before the crash.

1578161576.025669:DEBUG:trust_dns_server::authority::catalog:132:query received: 58954
1578161576.025741:DEBUG:trust_dns_server::authority::catalog:442:searching authorities for: rustls.com.
1578161576.025903:INFO:trust_dns_server::authority::catalog:515:request: 58954 found authority: rustls.com.
1578161576.026015:DEBUG:trust_dns_server::authority::catalog:535:no DAU in request, used default SupportAlgorithms
1578161576.026074:INFO:trust_dns_server::authority::catalog:544:request: 58954 supported_algs:
1578161576.026136:DEBUG:trust_dns_server::authority::catalog:551:performing name: rustls.com. type: MX class: IN on rustls.com.
1578161576.026244:DEBUG:trust_dns_server::store::in_memory::authority:958:searching InMemoryAuthority for: name: rustls.com. type: MX class: IN

thread 'trust-dns-server-runtime' has overflowed its stack
fatal runtime error: stack overflow

@bluejekyll
Copy link
Member

Ok, that's feasible. I think that function is recursive, so it might e doing something incorrectly and thus very possible to overflow the stack.

@Darkspirit
Copy link
Contributor Author

This domain usually just redirects to the GitHub project. I registered it last month before some idiot could take it away from us all.

[[zones]]
zone = "rustls.com"
zone_type = "Master"
file = "rustls.com"
enable_dnssec = true
stores = { type = "sqlite", zone_file_path = "rustls.com", journal_file_path = "rustls.com.jrnl", allow_update = true }
keys = [{key_path="keys/rustls.com.pk8", algorithm="ECDSAP384SHA384", is_zone_signing_key=true}, {key_path="auth.pk8", algorithm="ED25519", is_zone_update_auth=true}]

@ 86400 IN SOA ns1.darkspirit.eu. hostmaster.terrax.net. (
 201903313       ; Serial
 3600            ; Refresh
 600             ; Retry
 86400           ; Expire
 600)            ; Negative TTL
@ 600 IN NS ns1.darkspirit.eu.
@ 600 IN NS ns2.darkspirit.eu.
@ 600 IN MX 0 .
@ 600 IN TXT "v=spf1 mx -all"
@ 600 IN CAA 0 issue "letsencrypt.org; validationmethods=dns-01"
@ 600 IN CAA 0 iodef "mailto:caa@terrax.net"
@ 600 IN AAAA 2001:67c:1400:2190::2
@ 600 IN A 217.197.83.185
www 600 IN AAAA 2001:67c:1400:2190::2
www 600 IN A 217.197.83.185
www 86400 IN MX 0 .
_dmarc 600 IN TXT "v=DMARC1; p=reject; sp=reject; adkim=s; aspf=s; rua=mailto:postmaster@terrax.net; ruf=mailto:postmaster@terrax.net; rf=afrf; pct=100; ri=86400"
terrax._domainkey 600 IN CNAME terrax._domainkey.terrax.net.
_443._tcp 60 IN TLSA 3 1 1 C900788909DBCDDE7DC3752A10AC7EF485B8C7B98610E1FEDDC08F64A2179A2C
_443._tcp.www 60 IN TLSA 3 1 1 C900788909DBCDDE7DC3752A10AC7EF485B8C7B98610E1FEDDC08F64A2179A2C

@Darkspirit
Copy link
Contributor Author

RUST_BACKTRACE=full RUST_LOG=debug gdb --args named --config /home/trustdns/config.toml --zonedir /home/trustdns/zones --debug

gdb.txt
(Please instruct me what to do, sorry.)

@bluejekyll
Copy link
Member

Ok, this is strange. It looks like a wildcard lookup is being triggered, but there is no wildcard in your domain.

@Darkspirit
Copy link
Contributor Author

I've mailed you the full log, including the startup.

@bluejekyll
Copy link
Member

bluejekyll commented Jan 4, 2020

I'm looking at this, and we have decent test coverage here. I'm hoping to reproduce it with authority_battery/basic.rs test case, though, it doesn't look simple for some reason.

@bluejekyll
Copy link
Member

Ok, I've reproduced, it looks like it's the @ 600 IN MX 0 . record.

Can you explain what the intention of that is?

@Darkspirit
Copy link
Contributor Author

https://www.hardenize.com/report/rustls.com#email

This host doesn't specify any MX servers. According to the SMTP specification, in that case it should be assumed that the host itself is willing to receive email. We have checked and that's not the case. This host should probably deploy a NULL MX (RFC 7505) to indicate that email is not wanted, but in practice it doesn't matter a great deal.

@Darkspirit
Copy link
Contributor Author

Darkspirit commented Jan 4, 2020

darkspirit.eu has such a record for months, but it seems some mail servers are interested in sending spam to rustls.com, so its MX record was actually requested by someone.

@bluejekyll
Copy link
Member

Ok, it looks like we aren't properly processing the NULL record in regards to . as the target in the zone file. So this support will need to be added.

@bluejekyll
Copy link
Member

Ok, it turns out that with MX records, when we're looking up additional records for the response, it we ended up continuing to search for ever, bad base case in the recursion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants