Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh-net): Work around broken windows DNS configuration #2075

Merged
merged 9 commits into from
Mar 13, 2024

Conversation

flub
Copy link
Contributor

@flub flub commented Mar 13, 2024

Description

This actively refuses to use the fec0:0:0:ffff::1, fec0:0:0:ffff::2 and fec0:0:0:ffff::3 DNS servers if the system has them configured.

Windows by default adds 3 IPv6 site-local anycast addresses to the DNS servers: fec0:0:0:ffff::1, fec0:0:0:ffff::2 and fec0:0:0:ffff::3. Supposedly Microsoft DNS servers by default listen on those. These are present as soon as there's an IPv6 interface configured it seems, even for a loopback interface which is extremely common if not the default.

Our hickory-resolver loads the system configuration, which includes these 3 IPv6 DNS servers. When it needs to make a DNS query it selects a random nameserver and tries this. If that fails it will try another one. For the next query there is bias, it will remember which servers to avoid or use. So if you get lucky and your first query falls on an actual DNS server then you are good. If you get unlucky recovering is a bit of a tussle because:

Inside netcheck we do DNS queries with a 1s timeout, this because all the probes have a 3s timeout. However hickory-resolver has a 5s timeout configured, so it's queries stay alive longer than ours. This means almost all subsequent DNS queries will end up reusing an existing connection to one of those bad servers if you are unlucky to land on one. The interplay of these timeouts and the connection reuse make recovering to a good DNS server a rather tough prospect for netcheck. It probably would eventually, given enough netcheck runs (which run at intervals of ~30s).

The odds of these nameservers being the sole way of having working DNS is basically zero. The odds of these nameservers breaking the resolver are about 50%. So remove these deprecated things.

Notes & open questions

Unfortunately the resolver returned by get_resolver() does not have an API that allows to test it. But the test would basically be the inverse logic of the logic that removes the bad servers so perhaps not that useful anyway.

Closes #2069
Closes n0-computer/dumbpipe#17

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.

@flub flub changed the title Want to see configured name servers fix(iroh-net): Work around broken windows DNS configuration Mar 13, 2024
@@ -41,7 +41,7 @@ fn get_resolver() -> Result<TokioAsyncResolver> {
config.add_search(name.clone());
}
for nameserver_cfg in system_config.name_servers() {
if WINDOWS_BAD_SITE_LOCAL_DNS_SERVERS.contains(&nameserver_cfg.socket_addr.ip()) {
if !WINDOWS_BAD_SITE_LOCAL_DNS_SERVERS.contains(&nameserver_cfg.socket_addr.ip()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💙

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😓 😭 😎

@flub flub marked this pull request as ready for review March 13, 2024 12:22
Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flub flub added this pull request to the merge queue Mar 13, 2024
Merged via the queue into main with commit 3747a09 Mar 13, 2024
24 checks passed
@dignifiedquire dignifiedquire deleted the flub/windows-dns branch March 13, 2024 13:19
github-merge-queue bot pushed a commit that referenced this pull request Apr 23, 2024
## Description

They have been stable on CI for a while now.  Maybe the DNS config
tweak we did in #2075 made a difference.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

Closes #2086 

## Change checklist

- [x] Self-review.
- ~~[ ] Documentation updates if relevant.~~
- ~~[ ] Tests if relevant.~~
- ~~[ ] All breaking changes documented.~~
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

test_icmp_probe_eu_derper flaky on windows Dumbpipe Listen not working on windows 11
2 participants