Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

security/acme-client: HTTP verification method fails with domain names resolving to IPv6 addresses #1967

Closed
mbunkus opened this issue Aug 9, 2020 · 11 comments
Labels
bug Production bug help wanted Contributor missing

Comments

@mbunkus
Copy link
Contributor

mbunkus commented Aug 9, 2020

[✓] I have read the contributing guide lines at https://github.com/opnsense/plugins/blob/master/CONTRIBUTING.md
[✓] I have searched the existing issues and I'm convinced that mine is new.
[✓] The title contains the plugin to which this issue belongs

Using the Let's Encrypt (os-acme-client) plugin for domain names that resolve to IPv6 addresses fails due IPv6 NAT rules not working. Domain names that only resolve to IPv6 addresses fail always. Names that resolve to both IPv4 and IPv6 addresses fail often, but not always.

The IPv6 NAT rule created seems to violate the IPv6 scoping (see below). The IPv6 packets are therefore not actually redirected. Instead they're handled like packets to port 80 are handled in normal conditions, which depends on the global "HTTP Redirect" administration configuration: if the direct is enabled, the validation request will be redirected to the HTTPS port, and the haproxy there will happily reply with "not found". If redirection is disabled, the verification request simply times out.

For domain names that only resolve to IPv6 addresses, this is the end of the line & the process fails there every time. For domain names that resolve to both IPv4 and IPv6 addresses, the Let's Encrypt servers sometimes to try to connect via IPv4, too. The NAT rules for IPv4 do work, though, and for those domain names verification will then succeed (only if it's done via IPv4).

What do I mean with "violate scoping rules"? I'm not a FreeBSD expert. Here's a forum post that talks about redirecting IPv6 to ::1 wrt. transparent Squids, but the Let's Encrypt client uses the same method. That forum post links to this FreeBSD bug report. The important section is:

Your PF rule redirects a packet to ::1, but doesn't change the receiving interface. Thus, it violates scoping rules. You can tell by running 'netstat -s -f inet6 | grep "violated scope"' before and after generating the traffic that you want to redirect. The check is in in6_setscope().

And yes, that seems to be the case here. See below in the "to reproduce" section for proof.

To Reproduce

Steps to reproduce the behavior:

  1. Install OPNSense & assign an IPv6 address to one of its interfaces. Make sure the IPv6 address is actually reachable from outside.
  2. Create a public DNS entry pointing to that IPv6 address only, no IPv4 addresses, please.
  3. Install the Let's Encrypt plugin and configure it:
  4. Add an account.
  5. Add a verification method of challenge type HTTP-01 with method OPNsense web service. Either enable "IP auto discovery" for the interface having said IPv6 address, or disable "IP auto discovery" and add the IPv6y address manually.
  6. Configure a certificate for the domain created in 2. with the account created in 3.1. and the verification method created in 3.2.
  7. Before starting the issuing/renewal process, log in via ssh and observe the current number of IPv6 packets that violated the scope via netstat -s -f inet6 | grep "violated scope"
  8. Trigger certificate issuing/renewal for the certificate created in 3.3.
  9. Observe /var/log/acme.sh.log which will show, among other things, a failure to verify the domain. The exact failure depends on the global "HTTP redirect" administrative setting as explained above. If the redirect is off, the actual error will be something like this: [Sun Aug 9 20:36:56 CEST 2020] opnsense-test.bunkus.org:Verify error:Fetching http://opnsense-test.bunkus.org/.well-known/acme-challenge/R1LLOfl3EGBAwaBn5vtA3iM1zb9a53gukxTovHFe4uw: Timeout during connect (likely firewall problem)
  10. Look at the number of IPv6 packets with scope violations again: netstat -s -f inet6 | grep "violated scope" That number will be higher than before the issuing/renewal attempt.

I can provide any kind of log file required.

Environment

OPNsense 20.7-amd64 OpenSSL
os-acme-client 1.34

@fraenki fraenki self-assigned this Aug 10, 2020
@fraenki fraenki added bug Production bug help wanted Contributor missing labels Nov 19, 2020
@fraenki
Copy link
Member

fraenki commented Nov 19, 2020

@mbunkus thanks for your detailed analysis and bug description. Currently os-acme-client will always bind to localhost/loopback IPs:

# bind to port
server.bind = "127.0.0.1"
server.port = {{OPNsense.AcmeClient.settings.challengePort}}
$SERVER["socket"] == "127.0.0.1:{{OPNsense.AcmeClient.settings.challengePort}}" { }
{% if helpers.exists('system.ipv6allow') and system.ipv6allow|default("0") == "1" %}
# IPv6
$SERVER["socket"] == "[::1]:{{OPNsense.AcmeClient.settings.challengePort}}" { }
{% endif %}

Maybe this issue could be solved by making the IPv6 address configurable. So instead of always using [::1] it would then be possible for the user to specify one of the configured (LAN) IPv6 addresses. Would this solve the issue, what do you think?

Of course, the code that creates the NAT rules would also have to be changed accordingly:

} elseif (($_ipv6_enabled == true) && (filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6))) {
// IPv6
$_dst = '::1';
$_family = 'inet6';
LeUtils::log("using IPv6 address: ${ip}");

I don't have a box with IPv6 connectivity at hand, so I can't test this myself.

@mbunkus
Copy link
Contributor Author

mbunkus commented Nov 19, 2020

Thanks for looking into this.

So instead of always using [::1] it would then be possible for the user to specify one of the configured (LAN) IPv6 addresses. Would this solve the issue, what do you think?

Well, the thing is that no matter how you slice it, the daemon must be reachable on the IPv6 address that the certificates' host names resolve to. That could be one or more of the WAN addresses. With IPv6 it's rather common to have multiple addresses used on the same device, you might have multiple WAN connections with different addresses etc.

I pretty much only see two options here:

  1. Make the daemon accept connections to all configured IPv6 addresses (the daemon could listen on only one of them and the NAT rules would then have to redirect traffic for all configured IPv6 addresses)
  2. Make the daemon accept connections only on the first configured IPv6 address configured on any WAN interface (including NATing only for that address) and let the admin override that with a setting.

Personally I'd very much like to see 1. as it should work out of the box without the need for the admin to have knowledge about the inner workings of this plugin.

For testing purposes I can easily set up an OPNsense with IPv6 connectivity if you need to me test future patches.

@fraenki
Copy link
Member

fraenki commented Nov 19, 2020

Well, the thing is that no matter how you slice it, the daemon must be reachable on the IPv6 address that the certificates' host names resolve to.

Are you sure about this? This post suggests that a LAN IPv6 address would be sufficient to have a working NAT rule.

  1. Make the daemon accept connections to all configured IPv6 addresses [...]

By "all" you also mean WAN IPv6 addresses, right? Wouldn't this impose a serious security risk?

@mbunkus
Copy link
Contributor Author

mbunkus commented Nov 19, 2020

Are you sure about this? This post suggests that a LAN IPv6 address would be sufficient to have a working NAT rule.

What I meant is that connections to the IPv6 addresses that the certificate's host names resolve to must ultimately connect to the daemon. This can be achieved by the daemon listening on any IPv6 other that ::1 (for the reasons mentioned in my initial post) and having NAT rules that redirect traffic accordingly.

For example, if you have a certificate with host names mail.whatev.er and www.whatev.er, with those names resolving to 2001:db8::1 and 2001:db8::2 and those two being addresses that are configured somewhere on your OPNsense, then you'll need one NAT rule for each of them. You can let the daemon listen on the LAN's IPv6 address. However, now you're assuming that your LAN interface actually has an IPv6 address, and that there is a LAN interface in the first place[1].

By "all" you also mean WAN IPv6 addresses, right? Wouldn't this impose a serious security risk?

Well, you already have that daemon listening on at least some of the addresses while it's active, so listening on all of them for the same period of time isn't that much of an increase in attack surface.

You could also determine which IPv6 addresses you need to accept connections on by resolving the host names of all certificates that are going to be renewed.

[1] I actually have one OPNsense running without a LAN interface; it's used as an email gateway with rspamd. It delivers mail to a backend server that's located on the public internet, too. Hence no LAN interface, hence no IPv6 address on a LAN interface.

@fraenki
Copy link
Member

fraenki commented Nov 19, 2020

Thanks for the clarification. So the os-acme-client daemon does not need to actually bind on any WAN IP, a NAT rule is still sufficient.

I think I will add a new option to choose an interface. The plugin will then automatically select the assigned IPv6 address and let the os-acme-client daemon bind on this address. Having this IP as NAT target should work, I think.

However, I don't think selecting the WAN interface should be allowed, because it would mean that the os-acme-client daemon is permanently listening on that IP.

IMHO this is one of these examples where DNS-01 should be preferred over HTTP-01.

@mbunkus
Copy link
Contributor Author

mbunkus commented Nov 19, 2020

Oh, I agree, DNS-01 is much nicer than HTTP-01. I strongly dislike the fact that you have to interrupt your regular HTTP traffic for the duration of renewing with the "extra daemon" model. And I am using DNS-01 pretty much everywhere — the problem was, though, that there was no support for my DNS provider in os-acme-client; and that's why I had to fall back to trying HTTP-01.

Support for my DNS provider has been added since (see #1968). Therefore HTTP-01 support isn't all that relevant for me anymore. I filed this report mostly so that there is documentation about it in case someone else stumbles across the same issue, and so that there's a chance of it getting fixed one day. Like I said, it isn't time critical for me.

@fraenki
Copy link
Member

fraenki commented Nov 19, 2020

@mbunkus I'll try to implement this in one of the next releases and would be grateful for a quick review once I've done the coding part :)

@mbunkus
Copy link
Contributor Author

mbunkus commented Nov 19, 2020

Sure thing. Cannot guarantee an instant review, but give me some time and I'll definitely give it a try.

@fraenki fraenki changed the title Let's Encrypt: HTTP verification method fails with domain names resolving to IPv6 addresses security/acme-client: HTTP verification method fails with domain names resolving to IPv6 addresses Dec 13, 2020
@joshuarestivo
Copy link

It's unclear to me what the outcome of this issue was. I'm trying to use HTTP-01 with a dual-stack interface (acme client v 4.3) and it's failing. Even when I configure the Challenge Type with auto-discovery disabled, select the WAN interface, and configure a single address (IPv6) in 'IP addresses', the logs appear to show that the v6 address is ignored. Logs have two entries, one each for my public and private IPv4 addresses.

I can't use the DNS challenge because a 3rd-party controls the DNS. I'm trying to migrate to ec2-based opnsense instance but the NAT that's in play with ec2 doesn't seem to work with the HTTP-01 client so I'm hoping to make v6 work.

@fraenki
Copy link
Member

fraenki commented Jul 27, 2024

Unfortunately, I never got time to work on this (even after 3+ years). So I'll unassign myself now. If someone wants to help, then please submit a Pull Request.

@fraenki fraenki removed their assignment Jul 27, 2024
@OPNsense-bot
Copy link

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository,
please read https://github.com/opnsense/plugins/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue,
just let us know, so we can reopen the issue and assign an owner to it.

@OPNsense-bot OPNsense-bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Production bug help wanted Contributor missing
Development

No branches or pull requests

4 participants