Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPTv6 return traffic exiting WAN with wrong IP #4879

Closed
2 tasks done
FingerlessGlov3s opened this issue Mar 29, 2021 · 33 comments · Fixed by #4962
Closed
2 tasks done

NPTv6 return traffic exiting WAN with wrong IP #4879

FingerlessGlov3s opened this issue Mar 29, 2021 · 33 comments · Fixed by #4962
Assignees
Labels
bug Production bug
Milestone

Comments

@FingerlessGlov3s
Copy link
Contributor

FingerlessGlov3s commented Mar 29, 2021

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

I have NPTv6 setup so I can use my limited IPv6 /64 prefix on any network be hide OPNsense. When traffic originates from the machine behide OPNsense (fd37:c611:72fb:80::10) I can curl and ping the internet fine and curl -6 ifconfig.co returns 2001:41d0:800:2647::123:123:123 so I know its working. Monitoring the traffic with tcpdump -i vtnet0 icmp6 on OPNsense I can see the traffic leave the firewall with the correct IP.

Now the bug, when I try access 2001:41d0:800:2647::123:123:123 from the internet, good example is ping6 tool. I can see the traffic go through OPNsense, translate to the fd37:c611:72fb:80::10, then see it on the host be hide OPNsense using tcpdump as well. Then when it goes back through OPNsense it doesn't get translated back to 2001:41d0:800:2647::123:123:123 when it exits the WAN atleast tcpdump isn't showing this.

This used to work fine, and I heavy tested it, I believe its broken since going to 21.1, but can't be 100% sure, since I wasn't externally monitoring the IP, only internally monitoring.

To Reproduce
Create NPT rule

NPT Rule
| interface | WAN |
| Internal IPv6 Source | fd37:c611:72fb:80::10/128 |
| Destination IPv6 Prefix | 2001:41d0:800:2647::123:123:123 |
| Description | Mailcow |

Firewall Rule
| interface | WAN |
| Direction | IN |
| TCP/IP Version | IPv6+IPv6 |
| Source | ANY |
| Destination | ALIAS (contains fd37:c611:72fb:80::10 and 10.111.1.11[One-to-One NAT]) |
| Log | checked |
| Description | Mailcow ICMP |
| Gateway | Default |

Expected behaviour

Traffic to get translated back to the public IPv6, before exiting the WAN.

Describe alternatives you considered

Removing the NPTv6 rule and adding it again.

Relevant log files

Ping start on external (TCPDUMP monitored on PFsense WAN). Doesn't work

18:24:08.169244 IP6 (flowlabel 0xafd41, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo request, seq 1
18:24:08.169588 IP6 (flowlabel 0xce9b9, hlim 62, next-header ICMPv6 (58) payload length: 64) fd37:c611:72fb:80::10 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo reply, seq 1
18:24:09.187685 IP6 (flowlabel 0xafd41, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo request, seq 2
18:24:09.187912 IP6 (flowlabel 0xce9b9, hlim 62, next-header ICMPv6 (58) payload length: 64) fd37:c611:72fb:80::10 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo reply, seq 2
18:24:10.211634 IP6 (flowlabel 0xafd41, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo request, seq 3
18:24:10.211935 IP6 (flowlabel 0xce9b9, hlim 62, next-header ICMPv6 (58) payload length: 64) fd37:c611:72fb:80::10 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo reply, seq 3

Ping start on Guest. Works

18:24:45.375525 IP6 (flowlabel 0xb6b18, hlim 63, next-header ICMPv6 (58) payload length: 64) 2001:41d0:800:2647::123:123:123 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo request, seq 1
18:24:45.390534 IP6 (flowlabel 0x818e3, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo reply, seq 1
18:24:46.376990 IP6 (flowlabel 0xb6b18, hlim 63, next-header ICMPv6 (58) payload length: 64) 2001:41d0:800:2647::123:123:123 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo request, seq 2
18:24:46.391635 IP6 (flowlabel 0x818e3, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo reply, seq 2
18:24:47.378108 IP6 (flowlabel 0xb6b18, hlim 63, next-header ICMPv6 (58) payload length: 64) 2001:41d0:800:2647::123:123:123 > 2a03:b0c0:3:d0::101:4001: [icmp6 sum ok] ICMP6, echo request, seq 3
18:24:47.392740 IP6 (flowlabel 0x818e3, hlim 49, next-header ICMPv6 (58) payload length: 64) 2a03:b0c0:3:d0::101:4001 > 2001:41d0:800:2647::123:123:123: [icmp6 sum ok] ICMP6, echo reply, seq 3

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 21.1.3_3-amd64 on Proxmox
FreeBSD 12.1-RELEASE-p14-HBSD
OpenSSL 1.1.1j 16 Feb 2021
Intel(R) Xeon(R) E-2236 CPU @ 3.40GHz (2 cores)
VTNET (Vertio)

@AdSchellevis AdSchellevis added the support Community support label Mar 29, 2021
@FingerlessGlov3s
Copy link
Contributor Author

Still an issue in 21.1.4

@leifnel
Copy link

leifnel commented Apr 27, 2021

I believe the issue started at 20.7.6.

I also use NPTv6 to give the internal machines external IPv6 from my only /64.

Without changing configuration, they are no longer available when upgrading to 20.7.6

@FingerlessGlov3s
Copy link
Contributor Author

I don't have a firewall backup for 20.7.5, are you able to go back to that version and test?

@leifnel
Copy link

leifnel commented Apr 27, 2021

Yes, Luckily I run firewall on vmware and have a snapshot I can revert to.
Currently at 20.7.5.
Tried first to go to current 21.x , didn't work, restored snapshot.
Tried 20.7.6, didn't work, restored snapshot, working again at 20.7.5

@FingerlessGlov3s
Copy link
Contributor Author

So every time you go back to 20.7.5 it starts working again. I did backup but didn't test NPTv6 and was too late by the time I realised (I don't use it much).

Did you also get the issue where it only works one way, or did it just not work at all?

@leifnel
Copy link

leifnel commented Apr 27, 2021

I really only tested for incoming access to my webservers, outgoing is a secondary concern.

Could this be the issue (from changelog)? firewall: correctly select current IPv6 field in getInterfaceGateway()

I don't particularly fancy trying upgrading one patch at a time, partly because I don't know how, and partly because I don't know where to start :-(

@FingerlessGlov3s
Copy link
Contributor Author

Incoming from the WAN is where I discovered the issue too, going out the WAN was working fine.

Shall need to wait for Ad and co to look in too this.

@FingerlessGlov3s
Copy link
Contributor Author

Updated to 21.1.5, still broken.

@AdSchellevis
Copy link
Member

Ok, if 20.7.5 works for you and 20.7.6 doesn't, this is the only relevant change I can find df98263 (#4494)

I expect you can revert the change with opnsense-patch, it would be interesting to see what the difference is in your case between both versions for /tmp/rules.debug

@leifnel
Copy link

leifnel commented Apr 27, 2021

With the patch ipv6-servers are now available again. (20.7.6)
But they don't have outbound connectivity :-(

@FingerlessGlov3s
Copy link
Contributor Author

FingerlessGlov3s commented Apr 27, 2021

Did you get /tmp/rules.debug before and after?

Edit; Outbound is little odd, do you need to restart after the revert?

@leifnel
Copy link

leifnel commented Apr 27, 2021

I restarted. I have rules.debug after, but not before, but I can always apply the patch again ;-)

@leifnel
Copy link

leifnel commented Apr 27, 2021

Before and after:

`
186c186

< block in quick on em0 inet6 proto tcp from $blocked to {any} label "3fe0c265e9c80f5e0cff4adb9a55d216"

---

> block in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 proto tcp from $blocked to {any} label 
"3fe0c265e9c80f5e0cff4adb9a55d216"

198c198

< pass in quick on em0 inet6 proto ipv6-icmp from {any} to {any} keep state label "48a95038a7353e1873eb3e15cccdb3cc" # : 
Allow ping 4/6

---

> pass in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 proto ipv6-icmp from {any} to {any} keep state label 
"48a95038a7353e1873eb3e15cccdb3cc" # : Allow ping 4/6

200c200

< pass in quick on em0 inet6 proto tcp from {any} to {any} keep state label "8c4dce26b25101fe96a9db78d02e4887" # : www
---
> pass in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 proto tcp from {any} to {any} keep state label 
"8c4dce26b25101fe96a9db78d02e4887" # : www
208,209c208,209
< pass in quick on em0 inet6 proto tcp from {any} to {2001:0db8:a:7204::/64} port {80} keep state label 
"411556fe07d8a24b85117c5b61cf605e" # : http
< pass in quick on em0 inet6 proto tcp from {any} to {2001:0db8:a:7204::/64} port {443} keep state label 
"cba6ccde496406a638743bec9c90b920" # : https
---
> pass in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 proto tcp from {any} to {2001:0db8:a:7204::/64} port 
{80} keep state label "411556fe07d8a24b85117c5b61cf605e" # : http
> pass in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 proto tcp from {any} to {2001:0db8:a:7204::/64} port 
{443} keep state label "cba6ccde496406a638743bec9c90b920" # : https
211c211
< block return in quick on em0 inet6 from {any} to {any} label "1ceca4cc341f753de6fb8de269b0403e" # : Block whats not 
allowed and log
---
> block return in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 from {any} to {any} label 
"1ceca4cc341f753de6fb8de269b0403e" # : Block whats not allowed and log
213c213
< block return in quick on em0 inet6 from {any} to {any} label "5fb9e69742d748cd704b7dd9f039e6d3"
---
> block return in quick on em0 reply-to ( em0 2001:0db8:a:72ff:ff:ff:ff:ff ) inet6 from {any} to {any} label 
"5fb9e69742d748cd704b7dd9f039e6d3"

`

@FingerlessGlov3s
Copy link
Contributor Author

FingerlessGlov3s commented Apr 27, 2021

21.1.5, with that patch, doesn't work.
Outbound works, Inbound does not.

Remove that patch from 21.1.5, no change.... I then re-saved the WAN interface, since I presumed from looking at the code this was required, for the change to take effect.
Working in both directions, success!

rules.debug differences
With Patch (Before) Not working

pass in log quick on vtnet0 reply-to ( vtnet0 51.89.233.254 ) inet proto icmp from {any} to $Mailcow keep state label "0711655b1e0443967c943cce703b74b3" # : Mailcow ICMP
pass in log quick on vtnet0 reply-to ( vtnet0 2001:41d0:800:26ff:ff:ff:ff:ff ) inet6 proto ipv6-icmp from {any} to $Mailcow keep state label "0711655b1e0443967c943cce703b74b3" # : Mailcow ICMP
pass in log quick on vtnet0 reply-to ( vtnet0 51.89.233.254 ) inet proto tcp from {any} to $Mailcow port {25} keep state label "dde874088a272b1a49da731f2624d661" # : Mailcow SMTP
pass in log quick on vtnet0 reply-to ( vtnet0 2001:41d0:800:26ff:ff:ff:ff:ff ) inet6 proto tcp from {any} to $Mailcow port {25} keep state label "dde874088a272b1a49da731f2624d661" # : Mailcow SMTP

Removing Patch (After) Working

pass in log quick on vtnet0 reply-to ( vtnet0 51.89.233.254 ) inet proto icmp from {any} to $Mailcow keep state label "0711655b1e0443967c943cce703b74b3" # : Mailcow ICMP
pass in log quick on vtnet0 inet6 proto ipv6-icmp from {any} to $Mailcow keep state label "0711655b1e0443967c943cce703b74b3" # : Mailcow ICMP
pass in log quick on vtnet0 reply-to ( vtnet0 51.89.233.254 ) inet proto tcp from {any} to $Mailcow port {25} keep state label "dde874088a272b1a49da731f2624d661" # : Mailcow SMTP
pass in log quick on vtnet0 inet6 proto tcp from {any} to $Mailcow port {25} keep state label "dde874088a272b1a49da731f2624d661" # : Mailcow SMTP

There's other rules that are different but their just different services etc, but overall same differences

$Mailcow expands to the Addresss fd37:c611:72fb:80::10,10.111.1.11

2001:41d0:800:26ff:ff:ff:ff:ff is the Gateway

@leifnel
Copy link

leifnel commented Apr 27, 2021

Are you saying that 21.1.5 does not need to be patched, however the interfaces should just be re-saved?

@FingerlessGlov3s
Copy link
Contributor Author

FingerlessGlov3s commented Apr 27, 2021

You need to remove that patch, and then resave the WAN interface and it works once again.

Should work 20.7.6 onwards.

@leifnel
Copy link

leifnel commented Apr 27, 2021

Just to clarify, do I revert the patch, or do I leave the system alone when upgrading from 20.7.5?
Did I patch 5 or 6 times? Well in the heat of the action, I kind of lost track. So do I feel lucky?

@FingerlessGlov3s
Copy link
Contributor Author

Leave it alone while upgrading, encase it reverts the file back to its "broken" state, then run the patch command again.

@maurice-w
Copy link
Member

The patch added 'reply-to' functionality to IPv6 rules which for some reason seems to be incompatible with NPTv6. Instead of reverting the patch, you can just disable reply-to (in the firewall rules or globally in the advanced firewall settings). Should have the same effect.

The question remains whether the reply-to / binat incompatibility is a design limitation or a bug.

@FingerlessGlov3s
Copy link
Contributor Author

Would have to do some work to separate my rules in to IPv4 and IPv6 to use "disable reply-to" option but do able, but rather have the issue mitigated or resolved where needed.

I do use 1:1 BINAT on my IPv4 addresses as well, I'm not sure if NPTv6 is just BINAT under the hood.

@AdSchellevis
Copy link
Member

I don't think there's a lot we can do, the generic reply-to option isn't very fine grained (it's either there or not). It's a bit of an inheritance on our end, in reality quite some setups better assign their gateways explicitly.

@maurice-w
Copy link
Member

I do use 1:1 BINAT on my IPv4 addresses as well

Interesting. And that works with reply-to enabled?

I'm not sure if NPTv6 is just BINAT under the hood.

It is. Check 'Firewall: Diagnostics: pfInfo: Nat'. Any difference between the IPv4 binat rules and the IPv6 binat rules?

@FingerlessGlov3s
Copy link
Contributor Author

So currently IPv4 BINAT works with reply-to enabled, as you can see from the rules debug above.

Firewall: Diagnostics: pfInfo: Nat Filtered to only show interfaces we're interest in and the addresses

@94 nat on vtnet1_vlan80 inet from (vtnet1_vlan80:network:1) to 10.111.1.11 -> (vtnet1_vlan80) port 1024:65535 round-robin
@108 nat on vtnet0 inet from (vtnet0:network:13) to 10.111.1.11 -> (vtnet0) port 1024:65535 round-robin

@8 binat on vtnet0 inet6 from fd37:c611:72fb:80::10 to any -> 2001:41d0:800:2648:0:b19:8008:132
@9 binat on vtnet0 inet6 from 2001:41d0:800:2648:0:b19:8008:132 to any -> fd37:c611:72fb:80::10
@14 binat on vtnet0 inet from 10.111.1.11 to any -> 51.68.209.80

@87 rdr on vtnet1_vlan80 inet from any to 51.68.209.80 -> 10.111.1.11 bitmask
``

Can do this again once I've flipped it back to none working IPv6 state, but can't do that at this moment.

@maurice-w
Copy link
Member

The obvious difference is that IPv4 binat only has one rule (binat on $ext_if inet from <internal_IP> to any -> <external_IP>) while IPv6 binat additionally has the reverse rule. I was able to reproduce this. @AdSchellevis, do we know why?

@AdSchellevis
Copy link
Member

@maurice-w I think all if this still originates from pfsense/pfsense@462f900 , to be honest, I have no clue how many people actually try to use NPTv6 on OPNsense or pfSense

@FingerlessGlov3s
Copy link
Contributor Author

I've not come across too many people using NPTv6, I only know of myself and one other within my circles.

I would assume as more IPv6 gets rolled and if ISPs only give out /64 subnets, it may get more use due to the limitations /64 subnet brings.

@FingerlessGlov3s
Copy link
Contributor Author

FingerlessGlov3s commented Apr 28, 2021

I've reapplied the patch, so its back to how it was (broken) and I can confirm the NAT part is still the same.

@94 nat on vtnet1_vlan80 inet from (vtnet1_vlan80:network:1) to 10.111.1.11 -> (vtnet1_vlan80) port 1024:65535 round-robin
@108 nat on vtnet0 inet from (vtnet0:network:13) to 10.111.1.11 -> (vtnet0) port 1024:65535 round-robin

@8 binat on vtnet0 inet6 from fd37:c611:72fb:80::10 to any -> 2001:41d0:800:2648:0:b19:8008:132
@9 binat on vtnet0 inet6 from 2001:41d0:800:2648:0:b19:8008:132 to any -> fd37:c611:72fb:80::10
@14 binat on vtnet0 inet from 10.111.1.11 to any -> 51.68.209.80

@87 rdr on vtnet1_vlan80 inet from any to 51.68.209.80 -> 10.111.1.11 bitmask

I can confirm, disabling reply-to in the Firewall rule where NPTv6 addresses are being used is a workaround.

@maurice-w
Copy link
Member

maurice-w commented Apr 28, 2021

I tried to reproduce the return traffic translation issue but it seems to be intermittent. Maybe related to firewall state? And I'm using the development version (21.7.a_384) so there might be other differences.

But I'm now pretty confident that the reverse binat rule is erroneous. Not sure whether it's actually responsible for the reply-to incompatibility, but it's pointless. Binat is by definition bidirectional and a single address can't be both internal and external.

Patch 3fbf537 omits the reverse binat rule. NPT keeps working for me and return traffic gets translated correctly, even with reply-to enabled. @FingerlessGlov3s, could you test opnsense-patch 3fbf537?

@FingerlessGlov3s
Copy link
Contributor Author

Removed my disable reply-to Firewall rules, so the NPT stopped working, I then applied your patch, resaved the Firewall Rules so it would recreate the live ruleset, and boom working!!! I can confirm my other NPT rules also working.

Thanks for your help here, its very much appreciated.

NAT now shows

@94 nat on vtnet1_vlan80 inet from (vtnet1_vlan80:network:1) to 10.111.1.11 -> (vtnet1_vlan80) port 1024:65535 round-robin
@108 nat on vtnet0 inet from (vtnet0:network:13) to 10.111.1.11 -> (vtnet0) port 1024:65535 round-robin

@4 binat on vtnet0 inet6 from fd37:c611:72fb:80::10 to any -> 2001:41d0:800:2648:0:b19:8008:132
@9 binat on vtnet0 inet from 10.111.1.11 to any -> 51.68.209.80

@87 rdr on vtnet1_vlan80 inet from any to 51.68.209.80 -> 10.111.1.11 bitmask

@maurice-w
Copy link
Member

Great to hear, thanks for testing! Seems like another one of these "someone implemented it like this 10 years ago and no-one knows why" issues. ;-)

@AdSchellevis, if you have no objections, I'll create a PR for the patch.

@AdSchellevis
Copy link
Member

@maurice-w sure, go ahead. can't promise to ship it in a minor release if it changes legacy behaviour (good or bad)

@FingerlessGlov3s
Copy link
Contributor Author

Just worth closing this issue off with the command opnsense-patch 3fbf537 in the last message to apply the patch so in the mean time before 21.7, people can apply and find the patch easy to apply.

@fichtner fichtner added bug Production bug and removed support Community support labels Apr 30, 2021
@fichtner fichtner added this to the 21.7 milestone Apr 30, 2021
@fichtner
Copy link
Member

Thanks all! The second rule might have been a workaround and broke when pf(4) changes in FreeBSD made it obsolete at some point, maybe going from 11 to 12 in 20.7?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Production bug
Development

Successfully merging a pull request may close this issue.

5 participants