Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[23.7.7] dhcpc6 does not receive dhcp6 advertise from ISP over PPPOE if pf is on #6962

Closed
2 tasks done
spin-lock opened this issue Oct 27, 2023 · 20 comments
Closed
2 tasks done
Labels
help wanted Contributor missing / timeout support Community support

Comments

@spin-lock
Copy link

spin-lock commented Oct 27, 2023

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

After upgrade from 23.7.6, I noticed my WAN (PPPOE) doesn't have IPv6 address nor PD for my LAN.
From the log I can see that dhcp6c keep sending dhcp6 solicit without getting answered.

I ran packet capture and confirm that is the case:
[root@router ~]# tcpdump -nn -i pppoe0 port 546 and port 547
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
03:41:47.451210 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit
03:41:48.551764 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit
03:41:50.638706 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit
03:41:54.622538 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit
03:42:02.689358 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit

I checked and tried different things without success, until I tried disabling pf (pfctl -d) and restarted dhcp6c. Oddly enough this time packets are received correctly.

Packet capture output:
listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes
03:43:30.867525 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 solicit
03:43:30.880805 IP6 fe80::200:5eff:fe00:103.547 > fe80::daff:feed:beef.546: dhcp6 advertise
03:43:30.881094 IP6 fe80::daff:feed:beef.546 > ff02::1:2.547: dhcp6 request
03:43:30.921665 IP6 fe80::200:5eff:fe00:103.547 > fe80::daff:feed:beef.546: dhcp6 reply

dhcp6c output:
...
Oct/27/2023 03:43:30: Sending Solicit
Oct/27/2023 03:43:30: a new XID (7c989a) is generated
Oct/27/2023 03:43:30: set client ID (len 14)
Oct/27/2023 03:43:30: set identity association
Oct/27/2023 03:43:30: set elapsed time (len 2)
Oct/27/2023 03:43:30: set option request (len 4)
Oct/27/2023 03:43:30: set IA_PD prefix
Oct/27/2023 03:43:30: set IA_PD
Oct/27/2023 03:43:30: send solicit to ff02::1:2%pppoe0
Oct/27/2023 03:43:30: reset a timer on pppoe0, state=SOLICIT, timeo=0, retrans=1091
Oct/27/2023 03:43:30: receive advertise from fe80::200:5eff:fe00:103%pppoe0 on pppoe0
Oct/27/2023 03:43:30: get DHCP option client ID, len 14
Oct/27/2023 03:43:30: DUID: 00:01:00:01:2b:bd:54:2a:7c:2b:e1:13:02:0d
Oct/27/2023 03:43:30: get DHCP option server ID, len 16
Oct/27/2023 03:43:30: DUID: fe:80:00:00:00:00:00:00:02:00:5e:ff:fe:00:01:03
...
Oct/27/2023 03:43:30: Received REPLY for REQUEST
Oct/27/2023 03:43:30: nameserver[0] 2404:8000:11:2::4
Oct/27/2023 03:43:30: nameserver[1] 2404:8000:11:3::2
Oct/27/2023 03:43:30: make an IA: PD-0
Oct/27/2023 03:43:30: create a prefix 2404:8000:1001:193d::/64 pltime=172800, vltime=259200
Oct/27/2023 03:43:30: add an address 2404:8000:1001:193d:7e2b:e1ff:fe13:210/64 on igc3
Oct/27/2023 03:43:30: status code for PD-0: success

It looks like something is blocking my ISP receiving dhcp6 solicits when pf is on.
This is very mystifying to me, as this is a straight upgrade from 23.7.6 and not a single config change made after upgrade.
I compared the rules.debug from 23.7.6 and 23.7.7 and found no discrepancy, both have auto-generated rules for DHCP6 in place:
pass in quick on pppoe0 proto udp from {fe80::/10} port {546} to {fe80::/10} port {546} label "8cd6199018ef9eb8a56a803f76d043ba" # allow dhcpv6 client in WAN
pass in quick on pppoe0 proto udp from {any} port {547} to {any} port {546} label "223a20aafe5da09a3dd93ec49dd4a20b" # allow dhcpv6 client in WAN
pass out quick on pppoe0 proto udp from {any} port {546} to {any} port {547} set prio 0 label "edcf3e218111608c15b56710f3080b8b" # allow dhcpv6 client in WAN

I even tried passing any to any IP6 rule without success.

Any idea where to look at? I'm really at a loss here.
Thanks in advance.

@fichtner fichtner self-assigned this Oct 27, 2023
@fichtner fichtner added the cleanup Low impact changes label Oct 27, 2023
@fichtner fichtner added this to the 24.1 milestone Oct 27, 2023
@fichtner
Copy link
Member

@spin-lock there is definitely something going on but I'm unsure what it is. nothing really stands out in the release notes but 5cb5541 is the closest it comes to influencing this. Can you try to revert it?

# opnsense-patch 5cb5541

Cheers,
Franco

@spin-lock
Copy link
Author

spin-lock commented Oct 27, 2023

@fichtner Hi Franco, thanks but I tried the patch and still no dice.
Right now as a temporary fix I'm using a cron job to monitor IPv6 on my WAN to disable pf and restart dhcp6c if necessary.
Will investigate it again later when I'm free.

@fichtner
Copy link
Member

Thanks, appreciate all the details I can get. Still a bit baffled about this one.

@hendrer
Copy link

hendrer commented Oct 28, 2023

I can confirm this is exactly happening in my setup as well.

@fichtner
Copy link
Member

Details matter, please.

@hendrer
Copy link

hendrer commented Oct 28, 2023

This what I see in the log

<7>cannot forward src fe80:5::785d:c2ff:feb7:f7fb, dst 2a03:2880:f045:12:face:b00c:0:2, nxt 17, rcvif ixv4, outif pppoe0  

@fichtner
Copy link
Member

That’s just android phones being silly when they can’t get an IPv6 via SLAAC

@fichtner
Copy link
Member

ifconfig of the WAN/LAN plus /var/etc/radvd.conf would probably help.

@spin-lock
Copy link
Author

@fichtner Ok I tried different things over the weekend before finally gave up, destroyed the WAN and recreated from scratch with same settings. It magically started working again (without having to toggle pf).
Chalk this one up as an unknown glitch.
Thanks Franco!

@fichtner fichtner removed their assignment Oct 30, 2023
@fichtner fichtner added the support Community support label Oct 30, 2023
@fichtner fichtner removed this from the 24.1 milestone Oct 30, 2023
@fichtner fichtner removed the cleanup Low impact changes label Oct 30, 2023
@fichtner
Copy link
Member

@spin-lock weird :) I'm not against that but if you could can you check the configuration history diff to see if maybe the settings are slightly different now?

@hendrer
Copy link

hendrer commented Oct 30, 2023

I would appreciate that also since I'm still stuck. Will get you the details you asked for

@spin-lock
Copy link
Author

spin-lock commented Oct 31, 2023

@spin-lock weird :) I'm not against that but if you could can you check the configuration history diff to see if maybe the settings are slightly different now?

With the new config, I noticed the internal device name for my PPPOE is changed from "wan" to "opt8".
So every tag/value of "wan" in config.xml became "opt8" in the working version.
Aside from that, I don't see any more differences.

HTH.

@spin-lock
Copy link
Author

@fichtner I edited the config of working version and changed every occurrence of "opt8" to "wan", reload new config & reboot and dhcp6c was not working again (had to toggle pf to get dhcp6 replies from ISP).

I then edited the config again and changed back every "wan" to "opt8", reload & reboot and it was working normally again.

I'm not sure what to make out of this, hope this will give you some clue, Franco.

HTH.

@fichtner
Copy link
Member

With the new config, I noticed the internal device name for my PPPOE has changed from "wan" to "opt8".

Ok, that is expected when redoing the interface as the "lan" and "wan" identifiers are reserved values for the console and factory reset configuration only. Think we should close then.

I'm not sure what to make out of this, hope this will give you some clue, Franco.

Funky. Can you make a diff -u of the working version vs. the non-working version of the file /tmp/rules.debug -- I think we should see an issue here.

Cheers,
Franco

@spin-lock
Copy link
Author

spin-lock commented Oct 31, 2023

Here's the excerpts of diff I picked on pppoe:

@@ -148,8 +148,8 @@
 __openvpn_network = "<__openvpn_network>"
 table <__opt14_network>  persist  
 __opt14_network = "<__opt14_network>"
-table <__opt8_network>  persist  
-__opt8_network = "<__opt8_network>"
+table <__wan_network>  persist  
+__wan_network = "<__wan_network>"

@@ -279,9 +279,9 @@
 pass in quick on vlan01 inet6 proto udp from {ff02::/16} to {fe80::/10} port {547} label "83602f6f157b7063427dac458c3c1f17" # allow access to DHCPv6 server on VLAN50_VPN
 pass in quick on vlan01 inet6 proto udp from {fe80::/10} to {(self)} port {546} label "b884b7166e5ac8d3a40d9d2b10e79bd3" # allow access to DHCPv6 server on VLAN50_VPN
 pass out quick on vlan01 inet6 proto udp from {(self)} port {547} to {fe80::/10} label "f49c2bf2475cc72c935119eedd09d3f8" # allow access to DHCPv6 server on VLAN50_VPN
-pass in quick on pppoe0 proto udp from {fe80::/10} port {546} to {fe80::/10} port {546} label "699421e0b4fd390da837b84c6a8f9c15" # allow dhcpv6 client in WAN
-pass in quick on pppoe0 proto udp from {any} port {547} to {any} port {546} label "f6242abe481fee6ca8a2643f53642f7c" # allow dhcpv6 client in WAN
-pass out quick on pppoe0 proto udp from {any} port {546} to {any} port {547} label "4b266c6fc567cdaea179399d4c18b813" # allow dhcpv6 client in WAN
+pass in quick on pppoe0 proto udp from {fe80::/10} port {546} to {fe80::/10} port {546} label "15178fb2a3769d47b7e6b0b3c94cd170" # allow dhcpv6 client in WAN
+pass in quick on pppoe0 proto udp from {any} port {547} to {any} port {546} label "c671121c53904a58a4286cbb7f8500ca" # allow dhcpv6 client in WAN
+pass out quick on pppoe0 proto udp from {any} port {546} to {any} port {547} label "00cbb180fdcff7449da6542ac6da7501" # allow dhcpv6 client in WAN

@@ -383,16 +383,16 @@
 pass out route-to ( gif6 10.2.2.0 ) from {(gif6)} to {!(gif6:network)} keep state allow-opts label "c07d4d62dc804d56bae5a448235ab737" # let out anything from firewall host itself (force gw)
 pass out route-to ( gif2 10.1.2.0 ) from {(gif2)} to {!(gif2:network)} keep state allow-opts label "44ca1a6db5753db09d0a7590009f8498" # let out anything from firewall host itself (force gw)
 pass out route-to ( gif1 10.3.2.0 ) from {(gif1)} to {!(gif1:network)} keep state allow-opts label "33fd75a393e4c83e79ea4694f0ddf766" # let out anything from firewall host itself (force gw)
-pass out route-to ( pppoe0 182.253.230.1 ) from {(pppoe0)} to {!(pppoe0:network)} keep state allow-opts label "bd9a12eff5ea112bddf6772dee0a4da9" # let out anything from firewall host itself (force gw)
-pass out route-to ( pppoe0 fe80::200:5eff:fe00:103 ) from {(pppoe0)} to {!(pppoe0:network)} keep state allow-opts label "b2279682eacf9dfa6d1d368a1cab0c89" # let out anything from firewall host itself (force gw)
+pass out route-to ( pppoe0 182.253.230.1 ) from {(pppoe0)} to {!(pppoe0:network)} keep state allow-opts label "2da77f48c21d6dfc536f1fcbc2e2498e" # let out anything from firewall host itself (force gw)
+pass out route-to ( pppoe0 fe80::200:5eff:fe00:103 ) from {(pppoe0)} to {!(pppoe0:network)} keep state allow-opts label "9a5ed6991dd79497bbb7cbc22c7c0462" # let out anything from firewall host itself (force gw)

@@ -430,11 +430,11 @@
 pass in quick on wg1 inet6 from {any} to {any} keep state label "695755cbead8a77dab08363e3ad1a596"
 #debug:Interface opt7 not found
 # pass in quick on ##opt7## inet from {any} to {any} keep state label "d14c352e9def6f109eafc9d6446ec832"
-pass in quick on pppoe0 inet from $N_WAN_IN to {any} keep state label "4fa6c97fa22be9e93c682771886ec1f8"
-pass in quick on pppoe0 inet6 from $N_WAN_IN to {any} keep state label "4fa6c97fa22be9e93c682771886ec1f8"
-pass in quick on pppoe0 inet proto udp from {any} to {any} port {43580:43581} keep state label "83d91b6320d19f0559bb73309a2bb852" # Lets Encrypt renewals
-pass in quick on pppoe0 inet6 proto udp from {any} to {any} port {43580:43581} keep state label "83d91b6320d19f0559bb73309a2bb852" # Lets Encrypt renewals
-block in quick on pppoe0 inet proto icmp from !$RFC1918_NETv4 to {any} icmp-type {echoreq} label "47fd47d86c097f8e603c172f93151e5c" # Block IPv4 PING from outside
+pass in quick on pppoe0 inet from $N_WAN_IN to {any} keep state label "c35de4ed1b8fcc29e57f31417028b196"
+pass in quick on pppoe0 inet6 from $N_WAN_IN to {any} keep state label "c35de4ed1b8fcc29e57f31417028b196"
+pass in quick on pppoe0 inet proto udp from {any} to {any} port {43580:43581} keep state label "2528b5466a24cb135578005268b181f4" # Lets Encrypt renewals
+pass in quick on pppoe0 inet6 proto udp from {any} to {any} port {43580:43581} keep state label "2528b5466a24cb135578005268b181f4" # Lets Encrypt renewals
+block in quick on pppoe0 inet proto icmp from !$RFC1918_NETv4 to {any} icmp-type {echoreq} label "b88559e3aaae182895587d72fb26d189" # Block IPv4 PING from outside

I don't see any difference other than labels.

@fichtner
Copy link
Member

Ok, I keep going back to "weird" as the word to describe this. I don't have another idea. Maybe states problem?

@hendrer
Copy link

hendrer commented Nov 5, 2023

I resolved mine, think it was my switch's IGMP snooping causing issues.

@hendrer
Copy link

hendrer commented Nov 5, 2023

I decided to push my luck, change MTU on WAN and then change it back and lost all IPV6 and can't get it back. Now I'm convinced it's not a switch but opnsense issue.

@hendrer
Copy link

hendrer commented Nov 5, 2023

Enabled promiscuous mode, then disabled and then removed VLAN priority, rebooted and IPV6 is back. Totally strange. Let me know if there's anything you need to help triage this.

@OPNsense-bot
Copy link

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository,
please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue,
just let us know, so we can reopen the issue and assign an owner to it.

@OPNsense-bot OPNsense-bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 24, 2024
@OPNsense-bot OPNsense-bot added the help wanted Contributor missing / timeout label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributor missing / timeout support Community support
Development

No branches or pull requests

4 participants