Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After reboot - dpinger not starting on secondary IPv6 WAN interface. #7400

Closed
Wireheadbe opened this issue Apr 21, 2024 · 22 comments
Closed
Assignees
Labels
cleanup Low impact changes
Milestone

Comments

@Wireheadbe
Copy link

Wireheadbe commented Apr 21, 2024

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

After a reboot, or complete link loss, dpinger for the secondary IPv6 gateway fails to restart. This happened on previous versions as well (24.x)

To Reproduce

Steps to reproduce the behavior:
-Reboot the system
-Log in
-Dpinger not started for secondary IPv6 gateway

Expected behavior

-Dpinger to be started for IPv6 interface

Describe alternatives you considered

-Restarting dpinger manually works (naturally, this is a workaround)
-Going to the gateways page and saving, reconfigured dpinger, upon which all dpingers start correctly

Screenshots
image
image

Relevant log files

See above screenshots. Upon reboot, a dpinger_configure is exectuted for all Gateways, except WAN02_DHCP6.
It's seemingly skipped because "WAN02_DHCP6 IPv6 interface address could not be found, skipping."
I also see a mention of "Skipping gateway WAN02_DHCP6 due to empty 'gateway' property."
But it has an IPv6 address (Interfaces -> overview):
image

Additional context

The opnsense system uses a LAGG towards a switch. If the switch goes completely down (as mentioned above), the same behaviour occurs: WAN01 (IPv4 & IPv6) -> dpinger starts correctly. WAN02: only IPv4 dpinger starts.

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 24.1.6 (amd64).
Intel(R) Core(TM) i5-7400 CPU
NIC: Intel igb -> LAGG to Switch.

@Wireheadbe
Copy link
Author

The message seems to originate from:

if (empty($gwifip)) {
                log_msg(sprintf('The required %s IPv6 interface address could not be found, skipping.', $name), LOG_WARNING);
                continue;
            }

... in /usr/local/etc/inc/plugins.inc.d/dpinger.inc - but as you can see in the last screenshot, the IPv6 gateway isn't empty for said WAN02

@Wireheadbe
Copy link
Author

Wireheadbe commented Apr 21, 2024

Seems like a timing issue, where the information isn't available quick enough when the dpinger configure kicks in..
If I randomly add a 2 second delay:

function dpinger_configure_do($verbose = false, $gwname = null, $bootup = false)
{
    sleep(2);
    service_log(sprintf('Setting up gateway monitor%s...', empty($gwname) ? 's' : " {$gwname}"), $verbose);

in /usr/local/etc/inc/plugins.inc.d/dpinger.inc
-> then it "just works"

Hope this helps you in finding some way to fix this race condition.

@fichtner
Copy link
Member

fichtner commented Apr 21, 2024

Could you try adding a tunable „net.inet6.ip6.dad_count“ with value „0“ to see if that problem goes away?

To be frank we wait for this to settle in rc.newwanipv6, but it's been known to still fail:

/* wait for DAD to complete to avoid discarding tentative address */
$dad_delay = (int)get_single_sysctl('net.inet6.ip6.dad_count');
if ($dad_delay) {
/* XXX this is also required but missed for IPv6 VIPs created later in the script */
sleep($dad_delay + 1);
}

Debugging this is impossible without a setup where this happens for "reasons".

Cheers,
Franco

@Wireheadbe
Copy link
Author

I think you broke a fix-record in replying on a rainy Sunday. 😄
That did fix it yes. Would have never guessed it was duplicate address detection causing that.

Anyway I can help you guys out on a more permanent fix - seeing I have a "special set-up where this happens for reasons"? (Or we document this behaviour in the OPNsense documentation?)

@fichtner
Copy link
Member

Hmm, you said dual WAN…. Are both DHCPv6? If yes does it work when you disable the primary one?

@fichtner
Copy link
Member

(Just the IPv6 part I mean)

@Wireheadbe
Copy link
Author

Hmm, you said dual WAN…. Are both DHCPv6? If yes does it work when you disable the primary one?

I'll give it a try! Both are DHCPv6 yes.

@Wireheadbe
Copy link
Author

Wireheadbe commented Apr 21, 2024

Removed the dad_count - disabled IPv6 Gateway on WAN01 -> upon reboot: IPv6 Gateway on WAN02 doesn't come up.

For what it's worth, this is a dual-port, copper, IGB-based NIC.

@fichtner
Copy link
Member

Let me get back tomorrow with a fresh idea. Thanks for the help!

@fichtner fichtner added the support Community support label Apr 21, 2024
@Wireheadbe
Copy link
Author

Seems like Linux suffers from this as well..
https://www.agwa.name/blog/post/beware_the_ipv6_dad_race_condition
It mentions:

The problem is that until DAD can confirm that there is no other host with the same address, the address is considered to be "tentative." While it is in this state, attempts to bind() to the address fail with EADDRNOTAVAIL, as if the address doesn't exist. That means that if you have a service configured to listen on a particular IPv6 address, and that IPv6 address is still tentative when the service starts, it will fail to bind to that address. Very few programs will try to bind again later.

If FreeBSD has implemented this in a similar fashion, then yes, we'll run into this if we go "too fast" - not sure if there's a way on FreeBSD to check that "tentative" status.

In any case - many thanks for the lightning fast response 💯
If no clear-cut solution exists - it's probably a good idea to document such an edge case maybe here: https://docs.opnsense.org/troubleshooting/gateways.html

@Wireheadbe
Copy link
Author

Wireheadbe commented Apr 21, 2024

https://reviews.freebsd.org/D40103
Found some interesting references to NoDAD -> maybe there are provisions to get the actual interface status? (Tentative / normal)

Apparently, the interface flag "IN6_IFF_TENTATIVE" could be looked at.

https://redmine.pfsense.org/projects/pfsense/repository/2/revisions/3a335f0798cae05f86d61a43148fd0efc83408d7/diff
Seems like here they did a sleep(1) and were done with it 🤔 - but doesn't seem to be the cleanest solution. Possibly checking if IN6_IFF_TENTATIVE is still set on the relevant interface would probably be better. But seeing the DAD process is asynchronously launched, you'd probably need something like a loop to check. Which would also not be super-clean.

@fichtner
Copy link
Member

Let's not get carried away here... tentative is what it is and we exclude it when we look for viable addresses because it wouldn't work anyway:

if ($addr['family'] != 'inet6' || $addr['deprecated'] || $addr['tentative'] || $addr['alias']) {

The problem is that when we look for a dynamic "primary" IP address we don't know what we are looking for and we can't push tentative addresses to the caller as it would end up patching every spot trying to use a non-usable address so that's why we try to exclude it from the lookup.

rc.newwanipv6 is supposed to be started by dhcp6c after addresses are assigned, which means we get a tentative count and the initial wait should do the trick as mentioned.

There could be complications, however:

  • Tentative takes longer for unknown reasons. You could test this theory by applying a longer wait (increase the "+ 1" delay by a second or two).
  • The address is created later than rc.newwanipv6 is triggered by dhcp6c (we could check that in dhcp6c code but it's unlikely).
  • The address is removed and recreated while rc.newwanipv6 is sleeping so it's missing the new address (it's even hard to prove it).
  • The sleep() in rc.newwanipv6 is interrupted by a signal and is not waiting for the appropriate amount of time, but that seems a bit unlikely it's so easily reproducible for you and not for others.

Netlink could help with this in the future, but ideally we don't want individual code spots to loop until they have a viable address like done elsewhere as that just clogs up subsystems.

@fichtner
Copy link
Member

That being said if the sleep ends up "+ 2"instead of "+ 1" in order to work I'm not against it I guess.

@fichtner
Copy link
Member

Which address does it find on WAN BTW? IA-NA or one from the prefix or a SLAAC on WAN?

@fichtner
Copy link
Member

Interfaces: Settings: Log level mode to "Info" might help, but needs a reboot.

@Wireheadbe
Copy link
Author

Wireheadbe commented Apr 22, 2024

Which address does it find on WAN BTW? IA-NA or one from the prefix or a SLAAC on WAN?

It gets one from the prefix in this case.

That said, I don't think there's an issue with using net.inet6.ip6.dad_count "0" in the case of an "authorative" device on your network. It should be the one having the specific IP. But maybe we should document the case better?

I've set the log level - will reboot as soon as possible.

@fichtner
Copy link
Member

fichtner commented Apr 22, 2024

It gets one from the prefix in this case.

Ok that means it already "swings" to a tracking LAN because there is no GUA IPv6 on WAN itself. Theoretically speaking that is the slowest form of address acquire in the chain.

@Wireheadbe
Copy link
Author

DHCP log level is set to info - any specific log output you would like? 👍🏻

@fichtner
Copy link
Member

The full general („system“) log on reboot for both dad_counter unset and set to zero (set to debug level or just grab the file on the disk). You can send it to franco AT OPNsense DOT org

thanks!

@Wireheadbe
Copy link
Author

Log sent - many thanks! 👍🏻

@Wireheadbe
Copy link
Author

As a test - we updated the following line to " + 2"
https://github.com/opnsense/core/blob/master/src/etc/rc.newwanipv6#L75
... and removed the tunable.

That made the system behave correctly. I'll monitor over the course of a week, with some interface reloads / simulated link-loss how the system behaves.

@fichtner fichtner self-assigned this Apr 30, 2024
@fichtner fichtner added cleanup Low impact changes and removed support Community support labels Apr 30, 2024
@fichtner fichtner added this to the 24.7 milestone Apr 30, 2024
fichtner added a commit that referenced this issue Apr 30, 2024
The + 1 was completely arbitrary to begin with (derived from
FreeBSD scripting), but if part of the system needs longer to
cope with tentative state then this would be an easy way to
make it more reliable.

If + 3 makes sense for the next person is something I want to
doubt, however.

Special thanks go to @Wireheadbe for pursuing and testing this.
@Wireheadbe
Copy link
Author

Closing - f2e60c1 fixes this 👍🏻

fichtner added a commit that referenced this issue May 6, 2024
The + 1 was completely arbitrary to begin with (derived from
FreeBSD scripting), but if part of the system needs longer to
cope with tentative state then this would be an easy way to
make it more reliable.

If + 3 makes sense for the next person is something I want to
doubt, however.

Special thanks go to @Wireheadbe for pursuing and testing this.

(cherry picked from commit f2e60c1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Low impact changes
Development

No branches or pull requests

2 participants