Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIMD causes random kernel panics: needs to validate interfaces before attempted use #218

Open
MrPeteH opened this issue Mar 6, 2022 · 9 comments

Comments

@MrPeteH
Copy link

MrPeteH commented Mar 6, 2022

My pfSense started reliably crashing (random but always within 1-2 hr of boot) with kernel panics pointing to pimd.

After much sleuthing, I discovered the root cause:

  • Over a year ago, I eliminated some VLANs and subnets on my network (now combining wifi/wired instead of separate for each)
  • As a result, several interfaces were deleted
  • Not surprisingly, the system is unable to auto-delete the interfaces (and associated subnets) from various places in the overall config (firewall rules and aliases, pimd config, pfblocker config)
  • In most cases, nonexistent interfaces are ignored

However, pimd doesn't ignore the invalid interfaces. It apparently attempts to use them or at least do something with them.

Most of the time, this doesn't cause an issue (I did run for about a year like this!) But sometimes it does... leading to a kernel panic.

I think I have backup copies of various config files if necessary.

(Note: I am still eventually hoping to track down #171 )

@troglobit
Copy link
Owner

I'm sorry to hear that, but kernel crashes are not really the responsibility of userland applications. The kernel should handle such issues. You have the option of course of disabling pimd on select interface (or disable all, and then selectively enable), which could be a way to work around this problem.

@MrPeteH
Copy link
Author

MrPeteH commented Mar 11, 2022

Hmmm. That makes sense. (BTW, I do disable all and selectively enable. It's a bad *.conf file containing references to invalid interfaces due to having deleted some VLANs. Thus, pimd isn't validating interface references in the conf file? )

Yet, here's the strange thing I noted: this is a kernel panic in the "pim" area of the kernel.

Do I understand correctly that pimd is modifying various parameters of the built-in "pim" support?

On further reflection, presumably:

  • there ought to be nothing (even 100% invalid info) that pimd can do to cause a kernel crash.
  • If there IS such a crash, then the kernel is not fully validating the information passed to it.

Does that make sense? Or am I simply out to lunch on this? ;)

@troglobit
Copy link
Owner

Yes correct. You are spot on. Regardless what little old pimd does, it shouldn't crash the kernel.

@MrPeteH
Copy link
Author

MrPeteH commented Mar 13, 2022

So I will report this up-chain 😏

@MrPeteH
Copy link
Author

MrPeteH commented Mar 15, 2022

On further reflection... it's TRUE that userland "shouldn't" crash the kernel.
At the same time, defense-in-depth design suggests that all layers are well served by handling things well.
Can't be much overhead to validate interfaces at startup time and remove (or complain about) any that are invalid.

@troglobit
Copy link
Owner

Of course. And at startup pimd (master) does this by calling getifaddrs(), comparing that with the interfaces enabled (from command line and) in pimd.conf. There is of course a certain window where interfaces may go missing between this initial probe and the milliseconds in between pimd actuall registers VIFs with the kernel to be used for multicast routing. At runtime, however, there is no check if interfaces suddenly go disappear. Here it really is up to the kernel to clean up any VIFs registered with phyiscal interfaces, and return EIO or similar for any outstanding recvfrom(), setsockopt(), or similar that have a socket open on any of these interfaces.

@MrPeteH
Copy link
Author

MrPeteH commented Mar 17, 2022

In my case, we're not dealing with dynamic change.
Six months ago I removed some VLANs, and thus the associated interfaces.
For whatever reason, now having those in pimd.conf leads to kernel panics within seconds to hours of startup.

I hear you saying you already validate the interfaces. Should an invalid interface in pimd.conf...

  • Produce a startup error?
  • Simply be ignored by PIMD?

Something strange certainly happens.

@troglobit
Copy link
Owner

You know, I'm not really sure what it is you expect I should do for you here in this bug report. You've not declared how you start pimd, or what your pimd.conf runs. I don't know what version of pimd you are using, or what patches pfSense may have applied to integrate it into FreeBSD. I don't use FreeBSD myself, so for the pimd project that's like a second or third tier platform.

There's lots of error handling in pimd, the code is here on GitHub for you to inspect and learn. Check out config.c, vif.c and kern.c and you can see the error handling and log levels used for each such message.

@MrPeteH
Copy link
Author

MrPeteH commented Mar 24, 2022

I understand. Time for me to put on my coder/diagnosis hat and dig in. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants