Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIMD working, never the less some doubts e.g. because lots(!) of alarms related to e.g. 192.254.0.1 #236

Open
LouisAtGH opened this issue Nov 19, 2022 · 10 comments

Comments

@LouisAtGH
Copy link
Contributor

Joachim,

pimd is working as expected, however despite that there are two thing to notice:

  • I have an enormous(!) amounts of alarms in the system log which is definitively not OK
  • Perhaps OK, but in my eyes a bit strange is that pimctl shows a lot of interfaces as pim ionterfaces, where they are / should be disabled for pim!??

I did attache my pimd config file and a file with some data I did collect trying to understand these issues. It contains pimctl output and a wireshark trace.

Sincerely,

Louis

I notice the following alarms
<28>1 2022-11-17T20:54:13.978780+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-17T20:54:54.935547+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-17T20:55:22.174577+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 10.60.142.62
<28>1 2022-11-17T20:56:14.808738+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-17T20:56:54.159538+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-17T20:57:14.637808+01:00 pfSense.lan pimd 61739 - - Timeout waiting for reply from routing socket for 169.254.0.1

Strange is that that "169.254..x.y" is of course a special range and why o why is pimd using that range !?
The other alarm is also strange since the 10.60.142.62 is perhaps a range used by my provide, but not mine

Strange!? All interfaces show in the pimd interface table
Probably OK, but not sure is the shown PIM interface table, the interface table shows nearly all interfaces, where I did define all interfaces of except (Default bind bind to none)

Used interfaces
lagg0.16 Up 192.168.1.1 1 30 0 192.168.1.1 (Normal PC-LAN that is where the hifi receivers are)
lagg0.26 Up 192.168.2.1 1 30 0 192.168.2.1 1 (PC-zone-2)
lagg1.11 Up 192.168.11.1 1 30 0 192.168.11.1 1 (PC-zone-3)
lagg0.13 Up 192.168.13.1 1 30 0 192.168.13.1 1 (IOT zone not yet used)
lagg1.14 Up 192.168.14.1 1 30 0 192.168.14.1 (redzone that is where the media player is (TWONKY)

em0.4 Up 10.236.170.200 1 30 0 10.236.170.200 1 (IP-TV not used, not enabled)

The media player is (TWONKY) is situated in the redzone on ip 192.168.14.15

pimd.conf.txt
pimd_issues_maybe.txt

@troglobit
Copy link
Owner

I'll try to respond. I'd very much appreciate, however, if you could post one problem per issue in GitHub, and keep it as concise as possible.

  1. What (git) version of pimd is used?
  2. How is pimd started?
    - In particular the --disable-vifs command line flag is key to answer the second question -- the enable and disable keywords in pimd.conf change behavior slightly depending on that that option.
    - The amount of logs, and type of logs, is also highly dependent on the command line option. There is log level and susbystem that decides what to log.

Regarding the 169.254 issue you bring up, lots of old UNIX daemons, and in particular the mrouted+pimd family, often denote the multicast VIF or base interface using the first IP address it found when scanning for interfaces. The routing socket backend of pimd could definitely use some help here to be cleaned up and made to follow the log syntax used by the Linux netlink backend. I hope that answers that question.

Regarding the question of "Strange!? All interfaces show up in the pimd interface table", this too is related to the second point addressed above, command line options.

@LouisAtGH
Copy link
Contributor Author

Joachim,

ad 1) I build it a couple of days ago from the latest master branch
ad 2) pimd is not started using any option as far as I can see, it is just using the config file

The alarms do occur in the pfSense system.log file
<28>1 2022-11-21T13:02:17.036665+01:00 pfSense.lan pimd 67308 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-21T13:02:36.786660+01:00 pfSense.lan pimd 67308 - - Timeout waiting for reply from routing socket for 169.254.0.1
<28>1 2022-11-21T13:03:16.602660+01:00 pfSense.lan pimd 67308 - - Timeout waiting for reply from routing socket for 169.254.0.1

Thousands !!!! I really must get rid of those !

Related to the second point, I am just wondering about that, I did explicit select ^default do not include^ and never the less they occur in that output.

  • I am not sure if that is correct
  • I am not sure if pimd is processing those ^not selected^ interfaces correct. That is why I did attach extra info

If I should have made two issues one for the '169.254 issue' and one for the 'vague interface selection behavoir' my excuses

@troglobit
Copy link
Owner

The Timeout log message seems to come from 3e7fb03, introduced by @stormshield-damiend in 2021. It's possible it should be a LOG_DEBUG level log message instead. As the code is constructed it looks like it was set to LOG_WARNING during development, only to verify the refactor worked as intended. Since I don't have and BSD system up and running it'll take me a while to verify this theory.

I'll have to look in to the issue of enable/disable of interfaces separately. Hopefully I don't need a FreeBSD system for that. Let me get back to you on that.

@LouisAtGH
Copy link
Contributor Author

Joachim, I have an FreeBSD14-current here, because I did need that to compile your code for pfSense ....

Also note that my feeling is that not only the alarm is an issue but also all the real messages on the network to which they refer.
I mean what is the use of sending an endless number of messages to an "IPV4 'link local" like "169.254..x.y"

@troglobit
Copy link
Owner

Yes, if the message should be kept (maybe @stormshield-damiend can weigh in on that), it should probably not refer to the interface by IP address, but maybe by its name instead. I don't know, I'm not that familiar with the BSD routing sockets, frankly.

@LouisAtGH
Copy link
Contributor Author

Joachim, I found a problem with the (automatic created) config file. When starting which a modified file:

  • it solved the strange interface configuration behavoir and
  • almost solves the 169.254 problem

I will of course take care of the config file configuration issues myself. Sorry that I did not detected that issue earlier

@troglobit
Copy link
Owner

Great to hear! No problem ✋😃

Let's leave this issue open to give @stormshield-damiend some time to answer our question about log level above.

@LouisAtGH
Copy link
Contributor Author

Yep, it should stay open, the 169.254.x.y is still there "pimd[44512]: Timeout waiting for reply from routing socket for 169.254.0.1" which is to my feeling (not an high level expert on this) complete nonsense, this 169.254.x.y. is by no means a normal range. Let me at for @stormshield-damiend that at this moment I am running pimd on the pfSense development release based on FreebSD14-current (which works fine, as far as I know at the moment, apart from this issue)

@LouisAtGH
Copy link
Contributor Author

@stormshield-damiend Assuming there is no principle issue behind this issue, It is probably not a big thing to fix. I would appriciate

@stormshield-damiend
Copy link
Contributor

Hi,

i do not work anymore on pimd, dev in our side is now made by other people.

What i can say is that print message is an error message that should only occur when you fail to get a route for an unknown addr or when the routing part of the kernel fail to answer for whatever reason.
About the flooding part, maybe this log could be protected by some kind of time protection to prevent it from happening too often (once per second maybe ?).
What could also be done is raising the log type to prevent printing it in default and maybe add a debug counter that could be printed elsewhere like for example "# amount of request blocked due to kernel timeout".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants