-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Services not publishing Tunnel Only Nodes - Nightly 1112 (LUA) #324
Comments
Interesting... I've not seen this. Could you attach a support bundle taken after you see the issue? THX |
Some background on this. This only happens on hardware which has at least two ethernet devices (eth0 and eth1). When Mesh WiFI is disabled, we create a "dummy" mesh device called either eth0.3975 or eth1.3975 which we assign the mesh IP address to. We do this for a number of reasons, but one of which is keep OLSRD happy which needs to see its primary IP address attached to a real network devices, and of course we disabled the Mesh WiFI so we can't use that. OLSRD will only publish services associated with IP address attached to active network devices. OLSRD considers a network to be active if the device is up and if that devices has a carrier. For an ethernet device (and any VLAN associated with it) this means a cable must be plugged in (and the other end into a switch, etc.) Generall this is fine because, if wifi is disable, the device is probably connected via DtD or LAN to something, and the VLAN is created on that physical network device. However, if only the WAN is connected, no carrier will be detected for the VLAN and so OLSRD will not publish any services associated with that IP address. This is what happens here I think. |
That explanation makes sense, @aanon4 Tim -- I've just never seen this issue on any of my non-RF tunnel nodes. They all display the services list even though they only have the "dummy" eth0.3975 interface. I haven't been able to reproduce the issue on any of my wifi-wan tunnel nodes. I was hoping maybe there'd be a log entry if an error was occurring. |
This occurs on both GLINET AR750 Creta and Mikrotik hAP AC so far. @ab7pa I can post a support file if needed. |
@k1ky So the issue is only occurring on dual-radio nodes? |
I have not tried it on single radio nodes, but can check. I'm connecting to the the HAP/AR750 via Wi-Fi from my Laptop, then the node connects to a Home Wi-Fi using the Wi-Fi WAN. Tunnel enabled and connected to an offsite station. |
Why doesn't this system support .tgz file format file attachment? Here is the .tgz support file zipped |
So the important information is in data.txt; specifically:
and
The first the how linux views the state of the network devices, while the second is how OLSR views the state of the devices. If we did an 'ip link' dump (which we don't - maybe we should add it?) you'd see that eth1.3975 and eth1.2 both have NO-CARRIER set. |
As for solutions, unless there's a flag somewhere to tell linux to fake the carrier on a device (I can't find one) then there are two options I can think of. One is switch from using the LAN/DtD device for the VLAN to something which is always up. There are probably candidates for this, although the obvious one is the loopback address and OLSRD won't accept that. The second solution is for OLSR to ignore (selectively?) the NO-CARRIER state of a device. It is explicitly filtering the information is publishes based on the interface state, so we could modify the code to not care under some circumstances. |
Just for the fun of it, I turned off Wi-Fi on my computer (which was connecting to the MESH Node AR750), and connected computer direct via Ethernet to the LAN port on the node and no difference regarding the Services not listing. The node was still connected to my home Wi-Fi and Tunneled to the outside world. |
Could you attach updated system data with this changed configuration? |
Here ya go. Laptop Wi-Fi=Off, connected to AR750 direct via Ethernet. |
Thanks. Could I have you reboot the node in this configuration (so the ethernet is connected during the reboot rather than being plugged in after the fact)? |
Interesting to note that after sitting overnight, the Services are being published. Here is a support file after reboot with Ethernet from Laptop connected to LAN and no DTD or MESH RF Connections. Services are listed in this configuration. N-1112 |
For whatever reason your node had stopped updating the host and service files (something OLSRD does) for quite a while before you plugged in your ethernet cable. Not entirely obvious to me why as OLSRD was still working fine so ... perhaps .. there were no incoming changes from the network (which seems very unlikely on a mesh network, but maybe more likely if you're only connection is a tunnel). I suspect plugging in a cable wasn't sufficient reason for OLSRD to update these files .. which perhaps it should but at this point we're in corner cases or corner cases territory. There's options to fix this a few comments back, so I'll let people provide feedback on those. |
I'm wondering what's different from before the LUA migration that this function doesn't work under these conditions on these units? |
You can verify if this is LUA related by trying this on the current release build (which is non-LUA), but I think you'll see the same thing. |
Proposed fix: #327 |
This issue appears to be fixed with a few nightly releases a week or so ago. Appears to be good with Nightly 1191 4/26/22. |
Services are not publishing on Tunnel Only connected nodes when connected via Wi-Fi as WAN. Once they connect to another node via RF or DTD the services reappear. Disconnect the "buddy" and the services disappear. This dates back several Nightlies since the full LUA migration. (Tim has the support data and expansion on this problem comments below from @aanon4 )
The text was updated successfully, but these errors were encountered: