Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAN Client not getting router from DHCP when "prevent WAN access" is enabled #474

Closed
WZ0C opened this issue Aug 8, 2022 · 21 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@WZ0C
Copy link

WZ0C commented Aug 8, 2022

The client is a Reolink 823A camera. It successfully gets a DHCP lease and is accessible and usable on the local LAN segment. However, it is not accessible to clients on other mesh nodes because it does not have a route out of the local segment. If a remote client tries to access the camera, the TCP SYN reaches the camera; however, no SYN|ACK is returned (no route on which to send it), and the remote client can not use the camera. If a local client tries to access the camera, the handshake completes normally, and the camera can be accessed.

The DHCP responses to the client do not have Gateway addresses in the main headers, and when "prevent WAN access" is enabled, the option 3 router is not sent to the client. Additionally, the option 121 and 249 routes are never sent to this client. (And if the options 121 and 249 are force sent to the client, the client doesn't appear to use them.)

If "prevent WAN access" is disabled, then clients on other mesh nodes are able to access the camera without issue.

"Prevent access" not set: (images 1, 2, 3)
image
image
image

"Prevent access" set: (images 4, 5, 6)
image
image
image

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

After further investigation, it appears there may be a design assumption that a lan client could be multi-homed and/or have a default route on another router. And so the option 3 router would only be appropriate to send to the client if it was accessing the WAN through the node. And so DHCP option 121 is used to send CIDR routes for 10/8 and 172.16/12 so the LAN clients can access the mesh.

However, if the LAN client does not support DHCP option 121, then it has no way to get a route to the mesh, short of a static network configuration. If "Prevent WAN access" is not enabled, then the LAN client does not receive DHCP option 3. Separately, DHCP option 33 may also not work, and there are no other DHCP options that could set these routes.

Perhaps this issue is less bug and more feature request: the ability to have LAN clients receive a default route through the node (DHCP option 3 or a DHCP gateway address), even if "Prevent WAN access" is not enabled.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

I'll be honest, I'm not entirely sure what the code we have here is trying to do exactly. As you observed, when "Prevent WAN access" is enabled, no dhcp option 3 is setup. Instead we provide a 0.0.0.0/0 route as part of option 121 and 249. This route appears to be the LAN address of this issuing device ... which feels like what option 3 would have been doing anyway, so why we do things differently here is not clear. Nor is it clear why that setup would stop WAN access if it were available on the node anyway. I don't immediately see any firewall-ing which is what I'd have expected to find.

Going to kick this to @ae6xe for his thoughts, but it seems a little broken to me - but I could well be missing something.

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

The routes from options 121 and 249 are actually 10.0.0.0/8 and 172.16.0.0/12, directed to the node. So by default, only AREDN traffic is going to the node, unless WAN access is enabled.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

This is the code:

if not is_null(cfg.lan_dhcp_noroute) then
    c:set("dhcp", "@dhcp[0]", "dhcp_option", {
        "121,10.0.0.0/8," .. cfg.lan_ip .. ",172.16.0.0/12," .. cfg.lan_ip,
        "249,10.0.0.0/8," .. cfg.lan_ip .. ",172.16.0.0/12," .. cfg.lan_ip,
        "3"
    })
else
    c:set("dhcp", "@dhcp[0]", "dhcp_option", {
        "121,10.0.0.0/8," .. cfg.lan_ip .. ",172.16.0.0/12," .. cfg.lan_ip .. ",0.0.0.0/0," .. cfg.lan_ip,
        "249,10.0.0.0/8," .. cfg.lan_ip .. ",172.16.0.0/12," .. cfg.lan_ip .. ",0.0.0.0/0," .. cfg.lan_ip
    })
end

The top is WAN enabled, the bottom WAN disabled (let us coast past the three negatives involved in that if statement). Note the default route on the end of the bottom version.
This is somewhat orthogonal to the Option 3 problem, but as I said in my original comment, it's not entirely clear what this code is trying to achieve.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

It might make more sense to stick with using option 3 and block WAN access in the firewall @ae6xe ?

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

I think always sending option 3 will break users that have dual-homed (one interface on the home LAN and one interface on the mesh node) computers. It seems to me at this point that the feature is working as designed but doesn't account for hosts that don't accept DHCP option 121.

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

If I unwrap that triple-negative in the code, it appears to me that:
if WAN is allowed,
the LAN device gets DHCP option 3 and DHCP option 121 with 10/8, 172.16/12, and 0/0.
else
the LAN device gets DHCP option 121 with 10/8, 172.16/12.

This is also what I find empirically on the node in /var/etc/dnsmasq.conf in the two cases.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

I don't follow you - I see the "3" on one side of the if...else and the "0.0.0.0/0" on the other. I don't see how you can get both as you describe. On my node I have this ([correction] wan was disabled, but LAN-to-WAN was not):

config dhcp
        option interface 'lan'
        option start '6'
        option limit '1'
        option leasetime '1h'
        option ignore '1'
        list dhcp_option '121,10.0.0.0/8,10.49.102.5,172.16.0.0/12,10.49.102.5,0.0.0.0/0,10.49.102.5'
        list dhcp_option '249,10.0.0.0/8,10.49.102.5,172.16.0.0/12,10.49.102.5,0.0.0.0/0,10.49.102.5'

A 0.0.0.0/0 route but no option 3.
I didn't check what version you're running (this is on 3.22.6.0)

Regarding dual homes nodes, that why I'm hoping @ae6xe will chime in as I dont know that the intention was here. That said, I'm not sure why this 0.0.0.0/0 route would be less problematic than the Option 3 route from that perspective. Unfortunately I dont know what Windows or Linux will do when present by two default route options. Mac will use the interface order to determine which is prefers. I'd need to look at the dhcp config for dnsmasq to see if we can throw a higher metric on the Option 3 default route to make it less tempting for a client.

@aanon4 aanon4 self-assigned this Aug 9, 2022
@aanon4 aanon4 added the bug Something isn't working label Aug 9, 2022
@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

I've been using 3.22.6.0, and I loaded the latest nightly build today.

On Linux, if there are multiple default routes, it uses the one listed first in the table. The one that is first will of course vary by which interface came up last.

On the DHCP options, if I have "Prevent WAN" checked, I don't get either option 3 or the 0.0.0.0/0 in option 121. If I have "Prevent WAN" unchecked, I get both option 3 and the 0.0.0.0/0 in option 121.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

Could you post the output from "/etc/config/dhcp" for both of these configurations? - thanks

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

The test up there has: ... not is_null ... noroute (three negatives). The not and is_null cancel out. Then flip noroute to a hypothetical "route" and invert the test. So that's then conceptually:

if route then
10 and 172 and default
else
10 and 172

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

I get the following with this configuration:
Screen Shot 2022-08-08 at 8 56 04 PM

config dhcp
        option interface 'lan'
        option start '6'
        option limit '1'
        option leasetime '1h'
        option ignore '1'
        list dhcp_option '121,10.0.0.0/8,10.49.102.5,172.16.0.0/12,10.49.102.5'
        list dhcp_option '249,10.0.0.0/8,10.49.102.5,172.16.0.0/12,10.49.102.5'
        list dhcp_option '3'

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

With "Prevent WAN access" not checked. (WAN is enabled) DHCP sends option 3.

image

With "Prevent WAN access" checked. (WAN is disabled) DHCP does not send option 3.
image

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

list dhcp_option '3' is a null option. Because the option has no args, it doesn't get sent to the client. If that line is not there (or if an arg is added), the option does get sent.

See images 3 and 6 in the original issue text.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

Ah - right - now I understand I'm being thick - thanks - figured it was something like that.

Still .. begs the next question, if we're sending Option 3, why are we bothering to send those other routes at all? Doesn't option 3 cover those situations? All the routes go to the same address after all? Still feel like I'm missing something.

As for always sending option 3 and the multi-home problem, Microsoft have a vendor extension to allow us to bundle a default route metric which we could use to make it "if all else fails" ... but I'm not seeing support for Linux at the moment.

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

The option 121 10/8 and 172.16/12 are always there to get mesh traffic to the mesh.

Then the option 121 0/0 and option 3 are there to get all the remaining traffic to the WAN via the node. This does encompass the other routes, but I presume this way makes the engineering simpler. If these were turned on all the time, and "Prevent WAN" is checked, then all this traffic gets dropped by the node, instead of going out somewhere else.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

The option 121 10/8 and 172.16/12 are always there to get mesh traffic to the mesh.

A default route achieves the same thing here. LAN traffic ends up on the node, and the node send it on to the mesh.

Then the option 121 0/0 and option 3 are there to get all the remaining traffic to the WAN via the node. This does encompass the other routes, but I presume this way makes the engineering simpler.

Not obviously. This is generated by one bit of code in one place and consumed by one system.

If these were turned on all the time, and "Prevent WAN" is checked, then all this traffic gets dropped by the node, instead of going out somewhere else.

This options configures those routes to be handed to the clients, but nothing else. And, as I just checked, it turns out that if you set a default route on your LAN device, point it at the node with "Prevent WAN" set .. the node will happily route you to the internet and back. Which seems like a problem separate from this one.

So .. as it stands today ... all "Prevent WAN" is doing is decided whether to send a default route (via option 3) to your LAN device or not. Nothing else is going on to prevent any traffic from going to the WAN from the LAN regardless of the state of this checkbox.

A "proper fix" as you propose is to allow WAN access or not (via the firewall), and to provide a default route to your LAN device or not (via DHCP options) ... the combination of WAN access but no default route isn't a useful option, even though that's one of the ones we currently have :-( At this point I need to chat with someone better at the UX/UI - the current option is easy to understand but broken, and we need a good way to convey what the new options would be.

@WZ0C
Copy link
Author

WZ0C commented Aug 9, 2022

If this is being reworked, it would be nice if "Prevent WAN access" were flipped to "Allow WAN access". Forcing users to think in the negative is frustrating.

@aanon4
Copy link
Contributor

aanon4 commented Aug 9, 2022

Tagging @ab7pa who's a much better communicator than I am.

@aanon4
Copy link
Contributor

aanon4 commented Aug 17, 2022

Are we able to close this now?

@WZ0C
Copy link
Author

WZ0C commented Aug 17, 2022

The problem I encountered has been resolved in the latest build.

@aanon4 aanon4 closed this as completed Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants