Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Resolve Packet Masquerading Issues When Running netbird up on a Server Acting as DHCP and IGW? #1978

Closed
thorleifjacobsen opened this issue May 14, 2024 · 3 comments

Comments

@thorleifjacobsen
Copy link
Contributor

thorleifjacobsen commented May 14, 2024

Describe the problem

We have a server configured to act as both a DHCP server and an Internet Gateway (IGW) on a specific VLAN. However, after installing and running netbird up on this server, it stops masquerading packets from that VLAN. This issue is illustrated in the provided images.

image

The first image shows the packet behavior when netbird is up and running, while the second image shows the behavior when netbird is down. As indicated by the arrow, the packet is not being correctly masqueraded when netbird is active.

image

Our setup utilizes iptables for masquerading, which functions correctly when netbird is not running.

Could you help us understand why netbird interferes with packet masquerading and how we can resolve this issue?

What we have tried so far / additional information

We checked ip route show table all for differences, the only difference we found are these:

100.66.0.0/16 dev wt0 proto kernel scope link src 100.66.148.58
local 100.66.148.58 dev wt0 table local proto kernel scope host src 100.66.148.58
broadcast 100.66.255.255 dev wt0 table local proto kernel scope link src 100.66.148.58

This is the whole output:

default via 192.168.1.1 dev em2 proto dhcp metric 100
10.16.0.0/18 dev xx_xxxxx_int proto kernel scope link src 10.16.27.10
10.50.0.0/21 dev nrtap proto kernel scope link src 10.50.x.x
10.52.0.0/23 dev yy_yyyyy_int proto kernel scope link src 10.52.0.10
100.66.0.0/16 dev wt0 proto kernel scope link src 100.66.154.148 metric 450
192.168.1.0/24 dev em2 proto kernel scope link src 192.168.1.209 metric 100
broadcast 10.16.0.0 dev xx_xxxxx_int table local proto kernel scope link src 10.16.27.10
local 10.16.27.10 dev xx_xxxxx_int table local proto kernel scope host src 10.16.27.10
broadcast 10.16.63.255 dev xx_xxxxx_int table local proto kernel scope link src 10.16.27.10
broadcast 10.50.0.0 dev nrtap table local proto kernel scope link src 10.50.x.x
local 10.50.x.x dev nrtap table local proto kernel scope host src 10.50.x.x
broadcast 10.50.x.255 dev nrtap table local proto kernel scope link src 10.50.x.x
broadcast 10.52.0.0 dev yy_yyyyy_int table local proto kernel scope link src 10.52.0.10
local 10.52.0.10 dev yy_yyyyy_int table local proto kernel scope host src 10.52.0.10
broadcast 10.52.1.255 dev yy_yyyyy_int table local proto kernel scope link src 10.52.0.10
broadcast 100.66.0.0 dev wt0 table local proto kernel scope link src 100.66.154.148
local 100.66.154.148 dev wt0 table local proto kernel scope host src 100.66.154.148
broadcast 100.66.255.255 dev wt0 table local proto kernel scope link src 100.66.154.148
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
broadcast 192.168.1.0 dev em2 table local proto kernel scope link src 192.168.1.209
local 192.168.1.209 dev em2 table local proto kernel scope host src 192.168.1.209
broadcast 192.168.1.255 dev em2 table local proto kernel scope link src 192.168.1.209
unreachable default dev lo proto kernel metric 4294967295 error -101 pref medium
unreachable ::/96 dev lo metric 1024 error -113 pref medium
unreachable ::ffff:0.0.0.0/96 dev lo metric 1024 error -113 pref medium
unreachable 2002:a00::/24 dev lo metric 1024 error -113 pref medium
unreachable 2002:7f00::/24 dev lo metric 1024 error -113 pref medium
unreachable 2002:a9fe::/32 dev lo metric 1024 error -113 pref medium
unreachable 2002:ac10::/28 dev lo metric 1024 error -113 pref medium
unreachable 2002:c0a8::/32 dev lo metric 1024 error -113 pref medium
unreachable 2002:e000::/19 dev lo metric 1024 error -113 pref medium
unreachable 3ffe:ffff::/32 dev lo metric 1024 error -113 pref medium
unreachable default dev lo proto kernel metric 4294967295 error -101 pref medium
unreachable default dev lo proto kernel metric 4294967295 error -101 pref medium

We have checked ip rule for differences, the only difference is the rule 100 and 110 are there when netbird is up

0:      from all lookup local
100:    from all lookup main suppress_prefixlength 0
110:    not from all fwmark 0x1bd00 lookup netbird
32766:  from all lookup main
32767:  from all lookup default

We checked sysctl -a for differences, found no noteworthy differences. The lines with < is when netbird is up, and > is when netbird is down.

image

The result of ip address is this:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 50:9a:4c:87:70:5a brd ff:ff:ff:ff:ff:ff
3: em2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 50:9a:4c:87:70:5b brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.209/24 brd 192.168.1.255 scope global noprefixroute dynamic em2
       valid_lft 25261sec preferred_lft 25261sec
6: xx_xxxxx_int@em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 50:9a:4c:87:70:5a brd ff:ff:ff:ff:ff:ff
    inet 10.16.27.10/18 brd 10.16.63.255 scope global xx_xxxxx_int
       valid_lft forever preferred_lft forever
7: yy_yyyyy_int@em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 50:9a:4c:87:70:5a brd ff:ff:ff:ff:ff:ff
    inet 10.52.0.10/23 brd 10.52.1.255 scope global yy_yyyyy_int
       valid_lft forever preferred_lft forever
8: nrtap: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether 76:02:64:ca:1e:3a brd ff:ff:ff:ff:ff:ff
    inet 10.50.x.x/21 brd 10.50.x.x scope global nrtap
       valid_lft forever preferred_lft forever
23: wt0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1280 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none
    inet 100.66.154.148/16 brd 100.66.255.255 scope global wt0
       valid_lft forever preferred_lft forever

To Reproduce

I've not managed to reproduce it other than having a server like this setup. It is running CentOS 7.

Expected behavior

I expected netbird not to disable masquerading we have setup with IP tables on a different network and a totally different vlan / interface than wt0.

Are you using NetBird Cloud?

No

NetBird version

0.27.5

@thorleifjacobsen
Copy link
Contributor Author

Might be fixed in #2038

@mlsmaycon
Copy link
Collaborator

@thorleifjacobsen can you test it with #2038 build? You can download the binaries here: https://github.com/netbirdio/netbird/actions/runs/9209265609/artifacts/1531262295 then just extract and replace the currently installed binaries with the builded ones.

@thorleifjacobsen
Copy link
Contributor Author

Fixed in 0.27.9, thanks @mlsmaycon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants