Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ip6tables rules break Neighbor Solicitation #45460

Closed
lordgurke opened this issue May 3, 2023 · 2 comments
Closed

ip6tables rules break Neighbor Solicitation #45460

lordgurke opened this issue May 3, 2023 · 2 comments
Labels
area/networking/firewalling area/networking/ipv6 Issues related to ipv6 area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed

Comments

@lordgurke
Copy link

Description

With ip6tables enabled Docker creates various ip6table rules which should isolate networks from each other.
The way this is done breaks neighbor solicitation, because traffic to addresses other than the configured subnet is dropped, thus filtering traffic to/from fe80::/7.

Reproduce

  • Configure Docker deamon with ip6tables: true and experimental: true
  • Add a network with an IPv6 subnet: docker network create --subnet 'fdf1:a844:380c:b247::/64' --ipv6 --internal internal1
  • Create at least two containers and assign them to the new network
  • Try to ping each container on their IPv6 address

This will fail because neighbor solicitation will not work, as docker creates rules like these:

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination
  796 53728 DROP       0    --  *      br-73d147f51290 !fdf1:a844:380c:b247::/64  ::/0
 5388  388K DROP       0    --  br-73d147f51290 *       ::/0                !fdf1:a844:380c:b247::/64
14601 9931K RETURN     0    --  *      *       ::/0                 ::/0 

This drops traffic from/to fe80::/7 which is needed for neighbor solicitation.
As soon as you add a rule to explicitly allow ICMPv6 type 135 to pass, neighbor resolution will start to work:

ip6tables -I DOCKER-ISOLATION-STAGE-1 -i br-73d147f51290 -d fe80::/7 -p ipv6-icmp --icmpv6-type 135 -m hl --hl-eq 255 -j ACCEPT

(There might be a better fitting rule, but that one worked for me and should be safe enough as it only accepts NDP with hoplimit 255, so only locally generated packets)

Now you should be able to ping each container.

Expected behavior

Docker should add a rule to explicitly let IPv6 NDP pass (ICMPv6 type 135).
It needs to be inserted before the DROP rules that drop traffic with "wrong" IP addresses.

docker version

Client:
 Version:           23.0.4
 API version:       1.42
 Go version:        go1.20.3
 Git commit:        f480fb1e37
 Built:             Fri Apr 21 22:05:37 2023
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          23.0.4
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.20.3
  Git commit:       cbce331930
  Built:            Fri Apr 21 22:05:37 2023
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          v1.7.0
  GitCommit:        1fbd70374134b891f97ce19c70b6e50c7b9f4e0d.m
 runc:
  Version:          1.1.7
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  compose: Docker Compose (Docker Inc.)
    Version:  2.17.3
    Path:     /usr/lib/docker/cli-plugins/docker-compose

Server:
 Containers: 12
  Running: 7
  Paused: 0
  Stopped: 5
 Images: 11
 Server Version: 23.0.4
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1fbd70374134b891f97ce19c70b6e50c7b9f4e0d.m
 runc version: 
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.2.13-arch1-1
 Operating System: Arch Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 6
 Total Memory: 2.907GiB
 Name: docker1
 ID: 490a25a0-0035-44da-99f0-e67e88f2eb20
 Docker Root Dir: /docker/stor
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: true
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

No response

@lordgurke lordgurke added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels May 3, 2023
@akerouanton
Copy link
Member

Hi @lordgurke, thanks for your report. I analyzed a bit more what's going on here, and I'm drafting a PR.

Although resolving <container-hostname> returns the IPv6 address allocated by Docker, both rules in DOCKER-ISOLATION-STAGE-1 are also disallowing the use of link-local addresses and the use of the unspecified IP address (which might be used by Neighbor Solicitations, as specified by RFC4861).

In term1:

$ docker network create  --subnet 'fdf1:a844:380c:b247::/64' --ipv6 --internal testnet
$ docker run --rm -d --network testnet --name test1 nicolaka/netshoot /bin/sleep infinity
# Both ping fail
$ docker run --rm -t --network testnet --name test2 nicolaka/netshoot ping -c1 -6 test1
$ docker run --rm -t --network testnet --name test2 nicolaka/netshoot ping -c1 -6 fe80::42:acff:fe12:2%eth0

In term2:

# First ping command
$ sudo ./bin/iptables-tracer -family ipv6 -iface=br-21502e5b2c6c -filter='icmp6 and (ip6[40] == 135 || ip6[40] == 136)' -filter-chain=filter/DOCKER-ISOLATION-STAGE-1           
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff00:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=aa2b 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=:: DST=ff02::1:ff12:3 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=2d89 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#1): ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff00:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=aa2b 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff00:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=aa2b 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP

# Second ping command
$ sudo ./bin/iptables-tracer -family ipv6 -iface=br-21502e5b2c6c -filter='icmp6 and (ip6[40] == 135 || ip6[40] == 136)' -filter-chain=filter/DOCKER-ISOLATION-STAGE-1
INFO[0000] Waiting for trace events...                  
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff12:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=90ce 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=:: DST=ff02::1:ff12:3 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=ba26 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#1): ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff12:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=90ce 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP
IN=br-21502e5b2c6c OUT= SRC=fdf1:a844:380c:b247::3 DST=ff02::1:ff12:2 LEN=32 HOP=255 PROTO=ICMPv6 TYPE/CODE=NeighborSolicitation CSUM=90ce 
	filter DOCKER-ISOLATION-STAGE-1 NFMARK=0x0 
		MATCH RULE (#2): ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
		=> DROP

akerouanton added a commit to akerouanton/docker that referenced this issue May 30, 2023
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. and more generally, any datagram using
link-local addresses.

To solve this, this commit changes the following rules:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```

into:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c   -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64   -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```

These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.

Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.

Solve moby#45460.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
akerouanton added a commit to akerouanton/docker that referenced this issue May 30, 2023
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.

To solve this, this commit changes the following rules:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```

into:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c   -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64   -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```

These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.

Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.

Solve moby#45460.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
akerouanton added a commit to akerouanton/docker that referenced this issue May 30, 2023
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.

To solve this, this commit changes the following rules:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```

into:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c   -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64   -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```

These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.

Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.

Solve moby#45460.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
akerouanton added a commit to akerouanton/docker that referenced this issue Jul 27, 2023
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.

To solve this, this commit changes the following rules:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```

into:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c   -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64   -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```

These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.

Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.

Solve moby#45460.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
@akerouanton
Copy link
Member

Resolved by #45649.

thaJeztah pushed a commit to thaJeztah/docker that referenced this issue Aug 13, 2023
IPv6 ipt rules are exactly the same as IPv4 rules, although both
protocol don't use the same networking model. This has bad consequences,
for instance: 1. the current v6 rules disallow Neighbor
Solication/Advertisement ; 2. multicast addresses can't be used ; 3.
link-local addresses are blocked too.

To solve this, this commit changes the following rules:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64 -i br-21502e5b2c6c -j DROP
```

into:

```
-A DOCKER-ISOLATION-STAGE-1 ! -s fdf1:a844:380c:b247::/64 ! -i br-21502e5b2c6c   -o br-21502e5b2c6c -j DROP
-A DOCKER-ISOLATION-STAGE-1 ! -d fdf1:a844:380c:b247::/64   -i br-21502e5b2c6c ! -o br-21502e5b2c6c -j DROP
```

These rules only limit the traffic ingressing/egressing the bridge, but
not traffic between veth on the same bridge.

Note that, the Kernel takes care of dropping invalid IPv6 packets, eg.
loopback spoofing, thus these rules don't need to be more specific.

Solve moby#45460.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
(cherry picked from commit da9e44a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking/firewalling area/networking/ipv6 Issues related to ipv6 area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed
Projects
None yet
Development

No branches or pull requests

4 participants
@akerouanton @lordgurke @sam-thibault and others