Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: Fix bpf masquerade issue when host connecting to remote pod #15206

Merged
merged 1 commit into from
Mar 4, 2021

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Mar 4, 2021

See commit msg.

Invoking the agent as following on two nodes ...

  # ./daemon/cilium-agent --identity-allocation-mode=crd --enable-ipv6=true    \
      --enable-ipv4=true --disable-envoy-version-check=true --tunnel=disabled  \
      --k8s-kubeconfig-path=$HOME/.kube/config --kube-proxy-replacement=strict \
      --enable-l7-proxy=false --enable-bpf-masquerade=true                     \
      --enable-host-legacy-routing=false --auto-direct-node-routes=true        \
      --enable-bandwidth-manager=true --native-routing-cidr=10.217.0.0/16

... I ran into the issue that the hostns (192.168.180.29) cannot connect to a
remote Pod (10.217.1.175):

  # tcpdump -i enp2s0np0 -n
  [...]
  11:59:01.002065 IP 192.168.180.29.38233 > 10.217.1.175.12865: Flags [S], seq 3173960079, win 64240, options [mss 1460,sackOK,TS val 444671211 ecr 0,nop,wscale 7], length 0
  11:59:01.002113 IP 192.168.180.28.59227 > 192.168.180.29.38233: Flags [S.], seq 2874324629, ack 3173960080, win 65160, options [mss 1460,sackOK,TS val 3030265373 ecr 444671211,nop,wscale 7], length 0
  11:59:01.002242 IP 192.168.180.29.38233 > 192.168.180.28.59227: Flags [R], seq 3173960080, win 0, length 0

What can be seen is that the SYN/ACK reply from remote gets wrongly masqueraded
to the node IP address (192.168.180.28) of the Pod, hence the subsequent RST.

Debugging further, what can be seen is that in snat_v4_needed() we do find an
ipcache entry (the catchall case) where info->sec_label == REMOTE_NODE_ID does
not match, and therefore we masq for the remote node.

By default from daemon side, --enable-remote-node-identity is false which then
also does not have an ipcache entry:

  # ./cilium/cilium bpf ipcache list | grep 192.168.180.29
  10.217.0.152/32                          4 0 192.168.180.29
  10.217.0.208/32                          23768 0 192.168.180.29
  10.217.0.50/32                           16762 0 192.168.180.29
  10.217.0.69/32                           104 0 192.168.180.29
  f00d::a1d:0:0:a1bf/128                   104 0 192.168.180.29
  f00d::a1d:0:0:1be3/128                   23768 0 192.168.180.29
  f00d::a1d:0:0:ce91/128                   104 0 192.168.180.29
  10.217.0.91/32                           104 0 192.168.180.29
  10.217.0.219/32                          42983 0 192.168.180.29
  f00d::a1d:0:0:3088/128                   16762 0 192.168.180.29
  f00d::a1d:0:0:ae10/128                   42983 0 192.168.180.29
  f00d::a1d:0:0:dfe1/128                   4 0 192.168.180.29
  10.217.0.85/32                           1 0 192.168.180.29
  f00d::a1d:0:0:9dc8/128                   1 0 192.168.180.29

Rerunning the agent with ...

  # ./daemon/cilium-agent --identity-allocation-mode=crd --enable-ipv6=true    \
      --enable-ipv4=true --disable-envoy-version-check=true --tunnel=disabled  \
      --k8s-kubeconfig-path=$HOME/.kube/config --kube-proxy-replacement=strict \
      --enable-l7-proxy=false --enable-host-legacy-routing=false               \
      --auto-direct-node-routes=true --enable-bandwidth-manager=true           \
      --native-routing-cidr=10.217.0.0/16 --enable-bpf-masquerade=true         \
      --enable-remote-node-identity=true

... fixes the situation, and connectivity works as expected. ipcache then has
the entry as well with REMOTE_NODE_ID sec label:

  # ./cilium/cilium bpf ipcache list | grep 192.168.180.29
  10.217.0.85/32                           6 0 192.168.180.29
  10.217.0.91/32                           104 0 192.168.180.29
  10.217.0.50/32                           16762 0 192.168.180.29
  10.217.0.219/32                          42983 0 192.168.180.29
  10.217.0.152/32                          4 0 192.168.180.29
  f00d::a1d:0:0:dfe1/128                   4 0 192.168.180.29
  10.217.0.69/32                           104 0 192.168.180.29
  f00d::a1d:0:0:3088/128                   16762 0 192.168.180.29
  f00d::a1d:0:0:4a54/128                   4 0 192.168.180.29
  f00d::a1d:0:0:9dc8/128                   6 0 192.168.180.29
  f00d::a1d:0:0:a1bf/128                   104 0 192.168.180.29
  f00d::a1d:0:0:ae10/128                   42983 0 192.168.180.29
  f00d::a1d:0:0:1be3/128                   23768 0 192.168.180.29
  10.217.0.32/32                           4 0 192.168.180.29
  10.217.0.208/32                          23768 0 192.168.180.29
  192.168.180.29/32                        6 0 0.0.0.0               <-----
  f00d::a1d:0:0:ce91/128                   104 0 192.168.180.29

Given the code, make --enable-remote-node-identity=true a hard dependency for
the --enable-bpf-masquerade=true option. If the latter could not be enabled,
we also need to disable BPF host routing here.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann added kind/bug This is a bug in the Cilium logic. pending-review sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Mar 4, 2021
@borkmann borkmann requested review from brb and a team March 4, 2021 14:31
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Mar 4, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.5 Mar 4, 2021
@borkmann
Copy link
Member Author

borkmann commented Mar 4, 2021

test-me-please

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@borkmann borkmann merged commit 66dc917 into master Mar 4, 2021
@borkmann borkmann deleted the pr/fix-bpf-masq branch March 4, 2021 20:38
1.10.0 automation moved this from In progress to Done Mar 4, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.5 Mar 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.5 Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.9.5
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

4 participants