-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: Fix bpf masquerade issue when host connecting to remote pod #15206
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Invoking the agent as following on two nodes ... # ./daemon/cilium-agent --identity-allocation-mode=crd --enable-ipv6=true \ --enable-ipv4=true --disable-envoy-version-check=true --tunnel=disabled \ --k8s-kubeconfig-path=$HOME/.kube/config --kube-proxy-replacement=strict \ --enable-l7-proxy=false --enable-bpf-masquerade=true \ --enable-host-legacy-routing=false --auto-direct-node-routes=true \ --enable-bandwidth-manager=true --native-routing-cidr=10.217.0.0/16 ... I ran into the issue that the hostns (192.168.180.29) cannot connect to a remote Pod (10.217.1.175): # tcpdump -i enp2s0np0 -n [...] 11:59:01.002065 IP 192.168.180.29.38233 > 10.217.1.175.12865: Flags [S], seq 3173960079, win 64240, options [mss 1460,sackOK,TS val 444671211 ecr 0,nop,wscale 7], length 0 11:59:01.002113 IP 192.168.180.28.59227 > 192.168.180.29.38233: Flags [S.], seq 2874324629, ack 3173960080, win 65160, options [mss 1460,sackOK,TS val 3030265373 ecr 444671211,nop,wscale 7], length 0 11:59:01.002242 IP 192.168.180.29.38233 > 192.168.180.28.59227: Flags [R], seq 3173960080, win 0, length 0 What can be seen is that the SYN/ACK reply from remote gets wrongly masqueraded to the node IP address (192.168.180.28) of the Pod, hence the subsequent RST. Debugging further, what can be seen is that in snat_v4_needed() we do find an ipcache entry (the catchall case) where info->sec_label == REMOTE_NODE_ID does not match, and therefore we masq for the remote node. By default from daemon side, --enable-remote-node-identity is false which then also does not have an ipcache entry: # ./cilium/cilium bpf ipcache list | grep 192.168.180.29 10.217.0.152/32 4 0 192.168.180.29 10.217.0.208/32 23768 0 192.168.180.29 10.217.0.50/32 16762 0 192.168.180.29 10.217.0.69/32 104 0 192.168.180.29 f00d::a1d:0:0:a1bf/128 104 0 192.168.180.29 f00d::a1d:0:0:1be3/128 23768 0 192.168.180.29 f00d::a1d:0:0:ce91/128 104 0 192.168.180.29 10.217.0.91/32 104 0 192.168.180.29 10.217.0.219/32 42983 0 192.168.180.29 f00d::a1d:0:0:3088/128 16762 0 192.168.180.29 f00d::a1d:0:0:ae10/128 42983 0 192.168.180.29 f00d::a1d:0:0:dfe1/128 4 0 192.168.180.29 10.217.0.85/32 1 0 192.168.180.29 f00d::a1d:0:0:9dc8/128 1 0 192.168.180.29 Rerunning the agent with ... # ./daemon/cilium-agent --identity-allocation-mode=crd --enable-ipv6=true \ --enable-ipv4=true --disable-envoy-version-check=true --tunnel=disabled \ --k8s-kubeconfig-path=$HOME/.kube/config --kube-proxy-replacement=strict \ --enable-l7-proxy=false --enable-host-legacy-routing=false \ --auto-direct-node-routes=true --enable-bandwidth-manager=true \ --native-routing-cidr=10.217.0.0/16 --enable-bpf-masquerade=true \ --enable-remote-node-identity=true ... fixes the situation, and connectivity works as expected. ipcache then has the entry as well with REMOTE_NODE_ID sec label: # ./cilium/cilium bpf ipcache list | grep 192.168.180.29 10.217.0.85/32 6 0 192.168.180.29 10.217.0.91/32 104 0 192.168.180.29 10.217.0.50/32 16762 0 192.168.180.29 10.217.0.219/32 42983 0 192.168.180.29 10.217.0.152/32 4 0 192.168.180.29 f00d::a1d:0:0:dfe1/128 4 0 192.168.180.29 10.217.0.69/32 104 0 192.168.180.29 f00d::a1d:0:0:3088/128 16762 0 192.168.180.29 f00d::a1d:0:0:4a54/128 4 0 192.168.180.29 f00d::a1d:0:0:9dc8/128 6 0 192.168.180.29 f00d::a1d:0:0:a1bf/128 104 0 192.168.180.29 f00d::a1d:0:0:ae10/128 42983 0 192.168.180.29 f00d::a1d:0:0:1be3/128 23768 0 192.168.180.29 10.217.0.32/32 4 0 192.168.180.29 10.217.0.208/32 23768 0 192.168.180.29 192.168.180.29/32 6 0 0.0.0.0 <----- f00d::a1d:0:0:ce91/128 104 0 192.168.180.29 Given the code, make --enable-remote-node-identity=true a hard dependency for the --enable-bpf-masquerade=true option. If the latter could not be enabled, we also need to disable BPF host routing here. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
test-me-please |
brb
approved these changes
Mar 4, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
kind/bug
This is a bug in the Cilium logic.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
sig/datapath
Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See commit msg.