Skip to content

Commit

Permalink
Revert "datapath: Remove 2005 route table"
Browse files Browse the repository at this point in the history
This reverts commit 2b58e0f.

After removing the 2005 rtable to fix the L7 issue, the kube-proxy NodePort
with L7 netpol started to fail in the CI. After taking closer look, the removal
of the rtable is causing the reply from the envoy proxy to be passed to lo
instead of cilium_host :

14:54:33.585708 eth0  In  IP6 fc00:f853:ccd:e793::3.52394 >
fc00:f853:ccd:e793::4.30239: Flags [S], seq 504540809, win 64800, options [mss
1440,sackOK,TS val 3651151592 ecr 0,nop,wscale 7], length 0 14:54:33.585852
cilium_host Out IP6 fc00:f853:ccd:e793::4.13607 > fd00:10:244:2::c527.80: Flags
[S], seq 504540809, win 64800, options [mss 1440,sackOK,TS val 3651151592 ecr
0,nop,wscale 7], length 0 14:54:33.585856 cilium_net P   IP6
fc00:f853:ccd:e793::4.13607 > fd00:10:244:2::c527.80: Flags [S], seq 504540809,
win 64800, options [mss 1440,sackOK,TS val 3651151592 ecr 0,nop,wscale 7],
length 0 14:54:33.585916 lo    In  IP6 fd00:10:244:2::c527.80 >
fc00:f853:ccd:e793::4.13607: Flags [S.], seq 2619962850, ack 504540810, win
65464, options [mss 65476,sackOK,TS val 1096880080 ecr 3651151592,nop,wscale
7], length 0 14:54:33.585960 cilium_host Out IP6 fc00:f853:ccd:e793::4.13607 >
fd00:10:244:2::c527.80: Flags [R], seq 504540810, win 0, length 0

The NodePort request gets SNAT-ed by iptables to the cilium_host IP addr. The
trace is taken on the fc00:f853:ccd:e793::4 node which runs the selected
NodePort endpoint.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
  • Loading branch information
brb authored and joestringer committed Jan 26, 2023
1 parent 759f8cd commit 3ed62d5
Showing 1 changed file with 36 additions and 12 deletions.
48 changes: 36 additions & 12 deletions bpf/init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,8 @@ function move_local_rules()

function setup_proxy_rules()
{
# TODO(brb): remove $PROXY_RT_TABLE -related code in v1.14
# Any packet from an ingress proxy uses a separate routing table that routes
# the packet back to the cilium host device.
from_ingress_rulespec="fwmark 0xA00/0xF00 pref 10 lookup $PROXY_RT_TABLE"

# Any packet to an ingress or egress proxy uses a separate routing table
Expand All @@ -123,16 +124,27 @@ function setup_proxy_rules()
if [ -z "$(ip -4 rule list $to_proxy_rulespec)" ]; then
ip -4 rule add $to_proxy_rulespec
fi

ip -4 rule delete $from_ingress_rulespec || true
if [ "$ENDPOINT_ROUTES" = "true" ]; then
if [ ! -z "$(ip -4 rule list $from_ingress_rulespec)" ]; then
ip -4 rule delete $from_ingress_rulespec
fi
else
if [ -z "$(ip -4 rule list $from_ingress_rulespec)" ]; then
ip -4 rule add $from_ingress_rulespec
fi
fi
fi

# Traffic to the host proxy is local
ip route replace table $TO_PROXY_RT_TABLE local 0.0.0.0/0 dev lo

# The $PROXY_RT_TABLE is no longer in use, so delete it
ip route delete table $PROXY_RT_TABLE $IP4_HOST/32 dev $HOST_DEV1 2>/dev/null || true
ip route delete table $PROXY_RT_TABLE default via $IP4_HOST 2>/dev/null || true
# Traffic from ingress proxy goes to Cilium address space via the cilium host device
if [ "$ENDPOINT_ROUTES" = "true" ]; then
ip route delete table $PROXY_RT_TABLE $IP4_HOST/32 dev $HOST_DEV1 2>/dev/null || true
ip route delete table $PROXY_RT_TABLE default via $IP4_HOST 2>/dev/null || true
else
ip route replace table $PROXY_RT_TABLE $IP4_HOST/32 dev $HOST_DEV1
ip route replace table $PROXY_RT_TABLE default via $IP4_HOST
fi
else
ip -4 rule del $to_proxy_rulespec 2> /dev/null || true
ip -4 rule del $from_ingress_rulespec 2> /dev/null || true
Expand All @@ -143,17 +155,29 @@ function setup_proxy_rules()
if [ -z "$(ip -6 rule list $to_proxy_rulespec)" ]; then
ip -6 rule add $to_proxy_rulespec
fi

ip -6 rule delete $from_ingress_rulespec || true
if [ "$ENDPOINT_ROUTES" = "true" ]; then
if [ ! -z "$(ip -6 rule list $from_ingress_rulespec)" ]; then
ip -6 rule delete $from_ingress_rulespec
fi
else
if [ -z "$(ip -6 rule list $from_ingress_rulespec)" ]; then
ip -6 rule add $from_ingress_rulespec
fi
fi
fi

IP6_LLADDR=$(ip -6 addr show dev $HOST_DEV2 | grep inet6 | head -1 | awk '{print $2}' | awk -F'/' '{print $1}')
if [ -n "$IP6_LLADDR" ]; then
# Traffic to the host proxy is local
ip -6 route replace table $TO_PROXY_RT_TABLE local ::/0 dev lo
# The $PROXY_RT_TABLE is no longer in use, so delete it
ip -6 route delete table $PROXY_RT_TABLE ${IP6_LLADDR}/128 dev $HOST_DEV1 2>/dev/null || true
ip -6 route delete table $PROXY_RT_TABLE default via $IP6_LLADDR dev $HOST_DEV1 2>/dev/null || true
# Traffic from ingress proxy goes to Cilium address space via the cilium host device
if [ "$ENDPOINT_ROUTES" = "true" ]; then
ip -6 route delete table $PROXY_RT_TABLE ${IP6_LLADDR}/128 dev $HOST_DEV1 2>/dev/null || true
ip -6 route delete table $PROXY_RT_TABLE default via $IP6_LLADDR dev $HOST_DEV1 2>/dev/null || true
else
ip -6 route replace table $PROXY_RT_TABLE ${IP6_LLADDR}/128 dev $HOST_DEV1
ip -6 route replace table $PROXY_RT_TABLE default via $IP6_LLADDR dev $HOST_DEV1
fi
fi
else
ip -6 rule del $to_proxy_rulespec 2> /dev/null || true
Expand Down

0 comments on commit 3ed62d5

Please sign in to comment.