New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reply from pod to outside is dropped when L7 ingress policy is used #21954
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
Unfortunately, removal of the 2005 rtable is causing the legit CI failures. After removing the 2005 rtable to fix the L7 issue, the kube-proxy NodePort with L7 netpol started to fail in the CI. After taking closer look, the removal of the rtable is causing the reply from the envoy proxy to be passed to lo instead of cilium_host :
The NodePort request gets SNAT-ed by iptables to the cilium_host IP addr. The trace is taken on the fc00:f853:ccd:e793::4 node which runs the selected NodePort endpoint. |
x-posting #23346 (comment) |
Still puzzled why this doesn't happen in the v4 path:
The request got SNAT-ed to the |
Ok I believe I have thoroughly figured out this matter. TL;DR ip6tables miss some rules to allow the responding SYN+ACK from proxy to do reverse NAT: this is current ip6tables rules:
this is what we expect:
Long version:
Route table 2005 is simply routing skb with mark Thanks to this routing, the responding skb is going to cilium_host, then cilium_net, and finally back to kernel stack, but the mark Once we delete route table 2005, responding skb will be routed by the following rule in local table:
According to the kenel source code, route rules with type
Therefore, responding skb will go to This explains why my PR #24208 fails to fix this issue once deleting route table 2005.
ip4tables has rules of "NOTRACK for proxy return traffic" which are only applied for This will be fixed together in #24208 |
…-policy when running on < 1.14.0 Cilium cilium/cilium#21954 for the IPv6 path was resolved only for v1.14, but not for v1.13. In order to be able to run the latest connectivity tests on v1.13, we need to skip curl requests to the IPv6 addresses in that particular test. Fixes: cilium#1627 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
…-policy when running on < 1.14.0 Cilium cilium/cilium#21954 for the IPv6 path was resolved only for v1.14, but not for v1.13. In order to be able to run the latest connectivity tests on v1.13, we need to skip curl requests to the IPv6 addresses in that particular test. Fixes: #1627 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This reverts commit 3ed62d5 partially and only removes ipv4 2005 route table. Fixes: cilium#21954 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This commit adds e2e test to cover issue cilium#21954. Test cases for IPv6 are deleted and PR cilium#24882 will take care of them. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
The test case was introduced to cover issue cilium#21954, but it turned out the test is buggy and caused a number of CI flakes (cilium#25119). Consequently, PR cilium#25236 put the test case under quarantine. This commit removes that problematic test, as the target scenario has been covered by connectivity test in cilium-cli (cilium/cilium-cli#1547). Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
The test case was introduced to cover issue #21954, but it turned out the test is buggy and caused a number of CI flakes (#25119). Consequently, PR #25236 put the test case under quarantine. This commit removes that problematic test, as the target scenario has been covered by connectivity test in cilium-cli (cilium/cilium-cli#1547). Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit eb5bf06 ] The test case was introduced to cover issue cilium#21954, but it turned out the test is buggy and caused a number of CI flakes (cilium#25119). Consequently, PR cilium#25236 put the test case under quarantine. This commit removes that problematic test, as the target scenario has been covered by connectivity test in cilium-cli (cilium/cilium-cli#1547). Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
This test case covers cilium#21954. A new policy `echo-ingress-l7-policy-from-anywhere` is added to allow HTTP GET / on echo pods from outside. Use `cilium connectivity test --test north-south-loadbalancing --datapath` to run this test. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
…-policy when running on < 1.14.0 Cilium cilium#21954 for the IPv6 path was resolved only for v1.14, but not for v1.13. In order to be able to run the latest connectivity tests on v1.13, we need to skip curl requests to the IPv6 addresses in that particular test. Fixes: cilium#1627 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit eb5bf06 ] The test case was introduced to cover issue #21954, but it turned out the test is buggy and caused a number of CI flakes (#25119). Consequently, PR #25236 put the test case under quarantine. This commit removes that problematic test, as the target scenario has been covered by connectivity test in cilium-cli (cilium/cilium-cli#1547). Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
This reverts commit 9dd6cfc. 2005 route table is meant to push packets with mark 0xa00 to cilium_host: ``` $ ip ru 10: from all fwmark 0xa00/0xf00 lookup 2005 $ ip r s t 2005 default via 10.244.1.237 dev cilium_host 10.244.1.237 dev cilium_host scope link ``` 2005 route table was deleted to fix cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used). We decided to do so because we thought it was no more used, and causing troubles. However, we recently realized it's still critical to ensure correct encryption when IPsec is enabled and L7 policy is applied. Consider a reply packet from L7 proxy, this packet must have mark 0xa00 to indicate it's from proxy. With 2005 route table, this from-proxy packet will be routed to cilium_host, where from_host bpf prog will process the packet for IPsec encryption; without 2005 table, this packet has no chance to get encrypted, goes out with plain payload. This commit brings back 2005 route table for IPv6. For IPv4 2005 route table and resurged issue cilium#21954, we'll handle later as separate patches. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This commit fixes cilium#21954, as the original patch 9dd6cfc (datapath: remove 2005 route table for ipv6 only) has been reverted due to IPsec + L7 policy issues. This commit simply allows packets to WORLD as long as they are from proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This is an alternative approach to fix #21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: #21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
[ upstream commit eb5bf06 ] The test case was introduced to cover issue #21954, but it turned out the test is buggy and caused a number of CI flakes (#25119). Consequently, PR #25236 put the test case under quarantine. This commit removes that problematic test, as the target scenario has been covered by connectivity test in cilium-cli (cilium/cilium-cli#1547). Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix #21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: #21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ upstream commit ac63856 ] This is an alternative approach to fix #21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: #21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
This issue is about an L7 ingress policy problem a when a pod is reached directly from outside client / via a NodePort BPF service.
Let's consider the following L7 netpol:
When the netpol is applied, accessing the
echo
pod from outside the cluster fails with:The drop is triggered by https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626.
What happens is that the L7 proxy sends the SYN-ACK which gets handled by
bpf_host @ cilium_host
, and then dropped. See thepwru
output (ifindex=9 iscilium_host
):The packet is sent to the
cilium_host
because of the mark and the following IP rules / routes:One fix is to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to
WORLD_ID
.The text was updated successfully, but these errors were encountered: