-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cilium dropping IPIP packets w/ unknown drop reason of 119 #32473
Comments
Yes. How to do this? I actually have a repro in dev environment so can take any debug info required |
Oh nvm. Missed that this is a link (on mobile). Will try to do today |
haven't run pwru yet; but i've confirmed that the drop is indeed in i've added printk there:
and in bpf tracelog
|
@squeed pwru output:
|
and sending ipip4 packet from the dev server to the the k8s pod which is running w/ kind on the same devserver (and cilium is installed on that cluster; with defualt config as generated during |
pwru.txt
|
generated packet was:
|
so w/ patch like
and the issue it seems that ct_buffer.ret is int ; but drop_for_direction is expecting unsigned. so we have an issue w/ translating 2 compliment to unsigned i've got
as expected. but the question is - are there any config option for cilium to pass ipip (e.g. conntrack should check against inner packet; not outer ipip) ? i thought it is supported |
So I made this work by calculating offsets so it is looking into inner ipv4 header and transport ports. But i have no idea what could this possible break. So wonder who could give us more info on how ipip supposed to be processed on ingress side etc |
changes which made this work (for ipv4; this is just to continue the discussion on what to do w/ ipip. mb there is a config option which allows to do the same? to allow ingress ipip in pod which is running cilium)
|
I think @borkmann already has an implementation for this, but we have a tunnel iface on each node which we pass as --device along with eth0 😉 |
Hey, Daniel! In our setup each container (aka pod) has tunl interface in its namespace. So we terminate ipip there |
Ok, so that is currently not supported and needs to be extended for Cilium. I had some old code in https://github.com/cilium/cilium/pull/30547/commits for extracting inner tuple for service lookup, maybe it can be of help, or a diff properly cooked as patch as above. |
Yeah. I think recalc offset as I proposed above seems easier. And in our internal setup it is actually works as expected (at least all FW features seems working as expected on inner packets). Ok I think I will make something in a few weeks. Just need to run more internals tests etc to make sure nothing else is required |
I think I am facing same or related issue: As part of organization policy, we use https://github.com/facebookincubator/katran as part of edge fabric.
So, when I use nodeIP as Real (upstream in terms of Katran) and 10.15.12.33 as VIP traffic correctly routes to ingress pod. Using hostNetwork is undesirable for scaling reasons, so my goal to make it work with Cilium LB. For this setup I've dropped VIP IP from lo and made following Service
Both LB IPs are announced by BGP policy, but VIP (10.15.12.33) got denied by router, thus not conflicting with Katran's advertisements. Now when I change Katran's real to
There are no drops in |
@borkmann, may I ask You why #30547 did not get merged? As far as I understand, 'external Cilium in L4B mode' works exactly as Katran in our case. IPIP termination in worker node NS is also exactly the same usecase we are planning to use as part of our flow. So, complete flow that we are expecting is:
Thanks. |
DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: cilium#32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
fyi, I extracted this part into #33551. Thanks for tracing it down @tehnerd ! |
Thanks. We are mostly done with testing and I'm planning to upstream our internal |
DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
[ oss commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ oss commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 83fa353 ] [ backporter's notes: minor conflict due to different function signature ] DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: #32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io>
@julianwiedmann @tehnerd Is this issue resolved, should we close it? |
119 is resolved. Ipip is still not supported in upstream. But we have some internal patches and testing it with v4 and things looks promising right now. Hopefully I would upstream them in some time. We can close the issue |
Is there an existing issue for this?
What happened?
Cilium is dropping packets w/ unknown drop reason.
expected behavior: not having error code 119; but something else (if it is missconfiguration etc).
Cilium Version
Client: 1.15.1 a368c8f 2024-02-14T22:16:57+00:00 go version go1.21.6 linux/amd64
Daemon: 1.15.1 a368c8f 2024-02-14T22:16:57+00:00 go version go1.21.6 linux/amd64
Kernel Version
Linux dfw5a-rg19-9b 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Client Version: v1.28.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
environment where it is happening:
LB (not controlled by cilum) is sending ipip packet to the pod/k8s cluster where we have cilium installed. cilium is w/ default configuration. flow from logs above (e.g. 10.80.107.38:xxx -> 10.220.23.9:3991 is from the payload of ipip (aka inner packets etc))
it feels like drop happens here somewhere: https://github.com/cilium/cilium/blob/v1.15.1/bpf/bpf_lxc.c#L283
https://github.com/cilium/cilium/blob/v1.15.1/bpf/lib/conntrack.h#L884
https://github.com/cilium/cilium/blob/v1.15.1/bpf/lib/conntrack.h#L715
as ct_extract_ports4 does not have a case for ipip
and 119 is a 256-DROP_CT_UNKNOWN_PROTO (137) but i failed so far to find how/where this could be misscalculated.
also in general it is unclear why in logs we have a line for inner flow but ct_lookup is being done (theory; unfrotunately even w/ debug-verbose datapath there are 0 log lines related to this) against ipip packet.
Do cilium even supports of passing IPIP from external load balancer (e.g. ipvs)
Cilium Users Document
Code of Conduct
The text was updated successfully, but these errors were encountered: