-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cilium dropping IPIP packets w/ unknown drop reason of 119 #32473
Comments
Yes. How to do this? I actually have a repro in dev environment so can take any debug info required |
Oh nvm. Missed that this is a link (on mobile). Will try to do today |
haven't run pwru yet; but i've confirmed that the drop is indeed in i've added printk there:
and in bpf tracelog
|
@squeed pwru output:
|
and sending ipip4 packet from the dev server to the the k8s pod which is running w/ kind on the same devserver (and cilium is installed on that cluster; with defualt config as generated during |
pwru.txt
|
generated packet was:
|
so w/ patch like
and the issue it seems that ct_buffer.ret is int ; but drop_for_direction is expecting unsigned. so we have an issue w/ translating 2 compliment to unsigned i've got
as expected. but the question is - are there any config option for cilium to pass ipip (e.g. conntrack should check against inner packet; not outer ipip) ? i thought it is supported |
So I made this work by calculating offsets so it is looking into inner ipv4 header and transport ports. But i have no idea what could this possible break. So wonder who could give us more info on how ipip supposed to be processed on ingress side etc |
changes which made this work (for ipv4; this is just to continue the discussion on what to do w/ ipip. mb there is a config option which allows to do the same? to allow ingress ipip in pod which is running cilium)
|
I think @borkmann already has an implementation for this, but we have a tunnel iface on each node which we pass as --device along with eth0 😉 |
Hey, Daniel! In our setup each container (aka pod) has tunl interface in its namespace. So we terminate ipip there |
Ok, so that is currently not supported and needs to be extended for Cilium. I had some old code in https://github.com/cilium/cilium/pull/30547/commits for extracting inner tuple for service lookup, maybe it can be of help, or a diff properly cooked as patch as above. |
Yeah. I think recalc offset as I proposed above seems easier. And in our internal setup it is actually works as expected (at least all FW features seems working as expected on inner packets). Ok I think I will make something in a few weeks. Just need to run more internals tests etc to make sure nothing else is required |
Is there an existing issue for this?
What happened?
Cilium is dropping packets w/ unknown drop reason.
expected behavior: not having error code 119; but something else (if it is missconfiguration etc).
Cilium Version
Client: 1.15.1 a368c8f 2024-02-14T22:16:57+00:00 go version go1.21.6 linux/amd64
Daemon: 1.15.1 a368c8f 2024-02-14T22:16:57+00:00 go version go1.21.6 linux/amd64
Kernel Version
Linux dfw5a-rg19-9b 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Client Version: v1.28.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
environment where it is happening:
LB (not controlled by cilum) is sending ipip packet to the pod/k8s cluster where we have cilium installed. cilium is w/ default configuration. flow from logs above (e.g. 10.80.107.38:xxx -> 10.220.23.9:3991 is from the payload of ipip (aka inner packets etc))
it feels like drop happens here somewhere: https://github.com/cilium/cilium/blob/v1.15.1/bpf/bpf_lxc.c#L283
https://github.com/cilium/cilium/blob/v1.15.1/bpf/lib/conntrack.h#L884
https://github.com/cilium/cilium/blob/v1.15.1/bpf/lib/conntrack.h#L715
as ct_extract_ports4 does not have a case for ipip
and 119 is a 256-DROP_CT_UNKNOWN_PROTO (137) but i failed so far to find how/where this could be misscalculated.
also in general it is unclear why in logs we have a line for inner flow but ct_lookup is being done (theory; unfrotunately even w/ debug-verbose datapath there are 0 log lines related to this) against ipip packet.
Do cilium even supports of passing IPIP from external load balancer (e.g. ipvs)
Cilium Users Document
Code of Conduct
The text was updated successfully, but these errors were encountered: