-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to enforce policy on running endpoints with debug=false and kernel==4.10 + some stable #1822
Comments
Btw: If you run the cilium-agent in debug mode the issue does not happen. |
I tried to reproduce this on a 4.10 kernel with the object files @aanm provided me offline. I used the git tagged 4.10 kernel, and the related 4.10 tagged iproute2 version to test it. The error I get below is a different one than in the log above, which suggests in the log above an older iproute2 version may have been used, which doesn't expand log further. Anyway, the error I get is due to complexity:
The issue is that 4.10 was declared EOL [1] last May. So it still has the old complexity limit of 68k, whereas 4.9, 4.11., 4.12, 4.13 have a complexity limit og 98k, and 4.14 of 128k. If we want to support 4.10, we need to split [1] http://lkml.iu.edu/hypermail/linux/kernel/1705.2/04097.html |
|
What does |
|
Switching to iproute2 version == kernel version will then result in the same message I got above. |
Hey @borkmann I thought that I did something wrong but looks like that the Ubuntu Zesty iproute package is not correct:
And here you have the links: Maybe we should have a warning or something on cilium start? Do you want me to report to Ubuntu? Regards |
And looks like that the same version is running on the current Ubuntu stable release. |
@eloycoto If you have a chance to report that to Ubuntu upstream in case there's not a ticket open on that yet, that would be great. It doesn't solve the issue on verifier complexity (working on a fix for cilium for that), but it definitely seems strange on Ubuntu side that they are not aligned. It also means that users will miss all other new features in iproute2 that went into corresponding kernels. Thanks! |
Last commit that I found which worked in this combination (4.10 kernel + recent clang) was back in May (!) commit fe5555e ("cilium: add proxy6map support"), first one that breaks due to hitting complexity limit is commit 6c0c690 ("cilium: ct_state was not being returned in ipv6 ct lookup case"). CI with kernel/llvm/iproute2 test matrix (#824) would help identifying such issues upfront. |
The issue also happens on -stable kernels (e.g. 4.9.y and 4.13.y ) with a different LLVM version than 3.8.1 which runs on our Jenkins currently. LLVM's code generation there seems to be different in a way that we're hitting upper complexity limit of 98k (I've seen 233k from bpf_lxc). Therefore bumping prio to high again. Adding tail call hack into the fast-path to reduce complexity. |
This is due to missing upstreamed backports in non LTS kernel versions. It is unclear how we could avoid this behaviour given that these are kernel side changes that are missing. |
Already fixed via #2784, therefore this can be closed. Ubuntu side via https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/1735032 . |
Steps to reproduce:
This only happens when debug mode is off
"clang (4.0.0) and kernel (4.10.0) versions: OK!"
kernels 4.9, 4.12 and 4.13 are not affected
Reported-by @eloycoto
The text was updated successfully, but these errors were encountered: