New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: Report dst identity in drop notifications #5052
bpf: Report dst identity in drop notifications #5052
Conversation
test-me-please |
Likely legitimate failure (pod in
Failure from logs:
|
Found that even on master, it's possible to configure a set of Interestingly, in this particular case it seems that the entire policy lookup is compiled out and yet the complexity limit is still exceeded. |
What if we change the following line: And add some kind of assert message that checks after the test? Like deadlock, NACK or something similar? Maybe if we can fail on that, added a warning at least. Regards |
@eloycoto yeah, I think that your suggestion would be helpful to identify when the error is due to bpf complexity issues. In practice, the test always fails, but we don't always know whether this was the reason. If we detected this in the after-tests checks, it would become more obvious. My other approach is to try to maximise the complexity generated using the headers in the tree ( |
Issue at hand seems to require the following datapath diff to reproduce:
|
Next step: Investigate using |
f76f490
to
93322cf
Compare
test-me-please |
Couldn't import a policy in the test Strictly speaking, the limit is only lower for IPv6 so if we taught the limit about IPv4 vs IPv6 then this test would pass. |
93322cf
to
45ac598
Compare
return net.IPv6len*8 + 1 | ||
if ipv6 { | ||
return net.IPv6len*8 + 1 | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if block ends with a return statement, so drop this else and outdent its block
test-me-please |
45ac598
to
3e48d61
Compare
test-me-please |
Hit #5137 |
Due to features merged for the 1.2 feature window, the complexity of the datapath has increased. To prevent issues at runtime when attempting to compile and load / verify the datapath code, reduce the maximum number of prefix lengths that we support. As always, this limitation will only affect users running Cilium on Linux versions earlier than 4.11. Signed-off-by: Joe Stringer <joe@covalent.io>
This significantly reduces the complexity of the IPV6_FROM_LXC tail call when performing load/verification under particular sets of config options, such as with egress policy disabled. Before, Linux 4.15 with policy, monitor aggregation disabled: $ make -j2 -C bpf && sudo ./test/bpf/check-complexity.sh ... Prog section '2/10' (tail_call IPV6_FROM_LXC) loaded (28)! processed 57631 insns, stack depth 320 ... After: $ make -j2 -C bpf && sudo ./test/bpf/check-complexity.sh ... Prog section '2/10' (tail_call IPV6_FROM_LXC) loaded (28)! processed 23365 insns, stack depth 312 ... Signed-off-by: Joe Stringer <joe@covalent.io>
Due to the complexity of the datapaths, IPv6 and IPv4 have different unique prefix length limits from each other. To maximise the number of supported unique prefix lengths, split this logic by L3 protocol and enforce limits using the known good values for each protocol. Signed-off-by: Joe Stringer <joe@covalent.io>
Previously, the destination identity for drop messages was omitted even if it had been looked up during the egress handling of the packet. Retain the destination identity and report it in the drop monitor notifications. Fixes: cilium#5007 Signed-off-by: Joe Stringer <joe@covalent.io>
3e48d61
to
bcca4e4
Compare
test-me-please |
Hit #5065:
|
Hit also #5154, which should be harmless. Re-attempting CI run. |
test-me-please |
Previously, the destination identity for drop messages was omitted even
if it had been looked up during the egress handling of the packet.
Retain the destination identity and report it in the drop monitor
notifications.
To achieve this, the complexity of the programs must be reduced:
relax_verifier()
call into IPv6 conntrack lookup drop caseFixes: #5007
Upgrade Impact
Users running Cilium on Linux versions older than 4.11 with CIDR policies containing many unique prefix lengths may find that upon upgrade, their policy cannot be installed as the limit for the number of unique prefix lengths has been lowered in Cilium 1.2. The limit in this PR is four unique prefix lengths.
This change is