Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: add metrics for fragmented ipv4 packets #13347

Merged
merged 7 commits into from
Nov 16, 2020
Merged

bpf: add metrics for fragmented ipv4 packets #13347

merged 7 commits into from
Nov 16, 2020

Conversation

jibi
Copy link
Member

@jibi jibi commented Sep 30, 2020

This commit introduces 2 new metrics in the datapath logic related to
fragmented IPv4 packets:

  • REASON_FRAG_PACKET: number of received fragmented packets
  • REASON_FRAG_PACKET_UPDATE: number of failures in updating the
    IPV4_FRAG_DATAGRAMS_MAP map to register the first logical fragment of
    a datagram

Regarding testing: I did a smoke test by running make build_all and I'm now waiting for e2e tests to finish

@jibi jibi requested a review from a team September 30, 2020 14:45
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 30, 2020
@jibi jibi marked this pull request as draft September 30, 2020 14:48
@maintainer-s-little-helper
Copy link

Commit c4f46f8f434938630d2fc4868c61c1e6da1a84d7 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. and removed dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. labels Sep 30, 2020
@jibi jibi marked this pull request as ready for review September 30, 2020 15:53
@jibi
Copy link
Member Author

jibi commented Sep 30, 2020

Including "metrics.h" in lib/ipv4.h broke the tests:

➜  cilium git:(pr/jibi/data-path-add-metrics-for-frag-packets) make -C test/bpf
make: Entering directory '/home/gilberto/go/src/github.com/cilium/cilium/test/bpf'
clang -Wall -Wextra -Werror -Wshadow -Wno-unused-parameter -Wno-address-of-packed-member -Wno-unknown-warning-option -Wno-gnu-variable-sized-type-not-at-end -Wdeclaration-after-statement -I../../bpf/ -I../../bpf/include -I. -D__NR_CPUS__=8 -O2 -I../../bpf/ unit-test.c -o unit-test
In file included from unit-test.c:28:
../../bpf/tests/ipv6_test.h:60:30: error: unexpected type name '__be32': expected identifier
LPM_LOOKUP_FN(lpm4_lookup32, __be32, PREFIX32, dummy_map, match_dummy_prefix)
                             ^
../../bpf/tests/ipv6_test.h:60:38: error: expected identifier
LPM_LOOKUP_FN(lpm4_lookup32, __be32, PREFIX32, dummy_map, match_dummy_prefix)
                                     ^
../../bpf/tests/ipv6_test.h:55:18: note: expanded from macro 'PREFIX32'
#define PREFIX32 32,
                 ^
../../bpf/tests/ipv6_test.h:61:1: error: expected function body after function declarator
LPM_LOOKUP_FN(lpm4_lookup31, __be32, PREFIX31, dummy_map, match_dummy_prefix)
^

(looking into it)

@qmonnet qmonnet self-requested a review September 30, 2020 16:43
@qmonnet qmonnet added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Sep 30, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 30, 2020
@qmonnet
Copy link
Member

qmonnet commented Sep 30, 2020

I suspect that injecting "lib/metrics.h" (which in turns includes "lib/maps.h") in the inclusion chain somehow prevents LPM_LOOKUP_FN from being defined, because SKIP_UNDEF_LPM_LOOKUP_FN is undefined at that stage. Previously, the header was likely only included from bpf/tests/ipv6_test.h, where the macro is defined.

Includes: "tests/conntrack_test.h" -> "lib/conntrack.h" -> "ipv4.h" -> "metrics.h" ->"maps.h".

Some solutions could be to reorder the includes in test/bpf/unit-test.c to have "tests/ipv6_tests.h" first, or moving the #define SKIP_UNDEF_LPM_LOOKUP_FN to that file, above the includes.

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with a few nitpicks and pending the unit test build is fixed :)

bpf/lib/ipv4.h Outdated Show resolved Hide resolved
bpf/lib/ipv4.h Outdated Show resolved Hide resolved
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and you probably want to add a description in pkg/monitor/api/drop.go for the new metrics.

@pchaigno pchaigno added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/metrics Impacts statistics / metrics gathering, eg via Prometheus. labels Oct 1, 2020
@jibi
Copy link
Member Author

jibi commented Oct 1, 2020

Oh and you probably want to add a description in pkg/monitor/api/drop.go for the new metrics.

There it is :) I looked for that mapping but couldn't find anything by grepping for the constants defined in common.h.

Do you think it makes sense to add a comment in common.h pointing to that file?

@qmonnet
Copy link
Member

qmonnet commented Oct 1, 2020

Do you think it makes sense to add a comment in common.h pointing to that file?

I can't see how that would hurt :) Please go ahead!

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to extend the existing e2e fragment tracking test (K8sServicesTest Checks service across nodes Supports IPv4 fragments) to ensure we don't miss events or count them several times.

bpf/lib/ipv4.h Outdated Show resolved Hide resolved
@pchaigno pchaigno added the needs/e2e-test This issue is not covered by existing CI tests, but should be. label Oct 1, 2020
@jibi jibi marked this pull request as draft October 1, 2020 14:31
@jibi
Copy link
Member Author

jibi commented Oct 1, 2020

I suspect that injecting "lib/metrics.h" (which in turns includes "lib/maps.h") in the inclusion chain somehow prevents LPM_LOOKUP_FN from being defined, because SKIP_UNDEF_LPM_LOOKUP_FN is undefined at that stage. Previously, the header was likely only included from bpf/tests/ipv6_test.h, where the macro is defined.

Includes: "tests/conntrack_test.h" -> "lib/conntrack.h" -> "ipv4.h" -> "metrics.h" ->"maps.h".

Some solutions could be to reorder the includes in test/bpf/unit-test.c to have "tests/ipv6_tests.h" first, or moving the #define SKIP_UNDEF_LPM_LOOKUP_FN to that file, above the includes.

I went for the latter and moved #define SKIP_UNDEF_LPM_LOOKUP_FN in test/bpf/unit-test.c 👍

@jibi
Copy link
Member Author

jibi commented Oct 1, 2020

I should have addressed all the comments, beside e2e testing (looking into now).

For the REASON_FRAG_PACKET metric it should just be a matter of parsing the output of cilium monitor when we send a fragmented packet in test/k8sT/Services.go, making sure metrics are updated accordingly.

The other 2 cases are a bit more tricky as they require to:

  • forge an IP fragmented packet which is not the initial one (so we can't just use nc)
  • make map_update_elem(&IPV4_FRAG_DATAGRAMS_MAP, ..) fail (not sure when that can happen)

@pchaigno
Copy link
Member

pchaigno commented Oct 1, 2020

For the REASON_FRAG_PACKET metric it should just be a matter of parsing the output of cilium monitor when we send a fragmented packet in test/k8sT/Services.go, making sure metrics are updated accordingly.

I think you want cilium metrics list. It's likely that some existing tests already test metrics, so worth checking.

@pchaigno
Copy link
Member

pchaigno commented Oct 1, 2020

test-me-please

@qmonnet
Copy link
Member

qmonnet commented Oct 1, 2020

  • forge an IP fragmented packet which is not the initial one (so we can't just use nc)

Surely there are libraries for creating packets in go? Otherwise, stick to nc but add some egress rule on the source pod to drop the first fragment by targetting the UDP port?

  • make map_update_elem(&IPV4_FRAG_DATAGRAMS_MAP, ..) fail (not sure when that can happen)

We can't destroy the map since the program uses it, we can't freeze it after it's been created. We use the BPF_ANY flag so we can't play on that without modifying the eBPF program, same thing for using a wrong size for the keys and values. We can't block on a full map, it's a LRU. We can configure its size but we set a lower limit, so we can't set it to zero. 🤔 I can't find a way to have this update fail, besides running out of kernel memory. Not sure if there's a good way to test this one.

test/k8sT/Services.go Outdated Show resolved Hide resolved
@jibi jibi marked this pull request as ready for review October 9, 2020 08:39
@jibi jibi requested a review from a team as a code owner October 9, 2020 08:39
@jibi
Copy link
Member Author

jibi commented Nov 13, 2020

test-me-please

@pchaigno
Copy link
Member

On a positive note, this pull request seems to fix #13931 in the dev. VM 🎉

@jibi
Copy link
Member Author

jibi commented Nov 13, 2020

(@brb I re-requested your review as for some reason the approval for cilium/agent is still missing 🤔)

@jibi jibi removed the request for review from brb November 16, 2020 09:10
@pchaigno pchaigno added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 16, 2020
@jibi jibi merged commit a78f75e into cilium:master Nov 16, 2020
@jibi jibi deleted the pr/jibi/data-path-add-metrics-for-frag-packets branch November 16, 2020 17:05
@joestringer joestringer added this to Needs backport from master in 1.9.1 Nov 17, 2020
@aanm aanm added this to Needs backport from master in 1.9.2 Dec 4, 2020
@aanm aanm removed this from Needs backport from master in 1.9.1 Dec 4, 2020
@aanm aanm mentioned this pull request Dec 4, 2020
@joestringer joestringer removed this from Needs backport from master in 1.9.2 Jan 18, 2021
@pchaigno
Copy link
Member

I backported the four commits related to relax_verifier() to v1.8 to fix a complexity issue (cf. #15440):

datapath/probes: add support for misc features
bpf: reintroduce relax_verifier()
bpf: optimize relax_verifier()
bpf: fix complexity issue on kernels <5.3 

I'm not adding backport labels considering the metric introduced in this PR wasn't backported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Impacts statistics / metrics gathering, eg via Prometheus. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants