Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hubble: fix traffic direction and is reply when IPSec is enabled #31211

Conversation

kaworu
Copy link
Member

@kaworu kaworu commented Mar 6, 2024

Backport note: this patch won't apply cleanly on top of v1.13 which doesn't have the SRV6 encap/decap trace reasons, in v1.13 we only need the TraceReasonIsKnown() change in datapath_trace.go and the couple of TRACE_FROM_LXC test cases added in parser_test.go.

Before this patch, Hubble would wrongly report known traffic direction and reply status when IPSec was enabled.

The Cilium datapath uses trace_reason to convey both the trace reason and encryption status of trace notifications.The trace reason is decoded as-is in by userspace in the TraceNotify.Reason field, and users of the Reason field must carefully check the encrypt bit (see for example here or here). Unfortunately, TraceReasonIsKnown, TraceReasonIsEncap, and TraceReasonIsDecap didn't take the encrypt bit into account and thus returned wrong results when the encrypt bit was set.

One proper fix would be to refactor the monitor code to provide enough helpers for users so that they would not need to directly access the Reason field. This will be implemented in a separate PR in order to avoid backporting refactoring commits.

Related to #31202

@kaworu kaworu added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/hubble Impacts hubble server or relay needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Mar 6, 2024
@kaworu kaworu self-assigned this Mar 6, 2024
@kaworu kaworu requested review from a team as code owners March 6, 2024 19:33
@kaworu kaworu requested review from aspsk and chancez March 6, 2024 19:33
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.15.2 Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.8 Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.13 Mar 6, 2024
Before this patch, Hubble would wrongly report known traffic direction
and reply status when IPSec was enabled.

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
@kaworu kaworu force-pushed the pr/kaworu/hubble/traffic-direction-and-is-reply-fix-for-encrypted-trace-notifications branch from c32237f to e76fce6 Compare March 7, 2024 14:30
@kaworu
Copy link
Member Author

kaworu commented Mar 8, 2024

After closer inspection: while it's clear that in the datapath TRACE_REASON_ENCRYPTED is intended to be used as a mask (see also here), it is effectively always provided as-is (i.e. without additional trace reason), see for example

cilium/bpf/bpf_host.c

Lines 1093 to 1094 in 7e4ad4a

send_trace_notify(ctx, TRACE_FROM_STACK, identity, 0, 0,
ctx->ingress_ifindex, TRACE_REASON_ENCRYPTED, 0);

cc @learnitall to confirm this it's related to #29616 and the coccinelle script assumes no masking of TRACE_REASON_ENCRYPTED.

If correct and with the masking logic in mind, TRACE_REASON_ENCRYPTED is equivalent to TRACE_REASON_POLICY | TRACE_REASON_ENCRYPTED, and thus we never hit the bugs fixed by this patch as it would require either

  • TRACE_REASON_UNKNOWN | TRACE_REASON_ENCRYPTED
  • TRACE_REASON_SRV6_ENCAP | TRACE_REASON_ENCRYPTED
  • TRACE_REASON_SRV6_DECAP | TRACE_REASON_ENCRYPTED

Consequently, while this patch fixes the monitor helper functions the behaviour should be unchanged by the patch. So I'm unsure whether it is correct to qualify the PR as either bug fix nor need backport. Since it doesn't fix any actual bug it could be misleading to report it as bug fix in the release notes, and while the backport should be very low risk IMHO it's not worth the effort (we could close this PR and the fix will land in main through #31226). cc @joestringer for guidance here 🙏

@kaworu kaworu added the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Mar 8, 2024
@joestringer
Copy link
Member

If TRACE_REASON_ENCRYPTED is a property bit then I'd expect every trace message with that bit set to also contain the actual reason, at minimum with TRACE_REASON_UNKNOWN (which unfortunately has a non-zero value for historic reasons, so needs to be explicitly set). Though ideally we as developers should have an idea about why we're emitting the trace notification and allocate a reason for that purpose, otherwise we're apparently just emitting additional notifications for no reason, which is a waste of resources.

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datapath logic here looks good. Not sure if we're going far enough on fixing up usage of the encrypted trace mask, but that doesn't have to be addressed here.

@kaworu
Copy link
Member Author

kaworu commented Mar 11, 2024

/test

@kaworu kaworu removed the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Mar 11, 2024
@jrajahalme jrajahalme removed this from Needs backport from main in 1.15.2 Mar 13, 2024
@thorn3r thorn3r removed this from Needs backport from main in 1.13.13 Mar 13, 2024
@thorn3r thorn3r added this to Needs backport from main in 1.13.13 Mar 13, 2024
@thorn3r thorn3r added this to Needs backport from main in 1.13.14 Mar 13, 2024
@thorn3r thorn3r removed this from Needs backport from main in 1.13.13 Mar 13, 2024
@thorn3r thorn3r added this to Needs backport from main in 1.14.9 Mar 13, 2024
@thorn3r thorn3r removed this from Needs backport from main in 1.14.8 Mar 13, 2024
@kaworu
Copy link
Member Author

kaworu commented Mar 13, 2024

/ci-ginkgo

@kaworu
Copy link
Member Author

kaworu commented Mar 13, 2024

/ci-runtime

@kaworu
Copy link
Member Author

kaworu commented Mar 15, 2024

/ci-ginkgo

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 15, 2024
@dylandreimerink dylandreimerink added this pull request to the merge queue Mar 19, 2024
Merged via the queue into cilium:main with commit 9939fa2 Mar 19, 2024
62 checks passed
@gandro gandro mentioned this pull request Mar 19, 2024
8 tasks
@gandro gandro added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Mar 19, 2024
@kaworu kaworu deleted the pr/kaworu/hubble/traffic-direction-and-is-reply-fix-for-encrypted-trace-notifications branch March 21, 2024 13:20
@github-actions github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Mar 21, 2024
@sayboras sayboras mentioned this pull request Mar 24, 2024
6 tasks
@sayboras sayboras added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Mar 24, 2024
@sayboras sayboras mentioned this pull request Mar 24, 2024
4 tasks
@sayboras sayboras added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Mar 24, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Mar 25, 2024
@thorn3r thorn3r moved this from Needs backport from main to Backport done to v1.13 in 1.13.14 Mar 26, 2024
@jrajahalme jrajahalme moved this from Needs backport from main to Backport done to v1.14 in 1.14.9 Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/hubble Impacts hubble server or relay
Projects
No open projects
1.13.14
Backport done to v1.13
1.14.9
Backport done to v1.14
Status: Released
Status: Released
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

7 participants