Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13 Backports 2023-12-20 #29997

Merged
merged 8 commits into from
Jan 11, 2024
Merged

Conversation

YutaroHayakawa
Copy link
Member

@YutaroHayakawa YutaroHayakawa commented Dec 20, 2023

@YutaroHayakawa YutaroHayakawa added kind/backports This PR provides functionality previously merged into master. backport/1.13 labels Dec 20, 2023
leblowl and others added 3 commits December 20, 2023 12:44
[ upstream commit 28975e3 ]

[ backporter's note:
  - v1.13 still uses net package insteaf of netip package. Changed
    netip.Addr to net.IP and use .Equal() for equality check.
  - v1.13 branch doesn't have IsWorld() (and WorldV4 and WorldV6
    identity). Use == ResearvedIdentityWorld instead.
]

In Hubble, ignore certain cases where the datapath security ID does
not match the userspace security ID.

Signed-off-by: Lucas Leblow <lucasleblow@mailbox.org>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 23ef3c0 ]

Cilium-agent is overriding K8S_NODE_NAME env variable based on node
name. Cilium-preflight was not overriding it so in case that hostname
did not match node name, preflight was stuck waiting for node
information.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 8c2dfda ]

This commit changes the log fields for the "stale identity observed"
message to include the full context (i.e. trace observation point,
source and destination addresses etc) for identifying which flow caused
the message to be emitted. In addition, the "old" and "new" identity
field names are replaced with the better fitting "userspace" and
"datapath" terminology.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
gandro and others added 3 commits December 20, 2023 12:59
[ upstream commit caef2d3 ]

[ backporter's note: v1.13 branch still uses net package instead of
  netip package. Use ip.Equal for IP equality check. ]

This was reported by Paul in [1]. After investigating the sysdump, it is
clear that Hubble should not complain about such packets.

[1] #15283 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 4196935 ]

[ backporter's note: cilium time package doesn't exist on v1.13. Use
  standard time package instead. ]

This limits the amount of "stale identities observed" messages to one
every 30 seconds. This is particularly important if monitor aggregation
is disabled, as otherwise any unknown discrepancy fills the log buffers
and thus makes it hard to e.g. debug CI flakes.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit 0d35af0 ]

To make it easy for callers, ipv4_handle_fragmentation() should always
return a DROP_* reason on error. But for errors from l4_load_ports() we're
currently just propagating those raw errors back.

Return a drop reason instead. This also makes us consistent with the
non-fragment path in ipv4_load_l4_ports().

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa YutaroHayakawa force-pushed the pr/v1.13-backport-2023-12-20 branch 2 times, most recently from 2b3fe7a to 04c5223 Compare December 20, 2023 06:59
@YutaroHayakawa
Copy link
Member Author

YutaroHayakawa commented Dec 20, 2023

/test-backport-1.13

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathEgressGatewayTest tunnel disabled with endpointRoutes enabled and XDP egress gw policy both egress gw and basic connectivity work

Failure Output

FAIL: Expected command: kubectl exec -n kube-system log-gatherer-qxg57 -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://10.0.1.39:80 -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" 

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/1122/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@YutaroHayakawa YutaroHayakawa marked this pull request as ready for review December 20, 2023 09:30
@YutaroHayakawa YutaroHayakawa requested a review from a team as a code owner December 20, 2023 09:30
@YutaroHayakawa
Copy link
Member Author

/mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next

@YutaroHayakawa
Copy link
Member Author

k8s-1.26-kernel-net-next: #30021

@YutaroHayakawa
Copy link
Member Author

/test-1.26-net-next

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, but I'm not sure I understand the backporter note that you left. It says:

[ backporter's note: backporter's note: Conflict due to the
  do_netdev_encrypt_encap doesn't exist on v1.13 branch. Removed it. ]

But the commit applied here is 1:1 with the upstream commit. The change on line 733 in bpf/bpf_host.c is inside the function do_netdev_encrypt_encap. Maybe I'm missing something?

learnitall and others added 2 commits January 10, 2024 19:04
[ upstream commit a495abd ]

[ backporter's note: Upstream calls do_netdev_encrypt_encap instead of
  wrapping it with do_netdev_encrypt. Use do_netdev_encrypt. ]

This commit modifies calls to `send_trace_notify` in the datapath to
drop trace events related to encrypted packets when monitor aggregation
is enabled. More specifically, this commit ensures that whenever
`send_trace_notify` is called with a `trace_reason` of `TRACE_REASON_ENCRYPTED`,
the `monitor` argument is set to zero. A Coccinelle script is provided
in this commit to add a build-time check for this requirement moving forward.

This change helps to reduce the overall CPU usage of Cilium Agents when IPSec
encryption is enabled, by reducing the number of trace events emitted by
the datapath. Normally monitor aggregation can be used in order to
reduce the number of trace events, however IPSec-related trace events
are not able to be aggregated since they lack the necessary connection
tracking information. See the function `emit_trace_notify` in `bpf/lib/trace.h`
for more information.

This same change was applied in `bpf/bpf_lxc.c` in commit 3e52822.

Thank you to Lorenz who added a workaround for passing the verifier with
Clang 10.

See also: #27168

Co-authored-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
[ upstream commit e95f2e0 ]

[ backporter's note: Makefile target structure is different between
  v1.13 and main. Took the v1.13's structure.]

Add an initial bpf_network configuration. The defines are taken from a
sysdump for a failing CI run that seems to exhibit verifier problems.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
@YutaroHayakawa
Copy link
Member Author

@learnitall Same for #30004 (comment). Sorry for the confusion. Fixed the message in the latest push.

@YutaroHayakawa
Copy link
Member Author

/test-backport-1.13

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@julianwiedmann
Copy link
Member

@gandro did your review cover #27894 ? If not, who else could cover the review for that PR?

@gandro
Copy link
Member

gandro commented Jan 11, 2024

@gandro did your review cover #27894 ? If not, who else could cover the review for that PR?

Ah, yes it did. Sorry, forgot to check that too.

@julianwiedmann julianwiedmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 11, 2024
@dylandreimerink dylandreimerink merged commit 240958e into v1.13 Jan 11, 2024
140 checks passed
@dylandreimerink dylandreimerink deleted the pr/v1.13-backport-2023-12-20 branch January 11, 2024 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

8 participants