Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14 Backports 2024-04-30 #32251

Merged
merged 15 commits into from
May 2, 2024
Merged

v1.14 Backports 2024-04-30 #32251

merged 15 commits into from
May 2, 2024

Conversation

squeed and others added 15 commits April 30, 2024 10:53
[ upstream commit 34caeb2 ]
[ backporter note: Adopt change to v1.14 CI infrastructure ]

This timeout can be CPU sensitive, and the CI environments can be CPU
constrained.

Bumping this timeout ensures that performance regressions will still be
caught, as those tend to cause delays of 1+ seconds. This will, however,
cut down on CI flakes due to noise.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
[ upstream commit 284ee43 ]

This commit adds the missing pass of
the Helm value `kubeConfigPath` to the
initContainer of the Cilium-agent.

Signed-off-by: darox <maderdario@gmail.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit a758d21 ]

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit c76677d ]

This pulls in cilium/dns#11 which fixes a bug where the `SharedClient`
logic did not respect the `c.Client.Timeout` field.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 931b816 ]

This fixes a bug where DNS requests would timeout after 2 seconds,
instead of the intended 10 seconds. This resulted in a `Timeout waiting
for response to forwarded proxied DNS lookup` error message whenever the
response took longer than 2 seconds.

The `dns.Client` used by the proxy is [already configured][1] to use
`ProxyForwardTimeout` value of 10 seconds, which would apply also to the
`dns.Client.DialTimeout`, if it was not for the custom `net.Dialer` we
use in Cilium. The logic in [dns.Client.getTimeoutForRequest][2]
overwrites the request timeout with the timeout from the custom
`Dialer`. Therefore, the intended `ProxyForwardTimeout` 10 second
timeout value was overwritten with the much shorter `net.Dialer.Timeout`
value of two seconds. This commit fixes that issue by using
`ProxyForwardTimeout` for the `net.Dialer` too.

Fixes: cf3cc16 ("fqdn: dnsproxy: fix forwarding of the original security identity for TCP")

[1]: https://github.com/cilium/cilium/blob/50943dbc02496c42a4375947a988fc233417e163/pkg/fqdn/dnsproxy/proxy.go#L1042
[2]: https://github.com/cilium/cilium/blob/94f6553f5b79383b561e8630bdf40bd824769ede/vendor/github.com/cilium/dns/client.go#L405

Reported-by: Andrii Iuspin <andrii.iuspin@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit cf9bde5 ]

LinkList is prone to interrupts which are surfaced by the netlink library.  This leads to stability issues when using the ENI datapath.  This change makes it part of the retry loop in waitForNetlinkDevices.

Fixes: #31974
Signed-off-by: Jason Aliyetti <jaliyetti@gmail.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 715906a ]

Those workflows are failing to run on push events in private forks. They
fail in the "Deduce required tests from code changes" in which we
compute a diff of changes. To compute that diff, the dorny/paths-filter
GitHub action needs to be able to checkout older git references.
Unfortunately, we checkout only the latest reference and drop
credentials afterwards.

This commit fixes it by checking out the full repository. This will take
a few seconds longer so probably not a big issue.

Reported-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 49334a5 ]

Signed-off-by: James Bodkin <james.bodkin@amphora.net>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 8f0b106 ]

The warning log on failure to queue endpoint build is most likely not
meaningful when the context is canceled, as this typically happends when
the endpoint is deleted.

Skip the warning log if error is context.Canceled. This fixes CI flakes
like this:

    Found 1 k8s-app=cilium logs matching list of errors that must be investigated:
    2024-04-22T07:48:47.779499679Z time="2024-04-22T07:48:47Z" level=warning msg="unable to queue endpoint build" ciliumEndpointName=kube-system/coredns-76f75df574-9k8sp containerID=3791acef13 containerInterface=eth0 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=637 error="context canceled" identity=25283 ipv4=10.0.0.151 ipv6="fd02::82" k8sPodName=kube-system/coredns-76f75df574-9k8sp subsys=endpoint

Fixes: #31827
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit b971e46 ]

Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.3.0 to 2.4.0.
- [Release notes](https://github.com/pydantic/pydantic/releases)
- [Changelog](https://github.com/pydantic/pydantic/blob/main/HISTORY.md)
- [Commits](pydantic/pydantic@v2.3.0...v2.4.0)

[ Quentin: The pydantic update requires an update of pydantic_core, too.
    Bump both packages to their latest available version (pydantic 2.7.1
    and pydantic_core 2.18.2). ]

---
updated-dependencies:
- dependency-name: pydantic
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 6e53ad7 ]

Signed-off-by: Cilium Imagebot <noreply@cilium.io>
[ upstream commit a206965 ]
[ backporter notes: regenerated documentation ]

For some reason the renovate configuration added in commit ac804b6
("install/kubernetes: use renovate to update
quay.io/cilium/startup-script") did not pick up the update. Bump the
image manually for now while we keep investigating.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 77b1e6c ]
[ backporter note: minor conflict in setupCNIConfFile ]

If the daemon is configured to write a CNI configuration file, we should
not go ready until that CNI configuration file has been written. This
prevents a race condition where the controller removes the taint from a
node too early, meaning pods may be created with a different CNI
provider.

In #29405, Cilium was configured in chaining mode, but the "primary" CNI
provider hadn't written its configuration yet. This caused the not-ready
taint to be removed from the node too early, and pods were created in a
bad state.

By hooking in the CNI cell's status in the daemon's Status type, we
prevent the daemon's healthz endpoint from returning a successful
response until the CNI cell has been successful.

Fixes: #29405

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
[ upstream commit 889b688 ]

Currently, Envoys idle timeout configuration option can be configured in
Helm `envoy.idleTimeoutDurationSeconds` and the agent/operator Go flag
`proxy-idle-timeout-seconds`.

Unfortunately, changing the value in the Helm values doesn't have an effect
when running in embedded mode, because the helm value isn't passed to the
Cilium ConfigMap.

This commit fixes this, by setting the value in the configmap.

Fixes: #25214

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 8cea46d ]

Followup for #27706

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro added kind/backports This PR provides functionality previously merged into master. backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. labels Apr 30, 2024
@gandro gandro marked this pull request as ready for review April 30, 2024 09:22
@gandro gandro requested review from a team as code owners April 30, 2024 09:22
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My change looks good, thanks Sebastian!

@gandro
Copy link
Member Author

gandro commented Apr 30, 2024

/test-backport-1.14

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR looks good. Thanks!

@gandro
Copy link
Member Author

gandro commented May 2, 2024

Reviews for non-trivial PRs are in. Merging.

@gandro gandro merged commit 7c5677a into v1.14 May 2, 2024
235 checks passed
@gandro gandro deleted the pr/v1.14-backport-2024-04-30-10-46 branch May 2, 2024 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

10 participants