Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: Avoid xds timeout due to agent restart in envoy DS mode #31061

Merged
merged 1 commit into from Feb 29, 2024

Conversation

sayboras
Copy link
Member

For external envoy, xds server and envoy are having different life cycles i.e. each is running in its own pod, and can be deployed or restarted independently. This commit is to handle the case that xds in cilium agent got restarted, and nonce value is always 0.

Sample error

2024-02-05T12:49:51.771714518Z level=warning msg="Regeneration of endpoint failed" bpfCompilation=0s bpfLoadProg=105.68356ms bpfWaitForELF="24.396µs" bpfWriteELF=1.802221ms ciliumEndpointName=cilium-test/client-56f8968958-fqdl4 containerID=245b2aaac2 containerInterface=eth0 datapathPolicyRevision=5 desiredPolicyRevision=6 endpointID=134 error="Error while configuring proxy redirects: proxy state changes failed: context canceled" identity=1713 ipv4=10.244.1.1 ipv6="fd00:10:244:1::9544" k8sPodName=cilium-test/client-56f8968958-fqdl4 mapSync=2.476505ms policyCalculation=1.240346ms prepareBuild="437.049µs" proxyConfiguration="837.119µs" proxyPolicyCalculation="234.369µs" proxyWaitForAck=2m34.697546384s reason="policy rules added" subsys=endpoint total=2m34.818201428s waitingForCTClean=270ns waitingForLock="2.605µs"

For external envoy, xds server and envoy are having different life
cycles i.e. each is running in its own pod, and can be deployed or
restarted independently. This commit is to handle the case that xds in
cilium agent got restarted, and nonce value is always 0.

Sample error
```
2024-02-05T12:49:51.771714518Z level=warning msg="Regeneration of endpoint failed" bpfCompilation=0s bpfLoadProg=105.68356ms bpfWaitForELF="24.396µs" bpfWriteELF=1.802221ms ciliumEndpointName=cilium-test/client-56f8968958-fqdl4 containerID=245b2aaac2 containerInterface=eth0 datapathPolicyRevision=5 desiredPolicyRevision=6 endpointID=134 error="Error while configuring proxy redirects: proxy state changes failed: context canceled" identity=1713 ipv4=10.244.1.1 ipv6="fd00:10:244:1::9544" k8sPodName=cilium-test/client-56f8968958-fqdl4 mapSync=2.476505ms policyCalculation=1.240346ms prepareBuild="437.049µs" proxyConfiguration="837.119µs" proxyPolicyCalculation="234.369µs" proxyWaitForAck=2m34.697546384s reason="policy rules added" subsys=endpoint total=2m34.818201428s waitingForCTClean=270ns waitingForLock="2.605µs"
```

Signed-off-by: Tam Mach <tam.mach@cilium.io>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 29, 2024
@sayboras sayboras added release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Feb 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.15.2 Feb 29, 2024
@sayboras sayboras added the needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch label Feb 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.8 Feb 29, 2024
@sayboras sayboras marked this pull request as ready for review February 29, 2024 13:55
@sayboras sayboras requested a review from a team as a code owner February 29, 2024 13:55
@sayboras
Copy link
Member Author

/test

@sayboras sayboras added this pull request to the merge queue Feb 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 29, 2024
Merged via the queue into cilium:main with commit d7dba5e Feb 29, 2024
63 checks passed
@sayboras sayboras deleted the tam/xds-timeout branch February 29, 2024 15:26
@pippolo84 pippolo84 mentioned this pull request Mar 5, 2024
13 tasks
@pippolo84 pippolo84 added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Mar 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.15 in 1.15.2 Mar 5, 2024
@pippolo84 pippolo84 mentioned this pull request Mar 5, 2024
7 tasks
@pippolo84 pippolo84 added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Mar 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.8 Mar 5, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Mar 11, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.8 Mar 11, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.14 in 1.14.8 Mar 11, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Mar 11, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.15 in 1.15.2 Mar 11, 2024
@jrajahalme jrajahalme added the needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch label Apr 19, 2024
@giorio94 giorio94 mentioned this pull request Apr 19, 2024
6 tasks
@giorio94 giorio94 added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Apr 19, 2024
@github-actions github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.14.8
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

None yet

4 participants