-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.11 backports 2021-12-03 #18119
v1.11 backports 2021-12-03 #18119
Conversation
[ upstream commit 33bd95c ] Previously, it was possible that a backend or a service would get allocated ID, which would be ID_backend_A < ID < ID_backend_B. This could have happened after cilium-agent restart, as the nextID was not advanced upon the restoration of IDs. This could have led to situations in which the per-packet LB could selected a backend which did not belong to a requested service when the following was fulfilled in the chronological order: 1. Previously the same client made the request to the service and the backend with ID_x was chosen. 2. The service endpoint (backend) with ID_x was removed. 3. cilium-agent was restarted. 4. A new service backend which does not belong to the initial service was created and got the ID_x allocated. 5. The CT_SERVICE entry for the old connection was not removed by the CT GC. 6. The same client made a new connection to the same service from the same src port. The above led the lb{4,6}_local() to select the wrong backend, as it found the CT_SERVICE entry with the backend ID_x. The advancement of the nextID upon the restoration only partly mitigates the issue. The real fix would be to introduce a match map which key would be (svc_id, backend_id), and it would be populated by the agent. The lb{4,6}_local() routines would consult the map to detect whether the backend belongs to the service. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit 915a7f5 ] Tested by applying this patch to v1.11 branch and validating that the digest matches the correct cloud image vs. the v1.11.0-rc3 images on Quay.io: $ helm template cilium ./install/kubernetes/cilium/ --version 1.10.0-rc3 \ --namespace kube-system --set eni.enabled=true --set ipam.mode=eni \ --set egressMasqueradeInterfaces=eth0 --set tunnel=disabled \ | grep operator.*sha image: quay.io/cilium/operator-aws:v1.11.0-rc3@sha256:5ea0ccb6a866a5fb13f4bdfcf1ed8bce12a1355cb10a0914ea52af25f3a8f931 Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit 2273b04 ] Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit ce68d37 ] This upgrade guide contained all other versions in it. To prevent users from mistakenly reading an old upgrade guide, we should remove those leftovers. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit 6bd3833 ] This bumps the Hubble CLI to the recently released version 0.9.0. Hubble CLI v0.9.0 has been released to include the Hubble protobuf API changes present in Cilium v1.11-rc3 and thus is intended to be bundled with the final Cilium v1.11 release. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit cc1ded8 ] Make it clear how users can select devices for the prefiltering. Reported-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
[ upstream commit 0027542 ] This commit fixes the eksctl ClusterConfig to allow for copy. It is merely a workaround for now until a proper fix is available. Fixes: 706c900 ("docs: re-write docs to create clusters with tainted nodes") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me>
/test-backport-1.11 Job 'Cilium-PR-K8s-1.23-kernel-4.9' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.19-kernel-4.9' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my PR: 👍
ci-aks-1.11 hit #18125 Failures are known on the branch, Good to merge. |
Once this PR is merged, you can update the PR labels via:
or with