-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.10 backports 2021-12-16 #18276
v1.10 backports 2021-12-16 #18276
Conversation
[ upstream commit 1cf3ef3 ] The `reason` argument of `send_trace_notify` is intended to contain connection tracking state (see TRACE_REASON_*). It is used by user-space to filter out reply packets where possible and thus should be zero if no ct state is available, to avoid misclassification. The code in bpf_host erroneously populated this value with the verdict instead. This commit removes those values and adds documentation of what value may be passed in `send_trace_notify`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 3d78837 ] This commit introduces an enum for the `TRACE_REASON_*` values, to ensure callers of `send_trace_notify` do not accidentally pass in wrong values. Suggested-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 5c0eb01 ] This fixes a bug where Hubble wrongly populated the `is_reply` field for `to-network` trace events, as it assumed these events populated the `TraceNotify.Reason` field with connection tracking state. This however turned out to be an error, and thus TO_NETWORK needs to be removed from the list of observation points with CT state. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit f6c7825 ] Currently, we check for the DSR IP option and then create the CT entry with DSR flag only for new connections (CT_NEW). If a stale CT entry exists (without DSR flag), then the DSR is not handled for the new flow which leads to the rev-DNAT xlation not applied. This commit fix this problem and update the DSR flag for connections in the CT_REOPENED state. Signed-off-by: ivan <i.makarychev@tinkoff.ru> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 13a7125 ] Signed-off-by: chaosbox <ram29@bskyb.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 9f50a91 ] Using it to upgrade to a new minor Cilium version has never been supported and may break the Helm template in subtle ways due to the lack of default values. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 979cae5 ] Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 2ab15e9 ] Create the tunnel map as non-persistent instead of marking it as such in the package level init func. Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit c083925 ] Currently, when deleteTunnelMapping(oldCIDR *cidr.CIDR, quietMode bool) is called with quietMode = true for an inexistent entry - as could be the case on initial node addition - we would get ENOENT on the underlying map operation in pkg/bpf/(*map).deleteMapEntry. This would lead to the bpf-map-sync-cilium_tunnel_map controller being started to reconcile the error. However, in some reported cases (see cilium#16488), the controller seems to stay in an error state failing to delete the non-existent entry. Avoid this situation entirely by ignoring any map delete error in case deleteTunnelMapping is called with quietMode = true. Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 8b9e890 ] The tunnel map and thus its user space cache are only needed if either tunneling or the IPv4 egress gateway is enabled. Currently, the user space cache of the map is created unconditionally of whether it is actually used, leading to e.g. the bpf-map-sync-cilium_tunnel_map being spawend unnecessarily. This controller will e.g. show up in `cilium status --all-controllers` and might lead to confusion in setups where tunneling and the egress gateway feature are disabled. In some reported cases (see cilium#16488) with tunneling disabled that controller also ended up in an errors state, not being able to recoiver. Thus, only create the user space cache map and thus the sync controller in case it is actually needed to avoid this class of errors. Signed-off-by: Tobias Klauser <tobias@cilium.io>
[ upstream commit 386029b ] The datapath.LocalNodeConfig.EnableEncapsulation field is immutable at runtime [1], i.e. it is defined once at agent startup. The tunnel map is created as non-persistent, meaning that any potentially pinned map would be deleted on startup [2], [3]. In combination this means that there cannot possibly be a case that there are left over old tunnel map entries in the tunnel map if encapsulation is disabled. [1] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/datapath/node.go#L59-L64 [2] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/maps/tunnel/tunnel.go#L48 [3] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/bpf/map_linux.go#L104-L106 Signed-off-by: Tobias Klauser <tobias@cilium.io>
be38faa
to
0f3ca44
Compare
/test-backport-1.10 Job 'Cilium-PR-K8s-1.19-kernel-4.9' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.19-kernel-5.4' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.19-kernel-4.9' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-GKE' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for my commits! Thanks
/test-1.16-netnext VM provisioning failure: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.16-net-next/2170/ |
/test-1.19-4.9 Hit #16847 https://jenkins.cilium.io/job/Cilium-PR-K8s-1.19-kernel-4.9/1943/ |
Cilium L4LB XDP test hit #18211 |
/test-1.19-5.4 previous failure: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.19-kernel-5.4/1386/ (triaging) |
/test-1.19-4.9 previous failure: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.19-kernel-4.9/1944/ (triaging) |
/test-gke previous failure: https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/7197/
|
/test-1.16-netnext |
/test-gke |
All remaining failures are known flakes, merging. |
to-network
reply packets #18196 -- hubble: Fix misclassification ofto-network
reply packets (@gandro)--reuse-values
in Cilium upgrades #18259 -- docs: Warn against Helm's--reuse-values
in Cilium upgrades (@gandro)Once this PR is merged, you can update the PR labels via: