Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.12 backports 2022-10-10 #21639

Merged
merged 6 commits into from
Oct 18, 2022
Merged

v1.12 backports 2022-10-10 #21639

merged 6 commits into from
Oct 18, 2022

Conversation

jibi
Copy link
Member

@jibi jibi commented Oct 10, 2022

Once this PR is merged, you can update the PR labels via:

$ for pr in 21499 ; do contrib/backporting/set-labels.py $pr done 1.12; done

or with

$ make add-label BRANCH=v1.12 ISSUES=21499

@jibi jibi requested a review from a team as a code owner October 10, 2022 09:52
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.12 kind/backports This PR provides functionality previously merged into master. labels Oct 10, 2022
@pchaigno
Copy link
Member

pchaigno commented Oct 10, 2022

Any backporting conflicts? If not, why was this carried outside the normal backporting process? (We currently have 4 in-flight 1.12 backport PR 😱 ) If yes, please document 🙂

@jibi
Copy link
Member Author

jibi commented Oct 10, 2022

Any backporting conflicts? If not, why was this carried outside the normal backporting process? (We currently have 4 in-flight 1.12 backport PR scream ) If yes, please document slightly_smiling_face

All commits had non trivial conflicts as pointed in the first backport attempt and I had already synced offline with the tophat about that.

Not sure there's much value in documenting each individual conflict as I literally started from a blank state and manually reapplied the changes

@pchaigno
Copy link
Member

Not sure there's much value in documenting each individual conflict as I literally started from a blank state and manually reapplied the changes

Just that would be useful to know when reviewing & when we'll come back to these while trying to understand some buggy behavior.

@jibi
Copy link
Member Author

jibi commented Oct 10, 2022

/test-backport-1.12

@sayboras
Copy link
Member

all 3 jenkins jobs are failing for the same tests, I am not sure if there are some regression here.

Appreciate if someone can chime in.

  • K8sDatapathConfig AutoDirectNodeRoutes Check connectivity with automatic direct nodes routes
  • K8sDatapathConfig AutoDirectNodeRoutes Check direct connectivity with per endpoint routes
  • K8sDatapathConfig Iptables Skip conntrack for pod traffic

https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-5.4/780/
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.23-kernel-4.19/583/
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-net-next/626

bpf/lib/nat.h Outdated Show resolved Hide resolved
[ upstream commit 4532996 ]

In order to reclaim orphan entries in the NAT table the agent uses the
connection tracking state: if a NAT entry is not backed by a connection,
it should be considered orphaned and can be reclaimed.

Currently the SNAT logic assumes that all connections that needs to be
translated are already tracked. The only exception for this is
connections originating from the local host, which are tracked by
snat_v4_track_local().

In practice egress gateway is another case that requires special
attention: in fact, all connections tunneled from a different node to a
gateway node are currently not tracked. This means that whenever the CT
garbage collector kicks in, all NAT entries for egress gateway will be
deleted, potentially leading to connections breakage.

This commit fix this by making sure we always track these connections.

While at it, also rename snat_v4_track_local() to
snat_v4_track_connection().

Towards: #21346
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 19c2d59 ]

For consistency with its v4 counterpart.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit c62b209 ]

No functional changes introduced.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 658045e ]

In snat_v4_needed(), whenever NAT is required because of an egress
gateway policy, we unconditionally set the from_endpoint parameter to
true, as egress gateway traffic is always originating from an endpoint.

This is wrong as that parameter is supposed to track whether traffic is
coming from a _local_ endpoint, and so that variable should be set to true
only if the __lookup_ip4_endpoint() lookup is successful.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit 59ac84f ]

Move the from_endpoint parameter of snat_v4_needed() into
ipv4_nat_target, and rename the function to snat_v4_prepare_state().

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
[ upstream commit f9697d2 ]

Use the information stored in the SNAT target state to determine if the
packet is coming from a local endpoint, saving one map lookup.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
Signed-off-by: Gilberto Bertin <jibi@cilium.io>
@jibi jibi force-pushed the pr/v1.12-backport-2022-10-10 branch from 0f75a20 to 76cd011 Compare October 17, 2022 16:54
@jibi
Copy link
Member Author

jibi commented Oct 17, 2022

/test-backport-1.12

Job 'Cilium-PR-K8s-GKE' failed:

Click to show.

Test Name

K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master

Failure Output

FAIL: Timed out after 40.813s.

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-GKE so I can create one.

@jibi
Copy link
Member Author

jibi commented Oct 17, 2022

/test-1.19-4.9

Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤞

@jibi
Copy link
Member Author

jibi commented Oct 18, 2022

CI status:

  • ci-aks-1.12 didn't really failed:
✅ All 30 tests (224 actions) successful, 0 tests skipped, 1 scenarios skipped.
Error: The operation was canceled.
  • test-1.22-4.9 failed because of a known flake (K8sServicesTest Checks E/W loadbalancing (ClusterIP, NodePort from inside cluster, etc))

  • test-gke and test-1.23-4.9 are not marked as required

@jibi jibi added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 18, 2022
@qmonnet qmonnet merged commit dd93f2c into v1.12 Oct 18, 2022
@qmonnet qmonnet deleted the pr/v1.12-backport-2022-10-10 branch October 18, 2022 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants