Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

egressgateway: provide a very basic Cell #24330

Merged
merged 1 commit into from Mar 30, 2023

Conversation

lmb
Copy link
Contributor

@lmb lmb commented Mar 13, 2023

These are the minimal changes needed to get the EGW into a hive Cell. The hacky part is that enabling the EGW is gated on the global option.Config.EnableIPv4EgressGateway which is references from a bunch of places. It's not entirely clear to me how to migrate that toggle yet, and it'll probably require a bunch of changes. I've kept the PR simple in the interests of making this easier to review.

@lmb lmb added release-note/misc This PR makes changes that have no direct user impact. feature/egress-gateway Impacts the egress IP gateway feature. area/modularization labels Mar 13, 2023
@lmb lmb force-pushed the egw-modularization branch 2 times, most recently from cc23d04 to 041d52c Compare March 13, 2023 11:54
@lmb
Copy link
Contributor Author

lmb commented Mar 13, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented Mar 14, 2023

/test-1.16-4.19
/test-1.26-net-next
/test-runtime

@lmb lmb marked this pull request as ready for review March 14, 2023 09:19
@lmb lmb requested review from a team as code owners March 14, 2023 09:19
@lmb lmb requested review from bimmlerd, rolinh and aspsk March 14, 2023 09:19
@lmb
Copy link
Contributor Author

lmb commented Mar 14, 2023

cc @jibi @joamaki

Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of questions and some docs nit

pkg/egressgateway/manager.go Outdated Show resolved Hide resolved
pkg/egressgateway/manager.go Show resolved Hide resolved
pkg/egressgateway/manager.go Outdated Show resolved Hide resolved
pkg/egressgateway/manager.go Show resolved Hide resolved
cell.Config(egressgateway.DefaultConfig),
cell.Provide(
func(params egressgateway.Params) *egressgateway.Manager {
if !option.Config.EnableIPv4EgressGateway {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's not too much churn maybe pull this into the egress config and out of option.config. But this might be what you alluded to re minimal changes; fine in that case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is what i was referring to. I had a chat with gilberto and maybe there is an easy way to work around this, i'll take another look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced a bunch of checks for Config.Enable.. with egressGatewayManager != nil. That works nicely for the k8swatcher, the daemon is less clear cut. I've put the latter in a separate commit.

We're still left with a bunch of hairy global usage unfortunately:

pkg/datapath/loader/base.go
343:	if option.Config.Tunnel == option.TunnelDisabled && option.Config.EnableIPv4EgressGateway {

pkg/k8s/synced/crd.go
56:	if option.Config.EnableIPv4EgressGateway {

pkg/k8s/watchers/cilium_endpoint.go
212:	if option.Config.EnableIPv4EgressGateway {
239:	if option.Config.EnableIPv4EgressGateway {

pkg/datapath/linux/config/config.go
270:	if option.Config.EnableIPv4EgressGateway {

I think to fix this in a nicer way we need a way to define flags in hive cells, but somehow plumb them into option.Config during the transition phase. Otherwise we need to do gigantic refactors again. cc @joamaki

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferably I'd of course see these in EgressGatewayConfig, but seeing where the enable flag is referenced it'd get painful to do that in this PR since you'd need to plumb the EgressGatewayConfig struct to all these places that have not yet been modularized. Let's keep that out from this PR for the time being.

As a transitionary thing we could already add EgressGatewayConfig, keep the option.Config.EnableIPv4EgressGateway, remove the flag registration from daemon/cmd/daemon_main.go and do it in EgressGatewayConfig.Flags() instead, and finally set option.Config.EnableIPv4EgressGateway somewhere... and the last point is where it gets a bit hairy. It can be made to work by having a cell.Invoke at the very top in daemon/cmd/cells.go which would depend on *DaemonConfig and EgressGatewayConfig and bridge the two before the "daemonCell" or anyone else gets access to it. I don't like it that much though. I think it'd be preferable to just do it cleanly once when we're far enough that we can do it.

Another thing here is that most likely EgressGatewayConfig is a "datapath" configuration, e.g. it affects the operation of datapath (above e.g. pkg/datapath/loader/base.go) more than control-plane and it should likely live under pkg/datapath and possibly stay internal to it. Control-plane components that manage the "egress gateway datapath" would then depend on the config either directly, or via some "egress gateway datapath API" to tell them what can and cannot be done. This is maybe less clear with egress gateway, but with datapath components that require probing to figure out the "actual" configuration, it makes sense to not expose the user provided config struct, but rather the configuration that's the result of merging user configuration and probe results.

tl;dr: let's do nothing in this PR about the config 😁

@lmb lmb requested a review from a team as a code owner March 21, 2023 18:17
@lmb lmb requested a review from youngnick March 21, 2023 18:17
@lmb lmb force-pushed the egw-modularization branch 2 times, most recently from e1873b7 to 1bc0159 Compare March 22, 2023 09:25
@lmb
Copy link
Contributor Author

lmb commented Mar 22, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented Mar 22, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 22, 2023

/test

daemon/cmd/daemon.go Show resolved Hide resolved
@lmb
Copy link
Contributor Author

lmb commented Mar 27, 2023

/test

Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks E/W loadbalancing (ClusterIP, NodePort from inside cluster, etc) Tests NodePort inside cluster (kube-proxy) with IPSec and externalTrafficPolicy=Local

Failure Output

FAIL: Request from k8s1 to service http://[fd04::11]:31703 failed

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 so I can create one.

@lmb
Copy link
Contributor Author

lmb commented Mar 27, 2023

/test-1.25-4.19

Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks E/W loadbalancing (ClusterIP, NodePort from inside cluster, etc) Tests NodePort inside cluster (kube-proxy) with IPSec and externalTrafficPolicy=Local

Failure Output

FAIL: Request from k8s1 to service http://[fd04::11]:32322 failed

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 so I can create one.

@lmb
Copy link
Contributor Author

lmb commented Mar 27, 2023

/mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19

@lmb
Copy link
Contributor Author

lmb commented Mar 27, 2023

/test-runtime

@lmb
Copy link
Contributor Author

lmb commented Mar 28, 2023

/test

Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks E/W loadbalancing (ClusterIP, NodePort from inside cluster, etc) Tests NodePort inside cluster (kube-proxy) with IPSec and externalTrafficPolicy=Local

Failure Output

FAIL: Request from k8s1 to service http://[fd04::11]:30654 failed

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 so I can create one.

@lmb
Copy link
Contributor Author

lmb commented Mar 28, 2023

/mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19

👍 created #24602

@lmb
Copy link
Contributor Author

lmb commented Mar 28, 2023

/test-1.26-net-next

@lmb
Copy link
Contributor Author

lmb commented Mar 28, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented Mar 29, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented Mar 29, 2023

runtime test flaked due to #24342

@lmb lmb requested a review from a team as a code owner March 29, 2023 13:30
@lmb lmb requested a review from nebril March 29, 2023 13:30
@lmb
Copy link
Contributor Author

lmb commented Mar 29, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing Tests with XDP, vxlan tunnel, SNAT and Random

Failure Output

FAIL: Can not connect to service "tftp://[fd04::11]:32066/hello" from outside cluster (6/10)

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

@rolinh rolinh removed their request for review March 29, 2023 14:52
@ciliumbot
Copy link

Build finished.

Introduce a constructor which takes a hive-compatible Params struct
and wire things up into a global Cell.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
@lmb
Copy link
Contributor Author

lmb commented Mar 30, 2023

/test

@lmb
Copy link
Contributor Author

lmb commented Mar 30, 2023

This is what CI driven development looks like! The test-1.24-5.4 is, I believe, unrelated. Jenkins doesn't even have logs for the cilium pods and one of @jibi PRs has the same failure.

@lmb lmb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 30, 2023
@julianwiedmann julianwiedmann merged commit 8bbe924 into cilium:master Mar 30, 2023
42 checks passed
@lmb lmb deleted the egw-modularization branch March 30, 2023 17:16
@pchaigno
Copy link
Member

The test-1.24-5.4 is, I believe, unrelated. Jenkins doesn't even have logs for the cilium pods and one of @jibi PRs has the same failure.

Do we have an issue for that?

@jibi
Copy link
Member

jibi commented Mar 31, 2023

Filled #24667 for the CI failure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/modularization feature/egress-gateway Impacts the egress IP gateway feature. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants