New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
egressgateway: provide a very basic Cell #24330
Conversation
cc23d04
to
041d52c
Compare
/test |
/test-1.16-4.19 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of questions and some docs nit
daemon/cmd/cells.go
Outdated
cell.Config(egressgateway.DefaultConfig), | ||
cell.Provide( | ||
func(params egressgateway.Params) *egressgateway.Manager { | ||
if !option.Config.EnableIPv4EgressGateway { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there's not too much churn maybe pull this into the egress config and out of option.config. But this might be what you alluded to re minimal changes; fine in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is what i was referring to. I had a chat with gilberto and maybe there is an easy way to work around this, i'll take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced a bunch of checks for Config.Enable..
with egressGatewayManager != nil
. That works nicely for the k8swatcher, the daemon is less clear cut. I've put the latter in a separate commit.
We're still left with a bunch of hairy global usage unfortunately:
pkg/datapath/loader/base.go
343: if option.Config.Tunnel == option.TunnelDisabled && option.Config.EnableIPv4EgressGateway {
pkg/k8s/synced/crd.go
56: if option.Config.EnableIPv4EgressGateway {
pkg/k8s/watchers/cilium_endpoint.go
212: if option.Config.EnableIPv4EgressGateway {
239: if option.Config.EnableIPv4EgressGateway {
pkg/datapath/linux/config/config.go
270: if option.Config.EnableIPv4EgressGateway {
I think to fix this in a nicer way we need a way to define flags in hive cells, but somehow plumb them into option.Config
during the transition phase. Otherwise we need to do gigantic refactors again. cc @joamaki
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferably I'd of course see these in EgressGatewayConfig
, but seeing where the enable flag is referenced it'd get painful to do that in this PR since you'd need to plumb the EgressGatewayConfig struct to all these places that have not yet been modularized. Let's keep that out from this PR for the time being.
As a transitionary thing we could already add EgressGatewayConfig, keep the option.Config.EnableIPv4EgressGateway, remove the flag registration from daemon/cmd/daemon_main.go
and do it in EgressGatewayConfig.Flags() instead, and finally set option.Config.EnableIPv4EgressGateway
somewhere... and the last point is where it gets a bit hairy. It can be made to work by having a cell.Invoke
at the very top in daemon/cmd/cells.go
which would depend on *DaemonConfig
and EgressGatewayConfig
and bridge the two before the "daemonCell" or anyone else gets access to it. I don't like it that much though. I think it'd be preferable to just do it cleanly once when we're far enough that we can do it.
Another thing here is that most likely EgressGatewayConfig
is a "datapath" configuration, e.g. it affects the operation of datapath (above e.g. pkg/datapath/loader/base.go) more than control-plane and it should likely live under pkg/datapath
and possibly stay internal to it. Control-plane components that manage the "egress gateway datapath" would then depend on the config either directly, or via some "egress gateway datapath API" to tell them what can and cannot be done. This is maybe less clear with egress gateway, but with datapath components that require probing to figure out the "actual" configuration, it makes sense to not expose the user provided config struct, but rather the configuration that's the result of merging user configuration and probe results.
tl;dr: let's do nothing in this PR about the config 😁
e1873b7
to
1bc0159
Compare
/test |
/ci-datapath |
/test |
/test Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
/test-1.25-4.19 Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
/mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 |
/test-runtime |
/test Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
/mlh new-flake Cilium-PR-K8s-1.25-kernel-4.19 👍 created #24602 |
/test-1.26-net-next |
/test |
/test |
runtime test flaked due to #24342 |
/test Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
Build finished. |
Introduce a constructor which takes a hive-compatible Params struct and wire things up into a global Cell. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
/test |
This is what CI driven development looks like! The test-1.24-5.4 is, I believe, unrelated. Jenkins doesn't even have logs for the cilium pods and one of @jibi PRs has the same failure. |
Do we have an issue for that? |
Filled #24667 for the CI failure |
These are the minimal changes needed to get the EGW into a hive Cell. The hacky part is that enabling the EGW is gated on the global option.Config.EnableIPv4EgressGateway which is references from a bunch of places. It's not entirely clear to me how to migrate that toggle yet, and it'll probably require a bunch of changes. I've kept the PR simple in the interests of making this easier to review.