New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Fix egress gateway getting started guide #15984
docs: Fix egress gateway getting started guide #15984
Conversation
This commit fixes various issues discovered while testing the egress gateway getting started guide: - Use the correct Helm value (`egressGateway.enabled`) to enable the feature. - Use the correct field names in the example `CiliumEgressNATPolicy`. - Set the Kubernetes namespace in `helm install` as we do in other `helm install` invocations. - Ensure that the `egress-ip-assign` depoyment is always co-located with the example workload via pod affinity. - The examples use `curl` to access the external service, so use `curl` in the access log as well. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
/me casts resurrect spell Hey @gandro, do you still remember why this part was needed:
Discussing with @jibi, having the workload on the same node as the Egress Gateway shouldn't be necessary at all. |
It's been a while, so I might be wrong here - but from what I can reconstruct this was not about the egress gateway itself, but the node configuration. My understanding is that the node where egress gateway is running needs the egress IP to be set, and this change made it such that we're setting the IP on the gateway node, and not just a random node. Looking at the change, I guess the affinity should be based on the workload, but on the gateway itself, but I'm not sure to achieve that. The docs here also refer to the OSS version of egress gateway, which might behave differently from what Cilium Enterprise offers - I honestly lack the knowledge here. @jibi would know better. |
Assigning the IP to the node makes it the gateway node. But maybe what we'd want here is the opposite of the current action: to ensure that the egress gateway is not the node on which the client pod is running. That way users can see the full effect of the egress gateway feature (i.e., redirect to egress gateway + SNAT instead of just SNAT currently). |
For reference, the test-VM instead suggests using a |
correct
we are setting the IP on the node where the workload is running, which works I guess, but at the same time was confusing us a bit (as the comment seems to imply that we specifically need the node with the workload, while it's actually not)
agree with this 👍
yep, we should probably change the example |
Won't this be hard to achieve in an arbitrary customer environment? We don't know the names and labels of nodes. It makes sense for the CI where we control all this, but not sure it does for a user guide. |
I mean, we should just the same example selector ( apiVersion: cilium.io/v2alpha1
kind: CiliumEgressNATPolicy
metadata:
name: egress-test
spec:
#[..]
egressGateway:
nodeSelector:
matchLabels:
example-label: test and then in the apiVersion: v1
kind: Pod
metadata:
name: egress-ip-assign
spec:
#[..]
nodeSelector:
example-label: test (also not sure why we are currently suggesting to use a |
That requires the user to add the appropriate label on the node. One more step that isn't really necessary in this context IMO. Why do we care about which exact node the egress gateway ends up being? I understand for test (reduce randomness) but I don't see the point for a user guide. An antiaffinity rule seems enough to me. |
with the current state of the feature: yes, but once we add support for |
This commit fixes various issues discovered while testing the egress
gateway getting started guide:
egressGateway.enabled
) to enable thefeature.
CiliumEgressNATPolicy
.helm install
as we do in otherhelm install
invocations.egress-ip-assign
depoyment is always co-locatedwith the example workload via pod affinity.
curl
to access the external service, so usecurl
in the access log as well.
With these fixes applied, I was able to successfully validate this guide.