Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

egressgw: change special values for gatewayIP #24449

Merged
merged 3 commits into from
Apr 12, 2023

Conversation

MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented Mar 19, 2023

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request

  • All code is covered by unit and/or runtime tests where feasible.

  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.

  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.

  • All commits are signed off. See the section Developer’s Certificate of Origin

  • Provide a title or release-note blurb suitable for the release notes.

  • Are you a user of Cilium? Please add yourself to the Users doc

  • Thanks for contributing!

  • Change excluded CIDR "special" value of gatewayIP from 0.0.0.0 -> 0.0.0.1

@MrFreezeex MrFreezeex requested a review from a team as a code owner March 19, 2023 17:51
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 19, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Mar 19, 2023
@MrFreezeex
Copy link
Member Author

MrFreezeex commented Mar 19, 2023

I know this needs further polishing on test and possibly documentation, but I opened this to get initial feedback if this look like the right way to do this and whether or not traffic on egressgw that does not match any node should be dropped by default or not (or maybe this behavior should be guarded under a non default option?). I assumed this was the default behavior and had a bad surprise on my home setup (where I use this as some sort of proxy to reach the external world on some specific workloads).

@MrFreezeex MrFreezeex force-pushed the egressgw-drop-nonexistent branch 2 times, most recently from 47d318a to 5b0e2ff Compare March 19, 2023 18:04
@MrFreezeex MrFreezeex changed the title egressgw: drop traffic that does not any node egressgw: drop traffic that does not match any node Mar 19, 2023
@MrFreezeex MrFreezeex force-pushed the egressgw-drop-nonexistent branch 2 times, most recently from 7b8109c to 6dd2492 Compare March 19, 2023 20:50
@julianwiedmann julianwiedmann added feature/egress-gateway Impacts the egress IP gateway feature. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Mar 28, 2023
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks reasonable however, I'd like either @julianwiedmann or @jibi to give a final ack for full context.

@jibi jibi self-requested a review March 30, 2023 07:30
@jibi
Copy link
Member

jibi commented Mar 30, 2023

Thanks for the PR @MrFreezeex! Couple of comments below.

First I think 2158822e021025b4c25ac0d054aef1a14de2cbf0 warrants its own PR as that's a separate bug fix.

Next, for the other 2 commits, I'm not opposed to the idea of explicitly dropping traffic selected by a CEGP that doesn't match any node (I think that's desirable) but this will cause some issues during upgrades/downgrades of the agent. For example:

  • let's say we have a policy with an excluded CIDR
  • Cilium gets upgraded and the agent installs the new bpf programs
  • these programs will now interpret the 0.0.0.0 gateway IP as a drop rather than an excluded CIDR, causing a short time window where traffic that should leave the cluster non masqueraded gets dropped
  • eventually the egressgw manager updates the bpf maps, and connectivity is restored

One way to work around this would be to introduce the change over a couple of releases:

  • 1.x.y release: teach the datapath about 255.255.255.255 being an excluded CIDR while keeping also the existing 0.0.0.0 case and make the agent use 255.255.255.255 as well for excluded CIDRs
  • 1.x.y+1 release: switch the 0.0.0.0 case in the datapath to a drop

but I need to think a bit more here (as that could still cause some drops for example when downgrading from 1.x.y) so perhaps we can start with getting the first commit in?

@MrFreezeex
Copy link
Member Author

Thanks for the PR @MrFreezeex! Couple of comments below.

First I think 2158822 warrants its own PR as that's a separate bug fix.

Sure I will do that.

One way to work around this would be to introduce the change over a couple of releases:

* `1.x.y` release: teach the datapath about `255.255.255.255` being an excluded CIDR while keeping also the existing `0.0.0.0` case and make the agent use `255.255.255.255` as well for excluded CIDRs

* `1.x.y+1` release: switch the `0.0.0.0` case in the datapath to a drop

but I need to think a bit more here (as that could still cause some drops for example when downgrading from 1.x.y) so perhaps we can start with getting the first commit in?

Migration is a very good point that I indeed didn't have in mind! I am thinking about an alternative solution though, we could just say that 255.255.255.255 would be the blocking special value and keep 0.0.0.0 as the current excluded cidr case. It kinda makes less sense in terms of what the value means but at least this will help getting a smoother upgrade. WDYT?

@MrFreezeex
Copy link
Member Author

Thanks for the PR @MrFreezeex! Couple of comments below.
First I think 2158822 warrants its own PR as that's a separate bug fix.

Sure I will do that.

Here is the separated PR for the first commit: #24646

@jibi
Copy link
Member

jibi commented Mar 30, 2023

I am thinking about an alternative solution though, we could just say that 255.255.255.255 would be the blocking special value and keep 0.0.0.0 as the current excluded cidr case. It kinda makes less sense in terms of what the value means but at least this will help getting a smoother upgrade. WDYT?

I thought about that as well, but as you pointed out it makes things less clear, so not 100% sure we want to go this way 🤔

@pchaigno
Copy link
Member

I am thinking about an alternative solution though, we could just say that 255.255.255.255 would be the blocking special value and keep 0.0.0.0 as the current excluded cidr case. It kinda makes less sense in terms of what the value means but at least this will help getting a smoother upgrade. WDYT?

I thought about that as well, but as you pointed out it makes things less clear, so not 100% sure we want to go this way thinking

Aren't those values implementation details? We dump them in the CLI commands, but maybe that's our mistake and we should convert them to e.g. block and excluded when showing the map content?

@MrFreezeex
Copy link
Member Author

I am thinking about an alternative solution though, we could just say that 255.255.255.255 would be the blocking special value and keep 0.0.0.0 as the current excluded cidr case. It kinda makes less sense in terms of what the value means but at least this will help getting a smoother upgrade. WDYT?

I thought about that as well, but as you pointed out it makes things less clear, so not 100% sure we want to go this way thinking

Aren't those values implementation details? We dump them in the CLI commands, but maybe that's our mistake and we should convert them to e.g. block and excluded when showing the map content?

Yep afaik they are and never shown except with cilium bpf egress list anyway. The issue is more about the scenario described here #24449 (comment) as people using excluded cidr will have a short connection drop at upgrade time if this PR goes as is. Doing the opposite is not a problem for the user it's more a maintainability/readability issue I guess. Like it's easier to conceptualize that 0/no ip (for 0.0.0.0)
would mean block and "1"/broadcast (for 255.255.255.255) would mean excluded/route traffic without egressgw than the opposite. So there is a choice to make between code readability and possibly splitting the change in multiple release (or something else?).

@MrFreezeex
Copy link
Member Author

Ok actually I might have another idea. We could add an optional settings disabled by default to block traffic if egressgw doesn't match anything (disabled by default for now). If enabled it will block 0.0.0.0 and if disabled ignore both 0.0.0.0 and 255.255.255.255. And then in a future release you could decide to either keep this option disabled as is so it won't be a breaking change from the current behavior whatsoever or enable it/drop the option.

@pchaigno
Copy link
Member

Doing the opposite is not a problem for the user it's more a maintainability/readability issue I guess. Like it's easier to conceptualize that 0/no ip (for 0.0.0.0)
would mean block and "1"/broadcast (for 255.255.255.255) would mean excluded/route traffic without egressgw than the opposite.

Honestly, if it's just a matter of code readability, then I think that's a job for clear constant names (e.g., const specialExcludedValue = "0.0.0.0") and code comments. I agree ideally the other way around would be better, but I'm not sure it's worth the additional complexity of special upgrade/downgrade handling or new settings.

@MrFreezeex MrFreezeex force-pushed the egressgw-drop-nonexistent branch 4 times, most recently from 761161b to eed99de Compare April 3, 2023 15:35
@MrFreezeex MrFreezeex requested a review from a team as a code owner April 3, 2023 16:26
@MrFreezeex MrFreezeex requested a review from nebril April 3, 2023 16:26
@jibi
Copy link
Member

jibi commented Apr 11, 2023

/test

@jibi
Copy link
Member

jibi commented Apr 11, 2023

That worked to fix most of the failures, the only one left is for the new end to end test:

[2023-04-11T12:53:02.290Z] FAIL: Failed waiting for egress policy map entries
[2023-04-11T12:53:02.290Z] Expected
[2023-04-11T12:53:02.291Z]     <*errors.errorString | 0xc00060dc00>: {
[2023-04-11T12:53:02.291Z]         s: "could not ensure egress policy entries count is equal to 2: 4m0s timeout expired",
[2023-04-11T12:53:02.291Z]     }
[2023-04-11T12:53:02.291Z] to be nil
[2023-04-11T12:53:02.291Z] ===================== TEST FAILED =====================
[2023-04-11T12:53:02.291Z] 12:52:48 STEP: Running AfterFailed block for EntireTestsuite K8sDatapathEgressGatewayTest egress gw policy when the gateway is not found
[..]
[2023-04-11T12:53:02.291Z] Fetching command output from pods [cilium-c9gtp cilium-m5j7k]
[2023-04-11T12:53:02.291Z] cmd: kubectl exec -n kube-system cilium-c9gtp -c cilium-agent -- cilium bpf egress list
[2023-04-11T12:53:02.291Z] Exitcode: 0 
[2023-04-11T12:53:02.291Z] Stdout:
[2023-04-11T12:53:02.291Z]  	 

and the reason for that is that the source pods are not deployed (as we are using a separate context for that test).

Moving Context("egress gw policy when the gateway is not found", .. together with the other tests in the doContext function should ensure that everything we need for that test is deployed (and also the new test will run with all the configurations we are testing)

@MrFreezeex
Copy link
Member Author

That worked to fix most of the failures, the only one left is for the new end to end test:

[2023-04-11T12:53:02.290Z] FAIL: Failed waiting for egress policy map entries
[2023-04-11T12:53:02.290Z] Expected
[2023-04-11T12:53:02.291Z]     <*errors.errorString | 0xc00060dc00>: {
[2023-04-11T12:53:02.291Z]         s: "could not ensure egress policy entries count is equal to 2: 4m0s timeout expired",
[2023-04-11T12:53:02.291Z]     }
[2023-04-11T12:53:02.291Z] to be nil
[2023-04-11T12:53:02.291Z] ===================== TEST FAILED =====================
[2023-04-11T12:53:02.291Z] 12:52:48 STEP: Running AfterFailed block for EntireTestsuite K8sDatapathEgressGatewayTest egress gw policy when the gateway is not found
[..]
[2023-04-11T12:53:02.291Z] Fetching command output from pods [cilium-c9gtp cilium-m5j7k]
[2023-04-11T12:53:02.291Z] cmd: kubectl exec -n kube-system cilium-c9gtp -c cilium-agent -- cilium bpf egress list
[2023-04-11T12:53:02.291Z] Exitcode: 0 
[2023-04-11T12:53:02.291Z] Stdout:
[2023-04-11T12:53:02.291Z]  	 

and the reason for that is that the source pods are not deployed (as we are using a separate context for that test).

Moving Context("egress gw policy when the gateway is not found", .. together with the other tests in the doContext function should ensure that everything we need for that test is deployed (and also the new test will run with all the configurations we are testing)

Oh yes indeed sorry for that 🤦‍♂️, I was struggling to find the root cause for this, thanks!

@jibi
Copy link
Member

jibi commented Apr 11, 2023

/test

Copy link
Member

@jibi jibi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks for bearing with me through the few iterations and for all the extra tests!

With regard to part 2: I don't think we need to wait for anything in particular, we can already start getting that into master, and label it for backport only once this PR is backported and released in v1.13

bpf/tests/tc_egressgw_redirect.c Outdated Show resolved Hide resolved
0.0.0.0 now represents that the gateway has not been found (while it
previously meant gateway not found AND excluded CIDR). While 0.0.0.1
now represents an excluded CIDR. Future special values should all be
included in the 0.0.0.0/8 range.

Both values (0.0.0.0, 0.0.0.1) currently behave like excluded CIDR and
skip the egressgw NAT-ing while not dropping the packet for backward
compatibilities during Cilium upgrade. In a future release gateway not
found (0.0.0.0) will drop the traffic.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Add a new test to check the case whenever the egress gateway doesn't match
any gateway.

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
@MrFreezeex
Copy link
Member Author

MrFreezeex commented Apr 11, 2023

Looks good now, thanks for bearing with me through the few iterations and for all the extra tests!

Thanks for bearing with me too and for all your guidance :D.

With regard to part 2: I don't think we need to wait for anything in particular, we can already start getting that into master, and label it for backport only once this PR is backported and released in v1.13

Sure 👍, I can prepare something for it as soon as this gets merged! :)

@jibi
Copy link
Member

jibi commented Apr 11, 2023

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 12, 2023
@dylandreimerink dylandreimerink merged commit 85f1ae9 into cilium:master Apr 12, 2023
@jibi jibi added the release-blocker/1.13 This issue will prevent the release of the next version of Cilium. label Apr 12, 2023
@gandro
Copy link
Member

gandro commented Apr 12, 2023

@jibi Unfortunately this does not apply cleanly to v1.13. Marking with backport/author, assuming you will be able to do the backport to the v1.13 branch.

@gandro gandro added the backport/author The backport will be carried out by the author of the PR. label Apr 12, 2023
@jibi jibi added backport/author The backport will be carried out by the author of the PR. and removed backport/author The backport will be carried out by the author of the PR. labels Apr 12, 2023
@jibi jibi mentioned this pull request Apr 12, 2023
1 task
@jibi jibi added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Apr 12, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.13 in 1.13.2 Apr 12, 2023
@julianwiedmann
Copy link
Member

#24849 is merged, updating the label to done.

@julianwiedmann julianwiedmann added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Apr 14, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.2 Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/author The backport will be carried out by the author of the PR. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. feature/egress-gateway Impacts the egress IP gateway feature. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.13 This issue will prevent the release of the next version of Cilium. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.13.2
Backport done to v1.13
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

9 participants