Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error reporting/catching in agent on nativeRoutingCIDR #16646

Merged
merged 2 commits into from Jun 29, 2021

Conversation

jibi
Copy link
Member

@jibi jibi commented Jun 24, 2021

See individual commits

@jibi jibi added area/daemon Impacts operation of the Cilium daemon. release-note/misc This PR makes changes that have no direct user impact. area/helm Impacts helm charts and user deployment experience labels Jun 24, 2021
@jibi jibi requested review from brb, borkmann and a team June 24, 2021 10:44
@jibi jibi requested review from a team as code owners June 24, 2021 10:44
@jibi jibi requested review from a team, ldelossa and qmonnet June 24, 2021 10:44
@jibi jibi requested a review from errordeveloper June 24, 2021 10:44
@jibi jibi force-pushed the pr/jibi/better-native-routing-cidr-errors branch 2 times, most recently from fedf731 to eff456e Compare June 24, 2021 10:51
@qmonnet qmonnet added the upgrade-impact This PR has potential upgrade or downgrade impact. label Jun 24, 2021
Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jibi jibi force-pushed the pr/jibi/better-native-routing-cidr-errors branch from eff456e to d775eb4 Compare June 25, 2021 07:59
@bmcustodio
Copy link
Contributor

@jibi could you please rebase now that #16469 has been merged?

@jibi jibi force-pushed the pr/jibi/better-native-routing-cidr-errors branch from d775eb4 to 66433c5 Compare June 25, 2021 09:16
@jibi
Copy link
Member Author

jibi commented Jun 25, 2021

@jibi could you please rebase now that #16469 has been merged?

@bmcustodio done ✔️

I think there are still some issues with the helm documentation as for example the monitor key is getting the documentation for ipv4NativeRoutingCIDR (since that entire block is commented out)

@bmcustodio
Copy link
Contributor

I think there are still some issues with the helm documentation as for example the monitor key is getting the documentation for ipv4NativeRoutingCIDR (since that entire block is commented out)

Thank you 👍 I'm going to check that one out!

Make sure the provided range is actually a v4 CIDR.

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
@jibi jibi force-pushed the pr/jibi/better-native-routing-cidr-errors branch from 910a7b7 to dd69fd6 Compare June 28, 2021 07:14
@jibi
Copy link
Member Author

jibi commented Jun 28, 2021

test-me-please

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I question the use of cidr.MustParseCIDR causing a panic while the rest of error conditions in config.go perform a normal logged error and exit the application. But this isn't a change in your PR.

@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.2 Jun 28, 2021
@jibi
Copy link
Member Author

jibi commented Jun 28, 2021

Looks good to me.

I question the use of cidr.MustParseCIDR causing a panic while the rest of error conditions in config.go perform a normal logged error and exit the application. But this isn't a change in your PR.

Makes sense 👍 perhaps as a follow up we can replace MustParseCIDR which something that returns an error rather than panicking (and move the v4 range check inside that function)

@ldelossa
Copy link
Contributor

ldelossa commented Jun 28, 2021

Looks good to me.
I question the use of cidr.MustParseCIDR causing a panic while the rest of error conditions in config.go perform a normal logged error and exit the application. But this isn't a change in your PR.

Makes sense perhaps as a follow up we can replace MustParseCIDR which something that returns an error rather than panicking (and move the v4 range check inside that function)

That's my thinking, but low hanging fruit IMO. Nice to have.

@borkmann
Copy link
Member

retest-1.21-4.9

@jibi jibi added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 29, 2021
@borkmann borkmann merged commit 504e19e into master Jun 29, 2021
@borkmann borkmann deleted the pr/jibi/better-native-routing-cidr-errors branch June 29, 2021 06:27
@tklauser
Copy link
Member

While backporting for 1.10, I noticed that this PR was marked needs-backport/1.10.

However, it deprecates an existing option in the first commit (("daemon: rename native-routing-cidr option to ipv4-native-routing-cidr")), which is probably not what we want to do in a stable release?

Given that the deprecation notice was added to the 1.11 upgrade notes, do we really want to backport the full PR to 1.10? Or was the intention just to backport the second commit (config: add validation for IPv4NativeRoutingCIDR)?

@joestringer
Copy link
Member

@tklauser thanks for keeping an eye out for this, I agree that change should not be made in v1.10 as it can cause upgrade impact or break existing deployments during a point release.

@borkmann @jibi what was the intent here?

I'll drop the backport label until we figure this out.

@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Needs backport from master in 1.10.2 Jun 29, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 29, 2021
@jibi
Copy link
Member Author

jibi commented Jun 30, 2021

@tklauser thanks for keeping an eye out for this, I agree that change should not be made in v1.10 as it can cause upgrade impact or break existing deployments during a point release.

@borkmann @jibi what was the intent here?

I'll drop the backport label until we figure this out.

@joestringer I marked this for backport as I thought the upgrade impact was minimal (i.e. only a warning that --native-routing-cidr is deprecated and that --ipv4-native-routing-cidr should be used instead).

What I just realized is that this has a non trivial downgrade impact: if for example a user starts on 1.10.2 following the documentation (using the new --ipv4-native-routing-cidr) and then downgrades to 1.10.1, the downgrade will break their setup as the flag does not exist in that release. So I agree with the decision of dropping the label (sorry for the noise)

@joestringer
Copy link
Member

FWIW taking a look back over, while this could potentially catch some misconfiguration from users in a few circumstances, I don't think this PR really reaches the threshold where we should be backporting it. There is a cost of performing the backport and some degree of risk we take on. In this case I think it's OK for users to pick up these improvements with the next release.

For reference, the backport criteria definition is here: https://docs.cilium.io/en/latest/contributing/release/backports/#backport-criteria-for-current-minor-release.

qmonnet added a commit to qmonnet/cilium that referenced this pull request Jul 1, 2021
Fix Helm documentation after a recent update of the Helm values. Update
the spelling file accordingly.

Fixes: 792ed5a ("daemon: rename native-routing-cidr option to ipv4-native-routing-cidr")
Fixes: cilium#16646
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
joestringer pushed a commit that referenced this pull request Jul 1, 2021
Fix Helm documentation after a recent update of the Helm values. Update
the spelling file accordingly.

Fixes: 792ed5a ("daemon: rename native-routing-cidr option to ipv4-native-routing-cidr")
Fixes: #16646
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. area/helm Impacts helm charts and user deployment experience release-note/misc This PR makes changes that have no direct user impact. upgrade-impact This PR has potential upgrade or downgrade impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet