Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also take secondary CIDRs into account when checking for validity of IPv4NativeRoutingCIDR #18653

Merged
merged 2 commits into from
Feb 10, 2022

Conversation

codablock
Copy link
Contributor

@codablock codablock commented Jan 28, 2022

The given IPv4NativeRoutingCIDR is not necessarily part of the primary
VPC CIDR and may as well be part of one of the secondary CIDRs. We should
take these into account as well before bailing out.

This PR also contains a refactoring commit to first move out the auto detection logic into its own function.

I encountered this issue in our VPC setup, which has a primary CIDR and multiple secondary CIDRs. These CIDRs are from two different classes of CIDRs: 172.16.0.0/12 and 100.64.0.0/10. We use the 172.16.0.0/12 CIDRs for nodes that must be routable inside our VPN. The 100.64.0.0/10 CIDRs are used for k8s/cilium internal IPs, e.g. POD IPs. The IPv4NativeRoutingCIDR that we specify via configuration is a CIDR that contains all the smaller CIDRs from the 100.64.0.0/10 class. This IPv4NativeRoutingCIDR however does not contain the primary VPC CIDR (which is from the 172.16.0.0/12 class), which then causes cilium to abort with a fatal message.

Also take secondary CIDRs into account when checking for validity of IPv4NativeRoutingCIDR

Related: #17762

@codablock codablock requested review from a team and gandro January 28, 2022 13:20
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 28, 2022
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me! One question around the missing ENI status however.

pkg/ipam/crd.go Outdated Show resolved Hide resolved
@gandro gandro added area/eni Impacts ENI based IPAM. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/ipam IP address management, including cloud IPAM labels Jan 31, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jan 31, 2022
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

@gandro
Copy link
Member

gandro commented Jan 31, 2022

/test

Job 'Cilium-PR-K8s-GKE' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sLRPTests Checks local redirect policy LRP connectivity

Failure Output

FAIL: Cannot retrieve services on cilium pod

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-GKE so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.23-kernel-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sLRPTests Checks local redirect policy LRP connectivity

Failure Output

FAIL: Cannot retrieve services on cilium pod

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.23-kernel-net-next so I can create a new GitHub issue to track it.

@codablock
Copy link
Contributor Author

codablock commented Feb 2, 2022

I assume that test failures are not related to this PR?

Edit: I rebased on master to ensure that I don't miss any fixes in this PR. Looking at the code again, it can't be related as the tests are using the cluster-pool IPAM so the changed code path is not even executed.

@gandro
Copy link
Member

gandro commented Feb 2, 2022

I assume that test failures are not related to this PR?

Yes, looking at the output, I think so too. Thanks for the rebase. I'll re-run the tests.

@gandro
Copy link
Member

gandro commented Feb 2, 2022

/test

Job 'Cilium-PR-K8s-1.22-kernel-4.19' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDatapathConfig Iptables Skip conntrack for pod traffic

Failure Output

FAIL: Cannot flush conntrack table

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.22-kernel-4.19 so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.21-kernel-5.4' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDatapathConfig Iptables Skip conntrack for pod traffic

Failure Output

FAIL: Cannot flush conntrack table

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.21-kernel-5.4 so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.23-kernel-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDatapathConfig Iptables Skip conntrack for pod traffic

Failure Output

FAIL: Cannot flush conntrack table

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.23-kernel-net-next so I can create a new GitHub issue to track it.

Signed-off-by: Alexander Block <ablock84@gmail.com>
…gCIDR

The given IPv4NativeRoutingCIDR is not necessarely part of the primary
VPC CIDR and may as well be part of one of the secondary CIDRs. We should
take these into account as well before bailing out.

Signed-off-by: Alexander Block <ablock84@gmail.com>
@codablock
Copy link
Contributor Author

@gandro Tests were still failing, so I did a rebase + force-push again. Can you trigger tests again?

@gandro
Copy link
Member

gandro commented Feb 9, 2022

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 9, 2022
@jibi jibi merged commit e8b1210 into cilium:master Feb 10, 2022
@codablock
Copy link
Contributor Author

@gandro @jibi Is it possible to also backport this into 1.11? This would allow us to switch to a vanilla release when it gets released.

@codablock codablock deleted the vpc-secondary-ips branch February 10, 2022 14:56
@jibi jibi removed the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Feb 11, 2022
@jibi jibi added release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.11 labels Feb 11, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.11.2 Feb 11, 2022
@jibi
Copy link
Member

jibi commented Feb 11, 2022

@gandro @jibi Is it possible to also backport this into 1.11? This would allow us to switch to a vanilla release when it gets released.

@codablock I nominated this PR for v1.11 backporting as I think this should be classified as a bugfix

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.11.2 Feb 11, 2022
@jibi jibi added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.11 in 1.11.2 Feb 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.11 in 1.11.2 Feb 14, 2022
@christarazi
Copy link
Member

christarazi commented May 31, 2022

This likely fixes a bug in ENI mode where the user is utilizing secondary CIDRs and the cilium_host restoration logic doesn't account for these CIDRs (as it relies on the autodetection of the VPC CIDR which this PR fixes), introduced from #17762.

Marking as needs backport to v1.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/eni Impacts ENI based IPAM. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/ipam IP address management, including cloud IPAM
Projects
No open projects
1.11.2
Backport done to v1.11
Development

Successfully merging this pull request may close these issues.

None yet

5 participants