Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Node IP families configurable #251

Merged
merged 1 commit into from Sep 2, 2021

Conversation

olemarkus
Copy link
Member

@olemarkus olemarkus commented Aug 21, 2021

Many pods (e.g metrics-server) will always pick the first InternalIP regardless of the family the pod itself has. This will break for any pod not running on host network if the Pod only has an ipv6 address.

This PR will make which IP families to add to Node objects configurable through cloud config.

Does this PR introduce a user-facing change?:

IP families added to Node objects are now configurable though cloudconfig. The default configuration is ipv4 only.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 21, 2021
@hakman
Copy link
Member

hakman commented Aug 21, 2021

#230 (comment)

@olemarkus
Copy link
Member Author

Right. This is likely to break the other way if someone has an ipv4-only cluster then. I think we need that config option that says which families to add and the ordering.

@hakman
Copy link
Member

hakman commented Aug 22, 2021

Right. This is likely to break the other way if someone has an ipv4-only cluster then. I think we need that config option that says which families to add and the ordering.

Or look at cluster/pod CIDR?

@olemarkus
Copy link
Member Author

olemarkus commented Aug 22, 2021

What CIDR we use for the Pods doesn't say anything about what families we should use for Nodes really.

Also, --cluster-cidr flag is probably just copy/pasted from KCM. It is used by the RangeAllocator of the NodeIpam controller, but the code for this isn't included in AWS CCM. Also the suggestion I have for a cloud Node ipam doesn't use this flag, but looks at the instance data instead.

Kubelet uses --bind-address for this. We could do the same. I do find this behavior somewhat surprising, but at least consistent.

@hakman
Copy link
Member

hakman commented Aug 22, 2021

What CIDR we use for the Pods doesn't say anything about what families we should use for Nodes really.

If one has the Pod CIDR set to some IPv6 address or a dual-stack list, it kind of expresses the preference about the order in which the CCM should send the addresses. Was hoping that such info can be retrieved via the k8s api or some more elegant way that to set it as a flag.

--bind-address not quite sure what CCM does with this, but I think k8s treats 0.0.0.0 or :: in the same way and by default it listens to all.

@olemarkus
Copy link
Member Author

What CIDR we use for the Pods doesn't say anything about what families we should use for Nodes really.

If one has the Pod CIDR set to some IPv6 address or a dual-stack list, it kind of expresses the preference about the order in which the CCM should send the addresses. Was hoping that such info can be retrieved via the k8s api or some more elegant way that to set it as a flag.

It is CCM and KCM (depending on which of these runs the NodeIpam controller) that is the canonical source for the CIDRs that Nodes can assign to Pods. So this has to be decided by the flags set on which of these have the IPAM controller running.

But as mentioned I think the Node's IP adresses and PodCIDRs are two different concerns.

--bind-address not quite sure what CCM does with this, but I think k8s treats 0.0.0.0 or :: in the same way and by default it listens to all.

Sorry, I confused this with the --node-ip flag.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 25, 2021
@olemarkus olemarkus force-pushed the ipv6-first branch 2 times, most recently from 96bd86c to 93170fe Compare August 25, 2021 10:00
@olemarkus olemarkus changed the title Put ipv6 internal IPs before ipv4 Make Node IP families configurable Aug 25, 2021
for _, family := range c.cfg.Global.NodeIPFamilies {
switch family {
case "ipv4":
if c.selfAWSInstance.nodeName == name || len(name) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At best, this path seems like an unnecessary optimization given that kubelet is no longer a caller (as opposed to the legacy-cloud-provider). At worst, it maintains a non-obvious dependency on instance metadata. I think its right to leave the behavior as-is in this PR, and we can consider removing this path in a future change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was wondering if not NodeAddresses(nodename) should just perform the instanceID lookup and then call NodeAddressesByProviderID(instanceID). The code is overly redundant right now. I can do that in another PR after this one has merged.

sort.Slice(macIDs, func(i, j int) bool {
return macDevNum[macIDs[i]] < macDevNum[macIDs[j]]
})
func (c *Cloud) ipv4Addresses() ([]v1.NodeAddress, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this function is only exercised by CCM for the instance its running on, and exclusively retrieves addresses via metadata, I'd advocate for "metadata", "IMDS", or "local" in the function name to make it clear when looking at the caller.

return addresses, nil
}

func (c *Cloud) ipv6Addresses() ([]v1.NodeAddress, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, rename function to be specific to metadata.

@@ -1621,7 +1710,7 @@ func parseMetadataLocalHostname(metadata string) (string, []string) {
}

// extractNodeAddresses maps the instance information from EC2 to an array of NodeAddresses
func extractNodeAddresses(instance *ec2.Instance) ([]v1.NodeAddress, error) {
func extractIPv4NodeAddresses(instance *ec2.Instance) ([]v1.NodeAddress, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment or rename the variables to make it clear that privateDNSName and publicDNSName are ipv4 specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this?

Both which families to include and which the ordering of the families
@nckturner
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 2, 2021
@nckturner
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nckturner, olemarkus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit a43d818 into kubernetes:master Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants