Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to support IPv6 addresses on nodes #268

Merged
merged 1 commit into from
Sep 1, 2021

Conversation

sdmodi
Copy link

@sdmodi sdmodi commented Aug 20, 2021

These changes populate the Node object with the IPv6 address as well as the IPv6 podCIDR. This is done only for clusters with stackType as IPV4_IPV6. This patch handles both the internal and external IPv6 addresses.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 20, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @sdmodi. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 20, 2021
@sdmodi
Copy link
Author

sdmodi commented Aug 20, 2021

/cc @basantsa1989

@k8s-ci-robot
Copy link
Contributor

@sdmodi: GitHub didn't allow me to request PR reviews from the following users: basant1989.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @Basant1989

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MrHohn
Copy link
Member

MrHohn commented Aug 20, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 20, 2021
@sdmodi
Copy link
Author

sdmodi commented Aug 24, 2021

/test cloud-provider-gcp-e2e-full

@sdmodi
Copy link
Author

sdmodi commented Aug 24, 2021

Joe/Jiahui would you please review this. Thanks!

@sdmodi
Copy link
Author

sdmodi commented Aug 24, 2021

/test cloud-provider-gcp-e2e-full

@sdmodi
Copy link
Author

sdmodi commented Aug 25, 2021

/assign @bowei

@@ -392,6 +397,10 @@ func generateCloudConfig(configFile *ConfigFile) (cloudConfig *CloudConfig, err
cloudConfig.SecondaryRangeName = configFile.Global.SecondaryRangeName
}

if configFile != nil && configFile.Global.StackType != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip this check and let it set it to "" anyway?

@@ -116,6 +118,27 @@ func (g *Cloud) NodeAddresses(ctx context.Context, nodeName types.NodeName) ([]v
}
nodeAddresses = append(nodeAddresses, v1.NodeAddress{Type: v1.NodeInternalIP, Address: internalIP})

if g.stackType == "IPV4_IPV6" {
// Handling only the internal v6 address. External v6 addresses will be handled once vendor apis are updated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove or update this comment now that we support external too.

if internalIPV6 != "" {
nodeAddresses = append(nodeAddresses, v1.NodeAddress{Type: v1.NodeInternalIP, Address: internalIPV6})
} else {
klog.Warningf("internal IPV6 range is empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle the external IPV6 addresses right here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to. Both the internal and external IPv6 addresses are written to this same field in the metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!
Nit: Rename 'internalIPV6s' and 'internalIPV6Arr' to something generic like ipv6s, ipv6Arr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we include the node name in the warning log - might be helpful when debugging.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Done.

}

return nodeAddresses, nil
}

func getIPV6AddressFromInterface(nic *computealpha.NetworkInterface) string {
ipv6Addr := nic.Ipv6Address
if ipv6Addr == "" && nic.Ipv6AccessType == "EXTERNAL" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a discussion by gce team on supporting both public and private v6 IPs on the same interface. We may have to revisit this if that is the case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For now, VMs just get one IPv6 address.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about the access type - does "INTERNAL" correspond to directpath? What happen if the customer enables both dualstack and directpath?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INTERNAL does not directly correspond to directpath. INTERNAL means that the subnet has only private IPv6 addresses. Today directpath uses INTERNAL addresses. When dual stack is enabled with directpath on a subnet with INTERNAL addresses, everything just works. Directpath and dual stack use the same IPs. Directpath on subnets with EXTERNAL addresses is not working currently. The GCE team is trying to figure something out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the clarification. Make sense, seems like we will worry about the directpath use case later then.

Copy link
Author

@sdmodi sdmodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the other comments

if internalIPV6 != "" {
nodeAddresses = append(nodeAddresses, v1.NodeAddress{Type: v1.NodeInternalIP, Address: internalIPV6})
} else {
klog.Warningf("internal IPV6 range is empty")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to. Both the internal and external IPv6 addresses are written to this same field in the metadata.

}

return nodeAddresses, nil
}

func getIPV6AddressFromInterface(nic *computealpha.NetworkInterface) string {
ipv6Addr := nic.Ipv6Address
if ipv6Addr == "" && nic.Ipv6AccessType == "EXTERNAL" {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For now, VMs just get one IPv6 address.

@sdmodi
Copy link
Author

sdmodi commented Aug 25, 2021

/test cloud-provider-gcp-verify-all

1 similar comment
@sdmodi
Copy link
Author

sdmodi commented Aug 25, 2021

/test cloud-provider-gcp-verify-all

@jiahuif
Copy link
Member

jiahuif commented Aug 25, 2021

There is a govet error

[*] Verifying govet...
# k8s.io/cloud-provider-gcp/cmd/gke-gcloud-auth-plugin
cmd/gke-gcloud-auth-plugin/main.go:99:37: k8s.io/apimachinery/pkg/apis/meta/v1.Time composite literal uses unkeyed fields
cmd/gke-gcloud-auth-plugin/main.go:113:27: k8s.io/apimachinery/pkg/apis/meta/v1.Time composite literal uses unkeyed fields

@jiahuif
Copy link
Member

jiahuif commented Aug 25, 2021

sent a PR to fix that #269

@@ -166,6 +166,9 @@ type Cloud struct {
s *cloud.Service

metricsCollector loadbalancerMetricsCollector
// stackType indicates whether the cluster is a single stack IPv4, single
// stack IPv6 or a dual stack cluster
stackType string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have the list of all possible values as constants? something like

type StackType string
const NetworkStackDualStack StackType = "IPV4_IPV6"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we can eliminate the magic number below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sdmodi sdmodi force-pushed the ipv6-changes branch 2 times, most recently from 597be0f to 8e0bf35 Compare August 30, 2021 17:48
@jiahuif
Copy link
Member

jiahuif commented Aug 30, 2021

I wonder if there is a way to test this part of code right now. There is a fakeGCP so it seems that integration tests are possible.

@sdmodi
Copy link
Author

sdmodi commented Aug 30, 2021

I thought through how we could test this. The real issue is that the underlying code calls the metadata server which does not have the right architecture to fake.

I can add tests for the part of the code that calls GCP API. Let me take a stab at doing that. Thanks!

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 31, 2021
@sdmodi
Copy link
Author

sdmodi commented Aug 31, 2021

I have added unit tests wherever I could. Please take a look.

@basantsa1989 would you please review this again. Thanks!

Copy link
Contributor

@basantsa1989 basantsa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Kudos on the test coverage :)

@@ -117,6 +119,27 @@ func (g *Cloud) NodeAddresses(ctx context.Context, nodeName types.NodeName) ([]v
}
nodeAddresses = append(nodeAddresses, v1.NodeAddress{Type: v1.NodeInternalIP, Address: internalIP})

if g.stackType == NetworkStackDualStack {
// Both internal and external IPv6 addresses are written to this array
internalIPV6s, err := metadata.Get(fmt.Sprintf(networkInterfaceIPV6, nic))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since we handle both internal and external IPv6 addresses, we can rename 'internalIPV6s' and 'internalIPV6Arr' to something generic like ipv6s, ipv6Arr.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-ci-robot
Copy link
Contributor

@basantsa1989: changing LGTM is restricted to collaborators

In response to this:

/lgtm
Kudos on the test coverage :)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jiahuif
Copy link
Member

jiahuif commented Sep 1, 2021

lgtm for the change to the cloud provider. I do not have enough expertise with networking, though.

@sdmodi
Copy link
Author

sdmodi commented Sep 1, 2021

/assign @MrHohn

Zihong, would you please take a look at these changes from a networking point of view. Thanks!

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a few questions on the stacktype but overall LGTM.

providers/gce/gce.go Show resolved Hide resolved
@@ -393,6 +404,10 @@ func generateCloudConfig(configFile *ConfigFile) (cloudConfig *CloudConfig, err
cloudConfig.SecondaryRangeName = configFile.Global.SecondaryRangeName
}

if configFile != nil {
cloudConfig.StackType = configFile.Global.StackType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the default value for this StackType field? Is empty the same as IPV4?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should treat empty the same as IPV4

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep seems good. Was thinking what would happen when using this with old clusters.

if internalIPV6 != "" {
nodeAddresses = append(nodeAddresses, v1.NodeAddress{Type: v1.NodeInternalIP, Address: internalIPV6})
} else {
klog.Warningf("internal IPV6 range is empty")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we include the node name in the warning log - might be helpful when debugging.

}

return nodeAddresses, nil
}

func getIPV6AddressFromInterface(nic *computealpha.NetworkInterface) string {
ipv6Addr := nic.Ipv6Address
if ipv6Addr == "" && nic.Ipv6AccessType == "EXTERNAL" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about the access type - does "INTERNAL" correspond to directpath? What happen if the customer enables both dualstack and directpath?

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2021
@@ -94,6 +94,12 @@ var _ cloudprovider.Zones = (*Cloud)(nil)
var _ cloudprovider.PVLabeler = (*Cloud)(nil)
var _ cloudprovider.Clusters = (*Cloud)(nil)

type StackType string

const NetworkStackDualStack StackType = "IPV4_IPV6"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reference from where this constant is coming from

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are in the comments of the go sdk.
https://pkg.go.dev/google.golang.org/api@v0.56.0/compute/v0.alpha
(search for "IPV4_IPV6")
Unfortunately the generated code did not have the constants directly avaliable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually being set by the GKE cluster server. This value is going to be written to gce.conf.

These changes populate the Node object with the IPv6 address as well
as the IPv6 podCIDR. This is done only for clusters with stackType as
IPV4_IPV6
}

return nodeAddresses, nil
}

func getIPV6AddressFromInterface(nic *computealpha.NetworkInterface) string {
ipv6Addr := nic.Ipv6Address
if ipv6Addr == "" && nic.Ipv6AccessType == "EXTERNAL" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the clarification. Make sense, seems like we will worry about the directpath use case later then.

@@ -393,6 +404,10 @@ func generateCloudConfig(configFile *ConfigFile) (cloudConfig *CloudConfig, err
cloudConfig.SecondaryRangeName = configFile.Global.SecondaryRangeName
}

if configFile != nil {
cloudConfig.StackType = configFile.Global.StackType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep seems good. Was thinking what would happen when using this with old clusters.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: basantsa1989, MrHohn, sdmodi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@MrHohn
Copy link
Member

MrHohn commented Sep 1, 2021

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2021
@k8s-ci-robot k8s-ci-robot merged commit 7f05bd0 into kubernetes:master Sep 1, 2021
Fedosin added a commit to Fedosin/cloud-provider-gcp that referenced this pull request Oct 14, 2021
A recent change kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
Fedosin added a commit to Fedosin/cloud-provider-gcp that referenced this pull request Oct 14, 2021
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
Fedosin added a commit to Fedosin/cloud-provider-gcp that referenced this pull request Oct 14, 2021
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
Fedosin added a commit to Fedosin/cloud-provider-gcp that referenced this pull request Oct 14, 2021
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
Fedosin added a commit to Fedosin/cloud-provider-gcp that referenced this pull request Oct 27, 2021
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
aojea pushed a commit to aojea/cloud-provider-gcp that referenced this pull request Feb 9, 2023
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
aojea pushed a commit to aojea/cloud-provider-gcp that referenced this pull request Feb 9, 2023
kubernetes#268 introduced IPv6 support, but it also began to
require Google Cloud Compute Aplha API access. Unfortunately by default
projects don't have this access and therefore, when the CCM fails with:

googleapi: Error 403: Required 'Alpha Access' permission for 'Compute API', forbidden

This commit starts using Alpha API only for dual stack deployments,
where it's really required. For all other cases we will continue to
use GA API.

Fixes: kubernetes#281
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants