Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud controller is unable to patch node with IP addresses from cloud provider #266

Closed
ivanfilippov opened this issue Oct 11, 2019 · 25 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@ivanfilippov
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Cloud controller is unable to patch node with IP addresses from cloud provider on a multi-NIC VM.

I1011 05:22:49.646435       1 instances.go:158] instances.InstanceExistsByProviderID() CACHED with 42348b4e-4ba1-e452-e946-6d0e9a323160
I1011 05:22:49.646457       1 instances.go:78] instances.NodeAddressesByProviderID() CACHED with 42348b4e-4ba1-e452-e946-6d0e9a323160
E1011 05:22:49.656899       1 node_controller.go:188] Error patching node with cloud ip addresses = [failed to patch status "{\"status\":{\"$setElementOrder/addresses\":[{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"Hostname\"},{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"}],\"addresses\":[{\"address\":\"172.16.202.44\",\"type\":\"ExternalIP\"},{\"address\":\"172.16.201.44\",\"type\":\"ExternalIP\"},{\"address\":\"172.16.45.44\",\"type\":\"ExternalIP\"},{\"address\":\"172.16.202.44\",\"type\":\"InternalIP\"},{\"address\":\"172.16.201.44\",\"type\":\"InternalIP\"},{\"address\":\"172.16.45.44\",\"type\":\"InternalIP\"}]}}" for node "k8s-prod5-worker-2": The order in patch list:
[map[address:172.16.202.44 type:ExternalIP] map[address:172.16.201.44 type:ExternalIP] map[address:172.16.45.44 type:ExternalIP] map[address:172.16.202.44 type:InternalIP] map[address:172.16.201.44 type:InternalIP] map[address:172.16.45.44 type:InternalIP]]
 doesn't match $setElementOrder list:
[map[type:ExternalIP] map[type:InternalIP] map[type:Hostname] map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP]]
]

What you expected to happen:
Nodes are patched with IP addresses from cloud provider.

How to reproduce it (as minimally and precisely as possible):
Build a worker node with 3 NICs and attempt to deploy the cloud controller

Anything else we need to know?:
This was encountered in the openstack provider as well and was fixed there (kubernetes/cloud-provider-openstack#407).

Environment:

  • vsphere-cloud-controller-manager version: latest from github
  • OS (e.g. from /etc/os-release): RancherOS v1.5.4
  • Kernel (e.g. uname -a): 4.14.138-rancher
  • Install tools: cluster installed via RKE
  • Others:
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 11, 2019
@frapposelli
Copy link
Member

frapposelli commented Oct 11, 2019

/assign @andrewsykim @dvonthenen

@k8s-ci-robot
Copy link
Contributor

@frapposelli: GitHub didn't allow me to assign the following users: vonthenend.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @andrewsykim @vonthenend

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dvonthenen
Copy link
Contributor

/assign

@andrewsykim
Copy link
Member

This is the upstream fix for this bug kubernetes/kubernetes#79391, should be fixed when we pull that in.

@dvonthenen
Copy link
Contributor

It looks like that was merged and possibly made it into 1.16. Any confirmation on that?

@andrewsykim
Copy link
Member

yeah its in v1.16

@dvonthenen
Copy link
Contributor

dvonthenen commented Oct 21, 2019

yeah its in v1.16

Since I need to rebuild my dev env, will be trying out 1.16 today or tomorrow. Does the CPI need to pull in 1.16 for this to work? or just on the k8s side?

@andrewsykim
Copy link
Member

It seems like just k8s.io/kubernetes needs to be updated

@dvonthenen
Copy link
Contributor

Looks like it's only on the k8s side. I tested with the latest CPI binary without updating the k8s.io/kubernetes that the CPI is built on and it works just fine with 1.16.2. Closing this out.

/close

@k8s-ci-robot
Copy link
Contributor

@dvonthenen: Closing this issue.

In response to this:

Looks like it's only on the k8s side. I tested with the latest CPI binary without updating the k8s.io/kubernetes that the CPI is built on and it works just fine with 1.16.2. Closing this out.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member

@dvonthenen you used multiple vNICs in your tests? At first glance it sounds like we do need to update k8s.io/kubernetes in the controller manager since PatchNodeStatus is still called by the cloud node controllerl

@dvonthenen
Copy link
Contributor

ahhhh i missed that condition... will re-test right now

@andrewsykim
Copy link
Member

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Oct 24, 2019
@k8s-ci-robot
Copy link
Contributor

@andrewsykim: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dvonthenen
Copy link
Contributor

dvonthenen commented Oct 25, 2019

Yep, I re-tested with 3 NICs with 3 different IPs one on each NIC and I don't see that error. Works with the latest CPI binary without updating the k8s.io/kubernetes that the CPI is built on.

Keep in mind, we also made the change that only the first IP is patched remember. That change isn't in the 1.0.0 release but only on master.

Also keep in mind that this PR #271 is trying to re-introduce multiple IPs.

EDIT: I should say selecting a single IP from multiple ones available.

@andrewsykim
Copy link
Member

andrewsykim commented Oct 25, 2019

So sounds like this is already fixed in master but we're going to allow users to configure which networks the VM addresses should come from in #271? So a VM can have multiple NICs, but there can only be 1 InternalIP and 1 ExternalIP on the Node, both of which will be configurable after #271.

Is that correct @dvonthenen?

@dvonthenen
Copy link
Contributor

@andrewsykim that is correct

@frapposelli
Copy link
Member

@dvonthenen can you verify and close?

@frapposelli frapposelli added this to the v1.1.0 milestone Dec 4, 2019
@dvonthenen
Copy link
Contributor

Same as #270

@frapposelli
Copy link
Member

closed via #284

/close

@k8s-ci-robot
Copy link
Contributor

@frapposelli: Closing this issue.

In response to this:

closed via #284

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@deshui123
Copy link

hit the same issue for pure IPv6, related ticket #312

@deshui123
Copy link

When using vSphere VM Network names, the error disappeared.

@andrewsykim
Copy link
Member

FYI we also need to backport kubernetes/kubernetes#79391

@lubronzhan
Copy link
Contributor

Hit this issue with k8s 1.18 + CCM 1.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

7 participants