Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using Azure cloud worker nodes will never successfully provision #16887

Closed
davidnuzik opened this issue Dec 5, 2018 · 6 comments
Closed
Assignees
Labels
area/cloud-provider kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement status/autoclosed

Comments

@davidnuzik
Copy link
Contributor

davidnuzik commented Dec 5, 2018

Version: master 2.2 (12/5/18)
Related to: #16781

What kind of request is this (question/bug/enhancement/feature request):
Bug

Steps to reproduce (least amount of steps as possible):

  • Create and Azure RKE, under cluster options, choose Azure for Cloud Provider. Make one node with etcd,ctrl plane and three nodes for worker.
  • Enter the necessary minimum requirements in the Azure Cloud section (aadClientId, aadClientSecret, subscriptionId, tenantId) and otherwise use default options.
  • Notice that the worker nodes continually have issues provisioning. If you look at logs you will see:
    E1205 19:40:35.238785 607 kubelet_node_status.go:66] Unable to construct v1.Node object for kubelet: failed to get instance ID from cloud provider: compute.VirtualMachinesClient#Get: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="InvalidApiVersionParameter" Message="The api-version '2018-04-01' is invalid. The supported versions are '2018-11-01,2018-09-01,2018-08-01,2018-07-01,2018-06-01,2018-05-01,2018-02-01,2018-01-01,2017-12-01,2017-08-01,2017-06-01,2017-05-10,2017-05-01,2017-03-01,2016-09-01,2016-07-01,2016-06-01,2016-02-01,2015-11-01,2015-01-01,2014-04-01-preview,2014-04-01,2014-01-01,2013-03-01,2014-02-26,2014-04'."
  • Try again but with older k8s v1.11.3-rancher1-1 and similar issues will occur.

Result:
Unable to provision worker nodes when using Azure cloud

Other details that may be helpful:
Related to #16781 and #16632

Environment information

  • master 2.2 (12/5/18)
@moelsayed
Copy link
Contributor

It seems that Azure has disabled API version 2018-04-01, which is the API version used in k8s:v1.12 code. This is not related to Rancher code.

@alena1108
Copy link

alena1108 commented Dec 6, 2018

We are ahead with our version of azure sdk azure-sdk-for-go vs what k8s 1.12 uses, but seems like both versions support 2018-04-01 as an endpoint.

rancher master:

4e8cbbfb1aeab140cd0fa97fd16b64ee18c3ca6a v19.1.0 (2018-04-01 version is available)

k8s 1.12.3 upstream:

520918e6c8e8e1064154f51d13e02fad92b287b8 v19.0.0 (2018-04-01 version is available)

@moelsayed
Copy link
Contributor

This what happens when I tried to call the metadata api using that API version on standard VM:

ubuntu@melsayed-test:~$ AZURE_META_URL="http://169.254.169.254/metadata/instance/compute"
ubuntu@melsayed-test:~$ curl -s -H Metadata:true "${AZURE_META_URL}/name?api-version=2017-08-01&format=text"
melsayed-test
ubuntu@melsayed-test:~$ curl -s -H Metadata:true "${AZURE_META_URL}/name?api-version=2018-04-01&format=text"
Not found. Invalid version requested
ubuntu@melsayed-test:~$ 

@davidnuzik
Copy link
Contributor Author

I tested today in master 12/10 and this seems to be working now. I did notice that as we create the nodes there are failures multiple times but after a long time it will work. This is probably Azure's fault these take such a long time some of the time it seems light we might timeout or something.

@cjellick
Copy link

marked as low priority (p2) because @davidnuzik said its working now

@josenobile
Copy link

Any estimate when the commit with the working version of Azure will be released in a stable version? Or just will be in the 2.2 version in about 1.5 months?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-provider kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement status/autoclosed
Projects
None yet
Development

No branches or pull requests

8 participants