Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal of cross resource group nodes for Azure #2479

Merged
merged 1 commit into from Aug 21, 2018

Conversation

feiskyer
Copy link
Member

@feiskyer feiskyer commented Aug 6, 2018

Add proposal of cross resource group nodes for Azure.

/sig azure

/assign @brendandburns @khenidak @colemickens
/cc @justaugustus @andyzhangx @kubernetes/sig-azure-misc

@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 6, 2018
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. labels Aug 6, 2018
- The name of resource group, which is used for Azure nodes.
- `on-prem`, which is used for on-prem nodes.

When initializing nodes, these two labels should be set for kubelet by provisioning tools, e.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these looking to add new labels that kubelets set on their own Node objects? xref #911 for reasons why this is not desirable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also yep, it's for kubelet add labels for itself. We can add those labels to the whitelist after #911.

Note that azure cloud provider would validate resource group by calling Azure API. If the value is not consistent with Azure, then errors would be reported (InstanceNotFound).


## Design

Instance metadata is a general way to fetch node information for Azure, but it doesn't work if cloud-controller-manager is used (`kubelet --cloud-provider=external`). So it won't be used in this proposal. Instead, following two labels are proposed for providing required information:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use of the external cloud provider typically involves a controller that pulls information from metadata and sets it on Node objects. Using a source of truth outside the individual kubelets is preferable for setting topology information on Node objects, since it doesn't require trusting all nodes to categorize themselves appropriately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure cloud provider would validate the resource group from Azure API. The label provides an easy way to find which resource group should be used when checking whether the node exists or not on Azure.

There's also an alternative way (see last part) of querying node information directly from Azure API. But it is not feasible because that's time-consuming and easy to hit rate limits.

- `alpha.service-controller.kubernetes.io/exclude-balancer=true`, which is used to exclude the node from load balancer
- `kubernetes.azure.com/resource-group=<rg-name>`, which provides external RG and is used to get node information. Candidate values are
- The name of resource group, which is used for Azure nodes.
- `on-prem`, which is used for on-prem nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a reserved name (is an invalid resource group name)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's a valid resource group name. Actually, all label values are valid resource group name. So it's a reserved name here.


- `alpha.service-controller.kubernetes.io/exclude-balancer=true`, which is used to exclude the node from load balancer
- `kubernetes.azure.com/resource-group=<rg-name>`, which provides external RG and is used to get node information. Candidate values are
- The name of resource group, which is used for Azure nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all resource group names valid label values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding this.

No, resource group name is in length 1-90, while label value only accepts 63. I should add a limitation part for this.

@justaugustus
Copy link
Member

@feiskyer -- this should be submitted as a KEP.
Can you move it /keps/sig-azure and make sure the proposal adheres to the KEP template?

Additionally, if we're planning work for this in the 1.12 Release cycle you'll need to:

  • Submit a Feature issue detailing any work planned and link to this KEP
  • File an Exception for that work to be accepted in the 1.12 cycle

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 8, 2018
@feiskyer
Copy link
Member Author

feiskyer commented Aug 8, 2018

@justaugustus it's already tracked by feature kubernetes/enhancements#604. Let me convert this to KEP tomorrow.

@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Aug 9, 2018
@feiskyer
Copy link
Member Author

feiskyer commented Aug 9, 2018

@justaugustus Moved to keps/sig-azure

@justaugustus
Copy link
Member

@feiskyer -- thanks for moving this over (and for already having the feature issue opened)! :)
I'll try and review the proposal tomorrow.

@feiskyer
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 10, 2018
@feiskyer
Copy link
Member Author

ping @justaugustus @khenidak Any new comments?


### Assumptions

While news nodes (either from different RGs or on-prem) would be supported in this proposal, not all features would be supported for them. For example, AzureDisk will not work for on-prem nodes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: news => new

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@kkmsft
Copy link

kkmsft commented Aug 15, 2018

/lgtm

@k8s-ci-robot
Copy link
Contributor

@kkmsft: changing LGTM is restricted to assignees, and only kubernetes/community repo collaborators may be assigned issues.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@khenidak
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 16, 2018
Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feiskyer -- some doc nits to address.

Additionally, going back to @liggitt's point, we've got a problem with the RG naming. Kubernetes label values and RG names seem to have similar enough restrictions, which means there doesn't appear to be a simple way to differentiate between nodes that should and shouldn't be managed within the context of the Azure cloud provider.

Are we saying that resource groups named on-prem are marked as reserved by ARM? No Azure documentation suggestions that, so we'd need clarification there. I also don't like that that encodes a Kubernetes implementation detail to the cloud provider and hides it from the cluster administrator.

As an idea, could we instead do something like:

  • set kubernetes.azure.com/resource-group=
  • set an additional new label e.g., alpha.service-controller.kubernetes.io/cloud-provider-managed=false

So instances might hold the following labels:

Cross RG Azure instances

  • kubernetes.azure.com/resource-group=foo
  • alpha.service-controller.kubernetes.io/cloud-provider-managed=true

On-prem / non-Azure managed instances

  • kubernetes.azure.com/resource-group=
  • alpha.service-controller.kubernetes.io/cloud-provider-managed=false

Thoughts?

/hold

- sig-azure
reviewers:
- name: "@khenidak"
- name: "@colemickens"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"@justaugustus" **
Cole is no longer a Technical Lead for SIG Azure.


## Summary

This KEP aims to add cross resource group (RG) nodes and on-prem nodes support to Azure cloud provider.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add support for cross resource group (RG) and on-prem nodes to the Azure cloud provider.**


## Motivation

Today, Azure cloud provider only supports nodes from specified RG (which is set in cloud provider configure file). For nodes in different RG, Azure cloud provider reports `InstanceNotFound` error and thus they would be removed by controller manager. So as on-prem nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the** Azure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from a** specified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the** cloud provider configuration** file

Maybe we should also link out to a configuration file example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a** different RG

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/So as on-prem nodes./The same holds true for on-prem nodes.**


Today, Azure cloud provider only supports nodes from specified RG (which is set in cloud provider configure file). For nodes in different RG, Azure cloud provider reports `InstanceNotFound` error and thus they would be removed by controller manager. So as on-prem nodes.

With managed cluster (limited access to nodes) users may need nodes where they can customize in ways that are not possible in a managed service. So this document proposes to support joining arbitrary nodes to the cluster and make required changes for both Azure cloud provider and provision setups, which includes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With managed clusters, like AKS, there is limited access to configure nodes. There are instances where users may need to customize nodes in ways that are not possible in a managed service.**

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document proposes support for joining arbitrary nodes to a cluster and the required changes to make in both the Azure cloud provider and provisioned setups, which include:**


With managed cluster (limited access to nodes) users may need nodes where they can customize in ways that are not possible in a managed service. So this document proposes to support joining arbitrary nodes to the cluster and make required changes for both Azure cloud provider and provision setups, which includes

- Provision tools should setup kubelet with required labels (e.g. via `--node-labels`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provisioning**


- Nodes are in same region and set with required labels (as clarified in the following design part)
- Nodes will not be part of the load balancer managed by cloud provider
- Both node and container networking have setup properly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/have setup properly/are properly configured

- Both node and container networking have setup properly
- AzureDisk is supported for Azure cross-RG nodes, but not for on-prem nodes

Besides, feature gate [ServiceNodeExclusion](https://github.com/kubernetes/kubernetes/blob/master/pkg/features/kube_features.go#L174) should also be enabled for Kubernetes cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Besides,/In addition,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should/must


### Non-Goals

Note that provisioning Kubernetes cluster, setting up networking and provisioning new nodes are out of this proposal scope. Those could be done by external provisioning tools (e.g. acs-engine).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provisioning the**

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 19, 2018
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2018
@feiskyer
Copy link
Member Author

feiskyer commented Aug 20, 2018

Offline chat with @justaugustus, we have agreed to use label kubernetes.azure.com/on-prem=true kubernetes.azure.com/managed=false for on-prem nodes. In this way, we don't need to reserve some names for resource group names.

@justaugustus Addressed comments, PTAL

@justaugustus
Copy link
Member

Perfect. Thanks for addressing all of my nits, @feiskyer!
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2018
@justaugustus
Copy link
Member

justaugustus commented Aug 20, 2018

/assign @calebamiles @idvoretskyi @jdumars
(...although, shouldn't I be able to /approve + merge myself as there's an OWNERS file in /keps/sig-azure?)

@feiskyer
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2018
@idvoretskyi
Copy link
Member

@justaugustus you should be able to.

/uncc

@justaugustus
Copy link
Member

/approve
(crosses fingers)

@idvoretskyi
Copy link
Member

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 21, 2018
Copy link
Member

@idvoretskyi idvoretskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: idvoretskyi, justaugustus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 79d2def into kubernetes:master Aug 21, 2018
@feiskyer feiskyer deleted the azure branch August 21, 2018 06:56
@feiskyer
Copy link
Member Author

@idvoretskyi @justaugustus Thank you very much.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Aug 23, 2018
Automatic merge from submit-queue (batch tested with PRs 66980, 67604, 67741, 67715). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support of Azure cross resource group nodes

**What this PR does / why we need it**:

Part of feature [Cross resource group nodes](kubernetes/enhancements#604).

This PR adds support of Azure cross resource group nodes that are labeled with `kubernetes.azure.com/resource-group=<rg-name>` and `alpha.service-controller.kubernetes.io/exclude-balancer=true`

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

See designs [here](kubernetes/community#2479).

**Release note**:

```release-note
Azure cloud provider now supports cross resource group nodes that are labeled with `kubernetes.azure.com/resource-group=<rg-name>` and `alpha.service-controller.kubernetes.io/exclude-balancer=true`
```

/sig azure
/kind feature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet