Remaining work for Azure support #10412

kenji-cloudnatix · 2020-12-11T17:50:26Z

Aut0R3V · 2021-01-09T09:40:03Z

anything I can do here?

olemarkus · 2021-01-09T10:20:46Z

Any of these are up for grabs, I would think.

rifelpet · 2021-01-09T20:03:37Z

I think we can remove Be able to mount etcd volumes from protokube. Currently we rely on etcd-manager for volume mount from the list. That is only needed for the "legacy" etcd option before etcd-manager. Given that it will eventually go away, I don't see a need to add support for it in Azure.

ngalantowicz · 2021-01-13T22:28:06Z

Not sure how tasks are divided up for an issue like this or if work started needs to be reported, but folks I work with and myself are interested in starting

Implement azureCloudImplementation.GetApiIngressStatus. Currently we don't create any resources for ingress to the API server.

Will be first time kops contributors, so any pertinent info to contributing that may be omitted from relevant docs is much appreciated

rifelpet · 2021-01-13T23:01:33Z

@ngalantowicz thanks for stepping up! here is the function that needs implementing. you'll see the return type contains either an IP or Hostname. For clusters that use api load balancers with hostnames we specify the hostname, otherwise use the api load balancer's IP(s). For clusters that don't use api load balancers we return a list of the master instances' IPs. A good example of this is the openstack implementation.

For Azure, we don't yet support API load balancers so you'll only need to handle getting the master VMs' IPs.

Unfortunately we dont yet have azure code for listing VMs, only VMScaleSets. You'll likely need to add a new file and client for VMs (it only needs to support listing them, not create / update / delete) and add the client to the AzureCloud interface.

With the VM listing functionality you'll be able to list them all and filter for the VMs that are associated with the cluster and that have the master role tag.

I hope this helps, let me know if you need any additional assistance

kenji-cloudnatix · 2021-01-13T23:05:36Z

Thanks, @ngalantowicz , @rifelpet ! Just wanted to make a quick comment for listing VMs. I hink we can reuse the code in etcd-manager like this.

NickSchleicher · 2021-01-20T18:56:07Z

is there any additional information in regards to
Populate node labels from nodeIdentifier.IdentifyNodeChange?

i assume it might just mean labeling the nodes with the labels that can be specified in the instance group as seen below, by grabbing the tags from the cloud, but want to be sure i cover the whole depth of it

  nodeLabels:
    tier: web
    ...

i was going to start work on that, but it looks like there might be an existing error in this domain

I0120 18:14:43.199009       1 node_controller.go:144] sending patch for node "web000001": "{\"metadata\":{\"labels\":{\"\":\"web\"}}}"
W0120 18:14:43.204913       1 node_controller.go:110] failed to patch node labels on web000001: error applying patch to node: Node "web000001" is invalid: [metadata.labels: Invalid value: "": name part must be non-empty, metadata.labels: Invalid value: "": name part must consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyName',  or 'my.name',  or '123-abc', regex used for validation is '([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]')]

kops/pkg/nodeidentity/azure/identify.go

Line 35 in 309dc9f

InstanceGroupNameTag = "kops.k8s.io_instancegroup"

kops/pkg/nodeidentity/azure/identify.go

Line 111 in 309dc9f

info.Labels[strings.TrimPrefix(k, InstanceGroupNameTag)] = *v

the tag that is in the cloud for the VMSS is also the "prefix" so the key is left as an empty string. Do we need this label and if so, what should the key be?

Also partially related, I noticed an instance group is required to have the label kops.k8s.io/cluster or else you'll get the error
error: must specify "kops.k8s.io/cluster" label with cluster name to create instanceGroup
is it an issue that this label has a / in it and never makes it to the cloud as a tag

kenji-cloudnatix · 2021-01-20T19:30:12Z

Thanks for looking into, @NickSchleicher ! This looks like a bug in my original code. I believe we need to trim with clusterNodeTemplateLabel defined in

kops/pkg/model/azuremodel/context.go

Line 74 in 309dc9f

clusterNodeTemplateLabel = "k8s.io_cluster_node-template_label_"

.

You're right on the label kops.k8s.io/cluster as well. Currently CloudTagsForInstanceGroup has a logic to replace all occurrences of "/" to "_". I believe we need to convert "_" back to "/" in

kops/pkg/nodeidentity/azure/identify.go

Line 111 in 309dc9f

info.Labels[strings.TrimPrefix(k, InstanceGroupNameTag)] = *v

.

ngalantowicz · 2021-01-21T21:48:23Z

Reaching out with for more info
Support tag update of virtual networks

Seeing

kops/upup/pkg/fi/cloudup/azuretasks/virtualnetwork.go

Lines 114 to 121 in 309dc9f

    
           // Only allow tags to be updated. 
        
           if changes.Tags == nil { 
        
           	return nil 
        
           } 
        
           // TODO(kenji): Fix this. Update is not supported yet as updating the tag will recreate a virtual network, 
        
           // which causes an InUseSubnetCannotBeDeleted error. 
        
           klog.Infof("Skip updating a Virtual Network with name: %s", fi.StringValue(e.Name)) 
        
           return nil

An interesting behavior to note regarding route tables (which task and client files seem to be a mirror of the vnet files):
When creating a cluster with existing route table no new route table is created however the inUseError does occur and cluster delete fails due to such. If I manually remove the tags from the route table and run delete again no issues.

Are these two things similar and is there any vision/insight on how youd go about implementing tag update support?
Im thinking the work is to update tags when using pre-existing resources and remove tags when pre-existing resource was used is the goal?

@kenji-cloudnatix

kenji-cloudnatix · 2021-01-21T22:15:45Z

Im thinking the work is to update tags when using pre-existing resources and remove tags when pre-existing resource was used is the goal?

Yes, I think so, but I'm not certain if removing the tag is what kops does for other CSPs. I think other people can chime-in.

The issues with updating a tag of vnet was that it causes an InUseSubnetCannotBeDeleted error. I haven't fully figured out a cause, but my guess is that we need to set the Subnet field of VirtualNetworkPropertiesFormat.

https://raw.githubusercontent.com/Azure/azure-sdk-for-go/master/services/network/mgmt/2020-06-01/network/models.go

Otherwise, the update request attempts to delete an existing subnet (and then the request fails).

rifelpet · 2021-01-22T23:50:03Z

FWIW, based on the code in pkg/resources it doesn't seem that Kops removes any tags on shared resources when a cluster is deleted.

NickSchleicher · 2021-02-10T21:43:34Z

From a glance it looks like we don't have support yet for multi AZ instances
https://github.com/kubernetes/kops/blob/master/pkg/model/azuremodel/vmscaleset.go#L68

Coming from an AWS background where the subnets are zonal, but from looking at the other models, it looks like to go about implementing this it would more closely track the GCE model?

olemarkus · 2021-02-11T08:08:46Z

Is the CSI driver used for Azure Disks?
K8s is looking to disable the in-tree disk plugins as of kubernetes 1.22 and we need to be ready for that. See #10777

kenji-cloudnatix · 2021-02-11T16:16:08Z

@olemarkus , I see. No, the CSI driver is not used for Azure Disks. Thanks!

hakman · 2021-02-23T11:22:55Z

kOps v1.20.0-beta.1 was just released today. The plan is to have another beta in 1-2 weeks and the final release in in 3-4 weeks.
Please test the new Azure support as much as possible and report any issues you find.

rifelpet · 2021-02-25T03:59:17Z

Would folks find terraform support for azure clusters useful? If someone is interested in implementing it I can provide assistance.

finnje · 2021-03-09T21:08:09Z

I wanted to test out kOps in azure. Where can I get the kops binary that supports the --azure args?

ngalantowicz · 2021-03-09T22:06:24Z

I wanted to test out kOps in azure. Where can I get the kops binary that supports the --azure args?

@finnje can build from master branch and or kops release v1.20.0-beta.1 supports azure features.

hakman · 2021-04-17T06:24:56Z

@kenji-cloudnatix Do you think we could also add an entry with recommended image in the channels file for Azure?
I guess the URN would be something like Canonical:0001-com-ubuntu-server-focal:20_04-lts:20.04.20210315 and the entry would be similar to the GCE one:

kops/channels/alpha

Lines 77 to 80 in 5bb54a3

    
           - name: "ubuntu-os-cloud/ubuntu-2004-focal-v20210315" 
        
             providerID: gce 
        
             architectureID: amd64 
        
             kubernetesVersion: ">=1.18.0"

kenji-cloudnatix · 2021-04-20T15:33:23Z

I see. @hakman , let me take a look!

alexivanoff · 2021-07-05T21:59:42Z

Is it possible to configure Availability zone for master and node Scale Sets?
It tried to use "--zones eastus-1, eastus-2, eastus-3" but kops just created 3 Scale Sets without "Availability zone" settings.

johngmyers · 2021-07-05T22:13:35Z

Would it be possible for Azure to implement a fi.Authenticator/fi.Verifier pair so that instances can authenticate bootstrap requests to kops-controller using cloud-provided instance credentials?

k8s-triage-robot · 2021-10-03T23:04:03Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

rifelpet · 2021-10-04T16:25:33Z

/remove-lifecycle stale

k8s-triage-robot · 2022-01-02T17:18:50Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-02-01T17:45:19Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-03-03T17:51:32Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-03-03T17:51:53Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kenji-cloudnatix mentioned this issue Dec 11, 2020

Add Azure support #10114

Merged

NickSchleicher mentioned this issue Jan 20, 2021

Use correct tag when creating node labels from azure cloud tags #10619

Merged

NickSchleicher mentioned this issue Feb 11, 2021

Allow managed images for Azure instance groups #10797

Merged

ngalantowicz mentioned this issue Feb 22, 2021

add usage of subnet and routetable shared resources in azure #10900

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 4, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 1, 2022

k8s-ci-robot closed this as completed Mar 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remaining work for Azure support #10412

Remaining work for Azure support #10412

kenji-cloudnatix commented Dec 11, 2020 •

edited

Loading

Aut0R3V commented Jan 9, 2021

olemarkus commented Jan 9, 2021

rifelpet commented Jan 9, 2021

ngalantowicz commented Jan 13, 2021

rifelpet commented Jan 13, 2021

kenji-cloudnatix commented Jan 13, 2021

NickSchleicher commented Jan 20, 2021

kenji-cloudnatix commented Jan 20, 2021

ngalantowicz commented Jan 21, 2021

kenji-cloudnatix commented Jan 21, 2021

rifelpet commented Jan 22, 2021

NickSchleicher commented Feb 10, 2021

olemarkus commented Feb 11, 2021

kenji-cloudnatix commented Feb 11, 2021

hakman commented Feb 23, 2021

rifelpet commented Feb 25, 2021

finnje commented Mar 9, 2021

ngalantowicz commented Mar 9, 2021

hakman commented Apr 17, 2021

kenji-cloudnatix commented Apr 20, 2021

alexivanoff commented Jul 5, 2021

johngmyers commented Jul 5, 2021

k8s-triage-robot commented Oct 3, 2021

rifelpet commented Oct 4, 2021

k8s-triage-robot commented Jan 2, 2022

k8s-triage-robot commented Feb 1, 2022

k8s-triage-robot commented Mar 3, 2022

k8s-ci-robot commented Mar 3, 2022

Remaining work for Azure support #10412

Remaining work for Azure support #10412

Comments

kenji-cloudnatix commented Dec 11, 2020 • edited Loading

Aut0R3V commented Jan 9, 2021

olemarkus commented Jan 9, 2021

rifelpet commented Jan 9, 2021

ngalantowicz commented Jan 13, 2021

rifelpet commented Jan 13, 2021

kenji-cloudnatix commented Jan 13, 2021

NickSchleicher commented Jan 20, 2021

kenji-cloudnatix commented Jan 20, 2021

ngalantowicz commented Jan 21, 2021

kenji-cloudnatix commented Jan 21, 2021

rifelpet commented Jan 22, 2021

NickSchleicher commented Feb 10, 2021

olemarkus commented Feb 11, 2021

kenji-cloudnatix commented Feb 11, 2021

hakman commented Feb 23, 2021

rifelpet commented Feb 25, 2021

finnje commented Mar 9, 2021

ngalantowicz commented Mar 9, 2021

hakman commented Apr 17, 2021

kenji-cloudnatix commented Apr 20, 2021

alexivanoff commented Jul 5, 2021

johngmyers commented Jul 5, 2021

k8s-triage-robot commented Oct 3, 2021

rifelpet commented Oct 4, 2021

k8s-triage-robot commented Jan 2, 2022

k8s-triage-robot commented Feb 1, 2022

k8s-triage-robot commented Mar 3, 2022

k8s-ci-robot commented Mar 3, 2022

kenji-cloudnatix commented Dec 11, 2020 •

edited

Loading