Skip to content

Introduce scheduled event checking for Azure VMs via instance metadata #9170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kon-angelo
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Implement an additional polling for scheduled events via the instance metadata service and exposes that though the node conditions

Which issue(s) this PR fixes:

Fixes ##9169

Special notes for your reviewer:

Drafted until test implementation finishes

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 10, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kon-angelo
Once this PR has been reviewed and has the lgtm label, please assign andyzhangx for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 10, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @kon-angelo. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@github-actions github-actions bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 10, 2025
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 10, 2025
tests/go.mod Outdated
github.com/Azure/azure-kusto-go/azkustoingest v1.0.3
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.18.0
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v6 v6.4.0
github.com/Azure/azure-kusto-go/azkustodata v1.0.0-preview-5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the downgrade in these library versions?

@@ -133,7 +134,7 @@ type Config struct {
// `nodeIP`: vm private IPs will be attached to the inbound backend pool of the load balancer;
// `podIP`: pod IPs will be attached to the inbound backend pool of the load balancer (not supported yet).
LoadBalancerBackendPoolConfigurationType string `json:"loadBalancerBackendPoolConfigurationType,omitempty" yaml:"loadBalancerBackendPoolConfigurationType,omitempty"`
// PutVMSSVMBatchSize defines how many requests the client send concurrently when putting the VMSS VMs.
// PutVMSSVMBatchSize defines how many reque hssts the client send concurrently when putting the VMSS VMs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

@@ -0,0 +1,139 @@
/*
Copyright 2019 The Kubernetes Authors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 2025, no?

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 11, 2025
@@ -135,6 +137,7 @@ func (o *CloudNodeManagerOptions) Flags() cliflag.NamedFlagSets {
fs.Int32Var(&o.ClientConnection.Burst, "kube-api-burst", 30, "Burst to use while talking with kubernetes apiserver.")
fs.BoolVar(&o.WaitForRoutes, "wait-routes", false, "Whether the nodes should wait for routes created on Azure route table. It should be set to true when using kubenet plugin.")
fs.BoolVar(&o.UseInstanceMetadata, "use-instance-metadata", true, "Should use Instance Metadata Service for fetching node information; if false will use ARM instead.")
fs.BoolVar(&o.EnableNodeEventChecker, "enable-node-event-checker", true, "Should enable the NodeEventChecker to check for Azure scheduled events. Can only be set to true if --use-instance-metadata is also true. If false, the NodeEventChecker will not run and no events will be recorded in the node status.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we default to false, and enable as needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. we'd need user to enable this feature by the new flag

@@ -255,6 +272,42 @@ func (ims *InstanceMetadataService) getLoadBalancerMetadata() (*LoadBalancerMeta
return &obj, nil
}

func (ims *InstanceMetadataService) GetScheduledEvents() (*EventResponse, error) {
req, err := http.NewRequest("GET", ims.imdsServer+consts.ImdsScheduledEventsURI, nil)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would the request fail if there are no scheduled events? And would it fail during node bootstrap?

I'm wondering whether we should skip the errors in the caller side as failures here may block node lifecycle logic

@@ -20,6 +20,7 @@ import (
"strings"

"sigs.k8s.io/cloud-provider-azure/pkg/azclient/configloader"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants