Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Get-AzVM -Status" sometimes fails to get power state for Linux VMs #13178

Closed
o-l-a-v opened this issue Oct 11, 2020 · 15 comments
Closed

"Get-AzVM -Status" sometimes fails to get power state for Linux VMs #13178

o-l-a-v opened this issue Oct 11, 2020 · 15 comments
Assignees
Labels
Compute - VM customer-reported question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.

Comments

@o-l-a-v
Copy link

o-l-a-v commented Oct 11, 2020

Description

We have a runbook that starts and stops VMs based on tags and current time. When we started using some Linux VMs (Cent OS 7) we some times got an error because Az.Compute Get-AzVM -Status failed to obtain power state.

We until recently used AzureRM, but last week updated the runbook to use latest Az modules (Az.Accounts, Az.Compute, Az.Resources) instead, in hope of that fixing the problem. But it did not.

Most of the times, like 98%, Get-AzVM -Status manages to get power state, but some times it does not. I even try twice in the runbook, which seems to indicate this might be an error with ARM backend rather than Az.Compute, but I don't really know where this should be reported. So I start here.

Steps to reproduce

$VMs = [array](
    (Get-AzVM).ForEach{
        $null = Add-Member -InputObject $_ -MemberType 'NoteProperty' -Name 'Statuses' -Value (
            (Get-AzVM -ResourceGroupName $_.'ResourceGroupName' -Name $_.'Name' -Status).'Statuses'
        )
        $_
    }
)

$VMs | Select-Object -Property 'Name','Location',@{'Name'='PowerState';'Expression'={$_.'Statuses'.Where{$_.'Code' -like 'PowerState/*'}.'DisplayStatus'}},@{'Name'='OSType';'Expression'={if($_.'OSProfile'.'WindowsConfiguration'.'ProvisionVMAgent'){'Windows'}else{'Linux'}}}

Environment data

Because this is in a Automation Account, I ran following to get env info.

$PSVersionTable

Import-Module -Name 'Az.Accounts','Az.Compute','Az.Resources'
Get-Module -Name 'Az.*' | Select-Object -Property 'Name','Version'

Output

Name                           Value                                                                                    
----                           -----                                                                                    
PSVersion                      5.1.15063.726                                                                            
PSEdition                      Desktop                                                                                  
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                  
BuildVersion                   10.0.15063.726                                                                           
CLRVersion                     4.0.30319.42000                                                                          
WSManStackVersion              3.0                                                                                      
PSRemotingProtocolVersion      2.3                                                                                      
SerializationVersion           1.1.0.1                                                                                  


Name        Version
----        -------
Az.Accounts 1.9.4  
Az.Compute  4.4.0  
.0  
Az.Resources 2.5.1  

Debug output

Not feasible, can't reproduce the results because it seems to happen at a specific time or when VM is in a specific state. Running the same runbook 30 minutes later, it will not encounter the same issue.

Error output

No error message is given. The object I get returned with Get-AzVM -Status simply has no value for power state.

@o-l-a-v o-l-a-v added the triage label Oct 11, 2020
@ghost ghost added question The issue doesn't require a change to the product in order to be resolved. Most issues start as that customer-reported labels Oct 11, 2020
@dingmeng-xue dingmeng-xue added Compute - VM Service Attention This issue is responsible by Azure service team. and removed triage labels Oct 11, 2020
@ghost
Copy link

ghost commented Oct 11, 2020

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Drewm3, @avirishuv.

@dingmeng-xue
Copy link
Member

Compute team, please help to look into this question.

@amjads1
Copy link

amjads1 commented Oct 12, 2020

Looking into this issue.

@amjads1
Copy link

amjads1 commented Oct 12, 2020

Assigned to Avi.

@o-l-a-v
Copy link
Author

o-l-a-v commented Oct 16, 2020

@dingmeng-xue
@amjads1

Any updates to share here?

@amjads1
Copy link

amjads1 commented Oct 17, 2020

@o-l-a-v - Avi is looking into the issue and you should expect an update sometime next week.

@o-l-a-v
Copy link
Author

o-l-a-v commented Oct 22, 2020

@amjads1

Should I rather create a Azure support ticket?

@avirishuv
Copy link

avirishuv commented Oct 23, 2020

@o-l-a-v thanks for reporting this. Is it possible to check that when the power state is missing in the result of the Azure PowerShell query, at that time the direct GET API call on the VM also shows missing power state?
Sometimes, Power state may be missing for VMs due to reliability issues in the implementation, which we are continuing to improve upon.
If you would like more details of the specific incident that you have experienced, feel free to open a support ticket as well.

@o-l-a-v
Copy link
Author

o-l-a-v commented Nov 3, 2020

@o-l-a-v thanks for reporting this. Is it possible to check that when the power state is missing in the result of the Azure PowerShell query, at that time the direct GET API call on the VM also shows missing power state?
Sometimes, Power state may be missing for VMs due to reliability issues in the implementation, which we are continuing to improve upon.
If you would like more details of the specific incident that you have experienced, feel free to open a support ticket as well.

Sorry for late reply. It happens randomly, and next run 30 minutes later it usually works again. So not feasible to sit on standby until it happens again. I could maybe add the GET API call to the runbook using token from Get-AzContext, hmm. But don't think I'll get an aproval on doing that for the sake of troubleshooting only.

We've now just scripted a workaround, so now I will not even get alerted when it happens. The workaround is to just send Start or Stop cmdlet even if we fail to get power state for a VM. If the VM already runs and you invoke Start-AzVM, nothing happens, so seems like a safe workaround.

Edit: PowerShell to get power state with Rest API

# Assets
$ResourceGroupName = ''
$ResourceName = ''

# Get
$AzContext = Get-AzContext
$ArmToken = [Microsoft.Azure.Commands.Common.Authentication.AzureSession]::Instance.AuthenticationFactory.Authenticate(
    $AzContext.'Account',
    $AzContext.'Environment',
    $AzContext.'Tenant'.'Id',
    $null,
    [Microsoft.Azure.Commands.Common.Authentication.ShowDialog]::Never,
    $null,
    'https://management.azure.com/'
)
$Uri = 'https://management.azure.com/subscriptions/{0}/resourceGroups/{1}/providers/Microsoft.Compute/virtualMachines/{2}/instanceView?api-version=2020-06-01' -f $AzContext.'Subscription'.'Id', $ResourceGroupName, $ResourceName
$Headers = [hashtable]@{'Authorization'='Bearer {0}'-f$ArmToken.'AccessToken'}
(Invoke-RestMethod -Method 'Get' -Headers $Headers -Uri $Uri).'Statuses' | Format-List

@avirishuv
Copy link

@o-l-a-v thanks for the update about your workaround. We will be updating the public doc to reflect that sometimes the power state may be missing due to reliability.

@avirishuv
Copy link

Apologies for the delay due to holidays, still working on the documentation for this, will update here once done.

@avirishuv
Copy link

quick update: reviewing the doc update.

@avirishuv
Copy link

The document is updated: https://docs.microsoft.com/en-us/azure/virtual-machines/states-billing

Closing the issue now, please feel free to reopen if you have any additional questions on this topic.

@philhines
Copy link

@avirishuv : I am noticing that this issue hasn't been fixed yet. There has been more than enough time to not just document a workaround, but actually fix the bug. Please fix it. Thanks!
This is affecting me as well!

@philhines philhines reopened this Sep 29, 2022
@Drewm3 Drewm3 assigned Drewm3 and unassigned avirishuv Oct 3, 2022
@Drewm3
Copy link
Member

Drewm3 commented Oct 3, 2022

@philhines, the return of the power state is not expected to be 100%. There are a small number of cases where this is expected to be null due to various possible failures. The failure rate should be pretty low, though. If you have a scenario where this is returning null consistently, then please provide the detailed method to show the consistent reproduction of this issue.

@Drewm3 Drewm3 closed this as completed Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compute - VM customer-reported question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

6 participants