Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to update image when upgrade the scale set with another custom image using task Azure VM scale set deployment #11684

Closed
luweilevi opened this issue Nov 4, 2019 · 27 comments
Assignees

Comments

@luweilevi
Copy link

luweilevi commented Nov 4, 2019

When trying to upgrade the scale set with another custom image. The error said "cannot be updated as it uses a platform image", However the scale set was indeed created with a custom image.

Failed to update image for VMSS testvmssapp. Error: VMSS testvmssapp can not be updated as it uses a platform image. Only a VMSS which is currently using a custom image can be updated.

"storageProfile": { "osDisk": { "createOption": "FromImage", "caching": "ReadWrite", "managedDisk": { "storageAccountType": "Standard_LRS" }, "diskSizeGB": 127 }, "imageReference": { "id": "/subscriptions/xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx/resourceGroups/testvmssdeployment/providers/Microsoft.Compute/images/MyCustomImage" } },

vmss1

vmss2

@luweilevi luweilevi changed the title Failed to update image when upgrade the scale set with another custom image using Azure VM scale set deployment Failed to update image when upgrade the scale set with another custom image using task Azure VM scale set deployment Nov 4, 2019
@chernovol
Copy link

We have the same issue, it's a BIG pain for us.
Any update on when it will be resolved?

@bishal-pdMSFT
Copy link
Contributor

@luweilevi the custom image in the error message means a VHD image which has been set using a URI. This task does not support updating a VMSS created using a managed image.

@chernovol
Copy link

chernovol commented Nov 11, 2019

@bishal-pdMSFT

What about this?:
in Azure Release Pipeline we are using Task "Azure VM scale set deployment"
Action set to "Update VM Scale set by using image"
OS type: "Windows"
Image Details, Image URL: "$(currentdiskuri)"
Control Options
Control Options, Timeout: 0

Here is a Log:
2019-11-08T16:52:26.9213771Z ##[section]Starting: Azure VMSS: Update image
2019-11-08T16:52:26.9391518Z ==============================================================================
2019-11-08T16:52:26.9391660Z Task : Azure VM scale set deployment
2019-11-08T16:52:26.9392001Z Description : Deploy a virtual machine scale set image
2019-11-08T16:52:26.9392286Z Version : 0.1.3
2019-11-08T16:52:26.9392383Z Author : Microsoft Corporation
2019-11-08T16:52:26.9392494Z Help : Learn more about this task
2019-11-08T16:52:26.9392645Z ==============================================================================
2019-11-08T16:52:31.1170459Z URL for new VMSS image: https://images.blob.core.windows.net/system/Microsoft.Compute/Images/images/current.uat/-osDisk.933c3073-996b-4994-837c-d73c0ad244d9.vhd.
2019-11-08T16:52:31.1171306Z Updating VMSS to use new image...
2019-11-08T16:53:22.5313515Z ##[error]Failed to update image for VMSS. Error: Provisioning of VM extension 'PowershellDSCExtn' has timed out. Extension installation may be taking too long, or extension status could not be obtained.
2019-11-08T16:53:22.5318770Z ##[section]Finishing: Azure VMSS: Update image

this Task worked well for us last two years, the last successful run log:
2019-11-05T16:34:13.8448330Z ##[section]Starting: Azure VMSS: Update image
2019-11-05T16:34:13.8625197Z ==============================================================================
2019-11-05T16:34:13.8625375Z Task : Azure VM scale set deployment
2019-11-05T16:34:13.8625496Z Description : Deploy a virtual machine scale set image
2019-11-05T16:34:13.8625594Z Version : 0.1.3
2019-11-05T16:34:13.8625705Z Author : Microsoft Corporation
2019-11-05T16:34:13.8625841Z Help : Learn more about this task
2019-11-05T16:34:13.8625982Z ==============================================================================
2019-11-05T16:34:18.3529195Z URL for new VMSS image: https://---images.blob.core.windows.net/system/Microsoft.Compute/Images/images/current.uat/----osDisk.7f05add1-3f37-42d6-86b3-4070a54d6d3d.vhd.
2019-11-05T16:34:18.3529946Z Updating VMSS to use new image...
2019-11-05T16:35:09.8821767Z Successfully updated VMSS image.
2019-11-05T16:35:09.8860322Z ##[section]Finishing: Azure VMSS: Update image

@chernovol
Copy link

@bishal-pdMSFT @damccorm
Can we have any update, as we are blocked currently by this issue?

@bishal-pdMSFT
Copy link
Contributor

@chernovol can you please attach debug logs for failed task?

Also, it looks like a VM extension is failing to install on VMSS. The task does not install any such extension. But I think when VMSS image gets updated, VMSS tries to re-install all extensions which were previously installed on VMSS. If any such extension is failing, then may be image update itself is reported as failure. Can you please check VMSS in azure portal to see if it is reporting an extension failure. You can remove extension and try the task again.

@hanif-everycity
Copy link

hanif-everycity commented Nov 12, 2019

@bishal-pdMSFT Yes, I can see failure report related to provisioning.

image

What's the quickest way of generating the debug logs for the failed task?

@bishal-pdMSFT
Copy link
Contributor

@hanif-everycity can you delete the extension and retry to task?

You can add a pipeline variable with name system.debug and value true and then create a run.

@hanif-everycity
Copy link

hanif-everycity commented Nov 12, 2019

@bishal-pdMSFT Please find attached the log for the task. I've masked some sensitive data.

18_Azure VMSS MyCompany-UAT Update image.log

With regards to deleting the extension, I removed it from Azure > VMSS > Extensions prior to the above run.

@chernovol
Copy link

@bishal-pdMSFT
did you have a chance to check the log which was provided by @hanif-everycity?

@bishal-pdMSFT
Copy link
Contributor

@chernovol the task does not install this `` extension explicitly and hence I can't comment why it is failing. But I have a hunch that updating image on VMSS somehow triggers this extension installation. This could be due to something specific to this VMSS. Can you make a REST call to get more details about this VMSS https://management.azure.com/subscriptions/xxxx-xxxx-xxxx-xxxx/resourceGroups/ccvmss-uat/providers/Microsoft.Compute/virtualMachineScaleSets/MyCompany-UAT?$expand=undefined&api-version=2016-03-30 and attach here (after scrubbing).

Another option you can do is to try this image update with a different VMSS (preferably a new VMSS). This will definitely point to some issue with current VMSS.

@hanif-everycity
Copy link

Thanks @bishal-pdMSFT . I uninstalled the PowerShell DSC extension and restarted the scale set. This resulted in the deployment completing successfully. I reinstated the exact PowerShell DSC extension and script, restarted the scale set and then run the deployment successfully again. It seems to me the extension got stuck somewhere during installation. The script itself works fine on a standalone.

@charitycheckout-azure
Copy link

charitycheckout-azure commented Nov 14, 2019

@bishal-pdMSFT We are now seeing the following error:

2019-11-14T15:18:05.9429239Z ##[error]Long running operation failed with status 'Failed'. Additional Info:'VM has reported a failure when processing extension 'PowershellDSCExt'. Error message: "The DSC Extension received an incorrect input: Value for password key 'registrationKeyPrivate' is missing. Please provide a valid password..

In Azure, under VMSS > Extension, we are installing the PowerShell DSC extension with the following configuration for LCM. It's defining the credentials for the DSC found in Azure Automation. How do we pass in the Azure Automation key (RegistrationKey)? Where does the extension look for this key?

{ "Properties": [ { "Name": "RegistrationKey", "Value": { "UserName": "PLACEHOLDER_DONOTUSE", "Password": "PrivateSettingsRef:registrationKeyPrivate" }, "TypeName": "PSCredential" }, { "Name": "RegistrationUrl", "Value": "https://uks-agentservice-prod-1.azure-automation.net/accounts/xxxx", "TypeName": "System.String" }, { "Name": "NodeConfigurationName", "Value": "DNSBindings.localhost", "TypeName": "System.String" }, { "Name": "ConfigurationMode", "Value": "ApplyandAutoCorrect", "TypeName": "System.String" } ] }

@chernovol
Copy link

Moring @bishal-pdMSFT Can you help us with the above as this is a huge blocker for us

@bishal-pdMSFT
Copy link
Contributor

@chernovol I am not right person for PowerShell DSC extension. As per @hanif-everycity 's comment VMSS image update works if PowerShell DSC extension is not installed. The culprit is this extension and you should try to fix it. Unfortunately I can't help with this extension.

@pl-pack-01
Copy link

We are also experiencing this issue is.

@bishal-pdMSFT
Copy link
Contributor

@pl-pack-01 you mean you are also hitting PowerShell DSC extension error?

@chernovol
Copy link

Hi @bishal-pdMSFT
who is the right person for PowerShell DSC extension? can such person be attached to the ticket?
or should we create a ticket under another repo? if so where is a right place for the ticket?

@pl-pack-01
Copy link

I apologize for the ambiguous comment. To clarify, we have VMSS nodes that continue to fail with error messages similar to the ones mentioned above. We have gotten the following; Reports of failing provisioning when it appears that provisioning was successful, Reports of failing provisioning with an error message about the DSC Extension timing out, and Reports of failing provisioning with error message that it was unable to connect to the either the license or upgrade store. In each of these cases the nodes all are responsive but a few (between 1 and 4 out of 12) will not complete the DSC Powershell script. Please let me know who can help us with this issue.

@chernovol
Copy link

@bishal-pdMSFT @damccorm
Any update?

@bishal-pdMSFT
Copy link
Contributor

@chernovol I do not have any contact with PowerShell DSC extension. I did a quick internet search and found this page. You might want to use support section there.

@bishal-pdMSFT
Copy link
Contributor

Closing this issue as it is not in VM Scale set deployment task

@pl-pack-01
Copy link

pl-pack-01 commented Nov 20, 2019

I disagree with the closure of this ticket without assisting us in finding the group responsible. VMSS is where we are experiencing this issue and I am not experiencing it in other Azure Resources. At the very least this is related to VMSS. It would be different if this were a third party extension totally outside your companies control.

@chernovol
Copy link

@pl-pack-01 Totaly agree with you!

@bishal-pdMSFT @damccorm @vincent1173 @hiyadav @DS-MS @luweilevi @jahsu-MSFT
Can somebody help us?

@pl-pack-01
Copy link

@chernovol not sure if you are still experiencing this issue but I found mine was due to an Out of Memory Exception that was being masked by the timeout. I was able to refactor my script to be more memory efficient, but another option would be to increase the MaxMemoryPerShellMB setting. Hope this helps.

@uF7264
Copy link

uF7264 commented Jan 2, 2020

Has this problem been solved? We are struggling with the same error message.

Steps to reproduce

  1. Create VMSS using Azure Portal
    1.1 Select "Browse all public and private images" => My Items => My Images => Select Custom Image
    1.1 Confirming in the template definition that the imageReference is based on an id and that the Interfaces confirms we are using a "Custom Image"

  2. Create DevOps Pipeline
    2.1 Building the image runs smoothly, a new .vhd file create in our storage unit.
    2.2 Last Step "Azure VM scale set deployment" fails after 1s with the error message: "*** can not be updated as it uses a platform image. Only a VMSS which is currently using a custom image can be updated."

How can we fix that or is it a general problem with the pipeline?

@hanif-everycity
Copy link

@uF7264 Your matter sounds different from what we were facing. We use a DSC extension to register the DSC pull server on each of the VMs in the scale set. The problem was the extension was failing due to permission issues and this was being reported in the DevOps pipeline during the "Azure VM scale set deployment" step.

Instead, we decided to use a PowerShell script with the PS extension to register the DSC pull server on VM startup.

@pl-pack-01
Copy link

We are no longer experiencing this issue. The changes we made mentioned above resolved it for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants