Skip to content

ALAR2 repair scripts failing when using arm64 CPU on a SKU with an NVMe temp disk #35

@C-Lock

Description

@C-Lock

Issue

When using the ALAR2 Repair Scripts fail when used with both of the following conditions:

  • arm64 CPU Architecture
  • VM SKU with NVMe temp disk

This has been tested with both a common and uncommon distribution
Also tested with multiple nvme temp disk enabled SKUs
Tested with two actions: fstab and grubfix

Steps to Reproduce

  1. Create a test VM with an arm64 image and select a VM SKU with an NVMe temp disk
  2. Follow the ALAR Repair instructions here: https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/repair-linux-vm-using-alar
  3. Observe failures referencing the src/nvme.rs file

Expected Results

ALAR2 script to run without error

Observed Results

ALAR2 script fails out referencing the nvme.rs file

Work Around

  1. Manually resize the rescue VM to a SKU with no temp disk
  2. Retry the ALAR script

Error Output

$ az vm repair create --verbose --resource-group RG --name RHEL_arm64_01

...
Fetching architecture type of the source VM...
Checking if source VM is gen2
ARM64 VM detected
No specific distro was provided , using the default ARM64 Ubuntu distro
Disabling trusted launch on ARM
Using source VM size Standard_E2pds_v6 for repair VM
Checking if provided VM size is available...
Selected VM size 'Standard_E2pds_v6' is available. Selecting it to create repair VM.

Checking for existing resource groups with identical name within subscription...
Pre-existing repair resource group with the same name is 'False'
Creating resource group for repair VM and its resources...
Source VM uses managed disks. Creating repair VM with managed disks.

Copying OS disk of source VM...
Creating repair VM with command: az vm create -g repair-RHEL_arm64_01-20260520214417 -n repair-RHEL_ar_ --image Canonical:ubuntu-24_04-lts:server-arm64:latest --admin-username ******** --admin-password ******** --public-ip-address "" --tags repair_source=rg>/RHEL_arm64_01 --custom-data /home/********/.azure/cliextensions/vm-repair/azext_vm_repair/scripts/linux-build_setup-cloud-init.txt --size Standard_E2pds_v6 --security-type Standard
copy_disk_id: /subscriptions/SubID/resourceGroups/RG/providers/Microsoft.Compute/disks/RHEL_arm64_01-DiskCopy-20260520214417
fix_uuid: True
Validating VM template before continuing...
Creating repair VM...
Attaching copied disk to repair VM as data disk...

Your repair VM 'repair-RHEL_ar_' has been created in the resource group 'repair-RHEL_arm64_01-20260520214417' with disk 'RHEL_arm64_01-DiskCopy-20260520214417' attached as data disk. Please use this VM to troubleshoot and repair. Once the repairs are complete use the command 'az vm repair restore -n RHEL_arm64_01 -g RG --verbose' to restore disk to the source VM. Note that the copied disk is created within the original resource group 'RG'.

{
"copied_disk_name": "RHEL_arm64_01-DiskCopy-20260520214417",
"copied_disk_uri": "/subscriptions/SubID/resourceGroups/RG/providers/Microsoft.Compute/disks/RHEL_arm64_01-DiskCopy-20260520214417",
"created_resources": [
"/subscriptions/SubID/resourceGroups/repair-RHEL_arm64_01-20260520214417/providers/Microsoft.Network/networkSecurityGroups/repair-RHEL_ar_NSG",
"/subscriptions/SubID/resourceGroups/repair-RHEL_arm64_01-20260520214417/providers/Microsoft.Network/virtualNetworks/repair-RHEL_ar_VNET",
"/subscriptions/SubID/resourceGroups/repair-RHEL_arm64_01-20260520214417/providers/Microsoft.Network/networkInterfaces/repair-RHEL_ar_VMNic",
"/subscriptions/SubID/resourceGroups/repair-RHEL_arm64_01-20260520214417/providers/Microsoft.Compute/virtualMachines/repair-RHEL_ar_",
"/subscriptions/SubID/resourceGroups/REPAIR-RHEL_ARM64_01-20260520214417/providers/Microsoft.Compute/disks/repair-RHEL_ar__OsDisk_1_5ca0a3517bf843a2ae6edd76134e70f2",
"/subscriptions/SubID/resourceGroups/RG/providers/Microsoft.Compute/disks/RHEL_arm64_01-DiskCopy-20260520214417"
],
"message": "Your repair VM 'repair-RHEL_ar_' has been created in the resource group 'repair-RHEL_arm64_01-20260520214417' with disk 'RHEL_arm64_01-DiskCopy-20260520214417' attached as data disk. Please use this VM to troubleshoot and repair. Once the repairs are complete use the command 'az vm repair restore -n RHEL_arm64_01 -g RG --verbose' to restore disk to the source VM. Note that the copied disk is created within the original resource group 'RG'.",
"repair_resource_group": "repair-RHEL_arm64_01-20260520214417",
"repair_vm_name": "repair-RHEL_ar_",
"status": "SUCCESS"
}
Command ran in 286.147 seconds (init: 0.095, invoke: 286.053)

$ az vm repair run --verbose --resource-group RG --name RHEL_arm64_01 --run-id linux-alar2 --parameters fstab --run-on-repair

Searching for repair-vm within subscription...
Found repair VM: /subscriptions/SubID/resourceGroups/repair-RHEL_arm64_01-20260520214417/providers/Microsoft.Compute/virtualMachines/repair-RHEL_ar_

Running script on repair VM: repair-RHEL_ar_

Script returned with error:
[Error 05/20/2026 21:49:41]--2026-05-20 21:49:40-- https://raw.githubusercontent.com/Azure/ALAR/main/src/run-alar.sh Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1031 (1.0K) [text/plain] Saving to: ‘run-alar.sh’ 0K . 100% 106M=0s 2026-05-20 21:49:41 (106 MB/s) - ‘run-alar.sh’ saved [1031/1031] thread 'main' (1766) panicked at src/nvme.rs:99:39: assertion failed: self.is_char_boundary(at) note: run with RUST_BACKTRACE=1 environment variable to display a backtrace ./run-alar.sh: line 32: 1766 Aborted (core dumped) RUST_LOG=info ./alar "${args[@]}"

{
"err": "--2026-05-20 21:49:40-- https://raw.githubusercontent.com/Azure/ALAR/main/src/run-alar.sh\nResolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...\nConnecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 1031 (1.0K) [text/plain]\nSaving to: ‘run-alar.sh’\n\n 0K . 100% 106M=0s\n\n2026-05-20 21:49:41 (106 MB/s) - ‘run-alar.sh’ saved [1031/1031]\n\n\nthread 'main' (1766) panicked at src/nvme.rs:99:39:\nassertion failed: self.is_char_boundary(at)\nnote: run with RUST_BACKTRACE=1 environment variable to display a backtrace\n./run-alar.sh: line 32: 1766 Aborted (core dumped) RUST_LOG=info ./alar "${args[@]}"",
"log_full_path": "/var/lib/waagent/run-command/download/0/repair-files-20260520214940/logs-20260520214940.txt",
"logs": "[Log-Start 05/20/2026 21:49:40] [Error 05/20/2026 21:49:41]--2026-05-20 21:49:40-- https://raw.githubusercontent.com/Azure/ALAR/main/src/run-alar.sh Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1031 (1.0K) [text/plain] Saving to: ‘run-alar.sh’ 0K . 100% 106M=0s 2026-05-20 21:49:41 (106 MB/s) - ‘run-alar.sh’ saved [1031/1031] thread 'main' (1766) panicked at src/nvme.rs:99:39: assertion failed: self.is_char_boundary(at) note: run with RUST_BACKTRACE=1 environment variable to display a backtrace ./run-alar.sh: line 32: 1766 Aborted (core dumped) RUST_LOG=info ./alar "${args[@]}" [Log-End 05/20/2026 21:49:41]/var/lib/waagent/run-command/download/0/repair-files-20260520214940/logs-20260520214940.txt",
"message": "Script completed with errors.",
"output": "[Error 05/20/2026 21:49:41]--2026-05-20 21:49:40-- https://raw.githubusercontent.com/Azure/ALAR/main/src/run-alar.sh Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1031 (1.0K) [text/plain] Saving to: ‘run-alar.sh’ 0K . 100% 106M=0s 2026-05-20 21:49:41 (106 MB/s) - ‘run-alar.sh’ saved [1031/1031] thread 'main' (1766) panicked at src/nvme.rs:99:39: assertion failed: self.is_char_boundary(at) note: run with RUST_BACKTRACE=1 environment variable to display a backtrace ./run-alar.sh: line 32: 1766 Aborted (core dumped) RUST_LOG=info ./alar "${args[@]}" ",
"resource_group": "repair-RHEL_arm64_01-20260520214417",
"script_status": "ERROR",
"status": "SUCCESS",
"vm_name": "repair-RHEL_ar_"
}
Command ran in 34.762 seconds (init: 0.083, invoke: 34.679)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions