Skip to content

[AI Generated] BugFix: fix IndexError in _hot_add_disk_serial when no new device appears#4435

Open
johnsongeorge-w wants to merge 1 commit intomainfrom
bugfix/hot-add-disk-serial-index-error_240426_101331
Open

[AI Generated] BugFix: fix IndexError in _hot_add_disk_serial when no new device appears#4435
johnsongeorge-w wants to merge 1 commit intomainfrom
bugfix/hot-add-disk-serial-index-error_240426_101331

Conversation

@johnsongeorge-w
Copy link
Copy Markdown
Collaborator

Summary

Fix an IndexError: list index out of range crash in _hot_add_disk_serial() when a hot-added data disk does not appear as a new Linux block device after attachment.

Root Cause

The method computed the set difference of device keys before vs. after hot-add, then immediately indexed [0] on the resulting list without checking whether it was empty. If the disk was not yet visible (timing race) or failed to appear, the empty-list access raised IndexError.

Fix

  • Replace the bare [0] indexing with an explicit assert_that(...).is_not_empty() assertion that produces a clear diagnostic message including the expected LUN number.
  • Use next(iter(...)) to retrieve the first element safely after the assertion guarantees non-emptiness.

Validation

Tested on 3 distro families with verify_hot_add_disk_serial_premium_ssd:

Distro Kernel VM Size Region Result
Ubuntu 22.04 LTS 6.x Standard_D2s_v3 westus2 PASSED
RHEL 9.7 5.14.0-611.45.1.el9_7 Standard_D2s_v3 westus2 PASSED
SLES 15 SP5 5.14.21-150500.33.75-azure Standard_D2s_v3 westus2 PASSED

All lint checks pass (black, flake8, mypy).

… new device appears

Replace bare [0] indexing on the lun diff list with an explicit
assertion that the new device set is non-empty, providing a clear
error message about max_data_disk_count mismatch instead of an
opaque IndexError.
Copilot AI review requested due to automatic review settings April 24, 2026 21:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in the Storage test suite hot-add helper by replacing an unsafe empty-list index with an assertion and a safer retrieval approach when no new Linux block device is detected after attaching a disk.

Changes:

  • Add an explicit non-empty assertion for newly discovered device keys when comparing Disk.get_luns() before/after hot-add.
  • Replace direct [0] indexing with next(iter(...)) after validating the set difference is non-empty.
  • Improve failure diagnostics by including LUN and before/after device mappings in the assertion message.

Comment on lines +861 to +865
linux_device_lun_diff = linux_device_luns_after[next(iter(new_device_keys))]
log.debug(f"linux_device_lun_diff: {linux_device_lun_diff}")
assert_that(linux_device_lun_diff, "New device lun mismatch").is_equal_to(
lun
)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next(iter(new_device_keys)) picks an arbitrary element because set iteration order is not stable. If more than one new device key appears (e.g., due to renames/duplicate discoveries), this can become non-deterministic and fail even when a device at the expected LUN exists. Consider asserting there is exactly one new key, or selecting the key whose LUN matches the expected lun.

Suggested change
linux_device_lun_diff = linux_device_luns_after[next(iter(new_device_keys))]
log.debug(f"linux_device_lun_diff: {linux_device_lun_diff}")
assert_that(linux_device_lun_diff, "New device lun mismatch").is_equal_to(
lun
)
new_device_luns = [
linux_device_luns_after[device_key] for device_key in new_device_keys
]
log.debug(f"new_device_luns: {new_device_luns}")
assert_that(
new_device_luns,
f"Expected a newly discovered device at lun {lun}. "
f"New device keys: {list(new_device_keys)}. "
f"Before: {linux_device_luns}. After: {linux_device_luns_after}.",
).contains(lun)

Copilot uses AI. Check for mistakes.
Comment on lines 853 to +865
assert_that(
linux_device_lun_diff, "No new device lun found on VM"
).is_equal_to(lun)
list(new_device_keys),
f"Expected new device at lun {lun} but no new device "
f"appeared. Before: {linux_device_luns}, "
f"After: {linux_device_luns_after}. This may indicate "
f"the VM size does not support this many data disks "
f"(max_data_disk_count={max_data_disk_count}).",
).is_not_empty()
linux_device_lun_diff = linux_device_luns_after[next(iter(new_device_keys))]
log.debug(f"linux_device_lun_diff: {linux_device_lun_diff}")
assert_that(linux_device_lun_diff, "New device lun mismatch").is_equal_to(
lun
)
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new-device assertion fails (or any assertion in the loop fails), the function exits before the cleanup loop runs, leaving any already-attached managed disks on the VM. Consider wrapping the add/verify loop in a try/finally so disk.remove_data_disk(disks_added) is always executed to avoid resource leaks and cascading failures/cost.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

❌ AI Test Selection — FAILED

74 test case(s) selected (view run)

Marketplace image: canonical 0001-com-ubuntu-server-jammy 22_04-lts-gen2 latest

Count
✅ Passed 56
❌ Failed 4
⏭️ Skipped 14
Total 74
Test case details
Test Case Status Time (s) Message
verify_deployment_provision_standard_ssd_disk (lisa_0_4) ✅ PASSED 34.119
smoke_test_check_serial_console_pattern (lisa_0_2) ✅ PASSED 37.292
verify_deployment_provision_synthetic_nic (lisa_0_3) ✅ PASSED 43.360
smoke_test (lisa_0_1) ✅ PASSED 41.429
verify_deployment_provision_ephemeral_managed_disk (lisa_0_5) ✅ PASSED 47.154
verify_deployment_provision_premium_disk (lisa_0_6) ✅ PASSED 37.567
verify_deployment_provision_sriov (lisa_0_8) ✅ PASSED 55.565
verify_deployment_provision_premiumv2_disk (lisa_0_7) ✅ PASSED 38.733
verify_deployment_provision_ultra_datadisk (lisa_0_10) ✅ PASSED 43.187
verify_stop_start_in_platform (lisa_0_11) ✅ PASSED 83.705
verify_reboot_in_platform (lisa_0_9) ✅ PASSED 196.583
verify_deployment_provision_swiotlb_force (lisa_0_13) ✅ PASSED 110.914
stress_reboot (lisa_0_12) ✅ PASSED 574.261
verify_vmbus_devices_channels_bsd (lisa_0_14) ⏭️ SKIPPED 0.000 check skipped: OS type mismatch: ["requires [<class 'lisa.operating_system.BSD'>] but VM supports [<class 'lisa.operatin
verify_vmbus_devices_channels (lisa_0_15) ✅ PASSED 19.424
verify_vmbus_heartbeat_properties (lisa_0_16) ✅ PASSED 19.806
verify_serial_console (lisa_0_0) ✅ PASSED 37.539
verify_default_targetpw (lisa_0_36) ✅ PASSED 6.862
verify_grub (lisa_0_37) ✅ PASSED 2.201
verify_network_file_configuration (lisa_0_39) ⏭️ SKIPPED 0.260 skipped: unsupported distro type: <class 'lisa.operating_system.Ubuntu'>
verify_ifcfg_eth0 (lisa_0_40) ⏭️ SKIPPED 0.254 skipped: unsupported distro type: <class 'lisa.operating_system.Ubuntu'>
verify_udev_rules_moved (lisa_0_41) ⏭️ SKIPPED 0.421 skipped: Unsupported distro type : <class 'lisa.operating_system.Ubuntu'>
verify_dhcp_file_configuration (lisa_0_42) ⏭️ SKIPPED 0.246 skipped: Unsupported distro type : <class 'lisa.operating_system.Ubuntu'>
verify_repository_installed (lisa_0_46) ✅ PASSED 29.597
verify_serial_console_is_enabled (lisa_0_47) ✅ PASSED 1.865
verify_no_pre_exist_users (lisa_0_52) ✅ PASSED 4.024
verify_resource_disk_file_system (lisa_0_54) ✅ PASSED 8.282
verify_waagent_version (lisa_0_55) ✅ PASSED 2.427
verify_python_version (lisa_0_56) ✅ PASSED 1.817
verify_openssl_version (lisa_0_57) ✅ PASSED 1.847
verify_azure_64bit_os (lisa_0_58) ✅ PASSED 1.876
verify_omi_version (lisa_0_59) ✅ PASSED 2.315
verify_no_swap_on_osdisk (lisa_0_60) ✅ PASSED 1.975
verify_essential_kernel_modules (lisa_0_61) ✅ PASSED 2.948
verify_yum_conf (lisa_0_43) ⏭️ SKIPPED 1.675 skipped: Unsupported distro type : <class 'lisa.operating_system.Ubuntu'>
verify_hv_kvp_daemon_installed (lisa_0_45) ✅ PASSED 4.357
verify_cloud_init_error_status (lisa_0_50) ⏭️ SKIPPED 0.231 skipped: Unsupported system: 'Ubuntu 22.04.5 LTS'. unsupported distro to run verify_cloud_init test.
verify_client_active_interval (lisa_0_51) ✅ PASSED 2.138
verify_resource_disk_readme_file (lisa_0_53) ✅ PASSED 8.034
verify_os_update (lisa_0_44) ✅ PASSED 122.170
verify_network_manager_not_installed (lisa_0_38) ⏭️ SKIPPED 0.247 skipped: unsupported distro type: <class 'lisa.operating_system.Ubuntu'>
verify_boot_error_fail_warnings (lisa_0_49) ❌ FAILED 7.014 failed. AssertionError: [unexpected error/failure/warnings shown up in bootup log of distro Ubuntu 22.4.0] Expected <['A
verify_bash_history_is_empty (lisa_0_48) ✅ PASSED 9.709
verify_l3_cache (lisa_0_33) ✅ PASSED 2.118
verify_cpu_count (lisa_0_34) ✅ PASSED 0.269
verify_vmbus_interrupts (lisa_0_35) ❌ FAILED 10.865 failed. AssertionError: [Hypervisor synthetic timer interrupt should be processed by all vCPU's] Expected to be
verify_dhcp_client_timeout (lisa_0_22) ✅ PASSED 4.081
verify_dns_name_resolution (lisa_0_71) ✅ PASSED 2.916
verify_dns_name_resolution_after_upgrade (lisa_0_72) ✅ PASSED 164.481
verify_floppy_module_is_blacklisted (lisa_0_18) ✅ PASSED 1.784
verify_initrd_modules (lisa_0_24) ✅ PASSED 29.204
verify_hyperv_modules (lisa_0_25) ✅ PASSED 6.543
verify_lis_modules_version (lisa_0_23) ⏭️ SKIPPED 8.487 skipped: Ubuntu not supported. This test case only supports Redhat distros.
verify_reload_hyperv_modules (lisa_0_26) ✅ PASSED 193.741
verify_enable_kprobe (lisa_0_17) ✅ PASSED 3.882
verify_kvp (lisa_0_27) ✅ PASSED 10.844
verify_hyperv_platform_id (lisa_0_28) ✅ PASSED 6.940
verify_pmu_disabled_for_arm64 (lisa_0_67) ⏭️ SKIPPED 0.420 skipped: This test case does not support CpuArchitecture.X64. This validation is only for ARM64.
verify_timedrift_corrected (lisa_0_68) ✅ PASSED 75.453
verify_timesync_ptp (lisa_0_62) ✅ PASSED 3.346
verify_timesync_unbind_clockevent (lisa_0_64) ❌ FAILED 3.404 failed. AssertionError: [Expected clockevent name is Hyper-V clockevent, but actual it is lapic.] Expected to be
verify_timesync_unbind_clocksource (lisa_0_63) ✅ PASSED 28.496
verify_timesync_chrony (lisa_0_66) ✅ PASSED 28.754
verify_timesync_ntp (lisa_0_65) ✅ PASSED 57.383
verify_vdso (lisa_0_20) ✅ PASSED 163.387
verify_vm_hot_resize_decrease (lisa_0_30) ✅ PASSED 188.626
verify_vm_resize_decrease (lisa_0_32) ✅ PASSED 233.653
verify_vm_resize_increase (lisa_0_31) ❌ FAILED 474.102 failed. HttpResponseError: (InvalidParameter) The VM size 'Standard_E192ids_v6' cannot boot with DiskControllerType 'SCS
verify_vm_hot_resize (lisa_0_29) ✅ PASSED 699.613
verify_gdb (lisa_0_19) ✅ PASSED 80.699
verify_sched_core_basic (lisa_0_21) ⏭️ SKIPPED 8.797 before_case skipped: Unsupported system: 'Ubuntu 22.04.5 LTS'. SCHED_CORE support is only tested on AzureLinux 3.0 and l
verify_boot_with_debug_kernel (lisa_0_73) ⏭️ SKIPPED 0.227 skipped: Ubuntu not supported. This test case only supports redhat/centos distro.
verify_zram_crypto_zstd (lisa_0_69) ⏭️ SKIPPED 0.229 before_case skipped: Unsupported system: 'Ubuntu 22.04.5 LTS'. zram compression test requires Azure Linux 3.0+.
verify_zram_crypto_lz4 (lisa_0_70) ⏭️ SKIPPED 0.259 before_case skipped: Unsupported system: 'Ubuntu 22.04.5 LTS'. zram compression test requires Azure Linux 3.0+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants