Skip to content

Update README#25

Merged
sgaud-quic merged 1 commit into
qcom-next-staging-testfrom
sgaud-quic-patch-8
Jun 17, 2025
Merged

Update README#25
sgaud-quic merged 1 commit into
qcom-next-staging-testfrom
sgaud-quic-patch-8

Conversation

@sgaud-quic
Copy link
Copy Markdown
Owner

No description provided.

Signed-off-by: Salendarsingh Gaud <sgaud@qti.qualcomm.com>
@sgaud-quic sgaud-quic merged commit f6eebe3 into qcom-next-staging-test Jun 17, 2025
1 of 3 checks passed
@sgaud-quic sgaud-quic deleted the sgaud-quic-patch-8 branch June 17, 2025 07:28
sgaud-quic pushed a commit that referenced this pull request Sep 3, 2025
If we hit an error path in GEM obj creation before msm_gem_new_handle()
updates obj->resv to point to the gpuvm resv object, then obj->resv
still points to &obj->_resv.  In this case we don't want to decrement
the refcount of the object being freed (since the refcnt is already
zero).  This fixes the following splat:

   ------------[ cut here ]------------
   refcount_t: underflow; use-after-free.
   WARNING: CPU: 9 PID: 7013 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
   Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_da>
    qcom_pil_info i2c_hid drm_kms_helper qcom_common qcom_q6v5 phy_snps_eusb2 qcom_geni_serial drm qcom_sysmon pinctrl_s>
   CPU: 9 UID: 1000 PID: 7013 Comm: deqp-vk Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
   Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
   pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
   pc : refcount_warn_saturate+0xf4/0x148
   lr : refcount_warn_saturate+0xf4/0x148
   sp : ffff8000a2073920
   x29: ffff8000a2073920 x28: 0000000000000010 x27: 0000000000000010
   x26: 0000000000000042 x25: ffff000810e09800 x24: 0000000000000010
   x23: ffff8000a2073b94 x22: ffff000ddb22de00 x21: ffff000ddb22dc00
   x20: ffff000ddb22ddf8 x19: ffff0008024934e0 x18: 000000000000000a
   x17: 0000000000000000 x16: ffff9f8c67d77340 x15: 0000000000000000
   x14: 00000000ffffffff x13: 2e656572662d7265 x12: 7466612d65737520
   x11: 3b776f6c66726564 x10: 00000000ffff7fff x9 : ffff9f8c67506c70
   x8 : ffff9f8c69fa26f0 x7 : 00000000000bffe8 x6 : c0000000ffff7fff
   x5 : ffff000f53e14548 x4 : ffff6082ea2b2000 x3 : ffff0008b86ab080
   x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0008b86ab080
   Call trace:
    refcount_warn_saturate+0xf4/0x148 (P)
    msm_gem_free_object+0x248/0x260 [msm]
    drm_gem_object_free+0x24/0x40 [drm]
    msm_gem_new+0x1c4/0x1e0 [msm]
    msm_gem_new_handle+0x3c/0x1a0 [msm]
    msm_ioctl_gem_new+0x38/0x70 [msm]
    drm_ioctl_kernel+0xc8/0x138 [drm]
    drm_ioctl+0x2c8/0x618 [drm]
    __arm64_sys_ioctl+0xac/0x108
    invoke_syscall.constprop.0+0x64/0xe8
    el0_svc_common.constprop.0+0x40/0xe8
    do_el0_svc+0x24/0x38
    el0_svc+0x54/0x1d8
    el0t_64_sync_handler+0x10c/0x138
    el0t_64_sync+0x19c/0x1a0
   irq event stamp: 3698694
   hardirqs last  enabled at (3698693): [<ffff9f8c675021dc>] __up_console_sem+0x74/0x90
   hardirqs last disabled at (3698694): [<ffff9f8c68ce8164>] el1_dbg+0x24/0x90
   softirqs last  enabled at (3697578): [<ffff9f8c6744ec5c>] handle_softirqs+0x454/0x4b0
   softirqs last disabled at (3697567): [<ffff9f8c67360244>] __do_softirq+0x1c/0x28
   ---[ end trace 0000000000000000 ]---

Fixes: b58e12a ("drm/msm: Add _NO_SHARE flag")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/665355/
sgaud-quic pushed a commit that referenced this pull request Sep 3, 2025
submit_unpin_objects() should come before we unlock the objects.  This
fixes the splat:

   WARNING: CPU: 2 PID: 2171 at drivers/gpu/drm/msm/msm_gem.h:395 msm_gem_unpin_locked+0x8c/0xd8 [msm]
   Modules linked in: uinput snd_seq_dummy snd_hrtimer aes_ce_ccm snd_soc_wsa884x regmap_sdw q6prm_clocks q6apm_lpass_dais q6apm_dai snd_q6dsp_common q6prm snd_q6apm qcom_pd_mapper cdc_mbim cdc_wdm cdc_ncm r8153_ecm cdc_ether usbnet sunrpc nls_ascii nls_cp437 vfat fat snd_soc_x1e80100 snd_soc_lpass_rx_macro snd_soc_lpass_tx_macro snd_soc_lpass_va_macro snd_soc_lpass_wsa_macro snd_soc_qcom_common soundwire_qcom snd_soc_lpass_macro_common snd_soc_hdmi_codec snd_soc_qcom_sdw ext4 snd_soc_core snd_compress soundwire_bus snd_pcm_dmaengine snd_seq mbcache jbd2 snd_seq_device snd_pcm pm8941_pwrkey snd_timer r8152 qcom_spmi_temp_alarm industrialio snd lenovo_yoga_slim7x ath12k mii arm_smccc_trng soundcore rng_core evdev loop panel_samsung_atna33xc20 msm ubwc_config drm_client_lib drm_gpuvm drm_exec gpu_sched drm_display_helper pmic_glink_altmode aux_hpd_bridge ucsi_glink qcom_battmgr phy_qcom_qmp_combo ps883x cec aux_bridge drm_dp_aux_bus i2c_hid_of aes_ce_blk drm_kms_helper aes_ce_cipher i2c_hid qcom_q6v5_pas
    ghash_ce qcom_pil_info drm sha1_ce qcom_common phy_snps_eusb2 qcom_geni_serial qcom_q6v5 qcom_sysmon pinctrl_sm8550_lpass_lpi lpasscc_sc8280xp sbsa_gwdt mdt_loader gpio_keys pmic_glink i2c_dev efivarfs autofs4
   CPU: 2 UID: 1000 PID: 2171 Comm: gnome-shell Not tainted 6.16.0-rc4-debug+ #25 PREEMPT(voluntary)
   Hardware name: LENOVO 83ED/LNVNB161216, BIOS NHCN53WW 08/02/2024
   pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
   pc : msm_gem_unpin_locked+0x8c/0xd8 [msm]
   lr : msm_gem_unpin_locked+0x88/0xd8 [msm]
   sp : ffff80009c963820
   x29: ffff80009c963820 x28: ffff80009c9639f8 x27: ffff00080552a830
   x26: 0000000000000000 x25: ffff0009d5655800 x24: 0000000000000000
   x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
   x20: ffff000831db5480 x19: ffff000816e74400 x18: 0000000000000000
   x17: 0000000000000000 x16: ffffc1396afdd720 x15: 0000000000000000
   x14: 0000000000000000 x13: 0000000000000000 x12: ffff0008c065bc00
   x11: ffff0008c065c000 x10: 0000000000000000 x9 : ffffc13945b19074
   x8 : 0000000000000000 x7 : 0000000000000209 x6 : 0000000000000002
   x5 : 0000000000019d01 x4 : ffff0008ba8db080 x3 : 000000000004093f
   x2 : ffff3ed5e727f000 x1 : 0000000000000000 x0 : 0000000000000000
   Call trace:
    msm_gem_unpin_locked+0x8c/0xd8 [msm] (P)
    msm_ioctl_gem_submit+0x32c/0x1760 [msm]
    drm_ioctl_kernel+0xc8/0x138 [drm]
    drm_ioctl+0x2c8/0x618 [drm]
    __arm64_sys_ioctl+0xac/0x108
    invoke_syscall.constprop.0+0x64/0xe8
    el0_svc_common.constprop.0+0x40/0xe8
    do_el0_svc+0x24/0x38
    el0_svc+0x54/0x1d8
    el0t_64_sync_handler+0x10c/0x138
    el0t_64_sync+0x19c/0x1a0
   irq event stamp: 2185036
   hardirqs last  enabled at (2185035): [<ffffc1396afeef9c>] _raw_spin_unlock_irqrestore+0x74/0x80
   hardirqs last disabled at (2185036): [<ffffc1396afd8164>] el1_dbg+0x24/0x90
   softirqs last  enabled at (2184778): [<ffffc13969675e44>] fpsimd_restore_current_state+0x3c/0x328
   softirqs last disabled at (2184776): [<ffffc13969675e14>] fpsimd_restore_current_state+0xc/0x328
   ---[ end trace 0000000000000000 ]---

Fixes: 111fdd2 ("drm/msm: drm_gpuvm conversion")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/665357/
sgaud-quic pushed a commit that referenced this pull request Nov 1, 2025
…machine

After GPU reset with VRAM loss, a general protection fault occurs
during user queue restoration when accessing vm_bo->vm after
spinlock release in amdgpu_vm_bo_reset_state_machine.

The root cause is that vm_bo points to the last entry from the
list_for_each_entry loop, but this becomes invalid after the
spinlock is released. Accessing vm_bo->vm at this point leads
to memory corruption.

Crash log shows:
[  326.981811] Oops: general protection fault, probably for non-canonical address 0x4156415741e58ac8: 0000 [#1] SMP NOPTI
[  326.981820] CPU: 13 UID: 0 PID: 1035 Comm: kworker/13:3 Tainted: G            E       6.16.0+ #25 PREEMPT(voluntary)
[  326.981826] Tainted: [E]=UNSIGNED_MODULE
[  326.981827] Hardware name: Gigabyte Technology Co., Ltd. X870E AORUS PRO ICE/X870E AORUS PRO ICE, BIOS F3i 12/19/2024
[  326.981831] Workqueue: events amdgpu_userq_restore_worker [amdgpu]
[  326.981999] RIP: 0010:amdgpu_vm_assert_locked+0x16/0x70 [amdgpu]
[  326.982094] Code: 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 85 ff 74 45 48 8b 87 80 03 00 00 48 85 c0 74 40 <48> 8b b8 80 01 00 00 48 85 ff 74 3b 8b 05 0c b7 0e f0 85 c0 75 05
[  326.982098] RSP: 0018:ffffaa91c2a6bc20 EFLAGS: 00010206
[  326.982100] RAX: 4156415741e58948 RBX: ffff9e8f013e8330 RCX: 0000000000000000
[  326.982102] RDX: 0000000000000005 RSI: 000000001d254e88 RDI: ffffffffc144814a
[  326.982104] RBP: ffffaa91c2a6bc68 R08: 0000004c21a25674 R09: 0000000000000001
[  326.982106] R10: 0000000000000001 R11: dccaf3f2f82863fc R12: ffff9e8f013e8000
[  326.982108] R13: ffff9e8f013e8000 R14: 0000000000000000 R15: ffff9e8f09980000
[  326.982110] FS:  0000000000000000(0000) GS:ffff9e9e79995000(0000) knlGS:0000000000000000
[  326.982112] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  326.982114] CR2: 000055ed6c9caa80 CR3: 0000000797060000 CR4: 0000000000750ef0
[  326.982116] PKRU: 55555554

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant