-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S0ix fixes for AMD laptops #102
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Generally, the C-state latency is provided by the _CST method or FADT, but some OEM platforms using AMD Picasso, Renoir, Van Gogh, and Cezanne set the C2 latency greater than C3's which causes the C2 state to be skipped. That will block the core entering PC6, which prevents S0ix working properly on Linux systems. In other operating systems, the latency values are not validated and this does not cause problems by skipping states. To avoid this issue on Linux, detect when latencies are not an arithmetic progression and sort them. Link: https://gitlab.freedesktop.org/agd5f/linux/-/commit/026d186e4592c1ee9c1cb44295912d0294508725 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1230#note_712174 Suggested-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The XHCI controller is required to enter D3hot rather than D3cold for AMD s2idle on this hardware generation. Otherwise, the 'Controller Not Ready' (CNR) bit is not being cleared by host in resume and eventually this results in xhci resume failures during the s2idle wakeup. Link: https://lore.kernel.org/linux-usb/1612527609-7053-1-git-send-email-Prike.Liang@amd.com/ Suggested-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Cc: stable <stable@vger.kernel.org> # 5.11+ Link: https://lore.kernel.org/r/20210527154534.8900-1-mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Renoir needs a similar delay. Signed-off-by: Marcin Bachry <hegel666@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The documentation around the StorageD3Enable property hints that it should be made on the PCI device. This is where newer AMD systems set the property and it's required for S0i3 support. So rather than look for nodes of the root port only present on Intel systems, switch to the companion ACPI device for all systems. David Box from Intel indicated this should work on Intel as well. Link: https://lore.kernel.org/linux-nvme/YK6gmAWqaRmvpJXb@google.com/T/#m900552229fa455867ee29c33b854845fce80ba70 Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro Fixes: df4f9bc ("nvme-pci: add support for ACPI StorageD3Enable property") Suggested-by: Liang Prike <Prike.Liang@amd.com> Acked-by: Raul E Rangel <rrangel@chromium.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Although first implemented for NVME, this check may be usable by other drivers as well. Microsoft's specification explicitly mentions that is may be usable by SATA and AHCI devices. Google also indicates that they have used this with SDHCI in a downstream kernel tree that a user can plug a storage device into. Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro Suggested-by: Keith Busch <kbusch@kernel.org> CC: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com> CC: Alexander Deucher <Alexander.Deucher@amd.com> CC: Rafael J. Wysocki <rjw@rjwysocki.net> CC: Prike Liang <prike.liang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
AMD systems from Renoir and Lucienne require that the NVME controller is put into D3 over a Modern Standby / suspend-to-idle cycle. This is "typically" accomplished using the `StorageD3Enable` property in the _DSD, but this property was introduced after many of these systems launched and most OEM systems don't have it in their BIOS. On AMD Renoir without these drives going into D3 over suspend-to-idle the resume will fail with the NVME controller being reset and a trace like this in the kernel logs: ``` [ 83.556118] nvme nvme0: I/O 161 QID 2 timeout, aborting [ 83.556178] nvme nvme0: I/O 162 QID 2 timeout, aborting [ 83.556187] nvme nvme0: I/O 163 QID 2 timeout, aborting [ 83.556196] nvme nvme0: I/O 164 QID 2 timeout, aborting [ 95.332114] nvme nvme0: I/O 25 QID 0 timeout, reset controller [ 95.332843] nvme nvme0: Abort status: 0x371 [ 95.332852] nvme nvme0: Abort status: 0x371 [ 95.332856] nvme nvme0: Abort status: 0x371 [ 95.332859] nvme nvme0: Abort status: 0x371 [ 95.332909] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -16 [ 95.332936] nvme 0000:03:00.0: PM: failed to resume async: error -16 ``` The Microsoft documentation for StorageD3Enable mentioned that Windows has a hardcoded allowlist for D3 support, which was used for these platforms. Introduce quirks to hardcode them for Linux as well. As this property is now "standardized", OEM systems using AMD Cezanne and newer APU's have adopted this property, and quirks like this should not be necessary. CC: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com> CC: Alexander Deucher <Alexander.Deucher@amd.com> CC: Prike Liang <prike.liang@amd.com> Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Julian Sikorski <belegdol@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
These are supposedly not required for AMD platforms, but at least some HP laptops seem to require it to properly turn off the keyboard backlight. Based on a patch from Marcin Bachry <hegel666@gmail.com>. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1230 Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI_LPS0_ENTRY_AMD/ACPI_LPS0_EXIT_AMD are supposedly not required for AMD platforms, and on some platforms they are not even listed in the function mask but at least some HP laptops seem to require it to properly support s0ix. Based on a patch from Marcin Bachry <hegel666@gmail.com>. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1230 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marcin Bachry <hegel666@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
The protocol to submit a job request to SMU is to wait for AMD_PMC_REGISTER_RESPONSE to return 1,meaning SMU is ready to take requests. PMC driver has to make sure that the response code is always AMD_PMC_RESULT_OK before making any command submissions. Also, when we submit a message to SMU, we have to wait until it processes the request. Adding a read_poll_timeout() check as this was missing in the existing code. Fixes: 156ec47 ("platform/x86: amd-pmc: Add AMD platform support for S2Idle") Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
It was lately understood that the current mechanism available in the driver to get SMU firmware info works only on internal SMU builds and there is a separate way to get all the SMU logging counters (addressed in the next patch). Hence remove all the smu info shown via debugfs as it is no more useful. Also, use dump registers routine only at one place i.e. after the command submission to SMU is done. Fixes: 156ec47 ("platform/x86: amd-pmc: Add AMD platform support for S2Idle") Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
SMU provides a way to dump the s0ix debug statistics in the form of a metrics table via a of set special mailbox commands. Add support to the driver which can send these commands to SMU and expose the information received via debugfs. The information contains the s0ix entry/exit, active time of each IP block etc. As a side note, SMU subsystem logging is not supported on Picasso based SoC's. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Even the FCH SSC registers provides certain level of information about the s0ix entry and exit times which comes handy when the SMU fails to report the statistics via the mailbox communication. This information is captured via a new debugfs file "s0ix_stats". A non-zero entry in this counters would mean that the system entered the s0ix state. If s0ix entry time and exit time don't change during suspend to idle, the silicon has not entered the deepest state. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Some newer BIOSes have added another ACPI ID for the uPEP device. SMU statistics behave identically on this device. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
The upcoming PMC controller would have a newer acpi id, add that to the supported acpid device list. Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
AMD spec mentions only revision 0. With this change, device constraint list is populated properly. Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
qzed
pushed a commit
that referenced
this pull request
Jul 20, 2021
[ Upstream commit 45c8ddd ] Before referencing 'host->data', the driver needs to check whether it is null pointer, otherwise it will cause a null pointer reference. This log reveals it: [ 29.355199] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 29.357323] #PF: supervisor write access in kernel mode [ 29.357706] #PF: error_code(0x0002) - not-present page [ 29.358088] PGD 0 P4D 0 [ 29.358280] Oops: 0002 [#1] PREEMPT SMP PTI [ 29.358595] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4- g70e7f0549188-dirty #102 [ 29.359164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 29.359978] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.360314] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.361661] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.362042] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.362564] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.363085] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.363604] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.364128] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.364651] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.365235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.365655] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.366170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.366683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.367197] Call Trace: [ 29.367381] <IRQ> [ 29.367537] __handle_irq_event_percpu+0x53/0x3e0 [ 29.367916] handle_irq_event_percpu+0x35/0x90 [ 29.368247] handle_irq_event+0x39/0x60 [ 29.368632] handle_fasteoi_irq+0xc2/0x1d0 [ 29.368950] __common_interrupt+0x7f/0x150 [ 29.369254] common_interrupt+0xb4/0xd0 [ 29.369547] </IRQ> [ 29.369708] asm_common_interrupt+0x1e/0x40 [ 29.370016] RIP: 0010:native_safe_halt+0x17/0x20 [ 29.370360] Code: 07 0f 00 2d db 80 43 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 c2 37 e5 01 55 48 89 e5 85 c0 7e 07 0f 00 2d bb 80 43 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d f9 91 [ 29.371696] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 29.372079] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 29.372595] RDX: 0000000000000000 RSI: ffffffff854f67a4 RDI: ffffffff85403406 [ 29.373122] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 29.373646] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86009188 [ 29.374160] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100258000 [ 29.374690] default_idle+0x9/0x10 [ 29.374944] arch_cpu_idle+0xa/0x10 [ 29.375198] default_idle_call+0x6e/0x250 [ 29.375491] do_idle+0x1f0/0x2d0 [ 29.375740] cpu_startup_entry+0x18/0x20 [ 29.376034] start_secondary+0x11f/0x160 [ 29.376328] secondary_startup_64_no_verify+0xb0/0xbb [ 29.376705] Modules linked in: [ 29.376939] Dumping ftrace buffer: [ 29.377187] (ftrace buffer empty) [ 29.377460] CR2: 0000000000000014 [ 29.377712] ---[ end trace 51a473dffb618c47 ]--- [ 29.378056] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.378380] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.379714] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.380098] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.380614] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.381134] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.381653] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.382176] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.382697] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.383277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.383697] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.384223] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.384736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.385260] Kernel panic - not syncing: Fatal exception in interrupt [ 29.385882] Dumping ftrace buffer: [ 29.386135] (ftrace buffer empty) [ 29.386401] Kernel Offset: disabled [ 29.386656] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Link: https://lore.kernel.org/r/1622727200-15808-1-git-send-email-zheyuma97@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
qzed
pushed a commit
that referenced
this pull request
Jul 20, 2021
[ Upstream commit 45c8ddd ] Before referencing 'host->data', the driver needs to check whether it is null pointer, otherwise it will cause a null pointer reference. This log reveals it: [ 29.355199] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 29.357323] #PF: supervisor write access in kernel mode [ 29.357706] #PF: error_code(0x0002) - not-present page [ 29.358088] PGD 0 P4D 0 [ 29.358280] Oops: 0002 [#1] PREEMPT SMP PTI [ 29.358595] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4- g70e7f0549188-dirty #102 [ 29.359164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 29.359978] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.360314] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.361661] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.362042] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.362564] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.363085] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.363604] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.364128] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.364651] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.365235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.365655] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.366170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.366683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.367197] Call Trace: [ 29.367381] <IRQ> [ 29.367537] __handle_irq_event_percpu+0x53/0x3e0 [ 29.367916] handle_irq_event_percpu+0x35/0x90 [ 29.368247] handle_irq_event+0x39/0x60 [ 29.368632] handle_fasteoi_irq+0xc2/0x1d0 [ 29.368950] __common_interrupt+0x7f/0x150 [ 29.369254] common_interrupt+0xb4/0xd0 [ 29.369547] </IRQ> [ 29.369708] asm_common_interrupt+0x1e/0x40 [ 29.370016] RIP: 0010:native_safe_halt+0x17/0x20 [ 29.370360] Code: 07 0f 00 2d db 80 43 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 c2 37 e5 01 55 48 89 e5 85 c0 7e 07 0f 00 2d bb 80 43 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d f9 91 [ 29.371696] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 29.372079] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 29.372595] RDX: 0000000000000000 RSI: ffffffff854f67a4 RDI: ffffffff85403406 [ 29.373122] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 29.373646] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86009188 [ 29.374160] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100258000 [ 29.374690] default_idle+0x9/0x10 [ 29.374944] arch_cpu_idle+0xa/0x10 [ 29.375198] default_idle_call+0x6e/0x250 [ 29.375491] do_idle+0x1f0/0x2d0 [ 29.375740] cpu_startup_entry+0x18/0x20 [ 29.376034] start_secondary+0x11f/0x160 [ 29.376328] secondary_startup_64_no_verify+0xb0/0xbb [ 29.376705] Modules linked in: [ 29.376939] Dumping ftrace buffer: [ 29.377187] (ftrace buffer empty) [ 29.377460] CR2: 0000000000000014 [ 29.377712] ---[ end trace 51a473dffb618c47 ]--- [ 29.378056] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.378380] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.379714] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.380098] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.380614] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.381134] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.381653] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.382176] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.382697] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.383277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.383697] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.384223] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.384736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.385260] Kernel panic - not syncing: Fatal exception in interrupt [ 29.385882] Dumping ftrace buffer: [ 29.386135] (ftrace buffer empty) [ 29.386401] Kernel Offset: disabled [ 29.386656] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Link: https://lore.kernel.org/r/1622727200-15808-1-git-send-email-zheyuma97@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
qzed
pushed a commit
that referenced
this pull request
Jul 20, 2021
[ Upstream commit 45c8ddd ] Before referencing 'host->data', the driver needs to check whether it is null pointer, otherwise it will cause a null pointer reference. This log reveals it: [ 29.355199] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 29.357323] #PF: supervisor write access in kernel mode [ 29.357706] #PF: error_code(0x0002) - not-present page [ 29.358088] PGD 0 P4D 0 [ 29.358280] Oops: 0002 [#1] PREEMPT SMP PTI [ 29.358595] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4- g70e7f0549188-dirty #102 [ 29.359164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 29.359978] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.360314] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.361661] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.362042] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.362564] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.363085] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.363604] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.364128] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.364651] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.365235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.365655] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.366170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.366683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.367197] Call Trace: [ 29.367381] <IRQ> [ 29.367537] __handle_irq_event_percpu+0x53/0x3e0 [ 29.367916] handle_irq_event_percpu+0x35/0x90 [ 29.368247] handle_irq_event+0x39/0x60 [ 29.368632] handle_fasteoi_irq+0xc2/0x1d0 [ 29.368950] __common_interrupt+0x7f/0x150 [ 29.369254] common_interrupt+0xb4/0xd0 [ 29.369547] </IRQ> [ 29.369708] asm_common_interrupt+0x1e/0x40 [ 29.370016] RIP: 0010:native_safe_halt+0x17/0x20 [ 29.370360] Code: 07 0f 00 2d db 80 43 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 c2 37 e5 01 55 48 89 e5 85 c0 7e 07 0f 00 2d bb 80 43 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d f9 91 [ 29.371696] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 29.372079] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 29.372595] RDX: 0000000000000000 RSI: ffffffff854f67a4 RDI: ffffffff85403406 [ 29.373122] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 29.373646] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86009188 [ 29.374160] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100258000 [ 29.374690] default_idle+0x9/0x10 [ 29.374944] arch_cpu_idle+0xa/0x10 [ 29.375198] default_idle_call+0x6e/0x250 [ 29.375491] do_idle+0x1f0/0x2d0 [ 29.375740] cpu_startup_entry+0x18/0x20 [ 29.376034] start_secondary+0x11f/0x160 [ 29.376328] secondary_startup_64_no_verify+0xb0/0xbb [ 29.376705] Modules linked in: [ 29.376939] Dumping ftrace buffer: [ 29.377187] (ftrace buffer empty) [ 29.377460] CR2: 0000000000000014 [ 29.377712] ---[ end trace 51a473dffb618c47 ]--- [ 29.378056] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.378380] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.379714] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.380098] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.380614] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.381134] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.381653] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.382176] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.382697] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.383277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.383697] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.384223] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.384736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.385260] Kernel panic - not syncing: Fatal exception in interrupt [ 29.385882] Dumping ftrace buffer: [ 29.386135] (ftrace buffer empty) [ 29.386401] Kernel Offset: disabled [ 29.386656] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Link: https://lore.kernel.org/r/1622727200-15808-1-git-send-email-zheyuma97@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
kitakar5525
pushed a commit
to kitakar5525/linux-kernel
that referenced
this pull request
Aug 6, 2021
[ Upstream commit 45c8ddd ] Before referencing 'host->data', the driver needs to check whether it is null pointer, otherwise it will cause a null pointer reference. This log reveals it: [ 29.355199] BUG: kernel NULL pointer dereference, address: 0000000000000014 [ 29.357323] #PF: supervisor write access in kernel mode [ 29.357706] #PF: error_code(0x0002) - not-present page [ 29.358088] PGD 0 P4D 0 [ 29.358280] Oops: 0002 [#1] PREEMPT SMP PTI [ 29.358595] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4- g70e7f0549188-dirty linux-surface#102 [ 29.359164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 29.359978] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.360314] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.361661] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.362042] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.362564] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.363085] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.363604] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.364128] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.364651] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.365235] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.365655] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.366170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.366683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.367197] Call Trace: [ 29.367381] <IRQ> [ 29.367537] __handle_irq_event_percpu+0x53/0x3e0 [ 29.367916] handle_irq_event_percpu+0x35/0x90 [ 29.368247] handle_irq_event+0x39/0x60 [ 29.368632] handle_fasteoi_irq+0xc2/0x1d0 [ 29.368950] __common_interrupt+0x7f/0x150 [ 29.369254] common_interrupt+0xb4/0xd0 [ 29.369547] </IRQ> [ 29.369708] asm_common_interrupt+0x1e/0x40 [ 29.370016] RIP: 0010:native_safe_halt+0x17/0x20 [ 29.370360] Code: 07 0f 00 2d db 80 43 00 f4 5d c3 0f 1f 84 00 00 00 00 00 8b 05 c2 37 e5 01 55 48 89 e5 85 c0 7e 07 0f 00 2d bb 80 43 00 fb f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d f9 91 [ 29.371696] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246 [ 29.372079] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 29.372595] RDX: 0000000000000000 RSI: ffffffff854f67a4 RDI: ffffffff85403406 [ 29.373122] RBP: ffffc9000008fe90 R08: 0000000000000001 R09: 0000000000000001 [ 29.373646] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86009188 [ 29.374160] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888100258000 [ 29.374690] default_idle+0x9/0x10 [ 29.374944] arch_cpu_idle+0xa/0x10 [ 29.375198] default_idle_call+0x6e/0x250 [ 29.375491] do_idle+0x1f0/0x2d0 [ 29.375740] cpu_startup_entry+0x18/0x20 [ 29.376034] start_secondary+0x11f/0x160 [ 29.376328] secondary_startup_64_no_verify+0xb0/0xbb [ 29.376705] Modules linked in: [ 29.376939] Dumping ftrace buffer: [ 29.377187] (ftrace buffer empty) [ 29.377460] CR2: 0000000000000014 [ 29.377712] ---[ end trace 51a473dffb618c47 ]--- [ 29.378056] RIP: 0010:via_sdc_isr+0x21f/0x410 [ 29.378380] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00 10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43 18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77 [ 29.379714] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046 [ 29.380098] RAX: 0000000000000000 RBX: ffff888107d77880 RCX: 0000000000000000 [ 29.380614] RDX: 0000000000000000 RSI: ffffffff835d20bb RDI: 00000000ffffffff [ 29.381134] RBP: ffffc90000118ed8 R08: 0000000000000001 R09: 0000000000000001 [ 29.381653] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000008600 [ 29.382176] R13: ffff888107d779c8 R14: ffffc90009c00200 R15: 0000000000008000 [ 29.382697] FS: 0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000 [ 29.383277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.383697] CR2: 0000000000000014 CR3: 0000000005a2e000 CR4: 00000000000006e0 [ 29.384223] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 29.384736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 29.385260] Kernel panic - not syncing: Fatal exception in interrupt [ 29.385882] Dumping ftrace buffer: [ 29.386135] (ftrace buffer empty) [ 29.386401] Kernel Offset: disabled [ 29.386656] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Link: https://lore.kernel.org/r/1622727200-15808-1-git-send-email-zheyuma97@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
qzed
pushed a commit
that referenced
this pull request
Jun 27, 2022
Commit a7259df ("memblock: make memblock_find_in_range method private") changed the API using which memory is reserved for the pKVM hypervisor. However, memblock_phys_alloc() differs from the original API in terms of kmemleak semantics -- the old one didn't report the reserved regions to kmemleak while the new one does. Unfortunately, when protected KVM is enabled, all kernel accesses to pKVM-private memory result in a fatal exception, which can now happen because of kmemleak scans: $ echo scan > /sys/kernel/debug/kmemleak [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290! [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000 [ 34.991813] Kernel panic - not syncing: HYP panic: [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 34.991813] VCPU:0000000000000000 [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102 [ 34.994059] Hardware name: linux,dummy-virt (DT) [ 34.994452] Call trace: [ 34.994641] dump_backtrace.part.0+0xcc/0xe0 [ 34.994932] show_stack+0x18/0x6c [ 34.995094] dump_stack_lvl+0x68/0x84 [ 34.995276] dump_stack+0x18/0x34 [ 34.995484] panic+0x16c/0x354 [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60 [ 34.995933] scan_block+0x74/0x12c [ 34.996129] scan_gray_list+0xd8/0x19c [ 34.996332] kmemleak_scan+0x2c8/0x580 [ 34.996535] kmemleak_write+0x340/0x4a0 [ 34.996744] full_proxy_write+0x60/0xbc [ 34.996967] vfs_write+0xc4/0x2b0 [ 34.997136] ksys_write+0x68/0xf4 [ 34.997311] __arm64_sys_write+0x20/0x2c [ 34.997532] invoke_syscall+0x48/0x114 [ 34.997779] el0_svc_common.constprop.0+0x44/0xec [ 34.998029] do_el0_svc+0x2c/0xc0 [ 34.998205] el0_svc+0x2c/0x84 [ 34.998421] el0t_64_sync_handler+0xf4/0x100 [ 34.998653] el0t_64_sync+0x18c/0x190 [ 34.999252] SMP: stopping secondary CPUs [ 35.000034] Kernel Offset: disabled [ 35.000261] CPU features: 0x800,00007831,00001086 [ 35.000642] Memory Limit: none [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic: [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 35.001329] VCPU:0000000000000000 ]--- Fix this by explicitly excluding the hypervisor's memory pool from kmemleak like we already do for the hyp BSS. Cc: Mike Rapoport <rppt@kernel.org> Fixes: a7259df ("memblock: make memblock_find_in_range method private") Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220616161135.3997786-1-qperret@google.com
qzed
pushed a commit
that referenced
this pull request
Jul 3, 2022
[ Upstream commit 56961c6 ] Commit a7259df ("memblock: make memblock_find_in_range method private") changed the API using which memory is reserved for the pKVM hypervisor. However, memblock_phys_alloc() differs from the original API in terms of kmemleak semantics -- the old one didn't report the reserved regions to kmemleak while the new one does. Unfortunately, when protected KVM is enabled, all kernel accesses to pKVM-private memory result in a fatal exception, which can now happen because of kmemleak scans: $ echo scan > /sys/kernel/debug/kmemleak [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290! [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000 [ 34.991813] Kernel panic - not syncing: HYP panic: [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 34.991813] VCPU:0000000000000000 [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102 [ 34.994059] Hardware name: linux,dummy-virt (DT) [ 34.994452] Call trace: [ 34.994641] dump_backtrace.part.0+0xcc/0xe0 [ 34.994932] show_stack+0x18/0x6c [ 34.995094] dump_stack_lvl+0x68/0x84 [ 34.995276] dump_stack+0x18/0x34 [ 34.995484] panic+0x16c/0x354 [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60 [ 34.995933] scan_block+0x74/0x12c [ 34.996129] scan_gray_list+0xd8/0x19c [ 34.996332] kmemleak_scan+0x2c8/0x580 [ 34.996535] kmemleak_write+0x340/0x4a0 [ 34.996744] full_proxy_write+0x60/0xbc [ 34.996967] vfs_write+0xc4/0x2b0 [ 34.997136] ksys_write+0x68/0xf4 [ 34.997311] __arm64_sys_write+0x20/0x2c [ 34.997532] invoke_syscall+0x48/0x114 [ 34.997779] el0_svc_common.constprop.0+0x44/0xec [ 34.998029] do_el0_svc+0x2c/0xc0 [ 34.998205] el0_svc+0x2c/0x84 [ 34.998421] el0t_64_sync_handler+0xf4/0x100 [ 34.998653] el0t_64_sync+0x18c/0x190 [ 34.999252] SMP: stopping secondary CPUs [ 35.000034] Kernel Offset: disabled [ 35.000261] CPU features: 0x800,00007831,00001086 [ 35.000642] Memory Limit: none [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic: [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 35.001329] VCPU:0000000000000000 ]--- Fix this by explicitly excluding the hypervisor's memory pool from kmemleak like we already do for the hyp BSS. Cc: Mike Rapoport <rppt@kernel.org> Fixes: a7259df ("memblock: make memblock_find_in_range method private") Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220616161135.3997786-1-qperret@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Various fixes for s0ix on AMD laptops. In hope for improvement of s0ix/suspend on the SL4A.