mwifiex: add fw reset quirks for Surface devices #58

kitakar5525 · 2020-09-24T09:50:06Z

This PR adds

DMI tables of Surface devices (first and second patch)
fw reset quirk for Surface gen4+ devices (third patch)
fw reset quirk for Surface 3 (fourth patch)

While here, created new files for quirks (mwifiex/pcie_quirks.c and mwifiex/pcie_quirks.h) because the changes are a little bit too big to add into pcie.c.

For Surface gen4+ devices, we can reset fw by putting wifi into D3cold then bring back into D0 (power-cycle).

For Surface 3, we can reset fw by calling _DSM method available in \_SB.WSID device (https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_3/dsdt.dsl#L11947-L12011).

Note that for Surface Pro 3, there is the same WSID device exists (https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_pro_3/dsdt.dsl#L12164-L12216). However, I'm not sure if it will work in the same way as Surface 3. So, I didn't add support for SP3 in this PR. Might look into it later.

kitakar5525 · 2020-09-24T09:55:48Z

You can trigger the card reset manually via sysfs entry:

$ echo 1 | sudo tee /sys/kernel/debug/mwifiex/mlan0/reset

kitakar5525

mwifiex: pcie: add reset_wsid quirk for Surface 3

Forgot to mention that why this quirk is required.

To reset mwifiex on Surface 3, it seems that calling the _DSM method
exists in \_SB.WSID [1] device is required.

[1] https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_3/dsdt.dsl#L11947-L12011

EDIT: Added.

kitakar5525

mwifiex: pcie: add reset_wsid quirk for Surface 3

Hmm, some indentations look weird. I'll fix them later.

EDIT: Fixed. (Changed tab size setting to 8 in my editor)

kitakar5525 · 2020-09-24T14:10:23Z

Addressed the following checkpatch warnings:
mwifiex: pcie: add Surface devices DMI table for quirks

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#33: FILE: drivers/net/wireless/marvell/mwifiex/pcie_quirks.h:1:
+/*

mwifiex: pcie: add reset_d3cold quirk for Surface gen4+ devices

WARNING: Block comments use a trailing */ on a separate line
#53: FILE: drivers/net/wireless/marvell/mwifiex/pcie.c:398:
+	 * before performing FLR */

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#66: FILE: drivers/net/wireless/marvell/mwifiex/pcie_quirks.c:1:
+/*

WARNING: Block comments use a trailing */ on a separate line
#109: FILE: drivers/net/wireless/marvell/mwifiex/pcie_quirks.c:44:
+	 * "Cannot transition to power state D0 for parent in D3hot". */

mwifiex: pcie: add reset_wsid quirk for Surface 3

WARNING: Block comments use a trailing */ on a separate line
#39: FILE: drivers/net/wireless/marvell/mwifiex/pcie.c:2830:
+	 * reset-related works. */

Ignoring WARNING: networking block comments don't use an empty /* line, use /* Comment... at top of files in accordance with the other upstream files:

WARNING: networking block comments don't use an empty /* line, use /* Comment...
#35: FILE: drivers/net/wireless/marvell/mwifiex/pcie_quirks.h:3:
+/*
+ * Header file for PCIe quirks.

drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

drivers/net/wireless/marvell/mwifiex/pcie.c

drivers/net/wireless/marvell/mwifiex/pcie_quirks.h

kitakar5525 · 2020-09-25T14:29:09Z

Thanks for the review! I've pushed four new patches.

The first two patches addressed the review.
The rest is for simplification/upstreaming.

If they look OK, I'll squash the patches.

qzed

Looks mostly good, a couple of nitpicks.

drivers/net/wireless/marvell/mwifiex/pcie.c

drivers/net/wireless/marvell/mwifiex/pcie_quirks.h

drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

kitakar5525 · 2020-09-29T14:37:59Z

Squashed patches with addressing the review feedbacks:

moved mwifiex_initialize_quirks() before mwifiex_add_card()
switched to using GUID_INIT()
changed acpi_evaluate_dsm() failure return value to -EFAULT
added ACPI_FREE(obj); after acpi_evaluate_dsm() success

Other changes:

do not use mwifiex_dbg() to avoid troubles in pcie_quirks.c
(e.g. to avoid using mwifiex_adapter struct before init or wifi is powered down, or causes NULL ptr deref).
Added comment regarding this to the top of pcie_quirks.c:

kernel/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

Lines 21 to 25 in 5dcea7f

    
           /* The low-level PCI operations will be performed in this file. Therefore, 
        
            * let's use dev_*() instead of mwifiex_dbg() here to avoid troubles (e.g. 
        
            * to avoid using mwifiex_adapter struct before init or wifi is powered 
        
            * down, or causes NULL ptr deref). 
        
            */

kitakar5525 · 2020-09-29T15:42:38Z

I've rebased again.

Changes:

Fixed changes squashed into wrong patch (the print dev_info(&pdev->dev, "no quirks applied\n"); went into second patch by mistake, now in first patch)

Moved comment for (do not use mwifiex_dbg() to avoid troubles in pcie_quirks.c) higher (now above #include):

kernel/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

Lines 1 to 15 in a90d243

    
           // SPDX-License-Identifier: GPL-2.0 
        
           /* 
        
            * File for PCIe quirks. 
        
            */ 
        
           /* The low-level PCI operations will be performed in this file. Therefore, 
        
            * let's use dev_*() instead of mwifiex_dbg() here to avoid troubles (e.g. 
        
            * to avoid using mwifiex_adapter struct before init or wifi is powered 
        
            * down, or causes NULL ptr deref). 
        
            */ 
        
           #include <linux/acpi.h> 
        
           #include <linux/dmi.h> 
        
           #include "pcie_quirks.h"

Use result of mwifiex_pcie_set_power_d0()

kernel/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

Lines 186 to 191 in a90d243

    
           ret = mwifiex_pcie_set_power_d0(parent_pdev); 
        
           if (ret) 
        
           	return ret; 
        
           ret = mwifiex_pcie_set_power_d0(pdev); 
        
           if (ret) 
        
           	return ret;

kitakar5525 · 2020-09-30T09:18:42Z

Rebased with addressing review feedbacks:

dropped QUIRK_NONE and used .driver_data = 0 for the skeleton
dropped path variable in mwifiex_pcie_reset_wsid_quirk()
changed return value of acpi_evaluate_dsm() failure to -EIO

Other changes:

added _DSM check for also wifi power on function
There was only check for wifi power off function before, so also add for on accordingly

kernel/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c

Lines 216 to 228 in 5591da4

    
           if (!acpi_check_dsm(handle, &wsid_dsm_guid, 
        
           		    WSID_REV, WSID_FUNC_WIFI_PWR_OFF)) { 
        
           	dev_err(&pdev->dev, 
        
           		"_DSM method doesn't support wifi power off func\n"); 
        
           	return -ENODEV; 
        
           } 
        
           if (!acpi_check_dsm(handle, &wsid_dsm_guid, 
        
           		    WSID_REV, WSID_FUNC_WIFI_PWR_ON)) { 
        
           	dev_err(&pdev->dev, 
        
           		"_DSM method doesn't support wifi power on func\n"); 
        
           	return -ENODEV; 
        
           }

change wording no quirks applied -> no quirks enabled in mwifiex_initialize_quirks() in accordance with the other print there
just use "NOT operator" instead of == in mwifiex_initialize_quirks()
card->quirks == 0 -> !card->quirks
moved patch mwifiex: pcie: (OEMB) add quirk for Surface 3 with broken DMI table to last so that the first three patches can be sent upstream easily

qzed · 2020-09-30T11:34:37Z

Okay, looks good to me. Feel free to merge when you're ready.

kitakar5525 · 2020-09-30T12:38:58Z

Thanks!

There are still issues remaining:

For the reset_wsid quirk (for S3 and SP3), I think that creating a new device driver for the WSID device is better, then export the reset function, and call it from mwifiex. As the comments in code states, on S3, calling _DSM methods immediately remove/re-probe mwifiex. So, resetting the wifi externally is a better idea. (Also in this way, hopefully supporting SP3 may become easier.)
After reset, ASPM settings don't get restored completely.
sudo lspci -nnvvv diff before/after fw reset on SB1:

#
# 03:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88W8897 [AVASTAR] 802.11ac Wireless [11ab:2b38]
#
@@ -574,9 +574,9 @@
        Capabilities: [168 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=70us PortTPowerOnTime=10us
-               L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
-                          T_CommonMode=0us LTR1.2_Threshold=163840ns
-               L1SubCtl2: T_PwrOn=44us
+               L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
+                          T_CommonMode=0us LTR1.2_Threshold=0ns
+               L1SubCtl2: T_PwrOn=10us
        Kernel driver in use: mwifiex_pcie
        Kernel modules: mwifiex_pcie

#
# no changes on root port of wifi regarding ASPM
#

All of the L1 substates are disabled after fw reset. LTR value is also changed.

sudo lspci -nnvvv diff before/after fw reset on S3:

#
# root port of wifi:
# 00:1c.0 PCI bridge [0604]: Intel Corporation Atom/Celeron/Pentium Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port #1 [8086:22c8] (rev 20) (prog-if 00 [Normal decode])
#
@@ -103,7 +103,7 @@
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
                LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
-               LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+
+               LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok)
                        TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-

#
# no changes on wifi regarding ASPM
#

It's worse in this case. The ASPM of wifi root port is disabled after fw reset (thus breaks S0ix).

Anyway, we can improve these later. This PR just works regarding fw reset. So, I'll go this way for now.
I'll merge this later when the other open PR becomes ready.

qzed · 2020-09-30T12:59:49Z

Keep in mind that with an external driver you might need to ensure correct suspend/resume ordering though, which might complicate things. I know too little about the whole situation to say what's better here though-

kitakar5525 · 2020-09-30T13:04:31Z

I think what we need to do is just exposing reset function via EXPORT_SYMBOL_GPL(). So, I hope there are no issues with suspend/resume ordering, but I may be wrong though.

qzed · 2020-09-30T13:10:59Z

You'd have to make sure that reset can be called both before and after mwifiex has resumed and also executed concurrently. At least if you plan on calling it during resume. Otherwise you'd have to enforce an ordering. You could probably also call it during the complete phase, assuming that mwifiex doesn't have a handler for that phase (then it's guaranteed to run after the basic resume steps have been executed). You probably still want to make sure that it can executed concurrently though.

kitakar5525 · 2020-09-30T13:18:27Z

Hmm, anyway I'll try something first.

This commit adds quirk implementation based on DMI matching with DMI table for Surface devices. This implementation can be used for quirks later. Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>

To reset mwifiex on Surface gen4+ (Pro 4 or later gen) devices, it seems that putting the wifi device into D3cold is required according to errata.inf file on Windows installation (Windows/INF/errata.inf). This patch adds a function that performs power-cycle (put into D3cold then D0) and call the function at the end of reset_prepare(). Note: Need to also reset the parent device (bridge) of wifi on SB1; it might be because the bridge of wifi always reports it's in D3hot. When I tried to reset only the wifi device (not touching parent), it gave the following error and the reset failed: acpi device:4b: Cannot transition to power state D0 for parent in D3hot mwifiex_pcie 0000:03:00.0: can't change power state from D3cold to D0 (config space inaccessible) Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>

This commit adds reset_wsid quirk and uses this quirk for Surface 3 on card reset. To reset mwifiex on Surface 3, it seems that calling the _DSM method exists in \_SB.WSID [1] device is required. On Surface 3, calling the _DSM method removes/re-probes the card by itself. So, need to place the reset function before performing FLR and skip performing any other reset-related works. Note that Surface Pro 3 also has the WSID device [2], but it seems to need more work. This commit only supports Surface 3 yet. [1] https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_3/dsdt.dsl#L11947-L12011 [2] https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_pro_3/dsdt.dsl#L12164-L12216 Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>

(made referring to http://git.osdn.net/view?p=android-x86/kernel.git;a=commitdiff;h=18e2e857c57633b25b3b4120f212224a108cd883) On some Surface 3, the DMI table gets corrupted for unknown reasons and breaks existing DMI matching used for device-specific quirks. This commit adds the (broken) DMI info for the affected Surface 3. On affected systems, DMI info will look like this: $ grep . /sys/devices/virtual/dmi/id/{bios_vendor,board_name,board_vendor,\ chassis_vendor,product_name,sys_vendor} /sys/devices/virtual/dmi/id/bios_vendor:American Megatrends Inc. /sys/devices/virtual/dmi/id/board_name:OEMB /sys/devices/virtual/dmi/id/board_vendor:OEMB /sys/devices/virtual/dmi/id/chassis_vendor:OEMB /sys/devices/virtual/dmi/id/product_name:OEMB /sys/devices/virtual/dmi/id/sys_vendor:OEMB Expected: $ grep . /sys/devices/virtual/dmi/id/{bios_vendor,board_name,board_vendor,\ chassis_vendor,product_name,sys_vendor} /sys/devices/virtual/dmi/id/bios_vendor:American Megatrends Inc. /sys/devices/virtual/dmi/id/board_name:Surface 3 /sys/devices/virtual/dmi/id/board_vendor:Microsoft Corporation /sys/devices/virtual/dmi/id/chassis_vendor:Microsoft Corporation /sys/devices/virtual/dmi/id/product_name:Surface 3 /sys/devices/virtual/dmi/id/sys_vendor:Microsoft Corporation Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>

kitakar5525 · 2020-09-30T14:55:25Z

Rebased on top of current v5.8-surface-devel due to conflict with #59 (with correcting grammar in commit messages)

fast_second_level_miss handler for the TLBTEMP area has an assumption that page table directory entry for the TLBTEMP address range is 0. For it to be true the TLBTEMP area must be aligned to 4MB boundary and not share its 4MB region with anything that may use a page table. This is not true currently: TLBTEMP shares space with vmalloc space which results in the following kinds of runtime errors when fast_second_level_miss loads page table directory entry for the vmalloc space instead of fixing up the TLBTEMP area: Unable to handle kernel paging request at virtual address c7ff0e00 pc = d0009275, ra = 90009478 Oops: sig: 9 [#1] PREEMPT CPU: 1 PID: 61 Comm: kworker/u9:2 Not tainted 5.10.0-rc3-next-20201110-00007-g1fe4962fa983-dirty linux-surface#58 Workqueue: xprtiod xs_stream_data_receive_workfn a00: 90009478 d11e1dc0 c7ff0e00 00000020 c7ff0000 00000001 7f8b8107 00000000 a08: 900c5992 d11e1d90 d0cc88b8 5506e97c 00000000 5506e97c d06c8074 d11e1d90 pc: d0009275, ps: 00060310, depc: 00000014, excvaddr: c7ff0e00 lbeg: d0009275, lend: d0009287 lcount: 00000003, sar: 00000010 Call Trace: xs_stream_data_receive_workfn+0x43c/0x770 process_one_work+0x1a1/0x324 worker_thread+0x1cc/0x3c0 kthread+0x10d/0x124 ret_from_kernel_thread+0xc/0x18 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>

commit 481535c upstream. fast_second_level_miss handler for the TLBTEMP area has an assumption that page table directory entry for the TLBTEMP address range is 0. For it to be true the TLBTEMP area must be aligned to 4MB boundary and not share its 4MB region with anything that may use a page table. This is not true currently: TLBTEMP shares space with vmalloc space which results in the following kinds of runtime errors when fast_second_level_miss loads page table directory entry for the vmalloc space instead of fixing up the TLBTEMP area: Unable to handle kernel paging request at virtual address c7ff0e00 pc = d0009275, ra = 90009478 Oops: sig: 9 [#1] PREEMPT CPU: 1 PID: 61 Comm: kworker/u9:2 Not tainted 5.10.0-rc3-next-20201110-00007-g1fe4962fa983-dirty #58 Workqueue: xprtiod xs_stream_data_receive_workfn a00: 90009478 d11e1dc0 c7ff0e00 00000020 c7ff0000 00000001 7f8b8107 00000000 a08: 900c5992 d11e1d90 d0cc88b8 5506e97c 00000000 5506e97c d06c8074 d11e1d90 pc: d0009275, ps: 00060310, depc: 00000014, excvaddr: c7ff0e00 lbeg: d0009275, lend: d0009287 lcount: 00000003, sar: 00000010 Call Trace: xs_stream_data_receive_workfn+0x43c/0x770 process_one_work+0x1a1/0x324 worker_thread+0x1cc/0x3c0 kthread+0x10d/0x124 ret_from_kernel_thread+0xc/0x18 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

[ Upstream commit 8d3ea3d ] GCC12 appears to be much smarter about its dependency tracking and is aware that the relaxed variants are just normal loads and stores and this is causing problems like: [ 210.074549] ------------[ cut here ]------------ [ 210.079223] NETDEV WATCHDOG: enabcm6e4ei0 (bcmgenet): transmit queue 1 timed out [ 210.086717] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:529 dev_watchdog+0x234/0x240 [ 210.095044] Modules linked in: genet(E) nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat] [ 210.146561] ACPI CPPC: PCC check channel failed for ss: 0. ret=-110 [ 210.146927] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 5.17.0-rc7G12+ #58 [ 210.153226] CPPC Cpufreq:cppc_scale_freq_workfn: failed to read perf counters [ 210.161349] Hardware name: Raspberry Pi Foundation Raspberry Pi 4 Model B/Raspberry Pi 4 Model B, BIOS EDK2-DEV 02/08/2022 [ 210.161353] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 210.161358] pc : dev_watchdog+0x234/0x240 [ 210.161364] lr : dev_watchdog+0x234/0x240 [ 210.161368] sp : ffff8000080a3a40 [ 210.161370] x29: ffff8000080a3a40 x28: ffffcd425af87000 x27: ffff8000080a3b20 [ 210.205150] x26: ffffcd425aa00000 x25: 0000000000000001 x24: ffffcd425af8ec08 [ 210.212321] x23: 0000000000000100 x22: ffffcd425af87000 x21: ffff55b142688000 [ 210.219491] x20: 0000000000000001 x19: ffff55b1426884c8 x18: ffffffffffffffff [ 210.226661] x17: 64656d6974203120 x16: 0000000000000001 x15: 6d736e617274203a [ 210.233831] x14: 2974656e65676d63 x13: ffffcd4259c300d8 x12: ffffcd425b07d5f0 [ 210.241001] x11: 00000000ffffffff x10: ffffcd425b07d5f0 x9 : ffffcd4258bdad9c [ 210.248171] x8 : 00000000ffffdfff x7 : 000000000000003f x6 : 0000000000000000 [ 210.255341] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000001000 [ 210.262511] x2 : 0000000000001000 x1 : 0000000000000005 x0 : 0000000000000044 [ 210.269682] Call trace: [ 210.272133] dev_watchdog+0x234/0x240 [ 210.275811] call_timer_fn+0x3c/0x15c [ 210.279489] __run_timers.part.0+0x288/0x310 [ 210.283777] run_timer_softirq+0x48/0x80 [ 210.287716] __do_softirq+0x128/0x360 [ 210.291392] __irq_exit_rcu+0x138/0x140 [ 210.295243] irq_exit_rcu+0x1c/0x30 [ 210.298745] el1_interrupt+0x38/0x54 [ 210.302334] el1h_64_irq_handler+0x18/0x24 [ 210.306445] el1h_64_irq+0x7c/0x80 [ 210.309857] arch_cpu_idle+0x18/0x2c [ 210.313445] default_idle_call+0x4c/0x140 [ 210.317470] cpuidle_idle_call+0x14c/0x1a0 [ 210.321584] do_idle+0xb0/0x100 [ 210.324737] cpu_startup_entry+0x30/0x8c [ 210.328675] secondary_start_kernel+0xe4/0x110 [ 210.333138] __secondary_switched+0x94/0x98 The assumption when these were relaxed seems to be that device memory would be mapped non reordering, and that other constructs (spinlocks/etc) would provide the barriers to assure that packet data and in memory rings/queues were ordered with respect to device register reads/writes. This itself seems a bit sketchy, but the real problem with GCC12 is that it is moving the actual reads/writes around at will as though they were independent operations when in truth they are not, but the compiler can't know that. When looking at the assembly dumps for many of these routines its possible to see very clean, but not strictly in program order operations occurring as the compiler would be free to do if these weren't actually register reads/write operations. Its possible to suppress the timeout with a liberal bit of dma_mb()'s sprinkled around but the device still seems unable to reliably send/receive data. A better plan is to use the safer readl/writel everywhere. Since this partially reverts an older commit, which notes the use of the relaxed variants for performance reasons. I would suggest that any performance problems with this commit are targeted at relaxing only the performance critical code paths after assuring proper barriers. Fixes: 69d2ea9 ("net: bcmgenet: Use correct I/O accessors") Reported-by: Peter Robinson <pbrobinson@gmail.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Peter Robinson <pbrobinson@gmail.com> Tested-by: Peter Robinson <pbrobinson@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220310045358.224350-1-jeremy.linton@arm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

When trying to use copy_from_kernel_nofault() to read vsyscall page through a bpf program, the following oops was reported: BUG: unable to handle page fault for address: ffffffffff600000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0 Oops: 0000 [linux-surface#1] PREEMPT SMP PTI CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ linux-surface#58 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... RIP: 0010:copy_from_kernel_nofault+0x6f/0x110 ...... Call Trace: <TASK> ? copy_from_kernel_nofault+0x6f/0x110 bpf_probe_read_kernel+0x1d/0x50 bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d trace_call_bpf+0xc5/0x1c0 perf_call_bpf_enter.isra.0+0x69/0xb0 perf_syscall_enter+0x13e/0x200 syscall_trace_enter+0x188/0x1c0 do_syscall_64+0xb5/0xe0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 </TASK> ...... ---[ end trace 0000000000000000 ]--- The oops is triggered when: 1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall page and invokes copy_from_kernel_nofault() which in turn calls __get_user_asm(). 2) Because the vsyscall page address is not readable from kernel space, a page fault exception is triggered accordingly. 3) handle_page_fault() considers the vsyscall page address as a user space address instead of a kernel space address. This results in the fix-up setup by bpf not being applied and a page_fault_oops() is invoked due to SMAP. Considering handle_page_fault() has already considered the vsyscall page address as a userspace address, fix the problem by disallowing vsyscall page read for copy_from_kernel_nofault(). Originally-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com Reported-by: xingwei lee <xrivendell7@gmail.com> Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240202103935.3154011-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>

katamayko · 2024-04-25T19:07:15Z

friends
i installed debian 12 for the first time this week and i'm having wifi problems on my surface pro 4
I don't understand how to overcome this, can you give me a step-by-step guide if there is a solution

kitakar5525 commented Sep 24, 2020

View reviewed changes

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch from 4d81c3d to 17d699f Compare September 24, 2020 10:24

kitakar5525 commented Sep 24, 2020

View reviewed changes

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch 3 times, most recently from f0edf12 to 43662e8 Compare September 24, 2020 14:07

qzed requested changes Sep 24, 2020

View reviewed changes

drivers/net/wireless/marvell/mwifiex/pcie_quirks.c Outdated Show resolved Hide resolved

drivers/net/wireless/marvell/mwifiex/pcie.c Outdated Show resolved Hide resolved

drivers/net/wireless/marvell/mwifiex/pcie_quirks.h Outdated Show resolved Hide resolved

kitakar5525 changed the title ~~mwifiex: add fw reset quirks for Surface devices~~ WIP: mwifiex: add fw reset quirks for Surface devices Sep 25, 2020

qzed requested changes Sep 25, 2020

View reviewed changes

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch from 9fad2a8 to 5dcea7f Compare September 29, 2020 14:22

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch from 5dcea7f to a90d243 Compare September 29, 2020 15:39

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch from a90d243 to 5591da4 Compare September 30, 2020 09:16

qzed approved these changes Sep 30, 2020

View reviewed changes

kitakar5525 added 4 commits September 30, 2020 23:52

mwifiex: pcie: add DMI-based quirk impl for Surface devices

71b9751

This commit adds quirk implementation based on DMI matching with DMI table for Surface devices. This implementation can be used for quirks later. Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com>

kitakar5525 force-pushed the mwifiex/fw-reset-quirk branch from 5591da4 to 72b00a2 Compare September 30, 2020 14:53

kitakar5525 changed the title ~~WIP: mwifiex: add fw reset quirks for Surface devices~~ mwifiex: add fw reset quirks for Surface devices Sep 30, 2020

kitakar5525 merged commit 86944d6 into v5.8-surface-devel Sep 30, 2020

kitakar5525 deleted the mwifiex/fw-reset-quirk branch September 30, 2020 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mwifiex: add fw reset quirks for Surface devices #58

mwifiex: add fw reset quirks for Surface devices #58

kitakar5525 commented Sep 24, 2020 •

edited

kitakar5525 commented Sep 24, 2020

kitakar5525 left a comment •

edited

kitakar5525 left a comment •

edited

kitakar5525 commented Sep 24, 2020

kitakar5525 commented Sep 25, 2020

qzed left a comment

kitakar5525 commented Sep 29, 2020

kitakar5525 commented Sep 29, 2020

kitakar5525 commented Sep 30, 2020 •

edited

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020 •

edited

katamayko commented Apr 25, 2024

mwifiex: add fw reset quirks for Surface devices #58

mwifiex: add fw reset quirks for Surface devices #58

Conversation

kitakar5525 commented Sep 24, 2020 • edited

kitakar5525 commented Sep 24, 2020

kitakar5525 left a comment • edited

Choose a reason for hiding this comment

kitakar5525 left a comment • edited

Choose a reason for hiding this comment

kitakar5525 commented Sep 24, 2020

kitakar5525 commented Sep 25, 2020

qzed left a comment

Choose a reason for hiding this comment

kitakar5525 commented Sep 29, 2020

kitakar5525 commented Sep 29, 2020

kitakar5525 commented Sep 30, 2020 • edited

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

qzed commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020

kitakar5525 commented Sep 30, 2020 • edited

katamayko commented Apr 25, 2024

kitakar5525 commented Sep 24, 2020 •

edited

kitakar5525 left a comment •

edited

kitakar5525 left a comment •

edited

kitakar5525 commented Sep 30, 2020 •

edited

kitakar5525 commented Sep 30, 2020 •

edited