RFC: SMC hwmon driver by chadmed · Pull Request #299 · AsahiLinux/linux

chadmed · 2024-06-05T05:35:24Z

As it says on the tin, just an RFC for an improved hwmon driver which pulls data from the SMC.

The only real major issue is that there seems to be no reliable or accurate source of information
for which keys are present on which machine. The data in this MR comes mostly from poking
around with m1n1 on various devices. Some additional sensors come from exelban/stats,
however granularity at the SoC level seems insufficient as many devices are missing sensors which
are supposed to be there (e.g. random CPU core temperatures).

My preferred approach for this would be to simply ditch the attempt to keep SMC sensors in .dtsi files
and just experimentally determine which are present on each device (or class of device) and populate
&smc at that level.

jannau · 2024-06-10T19:39:05Z

+	if (!hwmon->fan)
+		return -ENOMEM;
+
+	hwmon_node = of_find_node_by_name(NULL, "macsmc-hwmon");


use of_get_child_by_name(pdev->dev.parent->of_node, "macsmc-hwmon"); the "macsmc-" isn't really needed in the name.
I would put the device_node into hwmon struct

Is there a reason to keep a reference to it around? We don't use it after the sensors have been populated.

jannau · 2024-06-10T20:02:03Z

+	for (idx = 0; idx < num_keys; idx += 1)
+		configs[idx] = flags;
+
+	configs[idx + 1] = 0;


this function can be replaced by memset32, it's already 0-termination though the kzalloc

Getting rid of the 0 assignment makes sense, but I don't see how we do away with the loop since we need the config flag mask set for every sensor/channel. We can't just memset32 one location.

chadmed · 2024-06-16T02:25:03Z

I am very unhappy with how we are handling the "shared" keys here. There doesn't seem to be much at all in common between logic boards, even keys you'd expect wouldn't change. I don't think having a series of cascaded includes is worth it given how wildly the boards seem to differ, and we should probably just specify the keys directly at the highest level we can empirically demonstrate that they are shared, e.g. t600x-j314-j416.dtsi

[ Upstream commit 4f580e9 ] The necessity of asserting the reset on removal was previously questioned, as DSA's own cleanup methods should suffice to prevent traffic leakage[1]. When a driver has subdrivers controlled by devres, they will be unregistered after the main driver's .remove is executed. If it asserts a reset, the subdrivers will be unable to communicate with the hardware during their cleanup. For LEDs, this means that they will fail to turn off, resulting in a timeout error. [1] https://lore.kernel.org/r/20240123215606.26716-9-luizluca@gmail.com/ Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 7c52c70 ] Like various other Asus Vivobook and Expertbook laptops, the Asus Vivobook Pro N6506MV has a DSDT table that describes IRQ 1 as ActiveLow while the kernel is overriding it to Edge_High. This prevents the internal keyboard from working. This patch prevents this issue by adding this laptop to the override table that prevents the kernel from overriding this IRQ Link: https://bugzilla.kernel.org/show_bug.cgi?id=218745 Tested-by: Gianni <gianni.casagrande.mail@gmail.com> Signed-off-by: Tamim Khan <tamim@fusetak.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit c2e6a87 ] KCSAN detected a race condition in netpoll: BUG: KCSAN: data-race in net_rx_action / netpoll_send_skb write (marked) to 0xffff8881164168b0 of 4 bytes by interrupt on cpu 10: net_rx_action (./include/linux/netpoll.h:90 net/core/dev.c:6712 net/core/dev.c:6822) <snip> read to 0xffff8881164168b0 of 4 bytes by task 1 on cpu 2: netpoll_send_skb (net/core/netpoll.c:319 net/core/netpoll.c:345 net/core/netpoll.c:393) netpoll_send_udp (net/core/netpoll.c:?) <snip> value changed: 0x0000000a -> 0xffffffff This happens because netpoll_owner_active() needs to check if the current CPU is the owner of the lock, touching napi->poll_owner non atomically. The ->poll_owner field contains the current CPU holding the lock. Use an atomic read to check if the poll owner is the current CPU. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240429100437.3487432-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit bf76b14 ] Currently, when using WCN7850 or QCN9274 as AP, ath12k always performs down grade phy mode operation regardless of whether the firmware supports EHT capability or not and then vdev will start in HE mode. When stations that support EHT capability try to connect to the AP, the AP will set phy mode to EHT after receiving the association request packet, and then send WMI_PEER_ASSOC_CMDID command to firmware, AP’s firmware will crash. This is because when the ath12k_mac_copy_sband_iftype_data() function handles EHT capability, it does not copy the EHT capability into the iftype[band][type] array according to the interface type. So, interface type should not be used as an index to get eht_cap in ath12k_mac_check_down_grade_phy_mode() function. To address this issue, use types_mask to select the eht_cap in ath12k_mac_check_down_grade_phy_mode() function. This patch affects QCN9274 and WCN7850 because they have the same issue. Hostapd log: wlo1: STA 02:03:7f:37:12:34 IEEE 802.11: Could not set STA to kernel driver Kernel log: [270894.816076] ath12k_pci 0000:03:00.0: failed to send WMI_PEER_SET_PARAM cmd [270894.816111] ath12k_pci 0000:03:00.0: failed to setup peer SMPS for vdev 0: -108 [270894.816122] ath12k_pci 0000:03:00.0: Failed to associate station: 02:03:7f:37:12:34 [270894.843389] ieee80211 phy5: Hardware restart was requested [270894.843517] ath12k_pci 0000:03:00.0: failed to lookup peer 02:03:7f:37:12:34 on vdev 0 [270894.843616] ath12k_pci 0000:03:00.0: failed to send WMI_PEER_DELETE cmd [270894.843650] ath12k_pci 0000:03:00.0: failed to delete peer vdev_id 0 addr 02:03:7f:37:12:34 ret -108 [270894.843663] ath12k_pci 0000:03:00.0: Failed to delete peer: 02:03:7f:37:12:34 for VDEV: 0 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240425083837.5340-1-quic_lingbok@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit ecf0b2b ] During chip recovery (e.g. chip reset), there is a possible situation that kernel worker reset_work is holding the lock and waiting for kernel thread stat_worker to be parked, while stat_worker is waiting for the release of the same lock. It causes a deadlock resulting in the dumping of hung tasks messages and possible rebooting of the device. This patch prevents the execution of stat_worker during the chip recovery. Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Ming Yen Hsieh <MingYen.Hsieh@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit dd2c345 ] This device sometimes doesn't send touch release signals when moving from >=4 fingers to <4 fingers. Using MT_QUIRK_NOT_SEEN_MEANS_UP instead of MT_QUIRK_ALWAYS_VALID makes sure that no touches become stuck. MT_QUIRK_FORCE_MULTI_INPUT is not necessary for this device, but does no harm. Signed-off-by: Sean O'Brien <seobrien@chromium.org> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 59d2f5b ] Adjusts the report descriptor for N-Key devices to make the output count 0x01 which completely avoids the need for a block of filtering. Signed-off-by: Luke D. Jones <luke@ljones.dev> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit c901f63 ] Lenovo Slim 7 16ARH7 is a machine with switchable graphics between AMD and Nvidia, and the backlight can't be adjusted properly unless acpi_backlight=native is passed. Although nvidia-wmi-backlight is present and loaded, this doesn't work as expected at all. For making it working as default, add the corresponding quirk entry with a DMI matching "LENOVO" "82UX". Link: https://bugzilla.suse.com/show_bug.cgi?id=1217750 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit cd3fc3b ] [Changes from V1: - The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized. - This warning is only supported in GCC.] The BPF selftest verifier_global_subprogs.c contains code that purposedly performs out of bounds access to memory, to check whether the kernel verifier is able to catch them. For example: __noinline int global_unsupp(const int *mem) { if (!mem) return 0; return mem[100]; /* BOOM */ } With -O1 and higher and no inlining, GCC notices this fact and emits a "maybe uninitialized" warning. This is by design. Note that the emission of these warnings is highly dependent on the precise optimizations that are performed. This patch adds a compiler pragma to verifier_global_subprogs.c to ignore these warnings. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 2d3b8df ] On slow machines the SND timestamp sometimes doesn't arrive before we quit. The test only waits as long as the packet delay, so it's easy for a race condition to happen. Double the wait but do a bit of polling, once the SND timestamp arrives there's no point to wait any longer. This fixes the "TXTIME abs" failures on debug kernels, like: Case ICMPv4 - TXTIME abs returned '', expected 'OK' Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240510005705.43069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit b37b98a ] Validate that `first` flag is set only for the first descriptor in multi-buffer packets. In case of an invalid descriptor, a reset will occur. A new reset reason for RX data corruption has been added. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240512134637.25299-4-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 68aa210 ] This fixes some CHECKs reported by the checkpatch script. Issues reported in ath3k.c: ------- ath3k.c ------- CHECK: Please don't use multiple blank lines + + CHECK: Blank lines aren't necessary after an open brace '{' +static const struct usb_device_id ath3k_blist_tbl[] = { + CHECK: Alignment should match open parenthesis +static int ath3k_load_firmware(struct usb_device *udev, + const struct firmware *firmware) CHECK: Alignment should match open parenthesis + err = usb_bulk_msg(udev, pipe, send_buf, size, + &len, 3000); CHECK: Unnecessary parentheses around 'len != size' + if (err || (len != size)) { CHECK: Alignment should match open parenthesis +static int ath3k_get_version(struct usb_device *udev, + struct ath3k_version *version) CHECK: Alignment should match open parenthesis +static int ath3k_load_fwfile(struct usb_device *udev, + const struct firmware *firmware) CHECK: Alignment should match open parenthesis + err = usb_bulk_msg(udev, pipe, send_buf, size, + &len, 3000); CHECK: Unnecessary parentheses around 'len != size' + if (err || (len != size)) { CHECK: Blank lines aren't necessary after an open brace '{' + switch (fw_version.ref_clock) { + CHECK: Alignment should match open parenthesis + snprintf(filename, ATH3K_NAME_LEN, "ar3k/ramps_0x%08x_%d%s", + le32_to_cpu(fw_version.rom_version), clk_value, ".dfu"); CHECK: Alignment should match open parenthesis +static int ath3k_probe(struct usb_interface *intf, + const struct usb_device_id *id) CHECK: Alignment should match open parenthesis + BT_ERR("Firmware file \"%s\" not found", + ATH3K_FIRMWARE); CHECK: Alignment should match open parenthesis + BT_ERR("Firmware file \"%s\" request failed (err=%d)", + ATH3K_FIRMWARE, ret); total: 0 errors, 0 warnings, 14 checks, 540 lines checked Signed-off-by: Uri Arev <me@wantyapps.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit f30a3be ] [WHY] PSP can access DCN registers during command submission and we need to ensure that DCN is not in PG before doing so. [HOW] Add a callback to DM to lock and notify DC for idle optimization exit. It can't be DC directly because of a potential race condition with the link protection thread and the rest of DM operation. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit b5b6d62 ] [Why] Cursor update can be pre-empted by a request for setting target flip submission. This causes an issue where we're in the middle of the exit sequence trying to log to DM, but the pre-emption starts another DMCUB command submission that requires being out of idle. The DC lock aqusition can fail, and depending on the DM/OS interface it's possible that the function inserted into this thread must not fail. This means that lock aqusition must be skipped and exit *must* occur. [How] Modify when we consider idle as active. Consider it exited only once the exit has fully finished. Consider it as entered prior to actual notification. Since we're on the same core/thread the cached values are coherent and we'll see that we still need to exit. Once the cursor update resumes it'll continue doing the double exit but this won't cause a functional issue, just a (potential) redundant operation. Reviewed-by: Duncan Ma <duncan.ma@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 23f1d8b ] The Z830 has some buttons that will only work properly as "quickstart" buttons. To enable them in that mode, a value between 1 and 7 must be used for HCI_HOTKEY_EVENT. Windows uses 0x5 on this laptop so use that for maximum predictability and compatibility. As there is not yet a known way of auto detection, this patch uses a DMI quirk table. A module parameter is exposed to allow setting this on other models for testing. Signed-off-by: Arvid Norlander <lkml@vorpal.se> Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240131111641.4418-3-W_Armin@gmx.de Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

"ioft" is a 48.16 fixed point type. Signed-off-by: Janne Grunau <j@jannau.net>

Apple Silicon devices, particularly Macs, continue Apple's tradition of integrating a ridiculous number of sensors on their products. Most of these report to the System Management Controller, and their data is exposed by the SMC firmware as simple key-value pairs. This commit adds an hwmon drive for retrieving the data reported by the temperature, voltage, current and power sensors, as well as fan data. Devices each expose their own unique set of sensors and fans, even with the SMC being baked into the SoC. This driver walks a devicetree node in order to discover the sensors present on the current device at runtime. Co-developed-by: Janne Grunau <j@jannau.net> Signed-off-by: Janne Grunau <j@jannau.net> Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

Signed-off-by: Janne Grunau <j@jannau.net>

Each SoC's SMC exposes a different set of hardware sensor keys, however there are a number that are shared between all currently supported SoCs. Describe these in a .dtsi so that we don't need to duplicate them across every SoC. Likewise, the fans on every machine are exposed as the same set of keys on each. Add .dtsis for these too. Co-developed-by: Janne Grunau <j@jannau.net> Signed-off-by: Janne Grunau <j@jannau.net> Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

chadmed · 2024-07-01T06:09:25Z

I have pulled in @jannau's fixes and squashed the fan changes into a single commit adding all the "common" files. I have also split out each SoC into a separate commit and completely removed keys that were not readable on any of the machines I have access to.

EDIT: also rebased

jannau · 2024-07-08T07:58:15Z

I've pulled the code into bits/110-smc but haven't merged any of the hwmon devicetree changes. Mostly over the CPU core temp sensors. I still not convinced that there are temp sensors for the efficiency cores. I can't identify any temp sensor with a clear temp increase signal when running a CPU stressor on any of the efficiency cores. I would expect that ~0.6 W power use on the area of an efficiency core would be clearly identifiable.

misc issues spotted so far: hwmon-t8112.dtsi has only 7 CPU core sensors, we can't use t600x / t602x as includes since we need to handle M1 / M2 Ultra separately, we should think about how we can handle disabled CPU cores like on M1/M2 Pro.

chadmed · 2024-07-08T09:21:47Z

I agree with not merging the DT changes until we figure out more stuff. FWIW, the reason t8112 lost a sensor is because I have no t8112 devices where the dropped CPU temp sensor was able to be read, and part of the final cleanup I did was getting rid of stuff that I could not verify as working on any hardware.

Same deal with the ecore sensors on t8112 - I could not get them to light up with an ecore load however they did materially change when burning even one of the pcores... The only solution here may be to brute force every T*** combination with a m1n1 experiment or something :/

userfaultfd_move() checks whether the PTE entry is present or a swap entry. - If the PTE entry is present, move_present_pte() handles folio migration by setting: src_folio->index = linear_page_index(dst_vma, dst_addr); - If the PTE entry is a swap entry, move_swap_pte() simply copies the PTE to the new dst_addr. This approach is incorrect because, even if the PTE is a swap entry, it can still reference a folio that remains in the swap cache. This creates a race window between steps 2 and 4. 1. add_to_swap: The folio is added to the swapcache. 2. try_to_unmap: PTEs are converted to swap entries. 3. pageout: The folio is written back. 4. Swapcache is cleared. If userfaultfd_move() occurs in the window between steps 2 and 4, after the swap PTE has been moved to the destination, accessing the destination triggers do_swap_page(), which may locate the folio in the swapcache. However, since the folio's index has not been updated to match the destination VMA, do_swap_page() will detect a mismatch. This can result in two critical issues depending on the system configuration. If KSM is disabled, both small and large folios can trigger a BUG during the add_rmap operation due to: page_pgoff(folio, page) != linear_page_index(vma, address) [ 13.336953] page: refcount:6 mapcount:1 mapping:00000000f43db19c index:0xffffaf150 pfn:0x4667c [ 13.337520] head: order:2 mapcount:1 entire_mapcount:0 nr_pages_mapped:1 pincount:0 [ 13.337716] memcg:ffff00000405f000 [ 13.337849] anon flags: 0x3fffc0000020459(locked|uptodate|dirty|owner_priv_1|head|swapbacked|node=0|zone=0|lastcpupid=0xffff) [ 13.338630] raw: 03fffc0000020459 ffff80008507b538 ffff80008507b538 ffff000006260361 [ 13.338831] raw: 0000000ffffaf150 0000000000004000 0000000600000000 ffff00000405f000 [ 13.339031] head: 03fffc0000020459 ffff80008507b538 ffff80008507b538 ffff000006260361 [ 13.339204] head: 0000000ffffaf150 0000000000004000 0000000600000000 ffff00000405f000 [ 13.339375] head: 03fffc0000000202 fffffdffc0199f01 ffffffff00000000 0000000000000001 [ 13.339546] head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 13.339736] page dumped because: VM_BUG_ON_PAGE(page_pgoff(folio, page) != linear_page_index(vma, address)) [ 13.340190] ------------[ cut here ]------------ [ 13.340316] kernel BUG at mm/rmap.c:1380! [ 13.340683] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [ 13.340969] Modules linked in: [ 13.341257] CPU: 1 UID: 0 PID: 107 Comm: a.out Not tainted 6.14.0-rc3-gcf42737e247a-dirty AsahiLinux#299 [ 13.341470] Hardware name: linux,dummy-virt (DT) [ 13.341671] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 13.341815] pc : __page_check_anon_rmap+0xa0/0xb0 [ 13.341920] lr : __page_check_anon_rmap+0xa0/0xb0 [ 13.342018] sp : ffff80008752bb20 [ 13.342093] x29: ffff80008752bb20 x28: fffffdffc0199f00 x27: 0000000000000001 [ 13.342404] x26: 0000000000000000 x25: 0000000000000001 x24: 0000000000000001 [ 13.342575] x23: 0000ffffaf0d0000 x22: 0000ffffaf0d0000 x21: fffffdffc0199f00 [ 13.342731] x20: fffffdffc0199f00 x19: ffff000006210700 x18: 00000000ffffffff [ 13.342881] x17: 6c203d2120296567 x16: 6170202c6f696c6f x15: 662866666f67705f [ 13.343033] x14: 6567617028454741 x13: 2929737365726464 x12: ffff800083728ab0 [ 13.343183] x11: ffff800082996bf8 x10: 0000000000000fd7 x9 : ffff80008011bc40 [ 13.343351] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000829eebf8 [ 13.343498] x5 : c0000000fffff000 x4 : 0000000000000000 x3 : 0000000000000000 [ 13.343645] x2 : 0000000000000000 x1 : ffff0000062db980 x0 : 000000000000005f [ 13.343876] Call trace: [ 13.344045] __page_check_anon_rmap+0xa0/0xb0 (P) [ 13.344234] folio_add_anon_rmap_ptes+0x22c/0x320 [ 13.344333] do_swap_page+0x1060/0x1400 [ 13.344417] __handle_mm_fault+0x61c/0xbc8 [ 13.344504] handle_mm_fault+0xd8/0x2e8 [ 13.344586] do_page_fault+0x20c/0x770 [ 13.344673] do_translation_fault+0xb4/0xf0 [ 13.344759] do_mem_abort+0x48/0xa0 [ 13.344842] el0_da+0x58/0x130 [ 13.344914] el0t_64_sync_handler+0xc4/0x138 [ 13.345002] el0t_64_sync+0x1ac/0x1b0 [ 13.345208] Code: aa1503e0 f000f801 910f6021 97ff5779 (d4210000) [ 13.345504] ---[ end trace 0000000000000000 ]--- [ 13.345715] note: a.out[107] exited with irqs disabled [ 13.345954] note: a.out[107] exited with preempt_count 2 If KSM is enabled, Peter Xu also discovered that do_swap_page() may trigger an unexpected CoW operation for small folios because ksm_might_need_to_copy() allocates a new folio when the folio index does not match linear_page_index(vma, addr). This patch also checks the swapcache when handling swap entries. If a match is found in the swapcache, it processes it similarly to a present PTE. However, there are some differences. For example, the folio is no longer exclusive because folio_try_share_anon_rmap_pte() is performed during unmapping. Furthermore, in the case of swapcache, the folio has already been unmapped, eliminating the risk of concurrent rmap walks and removing the need to acquire src_folio's anon_vma or lock. Note that for large folios, in the swapcache handling path, we directly return -EBUSY since split_folio() will return -EBUSY regardless if the folio is under writeback or unmapped. This is not an urgent issue, so a follow-up patch may address it separately. [v-songbaohua@oppo.com: minor cleanup according to Peter Xu] Link: https://lkml.kernel.org/r/20250226024411.47092-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20250226001400.9129-1-21cnbao@gmail.com Fixes: adef440 ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit c50f8e6 upstream. userfaultfd_move() checks whether the PTE entry is present or a swap entry. - If the PTE entry is present, move_present_pte() handles folio migration by setting: src_folio->index = linear_page_index(dst_vma, dst_addr); - If the PTE entry is a swap entry, move_swap_pte() simply copies the PTE to the new dst_addr. This approach is incorrect because, even if the PTE is a swap entry, it can still reference a folio that remains in the swap cache. This creates a race window between steps 2 and 4. 1. add_to_swap: The folio is added to the swapcache. 2. try_to_unmap: PTEs are converted to swap entries. 3. pageout: The folio is written back. 4. Swapcache is cleared. If userfaultfd_move() occurs in the window between steps 2 and 4, after the swap PTE has been moved to the destination, accessing the destination triggers do_swap_page(), which may locate the folio in the swapcache. However, since the folio's index has not been updated to match the destination VMA, do_swap_page() will detect a mismatch. This can result in two critical issues depending on the system configuration. If KSM is disabled, both small and large folios can trigger a BUG during the add_rmap operation due to: page_pgoff(folio, page) != linear_page_index(vma, address) [ 13.336953] page: refcount:6 mapcount:1 mapping:00000000f43db19c index:0xffffaf150 pfn:0x4667c [ 13.337520] head: order:2 mapcount:1 entire_mapcount:0 nr_pages_mapped:1 pincount:0 [ 13.337716] memcg:ffff00000405f000 [ 13.337849] anon flags: 0x3fffc0000020459(locked|uptodate|dirty|owner_priv_1|head|swapbacked|node=0|zone=0|lastcpupid=0xffff) [ 13.338630] raw: 03fffc0000020459 ffff80008507b538 ffff80008507b538 ffff000006260361 [ 13.338831] raw: 0000000ffffaf150 0000000000004000 0000000600000000 ffff00000405f000 [ 13.339031] head: 03fffc0000020459 ffff80008507b538 ffff80008507b538 ffff000006260361 [ 13.339204] head: 0000000ffffaf150 0000000000004000 0000000600000000 ffff00000405f000 [ 13.339375] head: 03fffc0000000202 fffffdffc0199f01 ffffffff00000000 0000000000000001 [ 13.339546] head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 13.339736] page dumped because: VM_BUG_ON_PAGE(page_pgoff(folio, page) != linear_page_index(vma, address)) [ 13.340190] ------------[ cut here ]------------ [ 13.340316] kernel BUG at mm/rmap.c:1380! [ 13.340683] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [ 13.340969] Modules linked in: [ 13.341257] CPU: 1 UID: 0 PID: 107 Comm: a.out Not tainted 6.14.0-rc3-gcf42737e247a-dirty #299 [ 13.341470] Hardware name: linux,dummy-virt (DT) [ 13.341671] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 13.341815] pc : __page_check_anon_rmap+0xa0/0xb0 [ 13.341920] lr : __page_check_anon_rmap+0xa0/0xb0 [ 13.342018] sp : ffff80008752bb20 [ 13.342093] x29: ffff80008752bb20 x28: fffffdffc0199f00 x27: 0000000000000001 [ 13.342404] x26: 0000000000000000 x25: 0000000000000001 x24: 0000000000000001 [ 13.342575] x23: 0000ffffaf0d0000 x22: 0000ffffaf0d0000 x21: fffffdffc0199f00 [ 13.342731] x20: fffffdffc0199f00 x19: ffff000006210700 x18: 00000000ffffffff [ 13.342881] x17: 6c203d2120296567 x16: 6170202c6f696c6f x15: 662866666f67705f [ 13.343033] x14: 6567617028454741 x13: 2929737365726464 x12: ffff800083728ab0 [ 13.343183] x11: ffff800082996bf8 x10: 0000000000000fd7 x9 : ffff80008011bc40 [ 13.343351] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000829eebf8 [ 13.343498] x5 : c0000000fffff000 x4 : 0000000000000000 x3 : 0000000000000000 [ 13.343645] x2 : 0000000000000000 x1 : ffff0000062db980 x0 : 000000000000005f [ 13.343876] Call trace: [ 13.344045] __page_check_anon_rmap+0xa0/0xb0 (P) [ 13.344234] folio_add_anon_rmap_ptes+0x22c/0x320 [ 13.344333] do_swap_page+0x1060/0x1400 [ 13.344417] __handle_mm_fault+0x61c/0xbc8 [ 13.344504] handle_mm_fault+0xd8/0x2e8 [ 13.344586] do_page_fault+0x20c/0x770 [ 13.344673] do_translation_fault+0xb4/0xf0 [ 13.344759] do_mem_abort+0x48/0xa0 [ 13.344842] el0_da+0x58/0x130 [ 13.344914] el0t_64_sync_handler+0xc4/0x138 [ 13.345002] el0t_64_sync+0x1ac/0x1b0 [ 13.345208] Code: aa1503e0 f000f801 910f6021 97ff5779 (d4210000) [ 13.345504] ---[ end trace 0000000000000000 ]--- [ 13.345715] note: a.out[107] exited with irqs disabled [ 13.345954] note: a.out[107] exited with preempt_count 2 If KSM is enabled, Peter Xu also discovered that do_swap_page() may trigger an unexpected CoW operation for small folios because ksm_might_need_to_copy() allocates a new folio when the folio index does not match linear_page_index(vma, addr). This patch also checks the swapcache when handling swap entries. If a match is found in the swapcache, it processes it similarly to a present PTE. However, there are some differences. For example, the folio is no longer exclusive because folio_try_share_anon_rmap_pte() is performed during unmapping. Furthermore, in the case of swapcache, the folio has already been unmapped, eliminating the risk of concurrent rmap walks and removing the need to acquire src_folio's anon_vma or lock. Note that for large folios, in the swapcache handling path, we directly return -EBUSY since split_folio() will return -EBUSY regardless if the folio is under writeback or unmapped. This is not an urgent issue, so a follow-up patch may address it separately. [v-songbaohua@oppo.com: minor cleanup according to Peter Xu] Link: https://lkml.kernel.org/r/20250226024411.47092-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20250226001400.9129-1-21cnbao@gmail.com Fixes: adef440 ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ surenb: resolved merged conflict caused by the difference in move_swap_pte() arguments] Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

zzywysm reviewed Jun 7, 2024

View reviewed changes

Comment thread drivers/hwmon/Kconfig

chadmed force-pushed the asahi-hwmon-merge-v2 branch from c747afc to 50cce7b Compare June 9, 2024 00:29

jannau reviewed Jun 10, 2024

View reviewed changes

chadmed force-pushed the asahi-hwmon-merge-v2 branch from b96b0ac to 31c9fad Compare June 15, 2024 13:05

jannau force-pushed the asahi-wip branch from 85037f1 to a226fbf Compare June 15, 2024 13:31

chadmed force-pushed the asahi-hwmon-merge-v2 branch 2 times, most recently from 070a639 to 9345bce Compare June 16, 2024 02:20

jannau force-pushed the asahi-wip branch 5 times, most recently from dc04487 to 138f32e Compare June 22, 2024 08:58

chadmed force-pushed the asahi-hwmon-merge-v2 branch 2 times, most recently from db96da1 to 2b0ad20 Compare June 22, 2024 13:26

luizluca and others added 15 commits June 27, 2024 13:52

jannau added 2 commits June 28, 2024 08:17

Merge branch 'bits/230-touchbar' into asahi-wip

e9eaf1e

Merge branch 'bits/240-isp' into asahi-wip

98953ab

jannau force-pushed the asahi-wip branch from 138f32e to 98953ab Compare June 28, 2024 06:39

chadmed force-pushed the asahi-hwmon-merge-v2 branch from 2b0ad20 to 10e36ca Compare July 1, 2024 06:05

jannau and others added 4 commits July 1, 2024 16:06

platform/apple: smc: Add apple_smc_read_ioft_scaled

5292cd7

"ioft" is a 48.16 fixed point type. Signed-off-by: Janne Grunau <j@jannau.net>

hwmon: macsmc: Avoid warning on 32-bit compile test builds

81c2834

Signed-off-by: Janne Grunau <j@jannau.net>

hwmon: macsmc: Avoid global writable hwmon_chip_info

2d6f806

Signed-off-by: Janne Grunau <j@jannau.net>

chadmed force-pushed the asahi-hwmon-merge-v2 branch from 10e36ca to 2806038 Compare July 1, 2024 06:06

chadmed added 5 commits July 1, 2024 16:07

arm64: apple: dts: t8112: add initial hwmon keys

6e6fa60

Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

arm64: dts: apple: t8103: add initial hwmon keys

c602e22

Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

arm64: apple: dts: t600x: add initial hwmon keys

143e828

Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

arm64: apple: dts: t602x: add initial hwmon keys

77b479a

Signed-off-by: James Calligeros <jcalligeros99@gmail.com>

chadmed force-pushed the asahi-hwmon-merge-v2 branch from 2806038 to 77b479a Compare July 1, 2024 06:08

jannau force-pushed the asahi-wip branch from 98953ab to 3d4b457 Compare July 7, 2024 08:47

jannau force-pushed the asahi-wip branch from f0760b4 to db6542f Compare July 11, 2024 19:42

jannau force-pushed the asahi-wip branch 4 times, most recently from 83a2784 to 15b881f Compare July 29, 2024 19:30

jannau force-pushed the asahi-wip branch from 15b881f to bfbb944 Compare August 4, 2024 17:49

jannau force-pushed the asahi-wip branch 2 times, most recently from ccc12d2 to 24c1838 Compare August 21, 2024 20:21

chadmed closed this Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: SMC hwmon driver#299

RFC: SMC hwmon driver#299
chadmed wants to merge 2545 commits intoAsahiLinux:asahi-wipfrom
chadmed:asahi-hwmon-merge-v2

chadmed commented Jun 5, 2024

Uh oh!

Uh oh!

Uh oh!

jannau Jun 10, 2024

Uh oh!

chadmed Jun 15, 2024

Uh oh!

Uh oh!

Uh oh!

jannau Jun 10, 2024

Uh oh!

chadmed Jun 15, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chadmed commented Jun 16, 2024

Uh oh!

chadmed commented Jul 1, 2024 •

edited

Loading

Uh oh!

jannau commented Jul 8, 2024

Uh oh!

chadmed commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

chadmed commented Jun 5, 2024

Uh oh!

Uh oh!

Uh oh!

jannau Jun 10, 2024

Choose a reason for hiding this comment

Uh oh!

chadmed Jun 15, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jannau Jun 10, 2024

Choose a reason for hiding this comment

Uh oh!

chadmed Jun 15, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chadmed commented Jun 16, 2024

Uh oh!

chadmed commented Jul 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jannau commented Jul 8, 2024

Uh oh!

chadmed commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

chadmed commented Jul 1, 2024 •

edited

Loading