From 1d62b7ac83e0791200563b9c2deb658c2d196dcf Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Fri, 4 Jun 2021 17:53:04 +0200 Subject: [PATCH 001/269] btrfs: tree-checker: do not error out if extent ref hash doesn't match MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 1119a72e223f3073a604f8fccb3a470ccd8a4416 upstream. The tree checker checks the extent ref hash at read and write time to make sure we do not corrupt the file system. Generally extent references go inline, but if we have enough of them we need to make an item, which looks like key.objectid = key.type = key.offset = hash(tree, owner, offset) However if key.offset collide with an unrelated extent reference we'll simply key.offset++ until we get something that doesn't collide. Obviously this doesn't match at tree checker time, and thus we error while writing out the transaction. This is relatively easy to reproduce, simply do something like the following xfs_io -f -c "pwrite 0 1M" file offset=2 for i in {0..10000} do xfs_io -c "reflink file 0 ${offset}M 1M" file offset=$(( offset + 2 )) done xfs_io -c "reflink file 0 17999258914816 1M" file xfs_io -c "reflink file 0 35998517829632 1M" file xfs_io -c "reflink file 0 53752752058368 1M" file btrfs filesystem sync And the sync will error out because we'll abort the transaction. The magic values above are used because they generate hash collisions with the first file in the main subvol. The fix for this is to remove the hash value check from tree checker, as we have no idea which offset ours should belong to. Reported-by: Tuomas Lähdekorpi Fixes: 0785a9aacf9d ("btrfs: tree-checker: Add EXTENT_DATA_REF check") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Reviewed-by: David Sterba [ add comment] Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/tree-checker.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index 40845428b739c..d4a3a56726aa8 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -1440,22 +1440,14 @@ static int check_extent_data_ref(struct extent_buffer *leaf, return -EUCLEAN; } for (; ptr < end; ptr += sizeof(*dref)) { - u64 root_objectid; - u64 owner; u64 offset; - u64 hash; + /* + * We cannot check the extent_data_ref hash due to possible + * overflow from the leaf due to hash collisions. + */ dref = (struct btrfs_extent_data_ref *)ptr; - root_objectid = btrfs_extent_data_ref_root(leaf, dref); - owner = btrfs_extent_data_ref_objectid(leaf, dref); offset = btrfs_extent_data_ref_offset(leaf, dref); - hash = hash_extent_data_ref(root_objectid, owner, offset); - if (hash != key->offset) { - extent_err(leaf, slot, - "invalid extent data ref hash, item has 0x%016llx key has 0x%016llx", - hash, key->offset); - return -EUCLEAN; - } if (!IS_ALIGNED(offset, leaf->fs_info->sectorsize)) { extent_err(leaf, slot, "invalid extent data backref offset, have %llu expect aligned to %u", From 70df000fb8808e2ea63e6beb2ff570c083592614 Mon Sep 17 00:00:00 2001 From: Grant Grundler Date: Tue, 19 Jan 2021 17:12:08 -0800 Subject: [PATCH 002/269] net: usb: cdc_ncm: don't spew notifications [ Upstream commit de658a195ee23ca6aaffe197d1d2ea040beea0a2 ] RTL8156 sends notifications about every 32ms. Only display/log notifications when something changes. This issue has been reported by others: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472 https://lkml.org/lkml/2020/8/27/1083 ... [785962.779840] usb 1-1: new high-speed USB device number 5 using xhci_hcd [785962.929944] usb 1-1: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=30.00 [785962.929949] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 [785962.929952] usb 1-1: Product: USB 10/100/1G/2.5G LAN [785962.929954] usb 1-1: Manufacturer: Realtek [785962.929956] usb 1-1: SerialNumber: 000000001 [785962.991755] usbcore: registered new interface driver cdc_ether [785963.017068] cdc_ncm 1-1:2.0: MAC-Address: 00:24:27:88:08:15 [785963.017072] cdc_ncm 1-1:2.0: setting rx_max = 16384 [785963.017169] cdc_ncm 1-1:2.0: setting tx_max = 16384 [785963.017682] cdc_ncm 1-1:2.0 usb0: register 'cdc_ncm' at usb-0000:00:14.0-1, CDC NCM, 00:24:27:88:08:15 [785963.019211] usbcore: registered new interface driver cdc_ncm [785963.023856] usbcore: registered new interface driver cdc_wdm [785963.025461] usbcore: registered new interface driver cdc_mbim [785963.038824] cdc_ncm 1-1:2.0 enx002427880815: renamed from usb0 [785963.089586] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected [785963.121673] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected [785963.153682] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected ... This is about 2KB per second and will overwrite all contents of a 1MB dmesg buffer in under 10 minutes rendering them useless for debugging many kernel problems. This is also an extra 180 MB/day in /var/logs (or 1GB per week) rendering the majority of those logs useless too. When the link is up (expected state), spew amount is >2x higher: ... [786139.600992] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected [786139.632997] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink [786139.665097] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected [786139.697100] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink [786139.729094] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected [786139.761108] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink ... Chrome OS cannot support RTL8156 until this is fixed. Signed-off-by: Grant Grundler Reviewed-by: Hayes Wang Link: https://lore.kernel.org/r/20210120011208.3768105-1-grundler@chromium.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/usb/cdc_ncm.c | 12 +++++++++++- include/linux/usb/usbnet.h | 2 ++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 854c6624e6859..1d3bf810f2ca1 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -1827,6 +1827,15 @@ cdc_ncm_speed_change(struct usbnet *dev, uint32_t rx_speed = le32_to_cpu(data->DLBitRRate); uint32_t tx_speed = le32_to_cpu(data->ULBitRate); + /* if the speed hasn't changed, don't report it. + * RTL8156 shipped before 2021 sends notification about every 32ms. + */ + if (dev->rx_speed == rx_speed && dev->tx_speed == tx_speed) + return; + + dev->rx_speed = rx_speed; + dev->tx_speed = tx_speed; + /* * Currently the USB-NET API does not support reporting the actual * device speed. Do print it instead. @@ -1867,7 +1876,8 @@ static void cdc_ncm_status(struct usbnet *dev, struct urb *urb) * USB_CDC_NOTIFY_NETWORK_CONNECTION notification shall be * sent by device after USB_CDC_NOTIFY_SPEED_CHANGE. */ - usbnet_link_change(dev, !!event->wValue, 0); + if (netif_carrier_ok(dev->net) != !!event->wValue) + usbnet_link_change(dev, !!event->wValue, 0); break; case USB_CDC_NOTIFY_SPEED_CHANGE: diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h index 2e4f7721fc4e7..8110c29fab42d 100644 --- a/include/linux/usb/usbnet.h +++ b/include/linux/usb/usbnet.h @@ -83,6 +83,8 @@ struct usbnet { # define EVENT_LINK_CHANGE 11 # define EVENT_SET_RX_MODE 12 # define EVENT_NO_IP_ALIGN 13 + u32 rx_speed; /* in bps - NOT Mbps */ + u32 tx_speed; /* in bps - NOT Mbps */ }; static inline struct usb_driver *driver_of(struct usb_interface *intf) From 0d0df2e53223167d305dde815664bc830cd4cce6 Mon Sep 17 00:00:00 2001 From: Armin Wolf Date: Thu, 13 May 2021 17:45:46 +0200 Subject: [PATCH 003/269] hwmon: (dell-smm-hwmon) Fix index values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 35d470b5fbc9f82feb77b56bb0d5d0b5cd73e9da ] When support for up to 10 temp sensors and for disabling automatic BIOS fan control was added, noone updated the index values used for disallowing fan support and fan type calls. Fix those values. Signed-off-by: Armin Wolf Reviewed-by: Pali Rohár Link: https://lore.kernel.org/r/20210513154546.12430-1-W_Armin@gmx.de Fixes: 1bb46a20e73b ("hwmon: (dell-smm) Support up to 10 temp sensors") Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin --- drivers/hwmon/dell-smm-hwmon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c index 73b9db9e3aab6..63b74e781c5d9 100644 --- a/drivers/hwmon/dell-smm-hwmon.c +++ b/drivers/hwmon/dell-smm-hwmon.c @@ -838,10 +838,10 @@ static struct attribute *i8k_attrs[] = { static umode_t i8k_is_visible(struct kobject *kobj, struct attribute *attr, int index) { - if (disallow_fan_support && index >= 8) + if (disallow_fan_support && index >= 20) return 0; if (disallow_fan_type_call && - (index == 9 || index == 12 || index == 15)) + (index == 21 || index == 25 || index == 28)) return 0; if (index >= 0 && index <= 1 && !(i8k_hwmon_flags & I8K_HWMON_HAVE_TEMP1)) From 46e65061793405766c8368738cb5ecf5eca6caf3 Mon Sep 17 00:00:00 2001 From: Grant Peltier Date: Fri, 14 May 2021 16:19:55 -0500 Subject: [PATCH 004/269] hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_3 for RAA228228 [ Upstream commit 2a29db088c7ae7121801a0d7a60740ed2d18c4f3 ] The initial version of the RAA228228 datasheet claimed that the device supported READ_TEMPERATURE_3 but not READ_TEMPERATURE_1. It has since been discovered that the datasheet was incorrect. The RAA228228 does support READ_TEMPERATURE_1 but does not support READ_TEMPERATURE_3. Signed-off-by: Grant Peltier Fixes: 51fb91ed5a6f ("hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_1 telemetry for RAA228228") Link: https://lore.kernel.org/r/20210514211954.GA24646@raspberrypi Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin --- drivers/hwmon/pmbus/isl68137.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/pmbus/isl68137.c b/drivers/hwmon/pmbus/isl68137.c index 7cad76e07f701..3f1b826dac8a0 100644 --- a/drivers/hwmon/pmbus/isl68137.c +++ b/drivers/hwmon/pmbus/isl68137.c @@ -244,8 +244,8 @@ static int isl68137_probe(struct i2c_client *client) info->read_word_data = raa_dmpvr2_read_word_data; break; case raa_dmpvr2_2rail_nontc: - info->func[0] &= ~PMBUS_HAVE_TEMP; - info->func[1] &= ~PMBUS_HAVE_TEMP; + info->func[0] &= ~PMBUS_HAVE_TEMP3; + info->func[1] &= ~PMBUS_HAVE_TEMP3; fallthrough; case raa_dmpvr2_2rail: info->pages = 2; From 39a909a9720dd5a7b423f97c3737541d0ef76c06 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 4 May 2021 16:40:00 +0200 Subject: [PATCH 005/269] netfilter: conntrack: unregister ipv4 sockopts on error unwind [ Upstream commit 22cbdbcfb61acc78d5fc21ebb13ccc0d7e29f793 ] When ipv6 sockopt register fails, the ipv4 one needs to be removed. Fixes: a0ae2562c6c ("netfilter: conntrack: remove l3proto abstraction") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nf_conntrack_proto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c index 47e9319d2cf31..71892822bbf5d 100644 --- a/net/netfilter/nf_conntrack_proto.c +++ b/net/netfilter/nf_conntrack_proto.c @@ -660,7 +660,7 @@ int nf_conntrack_proto_init(void) #if IS_ENABLED(CONFIG_IPV6) cleanup_sockopt: - nf_unregister_sockopt(&so_getorigdst6); + nf_unregister_sockopt(&so_getorigdst); #endif return ret; } From 5148066edbdc89c6fe5bc419c31a5c22e5f83bdb Mon Sep 17 00:00:00 2001 From: Changbin Du Date: Wed, 24 Mar 2021 22:54:35 +0800 Subject: [PATCH 006/269] efi/fdt: fix panic when no valid fdt found [ Upstream commit 668a84c1bfb2b3fd5a10847825a854d63fac7baa ] setup_arch() would invoke efi_init()->efi_get_fdt_params(). If no valid fdt found then initial_boot_params will be null. So we should stop further fdt processing here. I encountered this issue on risc-v. Signed-off-by: Changbin Du Fixes: b91540d52a08b ("RISC-V: Add EFI runtime services") Signed-off-by: Ard Biesheuvel Signed-off-by: Sasha Levin --- drivers/firmware/efi/fdtparams.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/efi/fdtparams.c b/drivers/firmware/efi/fdtparams.c index bb042ab7c2be6..e901f8564ca0c 100644 --- a/drivers/firmware/efi/fdtparams.c +++ b/drivers/firmware/efi/fdtparams.c @@ -98,6 +98,9 @@ u64 __init efi_get_fdt_params(struct efi_memory_map_data *mm) BUILD_BUG_ON(ARRAY_SIZE(target) != ARRAY_SIZE(name)); BUILD_BUG_ON(ARRAY_SIZE(target) != ARRAY_SIZE(dt_params[0].params)); + if (!fdt) + return 0; + for (i = 0; i < ARRAY_SIZE(dt_params); i++) { node = fdt_path_offset(fdt, dt_params[i].path); if (node < 0) From b828601c752b1053a18f750bc413949cefa2a58b Mon Sep 17 00:00:00 2001 From: Heiner Kallweit Date: Fri, 30 Apr 2021 16:22:51 +0200 Subject: [PATCH 007/269] efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared [ Upstream commit 45add3cc99feaaf57d4b6f01d52d532c16a1caee ] UEFI spec 2.9, p.108, table 4-1 lists the scenario that both attributes are cleared with the description "No memory access protection is possible for Entry". So we can have valid entries where both attributes are cleared, so remove the check. Signed-off-by: Heiner Kallweit Fixes: 10f0d2f577053 ("efi: Implement generic support for the Memory Attributes table") Signed-off-by: Ard Biesheuvel Signed-off-by: Sasha Levin --- drivers/firmware/efi/memattr.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c index 5737cb0fcd44e..0a9aba5f9ceff 100644 --- a/drivers/firmware/efi/memattr.c +++ b/drivers/firmware/efi/memattr.c @@ -67,11 +67,6 @@ static bool entry_is_valid(const efi_memory_desc_t *in, efi_memory_desc_t *out) return false; } - if (!(in->attribute & (EFI_MEMORY_RO | EFI_MEMORY_XP))) { - pr_warn("Entry attributes invalid: RO and XP bits both cleared\n"); - return false; - } - if (PAGE_SIZE > EFI_PAGE_SIZE && (!PAGE_ALIGNED(in->phys_addr) || !PAGE_ALIGNED(in->num_pages << EFI_PAGE_SHIFT))) { From 951f8ef71d691f754483cc9d871cb493379fdb35 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 23 Apr 2021 14:48:31 +0300 Subject: [PATCH 008/269] efi/libstub: prevent read overflow in find_file_option() [ Upstream commit c4039b29fe9637e1135912813f830994af4c867f ] If the buffer has slashes up to the end then this will read past the end of the array. I don't anticipate that this is an issue for many people in real life, but it's the right thing to do and it makes static checkers happy. Fixes: 7a88a6227dc7 ("efi/libstub: Fix path separator regression") Signed-off-by: Dan Carpenter Signed-off-by: Ard Biesheuvel Signed-off-by: Sasha Levin --- drivers/firmware/efi/libstub/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/firmware/efi/libstub/file.c b/drivers/firmware/efi/libstub/file.c index 4e81c6077188e..dd95f330fe6e1 100644 --- a/drivers/firmware/efi/libstub/file.c +++ b/drivers/firmware/efi/libstub/file.c @@ -103,7 +103,7 @@ static int find_file_option(const efi_char16_t *cmdline, int cmdline_len, return 0; /* Skip any leading slashes */ - while (cmdline[i] == L'/' || cmdline[i] == L'\\') + while (i < cmdline_len && (cmdline[i] == L'/' || cmdline[i] == L'\\')) i++; while (--result_len > 0 && i < cmdline_len) { From 8d27efbb0ee49251169feda005ec244aef73f8a6 Mon Sep 17 00:00:00 2001 From: Rasmus Villemoes Date: Wed, 21 Apr 2021 21:46:36 +0200 Subject: [PATCH 009/269] efi: cper: fix snprintf() use in cper_dimm_err_location() [ Upstream commit 942859d969de7f6f7f2659a79237a758b42782da ] snprintf() should be given the full buffer size, not one less. And it guarantees nul-termination, so doing it manually afterwards is pointless. It's even potentially harmful (though probably not in practice because CPER_REC_LEN is 256), due to the "return how much would have been written had the buffer been big enough" semantics. I.e., if the bank and/or device strings are long enough that the "DIMM location ..." output gets truncated, writing to msg[n] is a buffer overflow. Signed-off-by: Rasmus Villemoes Fixes: 3760cd20402d4 ("CPER: Adjust code flow of some functions") Signed-off-by: Ard Biesheuvel Signed-off-by: Sasha Levin --- drivers/firmware/efi/cper.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index e15d484b6a5a7..ea7ca74fc1730 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -276,8 +276,7 @@ static int cper_dimm_err_location(struct cper_mem_err_compact *mem, char *msg) if (!msg || !(mem->validation_bits & CPER_MEM_VALID_MODULE_HANDLE)) return 0; - n = 0; - len = CPER_REC_LEN - 1; + len = CPER_REC_LEN; dmi_memdev_name(mem->mem_dev_handle, &bank, &device); if (bank && device) n = snprintf(msg, len, "DIMM location: %s %s ", bank, device); @@ -286,7 +285,6 @@ static int cper_dimm_err_location(struct cper_mem_err_compact *mem, char *msg) "DIMM location: not present. DMI handle: 0x%.4x ", mem->mem_dev_handle); - msg[n] = '\0'; return n; } From c303db1211a751d1e2c6a016687ef5e9734ee4b8 Mon Sep 17 00:00:00 2001 From: Zhen Lei Date: Sat, 15 May 2021 10:04:58 +0800 Subject: [PATCH 010/269] vfio/pci: Fix error return code in vfio_ecap_init() [ Upstream commit d1ce2c79156d3baf0830990ab06d296477b93c26 ] The error code returned from vfio_ext_cap_len() is stored in 'len', not in 'ret'. Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Hulk Robot Signed-off-by: Zhen Lei Reviewed-by: Max Gurtovoy Message-Id: <20210515020458.6771-1-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson Signed-off-by: Sasha Levin --- drivers/vfio/pci/vfio_pci_config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index a402adee8a215..47f21a6ca7fe9 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -1581,7 +1581,7 @@ static int vfio_ecap_init(struct vfio_pci_device *vdev) if (len == 0xFF) { len = vfio_ext_cap_len(vdev, ecap, epos); if (len < 0) - return ret; + return len; } } From c25454a4f4cbb172c47a6a49506de8ab133928a1 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Sat, 15 May 2021 12:08:56 -0700 Subject: [PATCH 011/269] vfio/pci: zap_vma_ptes() needs MMU [ Upstream commit 2a55ca37350171d9b43d561528f23d4130097255 ] zap_vma_ptes() is only available when CONFIG_MMU is set/enabled. Without CONFIG_MMU, vfio_pci.o has build errors, so make VFIO_PCI depend on MMU. riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `vfio_pci_mmap_open': vfio_pci.c:(.text+0x1ec): undefined reference to `zap_vma_ptes' riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `.L0 ': vfio_pci.c:(.text+0x165c): undefined reference to `zap_vma_ptes' Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Randy Dunlap Reported-by: kernel test robot Cc: Alex Williamson Cc: Cornelia Huck Cc: kvm@vger.kernel.org Cc: Jason Gunthorpe Cc: Eric Auger Message-Id: <20210515190856.2130-1-rdunlap@infradead.org> Signed-off-by: Alex Williamson Signed-off-by: Sasha Levin --- drivers/vfio/pci/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index 0f28bf99efebc..4e1107767e29b 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -2,6 +2,7 @@ config VFIO_PCI tristate "VFIO support for PCI devices" depends on VFIO && PCI && EVENTFD + depends on MMU select VFIO_VIRQFD select IRQ_BYPASS_MANAGER help From 2adb0313b132cdade0827820f1d0db4833dfb4ef Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Thu, 20 May 2021 13:36:41 +0000 Subject: [PATCH 012/269] samples: vfio-mdev: fix error handing in mdpy_fb_probe() [ Upstream commit 752774ce7793a1f8baa55aae31f3b4caac49cbe4 ] Fix to return a negative error code from the framebuffer_alloc() error handling case instead of 0, also release regions in some error handing cases. Fixes: cacade1946a4 ("sample: vfio mdev display - guest driver") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Message-Id: <20210520133641.1421378-1-weiyongjun1@huawei.com> Signed-off-by: Alex Williamson Signed-off-by: Sasha Levin --- samples/vfio-mdev/mdpy-fb.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/samples/vfio-mdev/mdpy-fb.c b/samples/vfio-mdev/mdpy-fb.c index 21dbf63d6e415..9ec93d90e8a5a 100644 --- a/samples/vfio-mdev/mdpy-fb.c +++ b/samples/vfio-mdev/mdpy-fb.c @@ -117,22 +117,27 @@ static int mdpy_fb_probe(struct pci_dev *pdev, if (format != DRM_FORMAT_XRGB8888) { pci_err(pdev, "format mismatch (0x%x != 0x%x)\n", format, DRM_FORMAT_XRGB8888); - return -EINVAL; + ret = -EINVAL; + goto err_release_regions; } if (width < 100 || width > 10000) { pci_err(pdev, "width (%d) out of range\n", width); - return -EINVAL; + ret = -EINVAL; + goto err_release_regions; } if (height < 100 || height > 10000) { pci_err(pdev, "height (%d) out of range\n", height); - return -EINVAL; + ret = -EINVAL; + goto err_release_regions; } pci_info(pdev, "mdpy found: %dx%d framebuffer\n", width, height); info = framebuffer_alloc(sizeof(struct mdpy_fb_par), &pdev->dev); - if (!info) + if (!info) { + ret = -ENOMEM; goto err_release_regions; + } pci_set_drvdata(pdev, info); par = info->par; From 46ae882bb19a12e8df9936e8dcf1924b6abb2b56 Mon Sep 17 00:00:00 2001 From: Max Gurtovoy Date: Tue, 18 May 2021 22:21:31 +0300 Subject: [PATCH 013/269] vfio/platform: fix module_put call in error flow [ Upstream commit dc51ff91cf2d1e9a2d941da483602f71d4a51472 ] The ->parent_module is the one that use in try_module_get. It should also be the one the we use in module_put during vfio_platform_open(). Fixes: 32a2d71c4e80 ("vfio: platform: introduce vfio-platform-base module") Signed-off-by: Max Gurtovoy Message-Id: <20210518192133.59195-1-mgurtovoy@nvidia.com> Signed-off-by: Alex Williamson Signed-off-by: Sasha Levin --- drivers/vfio/platform/vfio_platform_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index fb4b385191f28..e83a7cd15c956 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -289,7 +289,7 @@ static int vfio_platform_open(void *device_data) vfio_platform_regions_cleanup(vdev); err_reg: mutex_unlock(&driver_lock); - module_put(THIS_MODULE); + module_put(vdev->parent_module); return ret; } From 4b1aba653642e469d954afefafa842025e10f00a Mon Sep 17 00:00:00 2001 From: Julian Anastasov Date: Mon, 24 May 2021 22:54:57 +0300 Subject: [PATCH 014/269] ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service [ Upstream commit 56e4ee82e850026d71223262c07df7d6af3bd872 ] syzbot reported memory leak [1] when adding service with HASHED flag. We should ignore this flag both from sockopt and netlink provided data, otherwise the service is not hashed and not visible while releasing resources. [1] BUG: memory leak unreferenced object 0xffff888115227800 (size 512): comm "syz-executor263", pid 8658, jiffies 4294951882 (age 12.560s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [] kmalloc include/linux/slab.h:556 [inline] [] kzalloc include/linux/slab.h:686 [inline] [] ip_vs_add_service+0x598/0x7c0 net/netfilter/ipvs/ip_vs_ctl.c:1343 [] do_ip_vs_set_ctl+0x810/0xa40 net/netfilter/ipvs/ip_vs_ctl.c:2570 [] nf_setsockopt+0x68/0xa0 net/netfilter/nf_sockopt.c:101 [] ip_setsockopt+0x259/0x1ff0 net/ipv4/ip_sockglue.c:1435 [] raw_setsockopt+0x18c/0x1b0 net/ipv4/raw.c:857 [] __sys_setsockopt+0x1b0/0x360 net/socket.c:2117 [] __do_sys_setsockopt net/socket.c:2128 [inline] [] __se_sys_setsockopt net/socket.c:2125 [inline] [] __x64_sys_setsockopt+0x22/0x30 net/socket.c:2125 [] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 [] entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-and-tested-by: syzbot+e562383183e4b1766930@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Julian Anastasov Reviewed-by: Simon Horman Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/ipvs/ip_vs_ctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index d45dbcba8b49c..c25097092a060 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -1367,7 +1367,7 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u, ip_vs_addr_copy(svc->af, &svc->addr, &u->addr); svc->port = u->port; svc->fwmark = u->fwmark; - svc->flags = u->flags; + svc->flags = u->flags & ~IP_VS_SVC_F_HASHED; svc->timeout = u->timeout * HZ; svc->netmask = u->netmask; svc->ipvs = ipvs; From 39b92726a38092c704c4e1bcc8262a8959ae978e Mon Sep 17 00:00:00 2001 From: Tom Rix Date: Fri, 7 May 2021 12:18:19 -0700 Subject: [PATCH 015/269] HID: logitech-hidpp: initialize level variable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 81c8bf9170477d453b24a6bc3300d201d641e645 ] Static analysis reports this representative problem hid-logitech-hidpp.c:1356:23: warning: Assigned value is garbage or undefined hidpp->battery.level = level; ^ ~~~~~ In some cases, 'level' is never set in hidpp20_battery_map_status_voltage() Since level is not available on all hw, initialize level to unknown. Fixes: be281368f297 ("hid-logitech-hidpp: read battery voltage from newer devices") Signed-off-by: Tom Rix Reviewed-by: Filipe Laíns Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/hid-logitech-hidpp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index 74ebfb12c360e..66b1051620390 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -1259,6 +1259,7 @@ static int hidpp20_battery_map_status_voltage(u8 data[3], int *voltage, int status; long flags = (long) data[2]; + *level = POWER_SUPPLY_CAPACITY_LEVEL_UNKNOWN; if (flags & 0x80) switch (flags & 0x07) { From 744db828d6f9a0908f64d337642bb8ee227a7ea9 Mon Sep 17 00:00:00 2001 From: Zhen Lei Date: Sat, 8 May 2021 10:47:37 +0800 Subject: [PATCH 016/269] HID: pidff: fix error return code in hid_pidff_init() [ Upstream commit 3dd653c077efda8152f4dd395359617d577a54cd ] Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 224ee88fe395 ("Input: add force feedback driver for PID devices") Reported-by: Hulk Robot Signed-off-by: Zhen Lei Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/usbhid/hid-pidff.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/usbhid/hid-pidff.c b/drivers/hid/usbhid/hid-pidff.c index fddac7c72f645..07a9fe97d2e05 100644 --- a/drivers/hid/usbhid/hid-pidff.c +++ b/drivers/hid/usbhid/hid-pidff.c @@ -1292,6 +1292,7 @@ int hid_pidff_init(struct hid_device *hid) if (pidff->pool[PID_DEVICE_MANAGED_POOL].value && pidff->pool[PID_DEVICE_MANAGED_POOL].value[0] == 0) { + error = -EPERM; hid_notice(hid, "device does not support device managed pool\n"); goto fail; From 56c45ab00abab9481bb55151bf5719ec7c93f01a Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Fri, 14 May 2021 15:58:50 +0200 Subject: [PATCH 017/269] HID: i2c-hid: fix format string mismatch [ Upstream commit dc5f9f55502e13ba05731d5046a14620aa2ff456 ] clang doesn't like printing a 32-bit integer using %hX format string: drivers/hid/i2c-hid/i2c-hid-core.c:994:18: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [-Werror,-Wformat] client->name, hid->vendor, hid->product); ^~~~~~~~~~~ drivers/hid/i2c-hid/i2c-hid-core.c:994:31: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [-Werror,-Wformat] client->name, hid->vendor, hid->product); ^~~~~~~~~~~~ Use an explicit cast to truncate it to the low 16 bits instead. Fixes: 9ee3e06610fd ("HID: i2c-hid: override HID descriptors for certain devices") Signed-off-by: Arnd Bergmann Reviewed-by: Nathan Chancellor Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/i2c-hid/i2c-hid-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index cb7758d59014e..13bc59ed1d9e7 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -1131,8 +1131,8 @@ static int i2c_hid_probe(struct i2c_client *client, hid->vendor = le16_to_cpu(ihid->hdesc.wVendorID); hid->product = le16_to_cpu(ihid->hdesc.wProductID); - snprintf(hid->name, sizeof(hid->name), "%s %04hX:%04hX", - client->name, hid->vendor, hid->product); + snprintf(hid->name, sizeof(hid->name), "%s %04X:%04X", + client->name, (u16)hid->vendor, (u16)hid->product); strlcpy(hid->phys, dev_name(&client->dev), sizeof(hid->phys)); ihid->quirks = i2c_hid_lookup_quirk(hid->vendor, hid->product); From 4f00f9c169d9f6840613a44490d7800be8d73a61 Mon Sep 17 00:00:00 2001 From: Parav Pandit Date: Wed, 26 May 2021 23:00:27 +0300 Subject: [PATCH 018/269] devlink: Correct VIRTUAL port to not have phys_port attributes [ Upstream commit b28d8f0c25a9b0355116cace5f53ea52bd4020c8 ] Physical port name, port number attributes do not belong to virtual port flavour. When VF or SF virtual ports are registered they incorrectly append "np0" string in the netdevice name of the VF/SF. Before this fix, VF netdevice name were ens2f0np0v0, ens2f0np0v1 for VF 0 and 1 respectively. After the fix, they are ens2f0v0, ens2f0v1. With this fix, reading /sys/class/net/ens2f0v0/phys_port_name returns -EOPNOTSUPP. Also devlink port show example for 2 VFs on one PF to ensure that any physical port attributes are not exposed. $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false pci/0000:06:00.3/196608: type eth netdev ens2f0v0 flavour virtual splittable false pci/0000:06:00.4/262144: type eth netdev ens2f0v1 flavour virtual splittable false This change introduces a netdevice name change on systemd/udev version 245 and higher which honors phys_port_name sysfs file for generation of netdevice name. This also aligns to phys_port_name usage which is limited to switchdev ports as described in [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/Documentation/networking/switchdev.rst Fixes: acf1ee44ca5d ("devlink: Introduce devlink port flavour virtual") Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20210526200027.14008-1-parav@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/core/devlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index 5d397838bceb6..90badb6f72271 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -693,7 +693,6 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, case DEVLINK_PORT_FLAVOUR_PHYSICAL: case DEVLINK_PORT_FLAVOUR_CPU: case DEVLINK_PORT_FLAVOUR_DSA: - case DEVLINK_PORT_FLAVOUR_VIRTUAL: if (nla_put_u32(msg, DEVLINK_ATTR_PORT_NUMBER, attrs->phys.port_number)) return -EMSGSIZE; @@ -8376,7 +8375,6 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, switch (attrs->flavour) { case DEVLINK_PORT_FLAVOUR_PHYSICAL: - case DEVLINK_PORT_FLAVOUR_VIRTUAL: if (!attrs->split) n = snprintf(name, len, "p%u", attrs->phys.port_number); else @@ -8413,6 +8411,8 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, n = snprintf(name, len, "pf%uvf%u", attrs->pci_vf.pf, attrs->pci_vf.vf); break; + case DEVLINK_PORT_FLAVOUR_VIRTUAL: + return -EOPNOTSUPP; } if (n >= len) From f07c548314776231f0d47d73ec6caa5b17e876e8 Mon Sep 17 00:00:00 2001 From: Paul Blakey Date: Wed, 26 May 2021 14:44:09 +0300 Subject: [PATCH 019/269] net/sched: act_ct: Offload connections with commit action [ Upstream commit 0cc254e5aa37cf05f65bcdcdc0ac5c58010feb33 ] Currently established connections are not offloaded if the filter has a "ct commit" action. This behavior will not offload connections of the following scenario: $ tc_filter add dev $DEV ingress protocol ip prio 1 flower \ ct_state -trk \ action ct commit action goto chain 1 $ tc_filter add dev $DEV ingress protocol ip chain 1 prio 1 flower \ action mirred egress redirect dev $DEV2 $ tc_filter add dev $DEV2 ingress protocol ip prio 1 flower \ action ct commit action goto chain 1 $ tc_filter add dev $DEV2 ingress protocol ip prio 1 chain 1 flower \ ct_state +trk+est \ action mirred egress redirect dev $DEV Offload established connections, regardless of the commit flag. Fixes: 46475bb20f4b ("net/sched: act_ct: Software offload of established flows") Reviewed-by: Oz Shlomo Reviewed-by: Jiri Pirko Acked-by: Marcelo Ricardo Leitner Signed-off-by: Paul Blakey Link: https://lore.kernel.org/r/1622029449-27060-1-git-send-email-paulb@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/sched/act_ct.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index aba3cd85f284f..6376b7b223256 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -979,7 +979,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, */ cached = tcf_ct_skb_nfct_cached(net, skb, p->zone, force); if (!cached) { - if (!commit && tcf_ct_flow_table_lookup(p, skb, family)) { + if (tcf_ct_flow_table_lookup(p, skb, family)) { skip_add = true; goto do_nat; } @@ -1019,10 +1019,11 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a, * even if the connection is already confirmed. */ nf_conntrack_confirm(skb); - } else if (!skip_add) { - tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo); } + if (!skip_add) + tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo); + out_push: skb_push_rcsum(skb, nh_ofs); From be0d8507268646a6ca524c0f40f29c501b3d78d9 Mon Sep 17 00:00:00 2001 From: Ariel Levkovich Date: Wed, 26 May 2021 20:01:10 +0300 Subject: [PATCH 020/269] net/sched: act_ct: Fix ct template allocation for zone 0 [ Upstream commit fb91702b743dec78d6507c53a2dec8a8883f509d ] Fix current behavior of skipping template allocation in case the ct action is in zone 0. Skipping the allocation may cause the datapath ct code to ignore the entire ct action with all its attributes (commit, nat) in case the ct action in zone 0 was preceded by a ct clear action. The ct clear action sets the ct_state to untracked and resets the skb->_nfct pointer. Under these conditions and without an allocated ct template, the skb->_nfct pointer will remain NULL which will cause the tc ct action handler to exit without handling commit and nat actions, if such exist. For example, the following rule in OVS dp: recirc_id(0x2),ct_state(+new-est-rel-rpl+trk),ct_label(0/0x1), \ in_port(eth0),actions:ct_clear,ct(commit,nat(src=10.11.0.12)), \ recirc(0x37a) Will result in act_ct skipping the commit and nat actions in zone 0. The change removes the skipping of template allocation for zone 0 and treats it the same as any other zone. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Ariel Levkovich Acked-by: Marcelo Ricardo Leitner Link: https://lore.kernel.org/r/20210526170110.54864-1-lariel@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/sched/act_ct.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 6376b7b223256..315a5b2f3add8 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -1199,9 +1199,6 @@ static int tcf_ct_fill_params(struct net *net, sizeof(p->zone)); } - if (p->zone == NF_CT_DEFAULT_ZONE_ID) - return 0; - nf_ct_zone_init(&zone, p->zone, NF_CT_DEFAULT_ZONE_DIR, 0); tmpl = nf_ct_tmpl_alloc(net, &zone, GFP_KERNEL); if (!tmpl) { From b198f77a3613a066cefa91f2cd9e0766612a19ce Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Thu, 27 May 2021 16:31:38 -0700 Subject: [PATCH 021/269] mptcp: always parse mptcp options for MPC reqsk [ Upstream commit 06f9a435b3aa12f4de6da91f11fdce8ce7b46205 ] In subflow_syn_recv_sock() we currently skip options parsing for OoO packet, given that such packets may not carry the relevant MPC option. If the peer generates an MPC+data TSO packet and some of the early segments are lost or get reorder, we server will ignore the peer key, causing transient, unexpected fallback to TCP. The solution is always parsing the incoming MPTCP options, and do the fallback only for in-order packets. This actually cleans the existing code a bit. Fixes: d22f4988ffec ("mptcp: process MP_CAPABLE data option") Reported-by: Matthieu Baerts Signed-off-by: Paolo Abeni Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- net/mptcp/subflow.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index bdd6af38a9ae3..96b6aca9d0ae7 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -527,21 +527,20 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, /* if the sk is MP_CAPABLE, we try to fetch the client key */ if (subflow_req->mp_capable) { - if (TCP_SKB_CB(skb)->seq != subflow_req->ssn_offset + 1) { - /* here we can receive and accept an in-window, - * out-of-order pkt, which will not carry the MP_CAPABLE - * opt even on mptcp enabled paths - */ - goto create_msk; - } - + /* we can receive and accept an in-window, out-of-order pkt, + * which may not carry the MP_CAPABLE opt even on mptcp enabled + * paths: always try to extract the peer key, and fallback + * for packets missing it. + * Even OoO DSS packets coming legitly after dropped or + * reordered MPC will cause fallback, but we don't have other + * options. + */ mptcp_get_options(skb, &mp_opt); if (!mp_opt.mp_capable) { fallback = true; goto create_child; } -create_msk: new_msk = mptcp_sk_clone(listener->conn, &mp_opt, req); if (!new_msk) fallback = true; From df7c913f90c3dcda988a254141bf01eb3bb6f123 Mon Sep 17 00:00:00 2001 From: Sagi Grimberg Date: Thu, 27 May 2021 18:16:38 -0700 Subject: [PATCH 022/269] nvme-rdma: fix in-casule data send for chained sgls [ Upstream commit 12b2aaadb6d5ef77434e8db21f469f46fe2d392e ] We have only 2 inline sg entries and we allow 4 sg entries for the send wr sge. Larger sgls entries will be chained. However when we build in-capsule send wr sge, we iterate without taking into account that the sgl may be chained and still fit in-capsule (which can happen if the sgl is bigger than 2, but lower-equal to 4). Fix in-capsule data mapping to correctly iterate chained sgls. Fixes: 38e1800275d3 ("nvme-rdma: Avoid preallocating big SGL for data") Reported-by: Walker, Benjamin Signed-off-by: Sagi Grimberg Reviewed-by: Max Gurtovoy Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/rdma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 8b326508a480e..e6d58402b829d 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1327,16 +1327,17 @@ static int nvme_rdma_map_sg_inline(struct nvme_rdma_queue *queue, int count) { struct nvme_sgl_desc *sg = &c->common.dptr.sgl; - struct scatterlist *sgl = req->data_sgl.sg_table.sgl; struct ib_sge *sge = &req->sge[1]; + struct scatterlist *sgl; u32 len = 0; int i; - for (i = 0; i < count; i++, sgl++, sge++) { + for_each_sg(req->data_sgl.sg_table.sgl, sgl, count, i) { sge->addr = sg_dma_address(sgl); sge->length = sg_dma_len(sgl); sge->lkey = queue->device->pd->local_dma_lkey; len += sge->length; + sge++; } sg->addr = cpu_to_le64(queue->ctrl->ctrl.icdoff); From 100c872c75112da26630309f4020991ffab2a11d Mon Sep 17 00:00:00 2001 From: Erik Kaneda Date: Fri, 21 May 2021 15:28:08 -0700 Subject: [PATCH 023/269] ACPICA: Clean up context mutex during object deletion [ Upstream commit e4dfe108371214500ee10c2cf19268f53acaa803 ] ACPICA commit bc43c878fd4ff27ba75b1d111b97ee90d4a82707 Fixes: c27f3d011b08 ("Fix race in GenericSerialBus (I2C) and GPIO OpRegion parameter handling") Link: https://github.com/acpica/acpica/commit/bc43c878 Reported-by: John Garry Reported-by: Xiang Chen Tested-by: Xiang Chen Signed-off-by: Erik Kaneda Signed-off-by: Bob Moore Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin --- drivers/acpi/acpica/utdelete.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/acpi/acpica/utdelete.c b/drivers/acpi/acpica/utdelete.c index 4c0d4e4341961..72d2c0b656339 100644 --- a/drivers/acpi/acpica/utdelete.c +++ b/drivers/acpi/acpica/utdelete.c @@ -285,6 +285,14 @@ static void acpi_ut_delete_internal_obj(union acpi_operand_object *object) } break; + case ACPI_TYPE_LOCAL_ADDRESS_HANDLER: + + ACPI_DEBUG_PRINT((ACPI_DB_ALLOCATIONS, + "***** Address handler %p\n", object)); + + acpi_os_delete_mutex(object->address_space.context_mutex); + break; + default: break; From 091283e3d5eb9f424b85e71804fc26092c3c4915 Mon Sep 17 00:00:00 2001 From: Li Huafei Date: Tue, 1 Jun 2021 17:27:50 +0800 Subject: [PATCH 024/269] perf probe: Fix NULL pointer dereference in convert_variable_location() [ Upstream commit 3cb17cce1e76ccc5499915a4d7e095a1ad6bf7ff ] If we just check whether the variable can be converted, 'tvar' should be a null pointer. However, the null pointer check is missing in the 'Constant value' execution path. The following cases can trigger this problem: $ cat test.c #include void main(void) { int a; const int b = 1; asm volatile("mov %1, %0" : "=r"(a): "i"(b)); printf("a: %d\n", a); } $ gcc test.c -o test -O -g $ sudo ./perf probe -x ./test -L "main" 0 void main(void) { 2 int a; const int b = 1; asm volatile("mov %1, %0" : "=r"(a): "i"(b)); 6 printf("a: %d\n", a); } $ sudo ./perf probe -x ./test -V "main:6" Segmentation fault The check on 'tvar' is added. If 'tavr' is a null pointer, we return 0 to indicate that the variable can be converted. Now, we can successfully show the variables that can be accessed. $ sudo ./perf probe -x ./test -V "main:6" Available variables at main:6 @ char* __fmt int a int b However, the variable 'b' cannot be tracked. $ sudo ./perf probe -x ./test -D "main:6 b" Failed to find the location of the 'b' variable at this address. Perhaps it has been optimized out. Use -V with the --range option to show 'b' location range. Error: Failed to add events. This is because __die_find_variable_cb() did not successfully match variable 'b', which has the DW_AT_const_value attribute instead of DW_AT_location. We added support for DW_AT_const_value in __die_find_variable_cb(). With this modification, we can successfully track the variable 'b'. $ sudo ./perf probe -x ./test -D "main:6 b" p:probe_test/main_L6 /home/lhf/test:0x1156 b=\1:s32 Fixes: 66f69b219716 ("perf probe: Support DW_AT_const_value constant value") Signed-off-by: Li Huafei Tested-by: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Frank Ch. Eigler Cc: Jianlin Lv Cc: Jiri Olsa Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Yang Jihong Cc: Zhang Jinhao http://lore.kernel.org/lkml/20210601092750.169601-1-lihuafei1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin --- tools/perf/util/dwarf-aux.c | 8 ++++++-- tools/perf/util/probe-finder.c | 3 +++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c index 7b2d471a6419d..4343356f3cf9a 100644 --- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -975,9 +975,13 @@ static int __die_find_variable_cb(Dwarf_Die *die_mem, void *data) if ((tag == DW_TAG_formal_parameter || tag == DW_TAG_variable) && die_compare_name(die_mem, fvp->name) && - /* Does the DIE have location information or external instance? */ + /* + * Does the DIE have location information or const value + * or external instance? + */ (dwarf_attr(die_mem, DW_AT_external, &attr) || - dwarf_attr(die_mem, DW_AT_location, &attr))) + dwarf_attr(die_mem, DW_AT_location, &attr) || + dwarf_attr(die_mem, DW_AT_const_value, &attr))) return DIE_FIND_CB_END; if (dwarf_haspc(die_mem, fvp->addr)) return DIE_FIND_CB_CONTINUE; diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index 76dd349aa48d8..fdafbfcef6871 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -190,6 +190,9 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr, immediate_value_is_supported()) { Dwarf_Sword snum; + if (!tvar) + return 0; + dwarf_formsdata(&attr, &snum); ret = asprintf(&tvar->value, "\\%ld", (long)snum); From 09fdb6747b7ed3bc4da720301de52ac7b159214a Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 31 May 2021 13:20:45 +0300 Subject: [PATCH 025/269] net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs [ Upstream commit 4ef8d857b5f494e62bce9085031563fda35f9563 ] When using sub-VLANs in the range of 1-7, the resulting value from: rx_vid = dsa_8021q_rx_vid_subvlan(ds, port, subvlan); is wrong according to the description from tag_8021q.c: | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | +-----------+-----+-----------------+-----------+-----------------------+ | DIR | SVL | SWITCH_ID | SUBVLAN | PORT | +-----------+-----+-----------------+-----------+-----------------------+ For example, when ds->index == 0, port == 3 and subvlan == 1, dsa_8021q_rx_vid_subvlan() returns 1027, same as it returns for subvlan == 0, but it should have returned 1043. This is because the low portion of the subvlan bits are not masked properly when writing into the 12-bit VLAN value. They are masked into bits 4:3, but they should be masked into bits 5:4. Fixes: 3eaae1d05f2b ("net: dsa: tag_8021q: support up to 8 VLANs per port using sub-VLANs") Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/dsa/tag_8021q.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/dsa/tag_8021q.c b/net/dsa/tag_8021q.c index 8e3e8a5b85593..a00b513c22a1d 100644 --- a/net/dsa/tag_8021q.c +++ b/net/dsa/tag_8021q.c @@ -64,7 +64,7 @@ #define DSA_8021Q_SUBVLAN_HI_SHIFT 9 #define DSA_8021Q_SUBVLAN_HI_MASK GENMASK(9, 9) #define DSA_8021Q_SUBVLAN_LO_SHIFT 4 -#define DSA_8021Q_SUBVLAN_LO_MASK GENMASK(4, 3) +#define DSA_8021Q_SUBVLAN_LO_MASK GENMASK(5, 4) #define DSA_8021Q_SUBVLAN_HI(x) (((x) & GENMASK(2, 2)) >> 2) #define DSA_8021Q_SUBVLAN_LO(x) ((x) & GENMASK(1, 0)) #define DSA_8021Q_SUBVLAN(x) \ From a5de17bb916a7f5b2e5b35a7c961ebee6d95bb28 Mon Sep 17 00:00:00 2001 From: Alexander Aring Date: Mon, 31 May 2021 17:00:30 -0400 Subject: [PATCH 026/269] net: sock: fix in-kernel mark setting [ Upstream commit dd9082f4a9f94280fbbece641bf8fc0a25f71f7a ] This patch fixes the in-kernel mark setting by doing an additional sk_dst_reset() which was introduced by commit 50254256f382 ("sock: Reset dst when changing sk_mark via setsockopt"). The code is now shared to avoid any further suprises when changing the socket mark value. Fixes: 84d1c617402e ("net: sock: add sock_set_mark") Reported-by: Marcelo Ricardo Leitner Signed-off-by: Alexander Aring Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/core/sock.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index dee29f41beaf8..7de51ea15cdfc 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -807,10 +807,18 @@ void sock_set_rcvbuf(struct sock *sk, int val) } EXPORT_SYMBOL(sock_set_rcvbuf); +static void __sock_set_mark(struct sock *sk, u32 val) +{ + if (val != sk->sk_mark) { + sk->sk_mark = val; + sk_dst_reset(sk); + } +} + void sock_set_mark(struct sock *sk, u32 val) { lock_sock(sk); - sk->sk_mark = val; + __sock_set_mark(sk, val); release_sock(sk); } EXPORT_SYMBOL(sock_set_mark); @@ -1118,10 +1126,10 @@ int sock_setsockopt(struct socket *sock, int level, int optname, case SO_MARK: if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { ret = -EPERM; - } else if (val != sk->sk_mark) { - sk->sk_mark = val; - sk_dst_reset(sk); + break; } + + __sock_set_mark(sk, val); break; case SO_RXQ_OVFL: From 874ece252ed269f5ac1f55167a3f2735ab0f249f Mon Sep 17 00:00:00 2001 From: Maxim Mikityanskiy Date: Tue, 1 Jun 2021 15:07:59 +0300 Subject: [PATCH 027/269] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU [ Upstream commit 05fc8b6cbd4f979a6f25759c4a17dd5f657f7ecd ] RCU synchronization is guaranteed to finish in finite time, unlike a busy loop that polls a flag. This patch is a preparation for the bugfix in the next patch, where the same synchronize_net() call will also be used to sync with the TX datapath. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/net/tls.h | 1 - net/tls/tls_device.c | 10 +++------- 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index 2bdd802212fe0..d32a067055877 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -193,7 +193,6 @@ struct tls_offload_context_tx { (sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX) enum tls_context_flags { - TLS_RX_SYNC_RUNNING = 0, /* Unlike RX where resync is driven entirely by the core in TX only * the driver knows when things went out of sync, so we need the flag * to be atomic. diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index a3ab2d3d4e4ea..abc04045577d8 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -680,15 +680,13 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx, struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx); struct net_device *netdev; - if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags))) - return; - trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type); + rcu_read_lock(); netdev = READ_ONCE(tls_ctx->netdev); if (netdev) netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn, TLS_OFFLOAD_CTX_DIR_RX); - clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags); + rcu_read_unlock(); TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICERESYNC); } @@ -1298,9 +1296,7 @@ static int tls_device_down(struct net_device *netdev) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_RX); WRITE_ONCE(ctx->netdev, NULL); - smp_mb__before_atomic(); /* pairs with test_and_set_bit() */ - while (test_bit(TLS_RX_SYNC_RUNNING, &ctx->flags)) - usleep_range(10, 200); + synchronize_net(); dev_put(netdev); list_del_init(&ctx->list); From f1d4184f128dede82a59a841658ed40d4e6d3aa2 Mon Sep 17 00:00:00 2001 From: Maxim Mikityanskiy Date: Tue, 1 Jun 2021 15:08:00 +0300 Subject: [PATCH 028/269] net/tls: Fix use-after-free after the TLS device goes down and up [ Upstream commit c55dcdd435aa6c6ad6ccac0a4c636d010ee367a4 ] When a netdev with active TLS offload goes down, tls_device_down is called to stop the offload and tear down the TLS context. However, the socket stays alive, and it still points to the TLS context, which is now deallocated. If a netdev goes up, while the connection is still active, and the data flow resumes after a number of TCP retransmissions, it will lead to a use-after-free of the TLS context. This commit addresses this bug by keeping the context alive until its normal destruction, and implements the necessary fallbacks, so that the connection can resume in software (non-offloaded) kTLS mode. On the TX side tls_sw_fallback is used to encrypt all packets. The RX side already has all the necessary fallbacks, because receiving non-decrypted packets is supported. The thing needed on the RX side is to block resync requests, which are normally produced after receiving non-decrypted packets. The necessary synchronization is implemented for a graceful teardown: first the fallbacks are deployed, then the driver resources are released (it used to be possible to have a tls_dev_resync after tls_dev_del). A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback mode. It's used to skip the RX resync logic completely, as it becomes useless, and some objects may be released (for example, resync_async, which is allocated and freed by the driver). Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/net/tls.h | 9 ++++++ net/tls/tls_device.c | 52 +++++++++++++++++++++++++++++++---- net/tls/tls_device_fallback.c | 7 +++++ net/tls/tls_main.c | 1 + 4 files changed, 64 insertions(+), 5 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index d32a067055877..43891b28fc482 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -193,6 +193,11 @@ struct tls_offload_context_tx { (sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX) enum tls_context_flags { + /* tls_device_down was called after the netdev went down, device state + * was released, and kTLS works in software, even though rx_conf is + * still TLS_HW (needed for transition). + */ + TLS_RX_DEV_DEGRADED = 0, /* Unlike RX where resync is driven entirely by the core in TX only * the driver knows when things went out of sync, so we need the flag * to be atomic. @@ -264,6 +269,7 @@ struct tls_context { /* cache cold stuff */ struct proto *sk_proto; + struct sock *sk; void (*sk_destruct)(struct sock *sk); @@ -446,6 +452,9 @@ static inline u16 tls_user_config(struct tls_context *ctx, bool tx) struct sk_buff * tls_validate_xmit_skb(struct sock *sk, struct net_device *dev, struct sk_buff *skb); +struct sk_buff * +tls_validate_xmit_skb_sw(struct sock *sk, struct net_device *dev, + struct sk_buff *skb); static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk) { diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index abc04045577d8..f718c7346088f 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -50,6 +50,7 @@ static void tls_device_gc_task(struct work_struct *work); static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task); static LIST_HEAD(tls_device_gc_list); static LIST_HEAD(tls_device_list); +static LIST_HEAD(tls_device_down_list); static DEFINE_SPINLOCK(tls_device_lock); static void tls_device_free_ctx(struct tls_context *ctx) @@ -759,6 +760,8 @@ void tls_device_rx_resync_new_rec(struct sock *sk, u32 rcd_len, u32 seq) if (tls_ctx->rx_conf != TLS_HW) return; + if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags))) + return; prot = &tls_ctx->prot_info; rx_ctx = tls_offload_ctx_rx(tls_ctx); @@ -961,6 +964,17 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx, ctx->sw.decrypted |= is_decrypted; + if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags))) { + if (likely(is_encrypted || is_decrypted)) + return 0; + + /* After tls_device_down disables the offload, the next SKB will + * likely have initial fragments decrypted, and final ones not + * decrypted. We need to reencrypt that single SKB. + */ + return tls_device_reencrypt(sk, skb); + } + /* Return immediately if the record is either entirely plaintext or * entirely ciphertext. Otherwise handle reencrypt partially decrypted * record. @@ -1288,6 +1302,26 @@ static int tls_device_down(struct net_device *netdev) spin_unlock_irqrestore(&tls_device_lock, flags); list_for_each_entry_safe(ctx, tmp, &list, list) { + /* Stop offloaded TX and switch to the fallback. + * tls_is_sk_tx_device_offloaded will return false. + */ + WRITE_ONCE(ctx->sk->sk_validate_xmit_skb, tls_validate_xmit_skb_sw); + + /* Stop the RX and TX resync. + * tls_dev_resync must not be called after tls_dev_del. + */ + WRITE_ONCE(ctx->netdev, NULL); + + /* Start skipping the RX resync logic completely. */ + set_bit(TLS_RX_DEV_DEGRADED, &ctx->flags); + + /* Sync with inflight packets. After this point: + * TX: no non-encrypted packets will be passed to the driver. + * RX: resync requests from the driver will be ignored. + */ + synchronize_net(); + + /* Release the offload context on the driver side. */ if (ctx->tx_conf == TLS_HW) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX); @@ -1295,13 +1329,21 @@ static int tls_device_down(struct net_device *netdev) !test_bit(TLS_RX_DEV_CLOSED, &ctx->flags)) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_RX); - WRITE_ONCE(ctx->netdev, NULL); - synchronize_net(); + dev_put(netdev); - list_del_init(&ctx->list); - if (refcount_dec_and_test(&ctx->refcount)) - tls_device_free_ctx(ctx); + /* Move the context to a separate list for two reasons: + * 1. When the context is deallocated, list_del is called. + * 2. It's no longer an offloaded context, so we don't want to + * run offload-specific code on this context. + */ + spin_lock_irqsave(&tls_device_lock, flags); + list_move_tail(&ctx->list, &tls_device_down_list); + spin_unlock_irqrestore(&tls_device_lock, flags); + + /* Device contexts for RX and TX will be freed in on sk_destruct + * by tls_device_free_ctx. rx_conf and tx_conf stay in TLS_HW. + */ } up_write(&device_offload_lock); diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c index 28895333701e4..0d40016bf69e0 100644 --- a/net/tls/tls_device_fallback.c +++ b/net/tls/tls_device_fallback.c @@ -430,6 +430,13 @@ struct sk_buff *tls_validate_xmit_skb(struct sock *sk, } EXPORT_SYMBOL_GPL(tls_validate_xmit_skb); +struct sk_buff *tls_validate_xmit_skb_sw(struct sock *sk, + struct net_device *dev, + struct sk_buff *skb) +{ + return tls_sw_fallback(sk, skb); +} + struct sk_buff *tls_encrypt_skb(struct sk_buff *skb) { return tls_sw_fallback(skb->sk, skb); diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 8d93cea99f2cb..32a51b20509c9 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -633,6 +633,7 @@ struct tls_context *tls_ctx_create(struct sock *sk) mutex_init(&ctx->tx_lock); rcu_assign_pointer(icsk->icsk_ulp_data, ctx); ctx->sk_proto = READ_ONCE(sk->sk_prot); + ctx->sk = sk; return ctx; } From c1ea8c0e71ead1efaaba33e241c1e7d35e9cbf51 Mon Sep 17 00:00:00 2001 From: Aya Levin Date: Tue, 25 May 2021 15:35:25 +0300 Subject: [PATCH 029/269] net/mlx5e: Fix incompatible casting [ Upstream commit d8ec92005f806dfa7524e9171eca707c0bb1267e ] Device supports setting of a single fec mode at a time, enforce this by bitmap_weight == 1. Input from fec command is in u32, avoid cast to unsigned long and use bitmap_from_arr32 to populate bitmap safely. Fixes: 4bd9d5070b92 ("net/mlx5e: Enforce setting of a single FEC mode") Signed-off-by: Aya Levin Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 986f0d86e94dc..bc7c1962f9e66 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1618,12 +1618,13 @@ static int mlx5e_set_fecparam(struct net_device *netdev, { struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5_core_dev *mdev = priv->mdev; + unsigned long fec_bitmap; u16 fec_policy = 0; int mode; int err; - if (bitmap_weight((unsigned long *)&fecparam->fec, - ETHTOOL_FEC_LLRS_BIT + 1) > 1) + bitmap_from_arr32(&fec_bitmap, &fecparam->fec, sizeof(fecparam->fec) * BITS_PER_BYTE); + if (bitmap_weight(&fec_bitmap, ETHTOOL_FEC_LLRS_BIT + 1) > 1) return -EOPNOTSUPP; for (mode = 0; mode < ARRAY_SIZE(pplm_fec_2_ethtool); mode++) { From 730700337593b41551e17427bc33edcbd95d3f05 Mon Sep 17 00:00:00 2001 From: Moshe Shemesh Date: Thu, 8 Apr 2021 07:30:57 +0300 Subject: [PATCH 030/269] net/mlx5: Check firmware sync reset requested is set before trying to abort it [ Upstream commit 5940e64281c09976ce2b560244217e610bf9d029 ] In case driver sent NACK to firmware on sync reset request, it will get sync reset abort event while it didn't set sync reset requested mode. Thus, on abort sync reset event handler, driver should check reset requested is set before trying to stop sync reset poll. Fixes: 7dd6df329d4c ("net/mlx5: Handle sync reset abort event") Signed-off-by: Moshe Shemesh Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index f9042e147c7f6..ee710ce007950 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -354,6 +354,9 @@ static void mlx5_sync_reset_abort_event(struct work_struct *work) reset_abort_work); struct mlx5_core_dev *dev = fw_reset->dev; + if (!test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags)) + return; + mlx5_sync_reset_clear_reset_requested(dev, true); mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n"); } From c8972cf28ea11043280135859903ad69b03e0851 Mon Sep 17 00:00:00 2001 From: Roi Dayan Date: Tue, 13 Apr 2021 14:35:22 +0300 Subject: [PATCH 031/269] net/mlx5e: Check for needed capability for cvlan matching [ Upstream commit afe93f71b5d3cdae7209213ec8ef25210b837b93 ] If not supported show an error and return instead of trying to offload to the hardware and fail. Fixes: 699e96ddf47f ("net/mlx5e: Support offloading tc double vlan headers match") Reported-by: Pablo Neira Ayuso Signed-off-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 1bdeb948f56d7..80abdb0b47d7e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2253,11 +2253,13 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, misc_parameters); struct flow_rule *rule = flow_cls_offload_flow_rule(f); struct flow_dissector *dissector = rule->match.dissector; + enum fs_flow_table_type fs_type; u16 addr_type = 0; u8 ip_proto = 0; u8 *match_level; int err; + fs_type = mlx5e_is_eswitch_flow(flow) ? FS_FT_FDB : FS_FT_NIC_RX; match_level = outer_match_level; if (dissector->used_keys & @@ -2382,6 +2384,13 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, if (match.mask->vlan_id || match.mask->vlan_priority || match.mask->vlan_tpid) { + if (!MLX5_CAP_FLOWTABLE_TYPE(priv->mdev, ft_field_support.outer_second_vid, + fs_type)) { + NL_SET_ERR_MSG_MOD(extack, + "Matching on CVLAN is not supported"); + return -EOPNOTSUPP; + } + if (match.key->vlan_tpid == htons(ETH_P_8021AD)) { MLX5_SET(fte_match_set_misc, misc_c, outer_second_svlan_tag, 1); From 2a8cda3867cd06fbc3f414a78e1c692f973d21e4 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Wed, 9 Dec 2020 16:40:38 +0200 Subject: [PATCH 032/269] net/mlx5: DR, Create multi-destination flow table with level less than 64 [ Upstream commit 216214c64a8c1cb9078c2c0aec7bb4a2f8e75397 ] Flow table that contains flow pointing to multiple flow tables or multiple TIRs must have a level lower than 64. In our case it applies to muli- destination flow table. Fix the level of the created table to comply with HW Spec definitions, and still make sure that its level lower than SW-owned tables, so that it would be possible to point from the multi-destination FW table to SW tables. Fixes: 34583beea4b7 ("net/mlx5: DR, Create multi-destination table for SW-steering use") Signed-off-by: Yevgeny Kliteynik Reviewed-by: Alex Vesker Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c | 3 ++- include/linux/mlx5/mlx5_ifc.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c index 1fbcd012bb855..7ccfd40586cee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c @@ -112,7 +112,8 @@ int mlx5dr_fw_create_md_tbl(struct mlx5dr_domain *dmn, int ret; ft_attr.table_type = MLX5_FLOW_TABLE_TYPE_FDB; - ft_attr.level = dmn->info.caps.max_ft_level - 2; + ft_attr.level = min_t(int, dmn->info.caps.max_ft_level - 2, + MLX5_FT_MAX_MULTIPATH_LEVEL); ft_attr.reformat_en = reformat_req; ft_attr.decap_en = reformat_req; diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index cc9ee07769745..af8f4e2cf21d1 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1223,6 +1223,8 @@ enum mlx5_fc_bulk_alloc_bitmask { #define MLX5_FC_BULK_NUM_FCS(fc_enum) (MLX5_FC_BULK_SIZE_FACTOR * (fc_enum)) +#define MLX5_FT_MAX_MULTIPATH_LEVEL 63 + enum { MLX5_STEERING_FORMAT_CONNECTX_5 = 0, MLX5_STEERING_FORMAT_CONNECTX_6DX = 1, From c440cd080761b18a52cac20f2a42e5da1e3995af Mon Sep 17 00:00:00 2001 From: Max Gurtovoy Date: Tue, 1 Jun 2021 19:22:05 +0300 Subject: [PATCH 033/269] nvmet: fix freeing unallocated p2pmem [ Upstream commit bcd9a0797d73eeff659582f23277e7ab6e5f18f3 ] In case p2p device was found but the p2p pool is empty, the nvme target is still trying to free the sgl from the p2p pool instead of the regular sgl pool and causing a crash (BUG() is called). Instead, assign the p2p_dev for the request only if it was allocated from p2p pool. This is the crash that was caused: [Sun May 30 19:13:53 2021] ------------[ cut here ]------------ [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! [Sun May 30 19:13:53 2021] invalid opcode: 0000 [#1] SMP PTI ... [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! ... [Sun May 30 19:13:53 2021] RIP: 0010:gen_pool_free_owner+0xa8/0xb0 ... [Sun May 30 19:13:53 2021] Call Trace: [Sun May 30 19:13:53 2021] ------------[ cut here ]------------ [Sun May 30 19:13:53 2021] pci_free_p2pmem+0x2b/0x70 [Sun May 30 19:13:53 2021] pci_p2pmem_free_sgl+0x4f/0x80 [Sun May 30 19:13:53 2021] nvmet_req_free_sgls+0x1e/0x80 [nvmet] [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! [Sun May 30 19:13:53 2021] nvmet_rdma_release_rsp+0x4e/0x1f0 [nvmet_rdma] [Sun May 30 19:13:53 2021] nvmet_rdma_send_done+0x1c/0x60 [nvmet_rdma] Fixes: c6e3f1339812 ("nvmet: add metadata support for block devices") Reviewed-by: Israel Rukshin Signed-off-by: Max Gurtovoy Reviewed-by: Logan Gunthorpe Reviewed-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/target/core.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 46e4f7ea34c8b..8b939e9db470c 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -988,19 +988,23 @@ static unsigned int nvmet_data_transfer_len(struct nvmet_req *req) return req->transfer_len - req->metadata_len; } -static int nvmet_req_alloc_p2pmem_sgls(struct nvmet_req *req) +static int nvmet_req_alloc_p2pmem_sgls(struct pci_dev *p2p_dev, + struct nvmet_req *req) { - req->sg = pci_p2pmem_alloc_sgl(req->p2p_dev, &req->sg_cnt, + req->sg = pci_p2pmem_alloc_sgl(p2p_dev, &req->sg_cnt, nvmet_data_transfer_len(req)); if (!req->sg) goto out_err; if (req->metadata_len) { - req->metadata_sg = pci_p2pmem_alloc_sgl(req->p2p_dev, + req->metadata_sg = pci_p2pmem_alloc_sgl(p2p_dev, &req->metadata_sg_cnt, req->metadata_len); if (!req->metadata_sg) goto out_free_sg; } + + req->p2p_dev = p2p_dev; + return 0; out_free_sg: pci_p2pmem_free_sgl(req->p2p_dev, req->sg); @@ -1008,25 +1012,19 @@ static int nvmet_req_alloc_p2pmem_sgls(struct nvmet_req *req) return -ENOMEM; } -static bool nvmet_req_find_p2p_dev(struct nvmet_req *req) +static struct pci_dev *nvmet_req_find_p2p_dev(struct nvmet_req *req) { - if (!IS_ENABLED(CONFIG_PCI_P2PDMA)) - return false; - - if (req->sq->ctrl && req->sq->qid && req->ns) { - req->p2p_dev = radix_tree_lookup(&req->sq->ctrl->p2p_ns_map, - req->ns->nsid); - if (req->p2p_dev) - return true; - } - - req->p2p_dev = NULL; - return false; + if (!IS_ENABLED(CONFIG_PCI_P2PDMA) || + !req->sq->ctrl || !req->sq->qid || !req->ns) + return NULL; + return radix_tree_lookup(&req->sq->ctrl->p2p_ns_map, req->ns->nsid); } int nvmet_req_alloc_sgls(struct nvmet_req *req) { - if (nvmet_req_find_p2p_dev(req) && !nvmet_req_alloc_p2pmem_sgls(req)) + struct pci_dev *p2p_dev = nvmet_req_find_p2p_dev(req); + + if (p2p_dev && !nvmet_req_alloc_p2pmem_sgls(p2p_dev, req)) return 0; req->sg = sgl_alloc(nvmet_data_transfer_len(req), GFP_KERNEL, @@ -1055,6 +1053,7 @@ void nvmet_req_free_sgls(struct nvmet_req *req) pci_p2pmem_free_sgl(req->p2p_dev, req->sg); if (req->metadata_sg) pci_p2pmem_free_sgl(req->p2p_dev, req->metadata_sg); + req->p2p_dev = NULL; } else { sgl_free(req->sg); if (req->metadata_sg) From 5f3429c05e4028a0e241afdad856dd15dec2ffb9 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Thu, 27 May 2021 21:54:42 +0200 Subject: [PATCH 034/269] netfilter: nft_ct: skip expectations for confirmed conntrack [ Upstream commit 1710eb913bdcda3917f44d383c32de6bdabfc836 ] nft_ct_expect_obj_eval() calls nf_ct_ext_add() for a confirmed conntrack entry. However, nf_ct_ext_add() can only be called for !nf_ct_is_confirmed(). [ 1825.349056] WARNING: CPU: 0 PID: 1279 at net/netfilter/nf_conntrack_extend.c:48 nf_ct_xt_add+0x18e/0x1a0 [nf_conntrack] [ 1825.351391] RIP: 0010:nf_ct_ext_add+0x18e/0x1a0 [nf_conntrack] [ 1825.351493] Code: 41 5c 41 5d 41 5e 41 5f c3 41 bc 0a 00 00 00 e9 15 ff ff ff ba 09 00 00 00 31 f6 4c 89 ff e8 69 6c 3d e9 eb 96 45 31 ed eb cd <0f> 0b e9 b1 fe ff ff e8 86 79 14 e9 eb bf 0f 1f 40 00 0f 1f 44 00 [ 1825.351721] RSP: 0018:ffffc90002e1f1e8 EFLAGS: 00010202 [ 1825.351790] RAX: 000000000000000e RBX: ffff88814f5783c0 RCX: ffffffffc0e4f887 [ 1825.351881] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88814f578440 [ 1825.351971] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88814f578447 [ 1825.352060] R10: ffffed1029eaf088 R11: 0000000000000001 R12: ffff88814f578440 [ 1825.352150] R13: ffff8882053f3a00 R14: 0000000000000000 R15: 0000000000000a20 [ 1825.352240] FS: 00007f992261c900(0000) GS:ffff889faec00000(0000) knlGS:0000000000000000 [ 1825.352343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1825.352417] CR2: 000056070a4d1158 CR3: 000000015efe0000 CR4: 0000000000350ee0 [ 1825.352508] Call Trace: [ 1825.352544] nf_ct_helper_ext_add+0x10/0x60 [nf_conntrack] [ 1825.352641] nft_ct_expect_obj_eval+0x1b8/0x1e0 [nft_ct] [ 1825.352716] nft_do_chain+0x232/0x850 [nf_tables] Add the ct helper extension only for unconfirmed conntrack. Skip rule evaluation if the ct helper extension does not exist. Thus, you can only create expectations from the first packet. It should be possible to remove this limitation by adding a new action to attach a generic ct helper to the first packet. Then, use this ct helper extension from follow up packets to create the ct expectation. While at it, add a missing check to skip the template conntrack too and remove check for IPCT_UNTRACK which is implicit to !ct. Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nft_ct.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c index a1b0aac46e9e0..70d46e0bbf064 100644 --- a/net/netfilter/nft_ct.c +++ b/net/netfilter/nft_ct.c @@ -1218,7 +1218,7 @@ static void nft_ct_expect_obj_eval(struct nft_object *obj, struct nf_conn *ct; ct = nf_ct_get(pkt->skb, &ctinfo); - if (!ct || ctinfo == IP_CT_UNTRACKED) { + if (!ct || nf_ct_is_confirmed(ct) || nf_ct_is_template(ct)) { regs->verdict.code = NFT_BREAK; return; } From 8d614eebc003bb7763993e6fcdc8f853401bc17e Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Fri, 28 May 2021 13:45:16 +0200 Subject: [PATCH 035/269] netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches [ Upstream commit 8971ee8b087750a23f3cd4dc55bff2d0303fd267 ] The private helper data size cannot be updated. However, updates that contain NFCTH_PRIV_DATA_LEN might bogusly hit EBUSY even if the size is the same. Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/nfnetlink_cthelper.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c index 5b0d0a77379c6..91afbf8ac8cf0 100644 --- a/net/netfilter/nfnetlink_cthelper.c +++ b/net/netfilter/nfnetlink_cthelper.c @@ -380,10 +380,14 @@ static int nfnl_cthelper_update(const struct nlattr * const tb[], struct nf_conntrack_helper *helper) { + u32 size; int ret; - if (tb[NFCTH_PRIV_DATA_LEN]) - return -EBUSY; + if (tb[NFCTH_PRIV_DATA_LEN]) { + size = ntohl(nla_get_be32(tb[NFCTH_PRIV_DATA_LEN])); + if (size != helper->data_len) + return -EBUSY; + } if (tb[NFCTH_POLICY]) { ret = nfnl_cthelper_update_policy(helper, tb[NFCTH_POLICY]); From 4cf297ef595ce98cb4e9d80ae3e00bb5af0a8de0 Mon Sep 17 00:00:00 2001 From: Zhihao Cheng Date: Tue, 1 Jun 2021 09:19:35 +0000 Subject: [PATCH 036/269] drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() [ Upstream commit 10c1f0cbcea93beec5d3bdc02b1a3b577b4985e7 ] In case of error, the function live_context() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Reported-by: Hulk Robot Signed-off-by: Zhihao Cheng Reviewed-by: Tvrtko Ursulin Link: https://patchwork.freedesktop.org/patch/msgid/33c46ef24cd547d0ad21dc106441491a@intel.com [tursulin: Wrap commit text, fix Fixes: tag.] Signed-off-by: Tvrtko Ursulin (cherry picked from commit 8f4caef8d5401b42c6367d46c23da5e0e8111516) Signed-off-by: Jani Nikula Signed-off-by: Sasha Levin --- drivers/gpu/drm/i915/selftests/i915_request.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index e424a6d1a68c9..7a72faf29f272 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -1391,8 +1391,8 @@ static int live_breadcrumbs_smoketest(void *arg) for (n = 0; n < smoke[0].ncontexts; n++) { smoke[0].contexts[n] = live_context(i915, file); - if (!smoke[0].contexts[n]) { - ret = -ENOMEM; + if (IS_ERR(smoke[0].contexts[n])) { + ret = PTR_ERR(smoke[0].contexts[n]); goto out_contexts; } } From cdf3f6db1a86fc1e3d70423f4ee4fa81e4831157 Mon Sep 17 00:00:00 2001 From: Tobias Klauser Date: Wed, 27 Jan 2021 18:46:15 +0100 Subject: [PATCH 037/269] bpf: Simplify cases in bpf_base_func_proto [ Upstream commit 61ca36c8c4eb3bae35a285b1ae18c514cde65439 ] !perfmon_capable() is checked before the last switch(func_id) in bpf_base_func_proto. Thus, the cases BPF_FUNC_trace_printk and BPF_FUNC_snprintf_btf can be moved to that last switch(func_id) to omit the inline !perfmon_capable() checks. Signed-off-by: Tobias Klauser Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20210127174615.3038-1-tklauser@distanz.ch Signed-off-by: Sasha Levin --- kernel/bpf/helpers.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index c489430cac78c..d0fc091d2ab4a 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -707,14 +707,6 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_spin_lock_proto; case BPF_FUNC_spin_unlock: return &bpf_spin_unlock_proto; - case BPF_FUNC_trace_printk: - if (!perfmon_capable()) - return NULL; - return bpf_get_trace_printk_proto(); - case BPF_FUNC_snprintf_btf: - if (!perfmon_capable()) - return NULL; - return &bpf_snprintf_btf_proto; case BPF_FUNC_jiffies64: return &bpf_jiffies64_proto; case BPF_FUNC_per_cpu_ptr: @@ -729,6 +721,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return NULL; switch (func_id) { + case BPF_FUNC_trace_printk: + return bpf_get_trace_printk_proto(); case BPF_FUNC_get_current_task: return &bpf_get_current_task_proto; case BPF_FUNC_probe_read_user: @@ -739,6 +733,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_probe_read_user_str_proto; case BPF_FUNC_probe_read_kernel_str: return &bpf_probe_read_kernel_str_proto; + case BPF_FUNC_snprintf_btf: + return &bpf_snprintf_btf_proto; default: return NULL; } From ff5039ec75c83d2ed5b781dc7733420ee8c985fc Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 28 May 2021 09:16:31 +0000 Subject: [PATCH 038/269] bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks [ Upstream commit ff40e51043af63715ab413995ff46996ecf9583f ] Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") added an implementation of the locked_down LSM hook to SELinux, with the aim to restrict which domains are allowed to perform operations that would breach lockdown. This is indirectly also getting audit subsystem involved to report events. The latter is problematic, as reported by Ondrej and Serhei, since it can bring down the whole system via audit: 1) The audit events that are triggered due to calls to security_locked_down() can OOM kill a machine, see below details [0]. 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() when trying to wake up kauditd, for example, when using trace_sched_switch() tracepoint, see details in [1]. Triggering this was not via some hypothetical corner case, but with existing tools like runqlat & runqslower from bcc, for example, which make use of this tracepoint. Rough call sequence goes like: rq_lock(rq) -> -------------------------+ trace_sched_switch() -> | bpf_prog_xyz() -> +-> deadlock selinux_lockdown() -> | audit_log_end() -> | wake_up_interruptible() -> | try_to_wake_up() -> | rq_lock(rq) --------------+ What's worse is that the intention of 59438b46471a to further restrict lockdown settings for specific applications in respect to the global lockdown policy is completely broken for BPF. The SELinux policy rule for the current lockdown check looks something like this: allow : lockdown { }; However, this doesn't match with the 'current' task where the security_locked_down() is executed, example: httpd does a syscall. There is a tracing program attached to the syscall which triggers a BPF program to run, which ends up doing a bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does the permission check against 'current', that is, httpd in this example. httpd has literally zero relation to this tracing program, and it would be nonsensical having to write an SELinux policy rule against httpd to let the tracing helper pass. The policy in this case needs to be against the entity that is installing the BPF program. For example, if bpftrace would generate a histogram of syscall counts by user space application: bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' bpftrace would then go and generate a BPF program from this internally. One way of doing it [for the sake of the example] could be to call bpf_get_current_task() helper and then access current->comm via one of bpf_probe_read_kernel{,_str}() helpers. So the program itself has nothing to do with httpd or any other random app doing a syscall here. The BPF program _explicitly initiated_ the lockdown check. The allow/deny policy belongs in the context of bpftrace: meaning, you want to grant bpftrace access to use these helpers, but other tracers on the system like my_random_tracer _not_. Therefore fix all three issues at the same time by taking a completely different approach for the security_locked_down() hook, that is, move the check into the program verification phase where we actually retrieve the BPF func proto. This also reliably gets the task (current) that is trying to install the BPF tracing program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since we're moving this out of the BPF helper's fast-path which can be called several millions of times per second. The check is then also in line with other security_locked_down() hooks in the system where the enforcement is performed at open/load time, for example, open_kcore() for /proc/kcore access or module_sig_check() for module signatures just to pick few random ones. What's out of scope in the fix as well as in other security_locked_down() hook locations /outside/ of BPF subsystem is that if the lockdown policy changes on the fly there is no retrospective action. This requires a different discussion, potentially complex infrastructure, and it's also not clear whether this can be solved generically. Either way, it is out of scope for a suitable stable fix which this one is targeting. Note that the breakage is specifically on 59438b46471a where it started to rely on 'current' as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode"). [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says: I starting seeing this with F-34. When I run a container that is traced with BPF to record the syscalls it is doing, auditd is flooded with messages like: type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality } for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM" scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=lockdown permissive=0 This seems to be leading to auditd running out of space in the backlog buffer and eventually OOMs the machine. [...] auditd running at 99% CPU presumably processing all the messages, eventually I get: Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64 Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1 Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [...] [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/, Serhei Makarov says: Upstream kernel 5.11.0-rc7 and later was found to deadlock during a bpf_probe_read_compat() call within a sched_switch tracepoint. The problem is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on ppc64le. Example stack trace: [...] [ 730.868702] stack backtrace: [ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1 [ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 [ 730.873278] Call Trace: [ 730.873770] dump_stack+0x7f/0xa1 [ 730.874433] check_noncircular+0xdf/0x100 [ 730.875232] __lock_acquire+0x1202/0x1e10 [ 730.876031] ? __lock_acquire+0xfc0/0x1e10 [ 730.876844] lock_acquire+0xc2/0x3a0 [ 730.877551] ? __wake_up_common_lock+0x52/0x90 [ 730.878434] ? lock_acquire+0xc2/0x3a0 [ 730.879186] ? lock_is_held_type+0xa7/0x120 [ 730.880044] ? skb_queue_tail+0x1b/0x50 [ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90 [ 730.881656] ? __wake_up_common_lock+0x52/0x90 [ 730.882532] __wake_up_common_lock+0x52/0x90 [ 730.883375] audit_log_end+0x5b/0x100 [ 730.884104] slow_avc_audit+0x69/0x90 [ 730.884836] avc_has_perm+0x8b/0xb0 [ 730.885532] selinux_lockdown+0xa5/0xd0 [ 730.886297] security_locked_down+0x20/0x40 [ 730.887133] bpf_probe_read_compat+0x66/0xd0 [ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820 [ 730.888917] trace_call_bpf+0xe9/0x240 [ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0 [ 730.890579] perf_trace_sched_switch+0x142/0x180 [ 730.891485] ? __schedule+0x6d8/0xb20 [ 730.892209] __schedule+0x6d8/0xb20 [ 730.892899] schedule+0x5b/0xc0 [ 730.893522] exit_to_user_mode_prepare+0x11d/0x240 [ 730.894457] syscall_exit_to_user_mode+0x27/0x70 [ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae [...] Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") Reported-by: Ondrej Mosnacek Reported-by: Jakub Hrozek Reported-by: Serhei Makarov Reported-by: Jiri Olsa Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Tested-by: Jiri Olsa Cc: Paul Moore Cc: James Morris Cc: Jerome Marchand Cc: Frank Eigler Cc: Linus Torvalds Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net Signed-off-by: Sasha Levin --- kernel/bpf/helpers.c | 7 +++++-- kernel/trace/bpf_trace.c | 32 ++++++++++++-------------------- 2 files changed, 17 insertions(+), 22 deletions(-) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index d0fc091d2ab4a..f7e99bb8c3b6c 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "../../lib/kstrtox.h" @@ -728,11 +729,13 @@ bpf_base_func_proto(enum bpf_func_id func_id) case BPF_FUNC_probe_read_user: return &bpf_probe_read_user_proto; case BPF_FUNC_probe_read_kernel: - return &bpf_probe_read_kernel_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_kernel_proto; case BPF_FUNC_probe_read_user_str: return &bpf_probe_read_user_str_proto; case BPF_FUNC_probe_read_kernel_str: - return &bpf_probe_read_kernel_str_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_kernel_str_proto; case BPF_FUNC_snprintf_btf: return &bpf_snprintf_btf_proto; default: diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index fcbfc95649967..01710831fd02f 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -212,16 +212,11 @@ const struct bpf_func_proto bpf_probe_read_user_str_proto = { static __always_inline int bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr) { - int ret = security_locked_down(LOCKDOWN_BPF_READ); + int ret; - if (unlikely(ret < 0)) - goto fail; ret = copy_from_kernel_nofault(dst, unsafe_ptr, size); if (unlikely(ret < 0)) - goto fail; - return ret; -fail: - memset(dst, 0, size); + memset(dst, 0, size); return ret; } @@ -243,10 +238,7 @@ const struct bpf_func_proto bpf_probe_read_kernel_proto = { static __always_inline int bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr) { - int ret = security_locked_down(LOCKDOWN_BPF_READ); - - if (unlikely(ret < 0)) - goto fail; + int ret; /* * The strncpy_from_kernel_nofault() call will likely not fill the @@ -259,11 +251,7 @@ bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr) */ ret = strncpy_from_kernel_nofault(dst, unsafe_ptr, size); if (unlikely(ret < 0)) - goto fail; - - return ret; -fail: - memset(dst, 0, size); + memset(dst, 0, size); return ret; } @@ -1293,16 +1281,20 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_probe_read_user: return &bpf_probe_read_user_proto; case BPF_FUNC_probe_read_kernel: - return &bpf_probe_read_kernel_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_kernel_proto; case BPF_FUNC_probe_read_user_str: return &bpf_probe_read_user_str_proto; case BPF_FUNC_probe_read_kernel_str: - return &bpf_probe_read_kernel_str_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_kernel_str_proto; #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE case BPF_FUNC_probe_read: - return &bpf_probe_read_compat_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_compat_proto; case BPF_FUNC_probe_read_str: - return &bpf_probe_read_compat_str_proto; + return security_locked_down(LOCKDOWN_BPF_READ) < 0 ? + NULL : &bpf_probe_read_compat_str_proto; #endif #ifdef CONFIG_CGROUPS case BPF_FUNC_get_current_cgroup_id: From 2a0ba0125c2c62566023882293face06046698a5 Mon Sep 17 00:00:00 2001 From: Zhen Lei Date: Sat, 8 May 2021 14:25:17 +0800 Subject: [PATCH 039/269] ieee802154: fix error return code in ieee802154_add_iface() [ Upstream commit 79c6b8ed30e54b401c873dbad2511f2a1c525fd5 ] Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: be51da0f3e34 ("ieee802154: Stop using NLA_PUT*().") Reported-by: Hulk Robot Signed-off-by: Zhen Lei Link: https://lore.kernel.org/r/20210508062517.2574-1-thunder.leizhen@huawei.com Signed-off-by: Stefan Schmidt Signed-off-by: Sasha Levin --- net/ieee802154/nl-phy.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ieee802154/nl-phy.c b/net/ieee802154/nl-phy.c index 2cdc7e63fe172..88215b5c93aa4 100644 --- a/net/ieee802154/nl-phy.c +++ b/net/ieee802154/nl-phy.c @@ -241,8 +241,10 @@ int ieee802154_add_iface(struct sk_buff *skb, struct genl_info *info) } if (nla_put_string(msg, IEEE802154_ATTR_PHY_NAME, wpan_phy_name(phy)) || - nla_put_string(msg, IEEE802154_ATTR_DEV_NAME, dev->name)) + nla_put_string(msg, IEEE802154_ATTR_DEV_NAME, dev->name)) { + rc = -EMSGSIZE; goto nla_put_failure; + } dev_put(dev); wpan_phy_put(phy); From e513d889625b5d8c6c2942243cd7f967455d300a Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Wed, 19 May 2021 14:16:14 +0000 Subject: [PATCH 040/269] ieee802154: fix error return code in ieee802154_llsec_getparams() [ Upstream commit 373e864cf52403b0974c2f23ca8faf9104234555 ] Fix to return negative error code -ENOBUFS from the error handling case instead of 0, as done elsewhere in this function. Fixes: 3e9c156e2c21 ("ieee802154: add netlink interfaces for llsec") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Link: https://lore.kernel.org/r/20210519141614.3040055-1-weiyongjun1@huawei.com Signed-off-by: Stefan Schmidt Signed-off-by: Sasha Levin --- net/ieee802154/nl-mac.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ieee802154/nl-mac.c b/net/ieee802154/nl-mac.c index d19c40c684e80..71be751123210 100644 --- a/net/ieee802154/nl-mac.c +++ b/net/ieee802154/nl-mac.c @@ -680,8 +680,10 @@ int ieee802154_llsec_getparams(struct sk_buff *skb, struct genl_info *info) nla_put_u8(msg, IEEE802154_ATTR_LLSEC_SECLEVEL, params.out_level) || nla_put_u32(msg, IEEE802154_ATTR_LLSEC_FRAME_COUNTER, be32_to_cpu(params.frame_counter)) || - ieee802154_llsec_fill_key_id(msg, ¶ms.out_key)) + ieee802154_llsec_fill_key_id(msg, ¶ms.out_key)) { + rc = -ENOBUFS; goto out_free; + } dev_put(dev); From e0b61cda5f07b87b07841ee15066e90b6cb2ca6e Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 10 May 2021 11:38:52 +0200 Subject: [PATCH 041/269] igb: add correct exception tracing for XDP [ Upstream commit 74431c40b9c5fa673fff83ec157a76a69efd5c72 ] Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 9cbc948b5a20 ("igb: add XDP support") Reported-by: Jesper Dangaard Brouer Signed-off-by: Magnus Karlsson Tested-by: Vishakha Jambekar Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/igb/igb_main.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 368f0aac5e1d4..5c87c0a7ce3d7 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -8419,18 +8419,20 @@ static struct sk_buff *igb_run_xdp(struct igb_adapter *adapter, break; case XDP_TX: result = igb_xdp_xmit_back(adapter, xdp); + if (result == IGB_XDP_CONSUMED) + goto out_failure; break; case XDP_REDIRECT: err = xdp_do_redirect(adapter->netdev, xdp, xdp_prog); - if (!err) - result = IGB_XDP_REDIR; - else - result = IGB_XDP_CONSUMED; + if (err) + goto out_failure; + result = IGB_XDP_REDIR; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_DROP: From 8067da904921c1de8c1cd055170ab1f5944945e3 Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 10 May 2021 11:38:53 +0200 Subject: [PATCH 042/269] ixgbevf: add correct exception tracing for XDP [ Upstream commit faae81420d162551b6ef2d804aafc00f4cd68e0e ] Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 21092e9ce8b1 ("ixgbevf: Add support for XDP_TX action") Reported-by: Jesper Dangaard Brouer Signed-off-by: Magnus Karlsson Tested-by: Vishakha Jambekar Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 82fce27f682bb..a7d0a459969a2 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -1072,11 +1072,14 @@ static struct sk_buff *ixgbevf_run_xdp(struct ixgbevf_adapter *adapter, case XDP_TX: xdp_ring = adapter->xdp_ring[rx_ring->queue_index]; result = ixgbevf_xmit_xdp_ring(xdp_ring, xdp); + if (result == IXGBEVF_XDP_CONSUMED) + goto out_failure; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: From 1dcf3d435bf6147691560b8670ff06bfaa023a69 Mon Sep 17 00:00:00 2001 From: Rahul Lakkireddy Date: Wed, 2 Jun 2021 19:38:59 +0530 Subject: [PATCH 043/269] cxgb4: fix regression with HASH tc prio value update [ Upstream commit a27fb314cba8cb84cd6456a4699c3330a83c326d ] commit db43b30cd89c ("cxgb4: add ethtool n-tuple filter deletion") has moved searching for next highest priority HASH filter rule to cxgb4_flow_rule_destroy(), which searches the rhashtable before the the rule is removed from it and hence always finds at least 1 entry. Fix by removing the rule from rhashtable first before calling cxgb4_flow_rule_destroy() and hence avoid fetching stale info. Fixes: db43b30cd89c ("cxgb4: add ethtool n-tuple filter deletion") Signed-off-by: Rahul Lakkireddy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c index 1b88bd1c2dbe4..dd9be229819a5 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c @@ -997,20 +997,16 @@ int cxgb4_tc_flower_destroy(struct net_device *dev, if (!ch_flower) return -ENOENT; + rhashtable_remove_fast(&adap->flower_tbl, &ch_flower->node, + adap->flower_ht_params); + ret = cxgb4_flow_rule_destroy(dev, ch_flower->fs.tc_prio, &ch_flower->fs, ch_flower->filter_id); if (ret) - goto err; + netdev_err(dev, "Flow rule destroy failed for tid: %u, ret: %d", + ch_flower->filter_id, ret); - ret = rhashtable_remove_fast(&adap->flower_tbl, &ch_flower->node, - adap->flower_ht_params); - if (ret) { - netdev_err(dev, "Flow remove from rhashtable failed"); - goto err; - } kfree_rcu(ch_flower, rcu); - -err: return ret; } From 09870235827451409ff546b073d754a19fd17e2e Mon Sep 17 00:00:00 2001 From: Coco Li Date: Thu, 3 Jun 2021 07:32:58 +0000 Subject: [PATCH 044/269] ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions [ Upstream commit 821bbf79fe46a8b1d18aa456e8ed0a3c208c3754 ] Reported by syzbot: HEAD commit: 90c911ad Merge tag 'fixes' of git://git.kernel.org/pub/scm.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master dashboard link: https://syzkaller.appspot.com/bug?extid=123aa35098fd3c000eb7 compiler: Debian clang version 11.0.1-2 ================================================================== BUG: KASAN: slab-out-of-bounds in fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline] BUG: KASAN: slab-out-of-bounds in fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732 Read of size 8 at addr ffff8880145c78f8 by task syz-executor.4/17760 CPU: 0 PID: 17760 Comm: syz-executor.4 Not tainted 5.12.0-rc8-syzkaller #0 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x202/0x31e lib/dump_stack.c:120 print_address_description+0x5f/0x3b0 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report+0x15c/0x200 mm/kasan/report.c:416 fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline] fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732 fib6_nh_release+0x9a/0x430 net/ipv6/route.c:3536 fib6_info_destroy_rcu+0xcb/0x1c0 net/ipv6/ip6_fib.c:174 rcu_do_batch kernel/rcu/tree.c:2559 [inline] rcu_core+0x8f6/0x1450 kernel/rcu/tree.c:2794 __do_softirq+0x372/0x7a6 kernel/softirq.c:345 invoke_softirq kernel/softirq.c:221 [inline] __irq_exit_rcu+0x22c/0x260 kernel/softirq.c:422 irq_exit_rcu+0x5/0x20 kernel/softirq.c:434 sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632 RIP: 0010:lock_acquire+0x1f6/0x720 kernel/locking/lockdep.c:5515 Code: f6 84 24 a1 00 00 00 02 0f 85 8d 02 00 00 f7 c3 00 02 00 00 49 bd 00 00 00 00 00 fc ff df 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 3d 00 00 00 00 00 4b c7 44 3d 09 00 00 00 00 43 c7 44 3d RSP: 0018:ffffc90009e06560 EFLAGS: 00000206 RAX: 1ffff920013c0cc0 RBX: 0000000000000246 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90009e066e0 R08: dffffc0000000000 R09: fffffbfff1f992b1 R10: fffffbfff1f992b1 R11: 0000000000000000 R12: 0000000000000000 R13: dffffc0000000000 R14: 0000000000000000 R15: 1ffff920013c0cb4 rcu_lock_acquire+0x2a/0x30 include/linux/rcupdate.h:267 rcu_read_lock include/linux/rcupdate.h:656 [inline] ext4_get_group_info+0xea/0x340 fs/ext4/ext4.h:3231 ext4_mb_prefetch+0x123/0x5d0 fs/ext4/mballoc.c:2212 ext4_mb_regular_allocator+0x8a5/0x28f0 fs/ext4/mballoc.c:2379 ext4_mb_new_blocks+0xc6e/0x24f0 fs/ext4/mballoc.c:4982 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4238 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638 ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848 ext4_bread+0x2a/0x1c0 fs/ext4/inode.c:900 ext4_append+0x1a4/0x360 fs/ext4/namei.c:67 ext4_init_new_dir+0x337/0xa10 fs/ext4/namei.c:2768 ext4_mkdir+0x4b8/0xc00 fs/ext4/namei.c:2814 vfs_mkdir+0x45b/0x640 fs/namei.c:3819 ovl_do_mkdir fs/overlayfs/overlayfs.h:161 [inline] ovl_mkdir_real+0x53/0x1a0 fs/overlayfs/dir.c:146 ovl_create_real+0x280/0x490 fs/overlayfs/dir.c:193 ovl_workdir_create+0x425/0x600 fs/overlayfs/super.c:788 ovl_make_workdir+0xed/0x1140 fs/overlayfs/super.c:1355 ovl_get_workdir fs/overlayfs/super.c:1492 [inline] ovl_fill_super+0x39ee/0x5370 fs/overlayfs/super.c:2035 mount_nodev+0x52/0xe0 fs/super.c:1413 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2903 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3233 do_mount fs/namespace.c:3246 [inline] __do_sys_mount fs/namespace.c:3454 [inline] __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3431 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4665f9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f68f2b87188 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9 RDX: 00000000200000c0 RSI: 0000000020000000 RDI: 000000000040000a RBP: 00000000004bfbb9 R08: 0000000020000100 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 R13: 00007ffe19002dff R14: 00007f68f2b87300 R15: 0000000000022000 Allocated by task 17768: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:427 [inline] ____kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506 kasan_kmalloc include/linux/kasan.h:233 [inline] __kmalloc+0xb4/0x380 mm/slub.c:4055 kmalloc include/linux/slab.h:559 [inline] kzalloc include/linux/slab.h:684 [inline] fib6_info_alloc+0x2c/0xd0 net/ipv6/ip6_fib.c:154 ip6_route_info_create+0x55d/0x1a10 net/ipv6/route.c:3638 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmsg+0x319/0x400 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Last potentially related work creation: kasan_save_stack+0x27/0x50 mm/kasan/common.c:38 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345 __call_rcu kernel/rcu/tree.c:3039 [inline] call_rcu+0x1b1/0xa30 kernel/rcu/tree.c:3114 fib6_info_release include/net/ip6_fib.h:337 [inline] ip6_route_info_create+0x10c4/0x1a10 net/ipv6/route.c:3718 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmsg+0x319/0x400 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Second to last potentially related work creation: kasan_save_stack+0x27/0x50 mm/kasan/common.c:38 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345 insert_work+0x54/0x400 kernel/workqueue.c:1331 __queue_work+0x981/0xcc0 kernel/workqueue.c:1497 queue_work_on+0x111/0x200 kernel/workqueue.c:1524 queue_work include/linux/workqueue.h:507 [inline] call_usermodehelper_exec+0x283/0x470 kernel/umh.c:433 kobject_uevent_env+0x1349/0x1730 lib/kobject_uevent.c:617 kvm_uevent_notify_change+0x309/0x3b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4809 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:877 [inline] kvm_put_kvm+0x9c/0xd10 arch/x86/kvm/../../../virt/kvm/kvm_main.c:920 kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3120 __fput+0x352/0x7b0 fs/file_table.c:280 task_work_run+0x146/0x1c0 kernel/task_work.c:140 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:174 [inline] exit_to_user_mode_prepare+0x10b/0x1e0 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff8880145c7800 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 56 bytes to the right of 192-byte region [ffff8880145c7800, ffff8880145c78c0) The buggy address belongs to the page: page:ffffea00005171c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x145c7 flags: 0xfff00000000200(slab) raw: 00fff00000000200 ffffea00006474c0 0000000200000002 ffff888010c41a00 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880145c7780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8880145c7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8880145c7880: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff8880145c7900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880145c7980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ================================================================== In the ip6_route_info_create function, in the case that the nh pointer is not NULL, the fib6_nh in fib6_info has not been allocated. Therefore, when trying to free fib6_info in this error case using fib6_info_release, the function will call fib6_info_destroy_rcu, which it will access fib6_nh_release(f6i->fib6_nh); However, f6i->fib6_nh doesn't have any refcount yet given the lack of allocation causing the reported memory issue above. Therefore, releasing the empty pointer directly instead would be the solution. Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info") Fixes: 706ec91916462 ("ipv6: Fix nexthop refcnt leak when creating ipv6 route info") Signed-off-by: Coco Li Cc: David Ahern Reviewed-by: Eric Dumazet Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/route.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 71e578ed8699f..ccff4738313c1 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3671,11 +3671,11 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, if (nh) { if (rt->fib6_src.plen) { NL_SET_ERR_MSG(extack, "Nexthops can not be used with source routing"); - goto out; + goto out_free; } if (!nexthop_get(nh)) { NL_SET_ERR_MSG(extack, "Nexthop has been deleted"); - goto out; + goto out_free; } rt->nh = nh; fib6_nh = nexthop_fib6_nh(rt->nh); @@ -3712,6 +3712,10 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, out: fib6_info_release(rt); return ERR_PTR(err); +out_free: + ip_fib_metrics_put(rt->fib6_metrics); + kfree(rt); + return ERR_PTR(err); } int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags, From b94580b055b8a3e90f47dc2f8c7192455306d655 Mon Sep 17 00:00:00 2001 From: Brett Creeley Date: Fri, 26 Feb 2021 13:19:20 -0800 Subject: [PATCH 045/269] ice: Fix allowing VF to request more/less queues via virtchnl [ Upstream commit f0457690af56673cb0c47af6e25430389a149225 ] Commit 12bb018c538c ("ice: Refactor VF reset") caused a regression that removes the ability for a VF to request a different amount of queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to either increase or decrease the number of queue pairs they are allocated. Fix this by using the variable vf->num_req_qs when determining the vf->num_vf_qs during VF VSI creation. Fixes: 12bb018c538c ("ice: Refactor VF reset") Signed-off-by: Brett Creeley Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_lib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index e1384503dd4d5..fb20c6971f4c7 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -192,6 +192,8 @@ static void ice_vsi_set_num_qs(struct ice_vsi *vsi, u16 vf_id) break; case ICE_VSI_VF: vf = &pf->vf[vsi->vf_id]; + if (vf->num_req_qs) + vf->num_vf_qs = vf->num_req_qs; vsi->alloc_txq = vf->num_vf_qs; vsi->alloc_rxq = vf->num_vf_qs; /* pf->num_msix_per_vf includes (VF miscellaneous vector + From a79883ce1e9f7ccc1616c7659332db1266a9d434 Mon Sep 17 00:00:00 2001 From: Brett Creeley Date: Fri, 26 Feb 2021 13:19:21 -0800 Subject: [PATCH 046/269] ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared [ Upstream commit 8679f07a9922068b9b6be81b632f52cac45d1b91 ] Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the same time as VF_MBX_ARQLEN. Fixes: 82ba01282cf8 ("ice: clear VF ARQLEN register on reset") Signed-off-by: Brett Creeley Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_hw_autogen.h | 1 + drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 12 +++++++----- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h index 90abc8612a6ab..406dd6bd97a7d 100644 --- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h +++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h @@ -31,6 +31,7 @@ #define PF_FW_ATQLEN_ATQOVFL_M BIT(29) #define PF_FW_ATQLEN_ATQCRIT_M BIT(30) #define VF_MBX_ARQLEN(_VF) (0x0022BC00 + ((_VF) * 4)) +#define VF_MBX_ATQLEN(_VF) (0x0022A800 + ((_VF) * 4)) #define PF_FW_ATQLEN_ATQENABLE_M BIT(31) #define PF_FW_ATQT 0x00080400 #define PF_MBX_ARQBAH 0x0022E400 diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c index b3161c5def465..374ccce2a784c 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c @@ -435,13 +435,15 @@ static void ice_trigger_vf_reset(struct ice_vf *vf, bool is_vflr, bool is_pfr) */ clear_bit(ICE_VF_STATE_INIT, vf->vf_states); - /* VF_MBX_ARQLEN is cleared by PFR, so the driver needs to clear it - * in the case of VFR. If this is done for PFR, it can mess up VF - * resets because the VF driver may already have started cleanup - * by the time we get here. + /* VF_MBX_ARQLEN and VF_MBX_ATQLEN are cleared by PFR, so the driver + * needs to clear them in the case of VFR/VFLR. If this is done for + * PFR, it can mess up VF resets because the VF driver may already + * have started cleanup by the time we get here. */ - if (!is_pfr) + if (!is_pfr) { wr32(hw, VF_MBX_ARQLEN(vf->vf_id), 0); + wr32(hw, VF_MBX_ATQLEN(vf->vf_id), 0); + } /* In the case of a VFLR, the HW has already reset the VF and we * just need to clean up, so don't hit the VFRTRIG register. From 8726b9e81be7b30d7a9f4f1e3426352b37e6129d Mon Sep 17 00:00:00 2001 From: Haiyue Wang Date: Fri, 26 Feb 2021 13:19:31 -0800 Subject: [PATCH 047/269] ice: handle the VF VSI rebuild failure [ Upstream commit c7ee6ce1cf60b7fcdbdd2354d377d00bae3fa2d2 ] VSI rebuild can be failed for LAN queue config, then the VF's VSI will be NULL, the VF reset should be stopped with the VF entering into the disable state. Fixes: 12bb018c538c ("ice: Refactor VF reset") Signed-off-by: Haiyue Wang Tested-by: Konrad Jankowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c index 374ccce2a784c..c9f82fd3cf48d 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c @@ -1341,7 +1341,12 @@ bool ice_reset_vf(struct ice_vf *vf, bool is_vflr) } ice_vf_pre_vsi_rebuild(vf); - ice_vf_rebuild_vsi_with_release(vf); + + if (ice_vf_rebuild_vsi_with_release(vf)) { + dev_err(dev, "Failed to release and setup the VF%u's VSI\n", vf->vf_id); + return false; + } + ice_vf_post_vsi_rebuild(vf); return true; From 68db78345f7383dcd3ffd3c20f379f2f8e1b445f Mon Sep 17 00:00:00 2001 From: Paul Greenwalt Date: Wed, 5 May 2021 14:17:58 -0700 Subject: [PATCH 048/269] ice: report supported and advertised autoneg using PHY capabilities [ Upstream commit 5cd349c349d6ec52862e550d3576893d35ab8ac2 ] Ethtool incorrectly reported supported and advertised auto-negotiation settings for a backplane PHY image which did not support auto-negotiation. This can occur when using media or PHY type for reporting ethtool supported and advertised auto-negotiation settings. Remove setting supported and advertised auto-negotiation settings based on PHY type in ice_phy_type_to_ethtool(), and MAC type in ice_get_link_ksettings(). Ethtool supported and advertised auto-negotiation settings should be based on the PHY image using the AQ command get PHY capabilities with media. Add setting supported and advertised auto-negotiation settings based get PHY capabilities with media in ice_get_link_ksettings(). Fixes: 48cb27f2fd18 ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Paul Greenwalt Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_ethtool.c | 51 +++----------------- 1 file changed, 6 insertions(+), 45 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index d70573f5072c6..a7975afecf70f 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -1797,49 +1797,6 @@ ice_phy_type_to_ethtool(struct net_device *netdev, ice_ethtool_advertise_link_mode(ICE_AQ_LINK_SPEED_100GB, 100000baseKR4_Full); } - - /* Autoneg PHY types */ - if (phy_types_low & ICE_PHY_TYPE_LOW_100BASE_TX || - phy_types_low & ICE_PHY_TYPE_LOW_1000BASE_T || - phy_types_low & ICE_PHY_TYPE_LOW_1000BASE_KX || - phy_types_low & ICE_PHY_TYPE_LOW_2500BASE_T || - phy_types_low & ICE_PHY_TYPE_LOW_2500BASE_KX || - phy_types_low & ICE_PHY_TYPE_LOW_5GBASE_T || - phy_types_low & ICE_PHY_TYPE_LOW_5GBASE_KR || - phy_types_low & ICE_PHY_TYPE_LOW_10GBASE_T || - phy_types_low & ICE_PHY_TYPE_LOW_10GBASE_KR_CR1 || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_T || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR_S || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR1 || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR_S || - phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR1 || - phy_types_low & ICE_PHY_TYPE_LOW_40GBASE_CR4 || - phy_types_low & ICE_PHY_TYPE_LOW_40GBASE_KR4) { - ethtool_link_ksettings_add_link_mode(ks, supported, - Autoneg); - ethtool_link_ksettings_add_link_mode(ks, advertising, - Autoneg); - } - if (phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_CR2 || - phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_KR2 || - phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_CP || - phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_KR_PAM4) { - ethtool_link_ksettings_add_link_mode(ks, supported, - Autoneg); - ethtool_link_ksettings_add_link_mode(ks, advertising, - Autoneg); - } - if (phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_CR4 || - phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_KR4 || - phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_KR_PAM4 || - phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_CP2) { - ethtool_link_ksettings_add_link_mode(ks, supported, - Autoneg); - ethtool_link_ksettings_add_link_mode(ks, advertising, - Autoneg); - } } #define TEST_SET_BITS_TIMEOUT 50 @@ -1996,9 +1953,7 @@ ice_get_link_ksettings(struct net_device *netdev, ks->base.port = PORT_TP; break; case ICE_MEDIA_BACKPLANE: - ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg); ethtool_link_ksettings_add_link_mode(ks, supported, Backplane); - ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg); ethtool_link_ksettings_add_link_mode(ks, advertising, Backplane); ks->base.port = PORT_NONE; @@ -2073,6 +2028,12 @@ ice_get_link_ksettings(struct net_device *netdev, if (caps->link_fec_options & ICE_AQC_PHY_FEC_25G_RS_CLAUSE91_EN) ethtool_link_ksettings_add_link_mode(ks, supported, FEC_RS); + /* Set supported and advertised autoneg */ + if (ice_is_phy_caps_an_enabled(caps)) { + ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg); + ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg); + } + done: kfree(caps); return err; From c4b796f20c9581ed8502e42be3d5bde59469143d Mon Sep 17 00:00:00 2001 From: Dave Ertman Date: Wed, 5 May 2021 14:17:59 -0700 Subject: [PATCH 049/269] ice: Allow all LLDP packets from PF to Tx [ Upstream commit f9f83202b7263ac371d616d6894a2c9ed79158ef ] Currently in the ice driver, the check whether to allow a LLDP packet to egress the interface from the PF_VSI is being based on the SKB's priority field. It checks to see if the packets priority is equal to TC_PRIO_CONTROL. Injected LLDP packets do not always meet this condition. SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL (0x0003) and does not set the priority field. There will be other injection methods (even ones used by end users) that will not correctly configure the socket so that SKB fields are correctly populated. Then ethernet header has to have to correct value for the protocol though. Add a check to also allow packets whose ethhdr->h_proto matches ETH_P_LLDP (0x88CC). Fixes: 0c3a6101ff2d ("ice: Allow egress control packets from PF_VSI") Signed-off-by: Dave Ertman Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 0f2544c420ac3..1510091a63e8a 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -2373,6 +2373,7 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_ring *tx_ring) struct ice_tx_offload_params offload = { 0 }; struct ice_vsi *vsi = tx_ring->vsi; struct ice_tx_buf *first; + struct ethhdr *eth; unsigned int count; int tso, csum; @@ -2419,7 +2420,9 @@ ice_xmit_frame_ring(struct sk_buff *skb, struct ice_ring *tx_ring) goto out_drop; /* allow CONTROL frames egress from main VSI if FW LLDP disabled */ - if (unlikely(skb->priority == TC_PRIO_CONTROL && + eth = (struct ethhdr *)skb_mac_header(skb); + if (unlikely((skb->priority == TC_PRIO_CONTROL || + eth->h_proto == htons(ETH_P_LLDP)) && vsi->type == ICE_VSI_PF && vsi->port_info->qos_cfg.is_sw_lldp)) offload.cd_qw1 |= (u64)(ICE_TX_DESC_DTYPE_CTX | From 21d494d4446b020e69e7b22fa6ed9274db1f175c Mon Sep 17 00:00:00 2001 From: Roja Rani Yarubandi Date: Tue, 25 May 2021 18:40:50 +0530 Subject: [PATCH 050/269] i2c: qcom-geni: Add shutdown callback for i2c [ Upstream commit 9f78c607600ce4f2a952560de26534715236f612 ] If the hardware is still accessing memory after SMMU translation is disabled (as part of smmu shutdown callback), then the IOVAs (I/O virtual address) which it was using will go on the bus as the physical addresses which will result in unknown crashes like NoC/interconnect errors. So, implement shutdown callback for i2c driver to suspend the bus during system "reboot" or "shutdown". Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Roja Rani Yarubandi Reviewed-by: Stephen Boyd Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-qcom-geni.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 4a6dd05d6dbf9..899ad2c7d67d3 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -654,6 +654,14 @@ static int geni_i2c_remove(struct platform_device *pdev) return 0; } +static void geni_i2c_shutdown(struct platform_device *pdev) +{ + struct geni_i2c_dev *gi2c = platform_get_drvdata(pdev); + + /* Make client i2c transfers start failing */ + i2c_mark_adapter_suspended(&gi2c->adap); +} + static int __maybe_unused geni_i2c_runtime_suspend(struct device *dev) { int ret; @@ -718,6 +726,7 @@ MODULE_DEVICE_TABLE(of, geni_i2c_dt_match); static struct platform_driver geni_i2c_driver = { .probe = geni_i2c_probe, .remove = geni_i2c_remove, + .shutdown = geni_i2c_shutdown, .driver = { .name = "geni_i2c", .pm = &geni_i2c_pm_ops, From 1958a31c035dd8981a7313acf8f91ae1886173cc Mon Sep 17 00:00:00 2001 From: Rahul Lakkireddy Date: Fri, 4 Jun 2021 16:48:18 +0530 Subject: [PATCH 051/269] cxgb4: avoid link re-train during TC-MQPRIO configuration [ Upstream commit 3822d0670c9d4342794d73e0d0e615322b40438e ] When configuring TC-MQPRIO offload, only turn off netdev carrier and don't bring physical link down in hardware. Otherwise, when the physical link is brought up again after configuration, it gets re-trained and stalls ongoing traffic. Also, when firmware is no longer accessible or crashed, avoid sending FLOWC and waiting for reply that will never come. Fix following hung_task_timeout_secs trace seen in these cases. INFO: task tc:20807 blocked for more than 122 seconds. Tainted: G S 5.13.0-rc3+ #122 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:tc state:D stack:14768 pid:20807 ppid: 19366 flags:0x00000000 Call Trace: __schedule+0x27b/0x6a0 schedule+0x37/0xa0 schedule_preempt_disabled+0x5/0x10 __mutex_lock.isra.14+0x2a0/0x4a0 ? netlink_lookup+0x120/0x1a0 ? rtnl_fill_ifinfo+0x10f0/0x10f0 __netlink_dump_start+0x70/0x250 rtnetlink_rcv_msg+0x28b/0x380 ? rtnl_fill_ifinfo+0x10f0/0x10f0 ? rtnl_calcit.isra.42+0x120/0x120 netlink_rcv_skb+0x4b/0xf0 netlink_unicast+0x1a0/0x280 netlink_sendmsg+0x216/0x440 sock_sendmsg+0x56/0x60 __sys_sendto+0xe9/0x150 ? handle_mm_fault+0x6d/0x1b0 ? do_user_addr_fault+0x1c5/0x620 __x64_sys_sendto+0x1f/0x30 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7f73218321 RSP: 002b:00007ffd19626208 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055b7c0a8b240 RCX: 00007f7f73218321 RDX: 0000000000000028 RSI: 00007ffd19626210 RDI: 0000000000000003 RBP: 000055b7c08680ff R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000055b7c085f5f6 R13: 000055b7c085f60a R14: 00007ffd19636470 R15: 00007ffd196262a0 Fixes: b1396c2bd675 ("cxgb4: parse and configure TC-MQPRIO offload") Signed-off-by: Rahul Lakkireddy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 2 -- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 4 ++-- drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c | 9 ++++++--- drivers/net/ethernet/chelsio/cxgb4/sge.c | 6 ++++++ 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 27308600da153..2dd4869156291 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -2177,8 +2177,6 @@ int cxgb4_update_mac_filt(struct port_info *pi, unsigned int viid, bool persistent, u8 *smt_idx); int cxgb4_get_msix_idx_from_bmap(struct adapter *adap); void cxgb4_free_msix_idx_in_bmap(struct adapter *adap, u32 msix_idx); -int cxgb_open(struct net_device *dev); -int cxgb_close(struct net_device *dev); void cxgb4_enable_rx(struct adapter *adap, struct sge_rspq *q); void cxgb4_quiesce_rx(struct sge_rspq *q); int cxgb4_port_mirror_alloc(struct net_device *dev); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 23c13f34a5727..04dcb5e4b3161 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -2834,7 +2834,7 @@ static void cxgb_down(struct adapter *adapter) /* * net_device operations */ -int cxgb_open(struct net_device *dev) +static int cxgb_open(struct net_device *dev) { struct port_info *pi = netdev_priv(dev); struct adapter *adapter = pi->adapter; @@ -2882,7 +2882,7 @@ int cxgb_open(struct net_device *dev) return err; } -int cxgb_close(struct net_device *dev) +static int cxgb_close(struct net_device *dev) { struct port_info *pi = netdev_priv(dev); struct adapter *adapter = pi->adapter; diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c index 6c259de96f969..338b04f339b3d 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c @@ -589,7 +589,8 @@ int cxgb4_setup_tc_mqprio(struct net_device *dev, * down before configuring tc params. */ if (netif_running(dev)) { - cxgb_close(dev); + netif_tx_stop_all_queues(dev); + netif_carrier_off(dev); needs_bring_up = true; } @@ -615,8 +616,10 @@ int cxgb4_setup_tc_mqprio(struct net_device *dev, } out: - if (needs_bring_up) - cxgb_open(dev); + if (needs_bring_up) { + netif_tx_start_all_queues(dev); + netif_carrier_on(dev); + } mutex_unlock(&adap->tc_mqprio->mqprio_mutex); return ret; diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index 546301272271d..ccb6bd002b20d 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -2552,6 +2552,12 @@ int cxgb4_ethofld_send_flowc(struct net_device *dev, u32 eotid, u32 tc) if (!eosw_txq) return -ENOMEM; + if (!(adap->flags & CXGB4_FW_OK)) { + /* Don't stall caller when access to FW is lost */ + complete(&eosw_txq->completion); + return -EIO; + } + skb = alloc_skb(len, GFP_KERNEL); if (!skb) return -ENOMEM; From fbae1a97ce342470dcf2c3f51e63faf2e8e557f0 Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Wed, 2 Dec 2020 16:07:22 +0100 Subject: [PATCH 052/269] i40e: optimize for XDP_REDIRECT in xsk path [ Upstream commit 346497c78d15cdd5bdc3b642a895009359e5457f ] Optimize i40e_run_xdp_zc() for the XDP program verdict being XDP_REDIRECT in the xsk zero-copy path. This path is only used when having AF_XDP zero-copy on and in that case most packets will be directed to user space. This provides a little over 100k extra packets in throughput on my server when running l2fwd in xdpsock. Signed-off-by: Magnus Karlsson Tested-by: George Kuruvinakunnel Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 8557807b41717..18e8d4e456d64 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -159,6 +159,13 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) xdp_prog = READ_ONCE(rx_ring->xdp_prog); act = bpf_prog_run_xdp(xdp_prog, xdp); + if (likely(act == XDP_REDIRECT)) { + err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); + result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED; + rcu_read_unlock(); + return result; + } + switch (act) { case XDP_PASS: break; @@ -166,10 +173,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->queue_index]; result = i40e_xmit_xdp_tx_ring(xdp, xdp_ring); break; - case XDP_REDIRECT: - err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED; - break; default: bpf_warn_invalid_xdp_action(act); fallthrough; From 274d6eeaafc7c40b0618c66ec9e0cdf2f51c2f4d Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 10 May 2021 11:38:49 +0200 Subject: [PATCH 053/269] i40e: add correct exception tracing for XDP [ Upstream commit f6c10b48f8c8da44adaff730d8e700b6272add2b ] Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action") Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Reported-by: Jesper Dangaard Brouer Signed-off-by: Magnus Karlsson Tested-by: Kiran Bhandare Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 7 ++++++- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 8 ++++++-- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 011f484606a3a..c40ac82db863e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2205,15 +2205,20 @@ static int i40e_run_xdp(struct i40e_ring *rx_ring, struct xdp_buff *xdp) case XDP_TX: xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->queue_index]; result = i40e_xmit_xdp_tx_ring(xdp, xdp_ring); + if (result == I40E_XDP_CONSUMED) + goto out_failure; break; case XDP_REDIRECT: err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED; + if (err) + goto out_failure; + result = I40E_XDP_REDIR; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 18e8d4e456d64..86c79f71c685a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -161,9 +161,10 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) if (likely(act == XDP_REDIRECT)) { err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED; + if (err) + goto out_failure; rcu_read_unlock(); - return result; + return I40E_XDP_REDIR; } switch (act) { @@ -172,11 +173,14 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp) case XDP_TX: xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->queue_index]; result = i40e_xmit_xdp_tx_ring(xdp, xdp_ring); + if (result == I40E_XDP_CONSUMED) + goto out_failure; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: From 7bd82b73d5898a0c8d219740855c823edde5ac00 Mon Sep 17 00:00:00 2001 From: Maciej Fijalkowski Date: Mon, 18 Jan 2021 16:13:11 +0100 Subject: [PATCH 054/269] ice: simplify ice_run_xdp MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 59c97d1b51b119eace6b1e61a6f820701f5a8299 ] There's no need for 'result' variable, we can directly return the internal status based on action returned by xdp prog. Reviewed-by: Björn Töpel Signed-off-by: Maciej Fijalkowski Tested-by: Kiran Bhandare Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 1510091a63e8a..5cc2854a5d48b 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -537,22 +537,20 @@ static int ice_run_xdp(struct ice_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { - int err, result = ICE_XDP_PASS; struct ice_ring *xdp_ring; + int err; u32 act; act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: - break; + return ICE_XDP_PASS; case XDP_TX: xdp_ring = rx_ring->vsi->xdp_rings[smp_processor_id()]; - result = ice_xmit_xdp_buff(xdp, xdp_ring); - break; + return ice_xmit_xdp_buff(xdp, xdp_ring); case XDP_REDIRECT: err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; - break; + return !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; default: bpf_warn_invalid_xdp_action(act); fallthrough; @@ -560,11 +558,8 @@ ice_run_xdp(struct ice_ring *rx_ring, struct xdp_buff *xdp, trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_DROP: - result = ICE_XDP_CONSUMED; - break; + return ICE_XDP_CONSUMED; } - - return result; } /** From 9e1eb428849fab331c73876be4a52bbcf884dbb0 Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Wed, 2 Dec 2020 16:07:24 +0100 Subject: [PATCH 055/269] ice: optimize for XDP_REDIRECT in xsk path [ Upstream commit bb52073645a618ab4d93c8d932fb8faf114c55bc ] Optimize ice_run_xdp_zc() for the XDP program verdict being XDP_REDIRECT in the xsk zero-copy path. This path is only used when having AF_XDP zero-copy on and in that case most packets will be directed to user space. This provides a little over 100k extra packets in throughput on my server when running l2fwd in xdpsock. Signed-off-by: Magnus Karlsson Tested-by: George Kuruvinakunnel Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_xsk.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 98101a8e2952d..952148eede30f 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -524,6 +524,14 @@ ice_run_xdp_zc(struct ice_ring *rx_ring, struct xdp_buff *xdp) } act = bpf_prog_run_xdp(xdp_prog, xdp); + + if (likely(act == XDP_REDIRECT)) { + err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); + result = !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; + rcu_read_unlock(); + return result; + } + switch (act) { case XDP_PASS: break; @@ -531,10 +539,6 @@ ice_run_xdp_zc(struct ice_ring *rx_ring, struct xdp_buff *xdp) xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->q_index]; result = ice_xmit_xdp_buff(xdp, xdp_ring); break; - case XDP_REDIRECT: - err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; - break; default: bpf_warn_invalid_xdp_action(act); fallthrough; From ad505705bba64cf698f0fcaa26fdd644f400e57c Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 10 May 2021 11:38:50 +0200 Subject: [PATCH 056/269] ice: add correct exception tracing for XDP [ Upstream commit 89d65df024c59988291f643b4e45d1528c51aef9 ] Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: efc2214b6047 ("ice: Add support for XDP") Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Reported-by: Jesper Dangaard Brouer Signed-off-by: Magnus Karlsson Tested-by: Kiran Bhandare Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ice/ice_txrx.c | 12 +++++++++--- drivers/net/ethernet/intel/ice/ice_xsk.c | 8 ++++++-- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 5cc2854a5d48b..442a9bcbf60a7 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -538,7 +538,7 @@ ice_run_xdp(struct ice_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct ice_ring *xdp_ring; - int err; + int err, result; u32 act; act = bpf_prog_run_xdp(xdp_prog, xdp); @@ -547,14 +547,20 @@ ice_run_xdp(struct ice_ring *rx_ring, struct xdp_buff *xdp, return ICE_XDP_PASS; case XDP_TX: xdp_ring = rx_ring->vsi->xdp_rings[smp_processor_id()]; - return ice_xmit_xdp_buff(xdp, xdp_ring); + result = ice_xmit_xdp_buff(xdp, xdp_ring); + if (result == ICE_XDP_CONSUMED) + goto out_failure; + return result; case XDP_REDIRECT: err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - return !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; + if (err) + goto out_failure; + return ICE_XDP_REDIR; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_DROP: diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 952148eede30f..9f36f8d7a9854 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -527,9 +527,10 @@ ice_run_xdp_zc(struct ice_ring *rx_ring, struct xdp_buff *xdp) if (likely(act == XDP_REDIRECT)) { err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED; + if (err) + goto out_failure; rcu_read_unlock(); - return result; + return ICE_XDP_REDIR; } switch (act) { @@ -538,11 +539,14 @@ ice_run_xdp_zc(struct ice_ring *rx_ring, struct xdp_buff *xdp) case XDP_TX: xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->q_index]; result = ice_xmit_xdp_buff(xdp, xdp_ring); + if (result == ICE_XDP_CONSUMED) + goto out_failure; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; case XDP_DROP: From e369db6cde11b77197382d8cea36e8bc976aaac9 Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Wed, 2 Dec 2020 16:07:23 +0100 Subject: [PATCH 057/269] ixgbe: optimize for XDP_REDIRECT in xsk path [ Upstream commit 7d52fe2eaddfa3d7255d43c3e89ebf2748b7ea7a ] Optimize ixgbe_run_xdp_zc() for the XDP program verdict being XDP_REDIRECT in the xsk zero-copy path. This path is only used when having AF_XDP zero-copy on and in that case most packets will be directed to user space. This provides a little under 100k extra packets in throughput on my server when running l2fwd in xdpsock. Signed-off-by: Magnus Karlsson Tested-by: Vishakha Jambekar Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c index 3771857cf887c..91ad5b902673c 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c @@ -104,6 +104,13 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, xdp_prog = READ_ONCE(rx_ring->xdp_prog); act = bpf_prog_run_xdp(xdp_prog, xdp); + if (likely(act == XDP_REDIRECT)) { + err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); + result = !err ? IXGBE_XDP_REDIR : IXGBE_XDP_CONSUMED; + rcu_read_unlock(); + return result; + } + switch (act) { case XDP_PASS: break; @@ -115,10 +122,6 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, } result = ixgbe_xmit_xdp_ring(adapter, xdpf); break; - case XDP_REDIRECT: - err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? IXGBE_XDP_REDIR : IXGBE_XDP_CONSUMED; - break; default: bpf_warn_invalid_xdp_action(act); fallthrough; From a1bf16616d8351a2e79400d6d19608befb2ce1dd Mon Sep 17 00:00:00 2001 From: Magnus Karlsson Date: Mon, 10 May 2021 11:38:51 +0200 Subject: [PATCH 058/269] ixgbe: add correct exception tracing for XDP [ Upstream commit 8281356b1cab1cccc71412eb4cf28b99d6bb2c19 ] Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 33fdc82f0883 ("ixgbe: add support for XDP_TX action") Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Reported-by: Jesper Dangaard Brouer Signed-off-by: Magnus Karlsson Tested-by: Vishakha Jambekar Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 ++++++++-------- drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 14 ++++++++------ 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 0b9fddbc5db4f..1bfba87f1ff60 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -2218,23 +2218,23 @@ static struct sk_buff *ixgbe_run_xdp(struct ixgbe_adapter *adapter, break; case XDP_TX: xdpf = xdp_convert_buff_to_frame(xdp); - if (unlikely(!xdpf)) { - result = IXGBE_XDP_CONSUMED; - break; - } + if (unlikely(!xdpf)) + goto out_failure; result = ixgbe_xmit_xdp_ring(adapter, xdpf); + if (result == IXGBE_XDP_CONSUMED) + goto out_failure; break; case XDP_REDIRECT: err = xdp_do_redirect(adapter->netdev, xdp, xdp_prog); - if (!err) - result = IXGBE_XDP_REDIR; - else - result = IXGBE_XDP_CONSUMED; + if (err) + goto out_failure; + result = IXGBE_XDP_REDIR; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c index 91ad5b902673c..f72d2978263b9 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c @@ -106,9 +106,10 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, if (likely(act == XDP_REDIRECT)) { err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog); - result = !err ? IXGBE_XDP_REDIR : IXGBE_XDP_CONSUMED; + if (err) + goto out_failure; rcu_read_unlock(); - return result; + return IXGBE_XDP_REDIR; } switch (act) { @@ -116,16 +117,17 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter, break; case XDP_TX: xdpf = xdp_convert_buff_to_frame(xdp); - if (unlikely(!xdpf)) { - result = IXGBE_XDP_CONSUMED; - break; - } + if (unlikely(!xdpf)) + goto out_failure; result = ixgbe_xmit_xdp_ring(adapter, xdpf); + if (result == IXGBE_XDP_CONSUMED) + goto out_failure; break; default: bpf_warn_invalid_xdp_action(act); fallthrough; case XDP_ABORTED: +out_failure: trace_xdp_exception(rx_ring->netdev, xdp_prog, act); fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: From d866a6e61a4de0eb13973eb8a85c780323876711 Mon Sep 17 00:00:00 2001 From: Vignesh Raghavendra Date: Mon, 10 May 2021 23:36:01 +0530 Subject: [PATCH 059/269] arm64: dts: ti: j7200-main: Mark Main NAVSS as dma-coherent [ Upstream commit 52ae30f55a2a40cff549fac95de82f25403bd387 ] Traffic through main NAVSS interconnect is coherent wrt ARM caches on J7200 SoC. Add missing dma-coherent property to main_navss node. Also add dma-ranges to be consistent with mcu_navss node and with AM65/J721e main_navss and mcu_navss nodes. Fixes: d361ed88455fe ("arm64: dts: ti: Add support for J7200 SoC") Signed-off-by: Vignesh Raghavendra Reviewed-by: Peter Ujfalusi Signed-off-by: Nishanth Menon Link: https://lore.kernel.org/r/20210510180601.19458-1-vigneshr@ti.com Signed-off-by: Sasha Levin --- arch/arm64/boot/dts/ti/k3-j7200-main.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi index 72d6496e88dd4..689538244392c 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi @@ -78,6 +78,8 @@ #size-cells = <2>; ranges = <0x00 0x30000000 0x00 0x30000000 0x00 0x0c400000>; ti,sci-dev-id = <199>; + dma-coherent; + dma-ranges; main_navss_intr: interrupt-controller1 { compatible = "ti,sci-intr"; From 426ba49ec50b5f1c8c10fdef095b5718b673d121 Mon Sep 17 00:00:00 2001 From: Jens Wiklander Date: Mon, 19 Apr 2021 18:46:30 +0200 Subject: [PATCH 060/269] optee: use export_uuid() to copy client UUID [ Upstream commit 673c7aa2436bfc857b92417f3e590a297c586dde ] Prior to this patch optee_open_session() was making assumptions about the internal format of uuid_t by casting a memory location in a parameter struct to uuid_t *. Fix this using export_uuid() to get a well defined binary representation and also add an octets field in struct optee_msg_param in order to avoid casting. Fixes: c5b4312bea5d ("tee: optee: Add support for session login client UUID generation") Suggested-by: Andy Shevchenko Signed-off-by: Jens Wiklander Signed-off-by: Sasha Levin --- drivers/tee/optee/call.c | 6 ++++-- drivers/tee/optee/optee_msg.h | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c index 780d7c4fd7565..0790de29f0ca2 100644 --- a/drivers/tee/optee/call.c +++ b/drivers/tee/optee/call.c @@ -217,6 +217,7 @@ int optee_open_session(struct tee_context *ctx, struct optee_msg_arg *msg_arg; phys_addr_t msg_parg; struct optee_session *sess = NULL; + uuid_t client_uuid; /* +2 for the meta parameters added below */ shm = get_msg_arg(ctx, arg->num_params + 2, &msg_arg, &msg_parg); @@ -237,10 +238,11 @@ int optee_open_session(struct tee_context *ctx, memcpy(&msg_arg->params[0].u.value, arg->uuid, sizeof(arg->uuid)); msg_arg->params[1].u.value.c = arg->clnt_login; - rc = tee_session_calc_client_uuid((uuid_t *)&msg_arg->params[1].u.value, - arg->clnt_login, arg->clnt_uuid); + rc = tee_session_calc_client_uuid(&client_uuid, arg->clnt_login, + arg->clnt_uuid); if (rc) goto out; + export_uuid(msg_arg->params[1].u.octets, &client_uuid); rc = optee_to_msg_param(msg_arg->params + 2, arg->num_params, param); if (rc) diff --git a/drivers/tee/optee/optee_msg.h b/drivers/tee/optee/optee_msg.h index 7b2d919da2ace..c7ac7d02d6cc9 100644 --- a/drivers/tee/optee/optee_msg.h +++ b/drivers/tee/optee/optee_msg.h @@ -9,7 +9,7 @@ #include /* - * This file defines the OP-TEE message protocol used to communicate + * This file defines the OP-TEE message protocol (ABI) used to communicate * with an instance of OP-TEE running in secure world. * * This file is divided into three sections. @@ -146,9 +146,10 @@ struct optee_msg_param_value { * @tmem: parameter by temporary memory reference * @rmem: parameter by registered memory reference * @value: parameter by opaque value + * @octets: parameter by octet string * * @attr & OPTEE_MSG_ATTR_TYPE_MASK indicates if tmem, rmem or value is used in - * the union. OPTEE_MSG_ATTR_TYPE_VALUE_* indicates value, + * the union. OPTEE_MSG_ATTR_TYPE_VALUE_* indicates value or octets, * OPTEE_MSG_ATTR_TYPE_TMEM_* indicates @tmem and * OPTEE_MSG_ATTR_TYPE_RMEM_* indicates @rmem, * OPTEE_MSG_ATTR_TYPE_NONE indicates that none of the members are used. @@ -159,6 +160,7 @@ struct optee_msg_param { struct optee_msg_param_tmem tmem; struct optee_msg_param_rmem rmem; struct optee_msg_param_value value; + u8 octets[24]; } u; }; From d551b8e857775a6ea48f365d9611fe5c470008a3 Mon Sep 17 00:00:00 2001 From: Tony Lindgren Date: Tue, 18 May 2021 09:47:23 +0300 Subject: [PATCH 061/269] bus: ti-sysc: Fix am335x resume hang for usb otg module [ Upstream commit 4d7b324e231366ea772ab10df46be31273ca39af ] On am335x, suspend and resume only works once, and the system hangs if suspend is attempted again. However, turns out suspend and resume works fine multiple times if the USB OTG driver for musb controller is loaded. The issue is caused my the interconnect target module losing context during suspend, and it needs a restore on resume to be reconfigure again as debugged earlier by Dave Gerlach . There are also other modules that need a restore on resume, like gpmc as noted by Dave. So let's add a common way to restore an interconnect target module based on a quirk flag. For now, let's enable the quirk for am335x otg only to fix the suspend and resume issue. As gpmc is not causing hangs based on tests with BeagleBone, let's patch gpmc separately. For gpmc, we also need a hardware reset done before restore according to Dave. To reinit the modules, we decouple system suspend from PM runtime. We replace calls to pm_runtime_force_suspend() and pm_runtime_force_resume() with direct calls to internal functions and rely on the driver internal state. There no point trying to handle complex system suspend and resume quirks via PM runtime. This is issue should have already been noticed with commit 1819ef2e2d12 ("bus: ti-sysc: Use swsup quirks also for am335x musb") when quirk handling was added for am335x otg for swsup. But the issue went unnoticed as having musb driver loaded hides the issue, and suspend and resume works once without the driver loaded. Fixes: 1819ef2e2d12 ("bus: ti-sysc: Use swsup quirks also for am335x musb") Suggested-by: Dave Gerlach Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin --- drivers/bus/ti-sysc.c | 53 +++++++++++++++++++++++++-- include/linux/platform_data/ti-sysc.h | 1 + 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index 9afbe4992a1dd..b7f8c6074a157 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -1330,6 +1330,34 @@ static int __maybe_unused sysc_runtime_resume(struct device *dev) return error; } +static int sysc_reinit_module(struct sysc *ddata, bool leave_enabled) +{ + struct device *dev = ddata->dev; + int error; + + /* Disable target module if it is enabled */ + if (ddata->enabled) { + error = sysc_runtime_suspend(dev); + if (error) + dev_warn(dev, "reinit suspend failed: %i\n", error); + } + + /* Enable target module */ + error = sysc_runtime_resume(dev); + if (error) + dev_warn(dev, "reinit resume failed: %i\n", error); + + if (leave_enabled) + return error; + + /* Disable target module if no leave_enabled was set */ + error = sysc_runtime_suspend(dev); + if (error) + dev_warn(dev, "reinit suspend failed: %i\n", error); + + return error; +} + static int __maybe_unused sysc_noirq_suspend(struct device *dev) { struct sysc *ddata; @@ -1340,12 +1368,18 @@ static int __maybe_unused sysc_noirq_suspend(struct device *dev) (SYSC_QUIRK_LEGACY_IDLE | SYSC_QUIRK_NO_IDLE)) return 0; - return pm_runtime_force_suspend(dev); + if (!ddata->enabled) + return 0; + + ddata->needs_resume = 1; + + return sysc_runtime_suspend(dev); } static int __maybe_unused sysc_noirq_resume(struct device *dev) { struct sysc *ddata; + int error = 0; ddata = dev_get_drvdata(dev); @@ -1353,7 +1387,19 @@ static int __maybe_unused sysc_noirq_resume(struct device *dev) (SYSC_QUIRK_LEGACY_IDLE | SYSC_QUIRK_NO_IDLE)) return 0; - return pm_runtime_force_resume(dev); + if (ddata->cfg.quirks & SYSC_QUIRK_REINIT_ON_RESUME) { + error = sysc_reinit_module(ddata, ddata->needs_resume); + if (error) + dev_warn(dev, "noirq_resume failed: %i\n", error); + } else if (ddata->needs_resume) { + error = sysc_runtime_resume(dev); + if (error) + dev_warn(dev, "noirq_resume failed: %i\n", error); + } + + ddata->needs_resume = 0; + + return error; } static const struct dev_pm_ops sysc_pm_ops = { @@ -1462,7 +1508,8 @@ static const struct sysc_revision_quirk sysc_revision_quirks[] = { SYSC_QUIRK("usb_otg_hs", 0, 0x400, 0x404, 0x408, 0x00000050, 0xffffffff, SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_SWSUP_MSTANDBY), SYSC_QUIRK("usb_otg_hs", 0, 0, 0x10, -ENODEV, 0x4ea2080d, 0xffffffff, - SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_SWSUP_MSTANDBY), + SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_SWSUP_MSTANDBY | + SYSC_QUIRK_REINIT_ON_RESUME), SYSC_QUIRK("wdt", 0, 0, 0x10, 0x14, 0x502a0500, 0xfffff0f0, SYSC_MODULE_QUIRK_WDT), /* PRUSS on am3, am4 and am5 */ diff --git a/include/linux/platform_data/ti-sysc.h b/include/linux/platform_data/ti-sysc.h index fafc1beea504a..9837fb011f2fb 100644 --- a/include/linux/platform_data/ti-sysc.h +++ b/include/linux/platform_data/ti-sysc.h @@ -50,6 +50,7 @@ struct sysc_regbits { s8 emufree_shift; }; +#define SYSC_QUIRK_REINIT_ON_RESUME BIT(27) #define SYSC_QUIRK_GPMC_DEBUG BIT(26) #define SYSC_MODULE_QUIRK_ENA_RESETDONE BIT(25) #define SYSC_MODULE_QUIRK_PRUSS BIT(24) From a3716c19330de0f04292ff8dc3eb9af0f1b87164 Mon Sep 17 00:00:00 2001 From: Michael Walle Date: Thu, 8 Apr 2021 13:02:18 +0200 Subject: [PATCH 062/269] arm64: dts: ls1028a: fix memory node [ Upstream commit dabea675faf16e8682aa478ff3ce65dd775620bc ] While enabling EDAC support for the LS1028A it was discovered that the memory node has a wrong endianness setting as well as a wrong interrupt assignment. Fix both. This was tested on a sl28 board. To force ECC errors, you can use the error injection supported by the controller in hardware (with CONFIG_EDAC_DEBUG enabled): # enable error injection $ echo 0x100 > /sys/devices/system/edac/mc/mc0/inject_ctrl # flip lowest bit of the data $ echo 0x1 > /sys/devices/system/edac/mc/mc0/inject_data_lo Fixes: 8897f3255c9c ("arm64: dts: Add support for NXP LS1028A SoC") Signed-off-by: Michael Walle Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi index 62f4dcb96e70d..f3b58bb9b8408 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi @@ -192,8 +192,8 @@ ddr: memory-controller@1080000 { compatible = "fsl,qoriq-memory-controller"; reg = <0x0 0x1080000 0x0 0x1000>; - interrupts = ; - big-endian; + interrupts = ; + little-endian; }; dcfg: syscon@1e00000 { From 4f323ce68e75b5fc8d34e408f7033f2a1477ca59 Mon Sep 17 00:00:00 2001 From: Lucas Stach Date: Fri, 7 May 2021 21:44:40 +0200 Subject: [PATCH 063/269] arm64: dts: zii-ultra: fix 12V_MAIN voltage [ Upstream commit ac0cbf9d13dccfd09bebc2f8f5697b6d3ffe27c4 ] As this is a fixed regulator on the board there was no harm in the wrong voltage being specified, apart from a confusing reporting to userspace. Fixes: 4a13b3bec3b4 ("arm64: dts: imx: add Zii Ultra board support") Signed-off-by: Lucas Stach Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi b/arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi index fa7a041ffcfde..825c83c71a9f1 100644 --- a/arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi +++ b/arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi @@ -45,8 +45,8 @@ reg_12p0_main: regulator-12p0-main { compatible = "regulator-fixed"; regulator-name = "12V_MAIN"; - regulator-min-microvolt = <5000000>; - regulator-max-microvolt = <5000000>; + regulator-min-microvolt = <12000000>; + regulator-max-microvolt = <12000000>; regulator-always-on; }; From 8aa4700de52d2ed132bfedc747e59f306c1071a6 Mon Sep 17 00:00:00 2001 From: Michael Walle Date: Fri, 14 May 2021 20:55:52 +0200 Subject: [PATCH 064/269] arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage [ Upstream commit 25201269c6ec3e9398426962ccdd55428261f7d0 ] During hardware validation it was noticed that the clock isn't continuously enabled when there is no link. This is because the 125MHz clock is derived from the internal PLL which seems to go into some kind of power-down mode every once in a while. The LS1028A expects a contiuous clock. Thus enable the PLL all the time. Also, the RGMII pad voltage is wrong. It was configured to 2.5V (that is the VDDH regulator). The correct voltage is 1.8V, i.e. the VDDIO regulator. This fix is for the freescale/fsl-ls1028a-kontron-sl28-var4.dts. Fixes: 815364d0424e ("arm64: dts: freescale: add Kontron sl28 support") Signed-off-by: Michael Walle Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- .../boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts index df212ed5bb942..e65d1c477e2ce 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts @@ -31,11 +31,10 @@ reg = <0x4>; eee-broken-1000t; eee-broken-100tx; - qca,clk-out-frequency = <125000000>; qca,clk-out-strength = ; - - vddio-supply = <&vddh>; + qca,keep-pll-enabled; + vddio-supply = <&vddio>; vddio: vddio-regulator { regulator-name = "VDDIO"; From a776ea1eca2b7c69d6c036e27a4e78a41bd35517 Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Thu, 20 May 2021 18:42:12 -0300 Subject: [PATCH 065/269] ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property [ Upstream commit 7c8f0338cdacc90fdf6468adafa8e27952987f00 ] According to Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml, the correct name of the property is 'fsl,tuning-step'. Fix it accordingly. Signed-off-by: Fabio Estevam Fixes: ae7b3384b61b ("ARM: dts: Add support for 96Boards Meerkat96 board") Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- arch/arm/boot/dts/imx7d-meerkat96.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx7d-meerkat96.dts b/arch/arm/boot/dts/imx7d-meerkat96.dts index 5339210b63d0f..dd8003bd1fc09 100644 --- a/arch/arm/boot/dts/imx7d-meerkat96.dts +++ b/arch/arm/boot/dts/imx7d-meerkat96.dts @@ -193,7 +193,7 @@ pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usdhc1>; keep-power-in-suspend; - tuning-step = <2>; + fsl,tuning-step = <2>; vmmc-supply = <®_3p3v>; no-1-8-v; broken-cd; From 67ae12a57b342e6dc5de6fae3fb7cda493abc257 Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Thu, 20 May 2021 18:42:13 -0300 Subject: [PATCH 066/269] ARM: dts: imx7d-pico: Fix the 'tuning-step' property [ Upstream commit 0e2fa4959c4f44815ce33e46e4054eeb0f346053 ] According to Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml, the correct name of the property is 'fsl,tuning-step'. Fix it accordingly. Signed-off-by: Fabio Estevam Fixes: f13f571ac8a1 ("ARM: dts: imx7d-pico: Extend peripherals support") Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- arch/arm/boot/dts/imx7d-pico.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx7d-pico.dtsi b/arch/arm/boot/dts/imx7d-pico.dtsi index e57da0d32b98d..e519897fae082 100644 --- a/arch/arm/boot/dts/imx7d-pico.dtsi +++ b/arch/arm/boot/dts/imx7d-pico.dtsi @@ -351,7 +351,7 @@ pinctrl-2 = <&pinctrl_usdhc1_200mhz>; cd-gpios = <&gpio5 0 GPIO_ACTIVE_LOW>; bus-width = <4>; - tuning-step = <2>; + fsl,tuning-step = <2>; vmmc-supply = <®_3p3v>; wakeup-source; no-1-8-v; From 5592731e13cc0bdc81ffd839359ce43f5c5e211e Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Fri, 21 May 2021 09:54:07 +0200 Subject: [PATCH 067/269] ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells [ Upstream commit b73eb6b3b91ff7d76cff5f8c7ab92fe0c51e3829 ] According to the DT bindings, #gpio-cells must be two. Fixes: 63e71fedc07c4ece ("ARM: dts: Add support for emtrion emCON-MX6 series") Signed-off-by: Geert Uytterhoeven Reviewed-by: Laurent Pinchart Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin --- arch/arm/boot/dts/imx6qdl-emcon-avari.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx6qdl-emcon-avari.dtsi b/arch/arm/boot/dts/imx6qdl-emcon-avari.dtsi index 828cf3e39784a..c4e146f3341bb 100644 --- a/arch/arm/boot/dts/imx6qdl-emcon-avari.dtsi +++ b/arch/arm/boot/dts/imx6qdl-emcon-avari.dtsi @@ -126,7 +126,7 @@ compatible = "nxp,pca8574"; reg = <0x3a>; gpio-controller; - #gpio-cells = <1>; + #gpio-cells = <2>; }; }; From 0d83aec6e0102e014eafdd453bdbc61b4d193029 Mon Sep 17 00:00:00 2001 From: Tony Lindgren Date: Tue, 25 May 2021 09:08:23 +0300 Subject: [PATCH 068/269] bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act [ Upstream commit c8692ad416dcc420ce1b403596a425c8f4c2720b ] Looks like the swsup_sidle_act quirk handling is unreliable for serial ports. The serial ports just eventually stop idling until woken up and re-idled again. As the serial port not idling blocks any deeper SoC idle states, it's adds an annoying random flakeyness for power management. Let's just switch to swsup_sidle quirk instead like we already do for omap3 uarts. This means we manually idle the port instead of trying to use the hardware autoidle features when not in use. For more details on why the serial ports have been using swsup_idle_act, see commit 66dde54e978a ("ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes"). It seems that the swsup_idle_act quirk handling is not enough though, and for example the TI Android kernel changed to using swsup_sidle with commit 77c34c84e1e0 ("OMAP4: HWMOD: UART1: disable smart-idle."). Fixes: b4a9a7a38917 ("bus: ti-sysc: Handle swsup idle mode quirks") Cc: Carl Philipp Klemm Cc: Ivan Jelincic Cc: Merlijn Wajer Cc: Pavel Machek Cc: Sebastian Reichel Cc: Sicelo A. Mhlongo Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin --- drivers/bus/ti-sysc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index b7f8c6074a157..818dc7f54f038 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -1450,9 +1450,9 @@ static const struct sysc_revision_quirk sysc_revision_quirks[] = { SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE), /* Uarts on omap4 and later */ SYSC_QUIRK("uart", 0, 0x50, 0x54, 0x58, 0x50411e03, 0xffff00ff, - SYSC_QUIRK_SWSUP_SIDLE_ACT | SYSC_QUIRK_LEGACY_IDLE), + SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE), SYSC_QUIRK("uart", 0, 0x50, 0x54, 0x58, 0x47422e03, 0xffffffff, - SYSC_QUIRK_SWSUP_SIDLE_ACT | SYSC_QUIRK_LEGACY_IDLE), + SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE), /* Quirks that need to be set based on the module address */ SYSC_QUIRK("mcpdm", 0x40132000, 0, 0x10, -ENODEV, 0x50000800, 0xffffffff, From e31ae45ed1d323b4409c3575299f6c203f0b3d2a Mon Sep 17 00:00:00 2001 From: Hoang Le Date: Thu, 25 Mar 2021 08:56:41 +0700 Subject: [PATCH 069/269] tipc: add extack messages for bearer/media failure [ Upstream commit b83e214b2e04204f1fc674574362061492c37245 ] Add extack error messages for -EINVAL errors when enabling bearer, getting/setting properties for a media/bearer Acked-by: Jon Maloy Signed-off-by: Hoang Le Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/tipc/bearer.c | 50 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 6504141104521..4d0e11623e5cc 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -234,7 +234,8 @@ void tipc_bearer_remove_dest(struct net *net, u32 bearer_id, u32 dest) */ static int tipc_enable_bearer(struct net *net, const char *name, u32 disc_domain, u32 prio, - struct nlattr *attr[]) + struct nlattr *attr[], + struct netlink_ext_ack *extack) { struct tipc_net *tn = tipc_net(net); struct tipc_bearer_names b_names; @@ -248,17 +249,20 @@ static int tipc_enable_bearer(struct net *net, const char *name, if (!bearer_name_validate(name, &b_names)) { errstr = "illegal name"; + NL_SET_ERR_MSG(extack, "Illegal name"); goto rejected; } if (prio > TIPC_MAX_LINK_PRI && prio != TIPC_MEDIA_LINK_PRI) { errstr = "illegal priority"; + NL_SET_ERR_MSG(extack, "Illegal priority"); goto rejected; } m = tipc_media_find(b_names.media_name); if (!m) { errstr = "media not registered"; + NL_SET_ERR_MSG(extack, "Media not registered"); goto rejected; } @@ -272,6 +276,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, break; if (!strcmp(name, b->name)) { errstr = "already enabled"; + NL_SET_ERR_MSG(extack, "Already enabled"); goto rejected; } bearer_id++; @@ -283,6 +288,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, name, prio); if (prio == TIPC_MIN_LINK_PRI) { errstr = "cannot adjust to lower"; + NL_SET_ERR_MSG(extack, "Cannot adjust to lower"); goto rejected; } pr_warn("Bearer <%s>: trying with adjusted priority\n", name); @@ -293,6 +299,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, if (bearer_id >= MAX_BEARERS) { errstr = "max 3 bearers permitted"; + NL_SET_ERR_MSG(extack, "Max 3 bearers permitted"); goto rejected; } @@ -306,6 +313,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, if (res) { kfree(b); errstr = "failed to enable media"; + NL_SET_ERR_MSG(extack, "Failed to enable media"); goto rejected; } @@ -322,6 +330,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, if (res) { bearer_disable(net, b); errstr = "failed to create discoverer"; + NL_SET_ERR_MSG(extack, "Failed to create discoverer"); goto rejected; } @@ -894,6 +903,7 @@ int tipc_nl_bearer_get(struct sk_buff *skb, struct genl_info *info) bearer = tipc_bearer_find(net, name); if (!bearer) { err = -EINVAL; + NL_SET_ERR_MSG(info->extack, "Bearer not found"); goto err_out; } @@ -933,8 +943,10 @@ int __tipc_nl_bearer_disable(struct sk_buff *skb, struct genl_info *info) name = nla_data(attrs[TIPC_NLA_BEARER_NAME]); bearer = tipc_bearer_find(net, name); - if (!bearer) + if (!bearer) { + NL_SET_ERR_MSG(info->extack, "Bearer not found"); return -EINVAL; + } bearer_disable(net, bearer); @@ -992,7 +1004,8 @@ int __tipc_nl_bearer_enable(struct sk_buff *skb, struct genl_info *info) prio = nla_get_u32(props[TIPC_NLA_PROP_PRIO]); } - return tipc_enable_bearer(net, bearer, domain, prio, attrs); + return tipc_enable_bearer(net, bearer, domain, prio, attrs, + info->extack); } int tipc_nl_bearer_enable(struct sk_buff *skb, struct genl_info *info) @@ -1031,6 +1044,7 @@ int tipc_nl_bearer_add(struct sk_buff *skb, struct genl_info *info) b = tipc_bearer_find(net, name); if (!b) { rtnl_unlock(); + NL_SET_ERR_MSG(info->extack, "Bearer not found"); return -EINVAL; } @@ -1071,8 +1085,10 @@ int __tipc_nl_bearer_set(struct sk_buff *skb, struct genl_info *info) name = nla_data(attrs[TIPC_NLA_BEARER_NAME]); b = tipc_bearer_find(net, name); - if (!b) + if (!b) { + NL_SET_ERR_MSG(info->extack, "Bearer not found"); return -EINVAL; + } if (attrs[TIPC_NLA_BEARER_PROP]) { struct nlattr *props[TIPC_NLA_PROP_MAX + 1]; @@ -1091,12 +1107,18 @@ int __tipc_nl_bearer_set(struct sk_buff *skb, struct genl_info *info) if (props[TIPC_NLA_PROP_WIN]) b->max_win = nla_get_u32(props[TIPC_NLA_PROP_WIN]); if (props[TIPC_NLA_PROP_MTU]) { - if (b->media->type_id != TIPC_MEDIA_TYPE_UDP) + if (b->media->type_id != TIPC_MEDIA_TYPE_UDP) { + NL_SET_ERR_MSG(info->extack, + "MTU property is unsupported"); return -EINVAL; + } #ifdef CONFIG_TIPC_MEDIA_UDP if (tipc_udp_mtu_bad(nla_get_u32 - (props[TIPC_NLA_PROP_MTU]))) + (props[TIPC_NLA_PROP_MTU]))) { + NL_SET_ERR_MSG(info->extack, + "MTU value is out-of-range"); return -EINVAL; + } b->mtu = nla_get_u32(props[TIPC_NLA_PROP_MTU]); tipc_node_apply_property(net, b, TIPC_NLA_PROP_MTU); #endif @@ -1224,6 +1246,7 @@ int tipc_nl_media_get(struct sk_buff *skb, struct genl_info *info) rtnl_lock(); media = tipc_media_find(name); if (!media) { + NL_SET_ERR_MSG(info->extack, "Media not found"); err = -EINVAL; goto err_out; } @@ -1260,9 +1283,10 @@ int __tipc_nl_media_set(struct sk_buff *skb, struct genl_info *info) name = nla_data(attrs[TIPC_NLA_MEDIA_NAME]); m = tipc_media_find(name); - if (!m) + if (!m) { + NL_SET_ERR_MSG(info->extack, "Media not found"); return -EINVAL; - + } if (attrs[TIPC_NLA_MEDIA_PROP]) { struct nlattr *props[TIPC_NLA_PROP_MAX + 1]; @@ -1278,12 +1302,18 @@ int __tipc_nl_media_set(struct sk_buff *skb, struct genl_info *info) if (props[TIPC_NLA_PROP_WIN]) m->max_win = nla_get_u32(props[TIPC_NLA_PROP_WIN]); if (props[TIPC_NLA_PROP_MTU]) { - if (m->type_id != TIPC_MEDIA_TYPE_UDP) + if (m->type_id != TIPC_MEDIA_TYPE_UDP) { + NL_SET_ERR_MSG(info->extack, + "MTU property is unsupported"); return -EINVAL; + } #ifdef CONFIG_TIPC_MEDIA_UDP if (tipc_udp_mtu_bad(nla_get_u32 - (props[TIPC_NLA_PROP_MTU]))) + (props[TIPC_NLA_PROP_MTU]))) { + NL_SET_ERR_MSG(info->extack, + "MTU value is out-of-range"); return -EINVAL; + } m->mtu = nla_get_u32(props[TIPC_NLA_PROP_MTU]); #endif } From fdf1e5eec3eddef6255cbc149e85877c924458d8 Mon Sep 17 00:00:00 2001 From: Hoang Le Date: Thu, 1 Apr 2021 09:30:48 +0700 Subject: [PATCH 070/269] tipc: fix unique bearer names sanity check [ Upstream commit f20a46c3044c3f75232b3d0e2d09af9b25efaf45 ] When enabling a bearer by name, we don't sanity check its name with higher slot in bearer list. This may have the effect that the name of an already enabled bearer bypasses the check. To fix the above issue, we just perform an extra checking with all existing bearers. Fixes: cb30a63384bc9 ("tipc: refactor function tipc_enable_bearer()") Cc: stable@vger.kernel.org Acked-by: Jon Maloy Signed-off-by: Hoang Le Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/tipc/bearer.c | 46 +++++++++++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 4d0e11623e5cc..12e535b43d887 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -246,6 +246,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, int bearer_id = 0; int res = -EINVAL; char *errstr = ""; + u32 i; if (!bearer_name_validate(name, &b_names)) { errstr = "illegal name"; @@ -270,31 +271,38 @@ static int tipc_enable_bearer(struct net *net, const char *name, prio = m->priority; /* Check new bearer vs existing ones and find free bearer id if any */ - while (bearer_id < MAX_BEARERS) { - b = rtnl_dereference(tn->bearer_list[bearer_id]); - if (!b) - break; + bearer_id = MAX_BEARERS; + i = MAX_BEARERS; + while (i-- != 0) { + b = rtnl_dereference(tn->bearer_list[i]); + if (!b) { + bearer_id = i; + continue; + } if (!strcmp(name, b->name)) { errstr = "already enabled"; NL_SET_ERR_MSG(extack, "Already enabled"); goto rejected; } - bearer_id++; - if (b->priority != prio) - continue; - if (++with_this_prio <= 2) - continue; - pr_warn("Bearer <%s>: already 2 bearers with priority %u\n", - name, prio); - if (prio == TIPC_MIN_LINK_PRI) { - errstr = "cannot adjust to lower"; - NL_SET_ERR_MSG(extack, "Cannot adjust to lower"); - goto rejected; + + if (b->priority == prio && + (++with_this_prio > 2)) { + pr_warn("Bearer <%s>: already 2 bearers with priority %u\n", + name, prio); + + if (prio == TIPC_MIN_LINK_PRI) { + errstr = "cannot adjust to lower"; + NL_SET_ERR_MSG(extack, "Cannot adjust to lower"); + goto rejected; + } + + pr_warn("Bearer <%s>: trying with adjusted priority\n", + name); + prio--; + bearer_id = MAX_BEARERS; + i = MAX_BEARERS; + with_this_prio = 1; } - pr_warn("Bearer <%s>: trying with adjusted priority\n", name); - prio--; - bearer_id = 0; - with_this_prio = 1; } if (bearer_id >= MAX_BEARERS) { From 282c9eeda6c2c2f6932aef0fb85b352b6210b9e6 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 16 Apr 2021 16:05:56 +0200 Subject: [PATCH 071/269] serial: stm32: fix threaded interrupt handling [ Upstream commit e359b4411c2836cf87c8776682d1b594635570de ] When DMA is enabled the receive handler runs in a threaded handler, but the primary handler up until very recently neither disabled interrupts in the device or used IRQF_ONESHOT. This would lead to a deadlock if an interrupt comes in while the threaded receive handler is running under the port lock. Commit ad7676812437 ("serial: stm32: fix a deadlock condition with wakeup event") claimed to fix an unrelated deadlock, but unfortunately also disabled interrupts in the threaded handler. While this prevents the deadlock mentioned in the previous paragraph it also defeats the purpose of using a threaded handler in the first place. Fix this by making the interrupt one-shot and not disabling interrupts in the threaded handler. Note that (receive) DMA must not be used for a console port as the threaded handler could be interrupted while holding the port lock, something which could lead to a deadlock in case an interrupt handler ends up calling printk. Fixes: ad7676812437 ("serial: stm32: fix a deadlock condition with wakeup event") Fixes: 3489187204eb ("serial: stm32: adding dma support") Cc: stable@vger.kernel.org # 4.9 Cc: Alexandre TORGUE Cc: Gerald Baeza Reviewed-by: Valentin Caron Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20210416140557.25177-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/tty/serial/stm32-usart.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c index 2cf9fc915510c..844059861f9e1 100644 --- a/drivers/tty/serial/stm32-usart.c +++ b/drivers/tty/serial/stm32-usart.c @@ -213,14 +213,11 @@ static void stm32_usart_receive_chars(struct uart_port *port, bool threaded) struct tty_port *tport = &port->state->port; struct stm32_port *stm32_port = to_stm32_port(port); const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs; - unsigned long c, flags; + unsigned long c; u32 sr; char flag; - if (threaded) - spin_lock_irqsave(&port->lock, flags); - else - spin_lock(&port->lock); + spin_lock(&port->lock); while (stm32_usart_pending_rx(port, &sr, &stm32_port->last_res, threaded)) { @@ -277,10 +274,7 @@ static void stm32_usart_receive_chars(struct uart_port *port, bool threaded) uart_insert_char(port, sr, USART_SR_ORE, c, flag); } - if (threaded) - spin_unlock_irqrestore(&port->lock, flags); - else - spin_unlock(&port->lock); + spin_unlock(&port->lock); tty_flip_buffer_push(tport); } @@ -653,7 +647,8 @@ static int stm32_usart_startup(struct uart_port *port) ret = request_threaded_irq(port->irq, stm32_usart_interrupt, stm32_usart_threaded_interrupt, - IRQF_NO_SUSPEND, name, port); + IRQF_ONESHOT | IRQF_NO_SUSPEND, + name, port); if (ret) return ret; @@ -1126,6 +1121,13 @@ static int stm32_usart_of_dma_rx_probe(struct stm32_port *stm32port, struct dma_async_tx_descriptor *desc = NULL; int ret; + /* + * Using DMA and threaded handler for the console could lead to + * deadlocks. + */ + if (uart_console(port)) + return -ENODEV; + /* Request DMA RX channel */ stm32port->rx_ch = dma_request_slave_channel(dev, "rx"); if (!stm32port->rx_ch) { From 3c23e23c7ad9844a645f4e2bd8ec34a0a2ee5514 Mon Sep 17 00:00:00 2001 From: Jisheng Zhang Date: Fri, 2 Apr 2021 21:29:08 +0800 Subject: [PATCH 072/269] riscv: vdso: fix and clean-up Makefile [ Upstream commit 772d7891e8b3b0baae7bb88a294d61fd07ba6d15 ] Running "make" on an already compiled kernel tree will rebuild the kernel even without any modifications: CALL linux/scripts/checksyscalls.sh CALL linux/scripts/atomic/check-atomics.sh CHK include/generated/compile.h SO2S arch/riscv/kernel/vdso/vdso-syms.S AS arch/riscv/kernel/vdso/vdso-syms.o AR arch/riscv/kernel/vdso/built-in.a AR arch/riscv/kernel/built-in.a AR arch/riscv/built-in.a GEN .version CHK include/generated/compile.h UPD include/generated/compile.h CC init/version.o AR init/built-in.a LD vmlinux.o The reason is "Any target that utilizes if_changed must be listed in $(targets), otherwise the command line check will fail, and the target will always be built" as explained by Documentation/kbuild/makefiles.rst Fix this build bug by adding vdso-syms.S to $(targets) At the same time, there are two trivial clean up modifications: - the vdso-dummy.o is not needed any more after so remove it. - vdso.lds is a generated file, so it should be prefixed with $(obj)/ instead of $(src)/ Fixes: c2c81bb2f691 ("RISC-V: Fix the VDSO symbol generaton for binutils-2.35+") Cc: stable@vger.kernel.org Signed-off-by: Jisheng Zhang Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin --- arch/riscv/kernel/vdso/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index ca2b40dfd24b8..24d936c147cdf 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -23,7 +23,7 @@ ifneq ($(c-gettimeofday-y),) endif # Build rules -targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds vdso-dummy.o +targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds vdso-syms.S obj-vdso := $(addprefix $(obj)/, $(obj-vdso)) obj-y += vdso.o vdso-syms.o @@ -41,7 +41,7 @@ KASAN_SANITIZE := n $(obj)/vdso.o: $(obj)/vdso.so # link rule for the .so file, .lds has to be first -$(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE +$(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE $(call if_changed,vdsold) LDFLAGS_vdso.so.dbg = -shared -s -soname=linux-vdso.so.1 \ --build-id=sha1 --hash-style=both --eh-frame-hdr From 0b2a990e5d2f76d020cb840c456e6ec5f0c27530 Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Fri, 7 May 2021 21:06:38 +0100 Subject: [PATCH 073/269] io_uring: fix link timeout refs [ Upstream commit a298232ee6b9a1d5d732aa497ff8be0d45b5bd82 ] WARNING: CPU: 0 PID: 10242 at lib/refcount.c:28 refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 RIP: 0010:refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] io_put_req fs/io_uring.c:2140 [inline] io_queue_linked_timeout fs/io_uring.c:6300 [inline] __io_queue_sqe+0xbef/0xec0 fs/io_uring.c:6354 io_submit_sqe fs/io_uring.c:6534 [inline] io_submit_sqes+0x2bbd/0x7c50 fs/io_uring.c:6660 __do_sys_io_uring_enter fs/io_uring.c:9240 [inline] __se_sys_io_uring_enter+0x256/0x1d60 fs/io_uring.c:9182 io_link_timeout_fn() should put only one reference of the linked timeout request, however in case of racing with the master request's completion first io_req_complete() puts one and then io_put_req_deferred() is called. Cc: stable@vger.kernel.org # 5.12+ Fixes: 9ae1f8dd372e0 ("io_uring: fix inconsistent lock state") Reported-by: syzbot+a2910119328ce8e7996f@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/ff51018ff29de5ffa76f09273ef48cb24c720368.1620417627.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index 369ec81033d67..958c463c11ebf 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6266,6 +6266,7 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) if (prev) { io_async_find_and_cancel(ctx, req, prev->user_data, -ETIME); io_put_req_deferred(prev, 1); + io_put_req_deferred(req, 1); } else { io_cqring_add_event(req, -ETIME, 0); io_put_req_deferred(req, 1); From ec72cb50c1db39816eae7296686449bba8ca0b2e Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sun, 28 Feb 2021 22:35:15 +0000 Subject: [PATCH 074/269] io_uring: use better types for cflags [ Upstream commit 8c3f9cd1603d0e4af6c50ebc6d974ab7bdd03cf4 ] __io_cqring_fill_event() takes cflags as long to squeeze it into u32 in an CQE, awhile all users pass int or unsigned. Replace it with unsigned int and store it as u32 in struct io_completion to match CQE. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- fs/io_uring.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 958c463c11ebf..fdbaaf579cc60 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -545,7 +545,7 @@ struct io_statx { struct io_completion { struct file *file; struct list_head list; - int cflags; + u32 cflags; }; struct io_async_connect { @@ -1711,7 +1711,8 @@ static void io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force, } } -static void __io_cqring_fill_event(struct io_kiocb *req, long res, long cflags) +static void __io_cqring_fill_event(struct io_kiocb *req, long res, + unsigned int cflags) { struct io_ring_ctx *ctx = req->ctx; struct io_uring_cqe *cqe; From 58f4d45d8d4d391f60b6f0db6308df1994a265b3 Mon Sep 17 00:00:00 2001 From: James Zhu Date: Mon, 17 May 2021 16:39:17 -0400 Subject: [PATCH 075/269] drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 4a62542ae064e3b645d6bbf2295a6c05136956c6 ] Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu Reviewed-by: Leo Liu Acked-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 700621ddc02e2..c9c888be12285 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -345,15 +345,14 @@ static int vcn_v3_0_hw_init(void *handle) static int vcn_v3_0_hw_fini(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - struct amdgpu_ring *ring; int i; + cancel_delayed_work_sync(&adev->vcn.idle_work); + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) continue; - ring = &adev->vcn.inst[i].ring_dec; - if (!amdgpu_sriov_vf(adev)) { if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) || (adev->vcn.cur_state != AMD_PG_STATE_GATE && From c12946548001bf1591426ebfec56b5d859de6262 Mon Sep 17 00:00:00 2001 From: James Zhu Date: Wed, 19 May 2021 12:04:38 -0400 Subject: [PATCH 076/269] drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 23f10a571da5eaa63b7845d16e2f49837e841ab9 ] Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu Reviewed-by: Leo Liu Acked-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c index 63b3501823898..8c84e35c2719b 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c @@ -187,14 +187,14 @@ static int jpeg_v2_5_hw_init(void *handle) static int jpeg_v2_5_hw_fini(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - struct amdgpu_ring *ring; int i; + cancel_delayed_work_sync(&adev->vcn.idle_work); + for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) { if (adev->jpeg.harvest_config & (1 << i)) continue; - ring = &adev->jpeg.inst[i].ring_dec; if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, i, mmUVD_JRBC_STATUS)) jpeg_v2_5_set_powergating_state(adev, AMD_PG_STATE_GATE); From 7fa8ee00b5fab915856d03b50113be8fef1337dc Mon Sep 17 00:00:00 2001 From: James Zhu Date: Wed, 19 May 2021 12:08:20 -0400 Subject: [PATCH 077/269] drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 20ebbfd22f8115a1e4f60d3d289f66be4d47f1ec ] Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu Reviewed-by: Leo Liu Acked-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c index 9259e35f0f55a..e00c88abeaed1 100644 --- a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c @@ -159,9 +159,9 @@ static int jpeg_v3_0_hw_init(void *handle) static int jpeg_v3_0_hw_fini(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - struct amdgpu_ring *ring; - ring = &adev->jpeg.inst->ring_dec; + cancel_delayed_work_sync(&adev->vcn.idle_work); + if (adev->jpeg.cur_state != AMD_PG_STATE_GATE && RREG32_SOC15(JPEG, 0, mmUVD_JRBC_STATUS)) jpeg_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE); From 3795007c8dfc8bca176529bfeceb17c6f4ef7e44 Mon Sep 17 00:00:00 2001 From: Lin Ma Date: Tue, 25 May 2021 14:39:02 +0200 Subject: [PATCH 078/269] Bluetooth: fix the erroneous flush_work() order commit 6a137caec23aeb9e036cdfd8a46dd8a366460e5d upstream. In the cleanup routine for failed initialization of HCI device, the flush_work(&hdev->rx_work) need to be finished before the flush_work(&hdev->cmd_work). Otherwise, the hci_rx_work() can possibly invoke new cmd_work and cause a bug, like double free, in late processings. This was assigned CVE-2021-3564. This patch reorder the flush_work() to fix this bug. Cc: Marcel Holtmann Cc: Johan Hedberg Cc: Luiz Augusto von Dentz Cc: "David S. Miller" Cc: Jakub Kicinski Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lin Ma Signed-off-by: Hao Xiong Cc: stable Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman --- net/bluetooth/hci_core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index 0152bc6b67967..86ebfc6ae6986 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -1602,8 +1602,13 @@ static int hci_dev_do_open(struct hci_dev *hdev) } else { /* Init failed, cleanup */ flush_work(&hdev->tx_work); - flush_work(&hdev->cmd_work); + + /* Since hci_rx_work() is possible to awake new cmd_work + * it should be flushed first to avoid unexpected call of + * hci_cmd_work() + */ flush_work(&hdev->rx_work); + flush_work(&hdev->cmd_work); skb_queue_purge(&hdev->cmd_q); skb_queue_purge(&hdev->rx_q); From 74caf718cc7422a957aac381c73d798c0a999a65 Mon Sep 17 00:00:00 2001 From: Lin Ma Date: Sun, 30 May 2021 21:37:43 +0800 Subject: [PATCH 079/269] Bluetooth: use correct lock to prevent UAF of hdev object commit e305509e678b3a4af2b3cfd410f409f7cdaabb52 upstream. The hci_sock_dev_event() function will cleanup the hdev object for sockets even if this object may still be in used within the hci_sock_bound_ioctl() function, result in UAF vulnerability. This patch replace the BH context lock to serialize these affairs and prevent the race condition. Signed-off-by: Lin Ma Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman --- net/bluetooth/hci_sock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c index 251b9128f530a..eed0dd066e12c 100644 --- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -762,7 +762,7 @@ void hci_sock_dev_event(struct hci_dev *hdev, int event) /* Detach sockets from device */ read_lock(&hci_sk_list.lock); sk_for_each(sk, &hci_sk_list.head) { - bh_lock_sock_nested(sk); + lock_sock(sk); if (hci_pi(sk)->hdev == hdev) { hci_pi(sk)->hdev = NULL; sk->sk_err = EPIPE; @@ -771,7 +771,7 @@ void hci_sock_dev_event(struct hci_dev *hdev, int event) hci_dev_put(hdev); } - bh_unlock_sock(sk); + release_sock(sk); } read_unlock(&hci_sk_list.lock); } From d64fdbaec09b4143aa1965e92d0ff8c8c84de585 Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:32 +0200 Subject: [PATCH 080/269] wireguard: do not use -O3 commit cc5060ca0285efe2728bced399a1955a7ce808b2 upstream. Apparently, various versions of gcc have O3-related miscompiles. Looking at the difference between -O2 and -O3 for gcc 11 doesn't indicate miscompiles, but the difference also doesn't seem so significant for performance that it's worth risking. Link: https://lore.kernel.org/lkml/CAHk-=wjuoGyxDhAF8SsrTkN0-YfCx7E6jUN3ikC_tn2AKWTTsA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/CAHmME9otB5Wwxp7H8bR_i2uH2esEMvoBMC8uEXBMH9p0q1s6Bw@mail.gmail.com/ Reported-by: Linus Torvalds Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/wireguard/Makefile b/drivers/net/wireguard/Makefile index fc52b2cb500b3..dbe1f8514efc3 100644 --- a/drivers/net/wireguard/Makefile +++ b/drivers/net/wireguard/Makefile @@ -1,5 +1,4 @@ -ccflags-y := -O3 -ccflags-y += -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt' +ccflags-y := -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt' ccflags-$(CONFIG_WIREGUARD_DEBUG) += -DDEBUG wireguard-y := main.o wireguard-y += noise.o From d4275889ac9c9f3a4afafc3ec2a13154cdd0b632 Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:34 +0200 Subject: [PATCH 081/269] wireguard: peer: allocate in kmem_cache commit a4e9f8e3287c9eb6bf70df982870980dd3341863 upstream. With deployments having upwards of 600k peers now, this somewhat heavy structure could benefit from more fine-grained allocations. Specifically, instead of using a 2048-byte slab for a 1544-byte object, we can now use 1544-byte objects directly, thus saving almost 25% per-peer, or with 600k peers, that's a savings of 303 MiB. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: 8b5553ace83c ("wireguard: queueing: get rid of per-peer ring buffers") Suggested-by: Arnd Bergmann Suggested-by: Matthew Wilcox Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/main.c | 7 +++++++ drivers/net/wireguard/peer.c | 21 +++++++++++++++++---- drivers/net/wireguard/peer.h | 3 +++ 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c index 7a7d5f1a80fc7..0a3ebfdac7947 100644 --- a/drivers/net/wireguard/main.c +++ b/drivers/net/wireguard/main.c @@ -28,6 +28,10 @@ static int __init mod_init(void) #endif wg_noise_init(); + ret = wg_peer_init(); + if (ret < 0) + goto err_peer; + ret = wg_device_init(); if (ret < 0) goto err_device; @@ -44,6 +48,8 @@ static int __init mod_init(void) err_netlink: wg_device_uninit(); err_device: + wg_peer_uninit(); +err_peer: return ret; } @@ -51,6 +57,7 @@ static void __exit mod_exit(void) { wg_genetlink_uninit(); wg_device_uninit(); + wg_peer_uninit(); } module_init(mod_init); diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c index cd5cb0292cb67..6936a82364bf7 100644 --- a/drivers/net/wireguard/peer.c +++ b/drivers/net/wireguard/peer.c @@ -15,6 +15,7 @@ #include #include +static struct kmem_cache *peer_cache; static atomic64_t peer_counter = ATOMIC64_INIT(0); struct wg_peer *wg_peer_create(struct wg_device *wg, @@ -29,10 +30,10 @@ struct wg_peer *wg_peer_create(struct wg_device *wg, if (wg->num_peers >= MAX_PEERS_PER_DEVICE) return ERR_PTR(ret); - peer = kzalloc(sizeof(*peer), GFP_KERNEL); + peer = kmem_cache_zalloc(peer_cache, GFP_KERNEL); if (unlikely(!peer)) return ERR_PTR(ret); - if (dst_cache_init(&peer->endpoint_cache, GFP_KERNEL)) + if (unlikely(dst_cache_init(&peer->endpoint_cache, GFP_KERNEL))) goto err; peer->device = wg; @@ -64,7 +65,7 @@ struct wg_peer *wg_peer_create(struct wg_device *wg, return peer; err: - kfree(peer); + kmem_cache_free(peer_cache, peer); return ERR_PTR(ret); } @@ -193,7 +194,8 @@ static void rcu_release(struct rcu_head *rcu) /* The final zeroing takes care of clearing any remaining handshake key * material and other potentially sensitive information. */ - kfree_sensitive(peer); + memzero_explicit(peer, sizeof(*peer)); + kmem_cache_free(peer_cache, peer); } static void kref_release(struct kref *refcount) @@ -225,3 +227,14 @@ void wg_peer_put(struct wg_peer *peer) return; kref_put(&peer->refcount, kref_release); } + +int __init wg_peer_init(void) +{ + peer_cache = KMEM_CACHE(wg_peer, 0); + return peer_cache ? 0 : -ENOMEM; +} + +void wg_peer_uninit(void) +{ + kmem_cache_destroy(peer_cache); +} diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h index 0809cda08bfa4..74227aa2d5b5a 100644 --- a/drivers/net/wireguard/peer.h +++ b/drivers/net/wireguard/peer.h @@ -80,4 +80,7 @@ void wg_peer_put(struct wg_peer *peer); void wg_peer_remove(struct wg_peer *peer); void wg_peer_remove_all(struct wg_device *wg); +int wg_peer_init(void); +void wg_peer_uninit(void); + #endif /* _WG_PEER_H */ From f74da2c2546c402cd2fc4165ef54b48cce6e39cc Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:33 +0200 Subject: [PATCH 082/269] wireguard: use synchronize_net rather than synchronize_rcu commit 24b70eeeb4f46c09487f8155239ebfb1f875774a upstream. Many of the synchronization points are sometimes called under the rtnl lock, which means we should use synchronize_net rather than synchronize_rcu. Under the hood, this expands to using the expedited flavor of function in the event that rtnl is held, in order to not stall other concurrent changes. This fixes some very, very long delays when removing multiple peers at once, which would cause some operations to take several minutes. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/peer.c | 6 +++--- drivers/net/wireguard/socket.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c index 6936a82364bf7..1acd00ab2fbcb 100644 --- a/drivers/net/wireguard/peer.c +++ b/drivers/net/wireguard/peer.c @@ -89,7 +89,7 @@ static void peer_make_dead(struct wg_peer *peer) /* Mark as dead, so that we don't allow jumping contexts after. */ WRITE_ONCE(peer->is_dead, true); - /* The caller must now synchronize_rcu() for this to take effect. */ + /* The caller must now synchronize_net() for this to take effect. */ } static void peer_remove_after_dead(struct wg_peer *peer) @@ -161,7 +161,7 @@ void wg_peer_remove(struct wg_peer *peer) lockdep_assert_held(&peer->device->device_update_lock); peer_make_dead(peer); - synchronize_rcu(); + synchronize_net(); peer_remove_after_dead(peer); } @@ -179,7 +179,7 @@ void wg_peer_remove_all(struct wg_device *wg) peer_make_dead(peer); list_add_tail(&peer->peer_list, &dead_peers); } - synchronize_rcu(); + synchronize_net(); list_for_each_entry_safe(peer, temp, &dead_peers, peer_list) peer_remove_after_dead(peer); } diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index c33e2c81635fa..c8cd385d233b6 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -430,7 +430,7 @@ void wg_socket_reinit(struct wg_device *wg, struct sock *new4, if (new4) wg->incoming_port = ntohs(inet_sk(new4)->inet_sport); mutex_unlock(&wg->socket_update_lock); - synchronize_rcu(); + synchronize_net(); sock_free(old4); sock_free(old6); } From b8d72ac1f2106adb25ef3e178f373b930aee7d8c Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:30 +0200 Subject: [PATCH 083/269] wireguard: selftests: remove old conntrack kconfig value commit acf2492b51c9a3c4dfb947f4d3477a86d315150f upstream. On recent kernels, this config symbol is no longer used. Reported-by: Rui Salvaterra Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- tools/testing/selftests/wireguard/qemu/kernel.config | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index 4eecb432a66c1..74db83a0aedd8 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -19,7 +19,6 @@ CONFIG_NETFILTER_XTABLES=y CONFIG_NETFILTER_XT_NAT=y CONFIG_NETFILTER_XT_MATCH_LENGTH=y CONFIG_NETFILTER_XT_MARK=y -CONFIG_NF_CONNTRACK_IPV4=y CONFIG_NF_NAT_IPV4=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_FILTER=y From 842c21d6a0427ed035e544cc49cc3f53b0cb41e0 Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:31 +0200 Subject: [PATCH 084/269] wireguard: selftests: make sure rp_filter is disabled on vethc commit f8873d11d4121aad35024f9379e431e0c83abead upstream. Some distros may enable strict rp_filter by default, which will prevent vethc from receiving the packets with an unrouteable reverse path address. Reported-by: Hangbin Liu Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- tools/testing/selftests/wireguard/netns.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 7ed7cd95e58fe..ebc4ee0fe179f 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -363,6 +363,7 @@ ip1 -6 rule add table main suppress_prefixlength 0 ip1 -4 route add default dev wg0 table 51820 ip1 -4 rule add not fwmark 51820 table 51820 ip1 -4 rule add table main suppress_prefixlength 0 +n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/vethc/rp_filter' # Flood the pings instead of sending just one, to trigger routing table reference counting bugs. n1 ping -W 1 -c 100 -f 192.168.99.7 n1 ping -W 1 -c 100 -f abab::1111 From 42a667715b1e118f725de85ac748983ced2ecb5e Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:35 +0200 Subject: [PATCH 085/269] wireguard: allowedips: initialize list head in selftest commit 46cfe8eee285cde465b420637507884551f5d7ca upstream. The randomized trie tests weren't initializing the dummy peer list head, resulting in a NULL pointer dereference when used. Fix this by initializing it in the randomized trie test, just like we do for the static unit test. While we're at it, all of the other strings like this have the word "self-test", so add it to the missing place here. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/selftest/allowedips.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 846db14cb046b..0d2a43a2d4008 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -296,6 +296,7 @@ static __init bool randomized_test(void) goto free; } kref_init(&peers[i]->refcount); + INIT_LIST_HEAD(&peers[i]->allowedips_list); } mutex_lock(&mutex); @@ -333,7 +334,7 @@ static __init bool randomized_test(void) if (wg_allowedips_insert_v4(&t, (struct in_addr *)mutated, cidr, peer, &mutex) < 0) { - pr_err("allowedips random malloc: FAIL\n"); + pr_err("allowedips random self-test malloc: FAIL\n"); goto free_locked; } if (horrible_allowedips_insert_v4(&h, From 70a9a71ab3e55903ca171c22bb5432748dda9e7b Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:36 +0200 Subject: [PATCH 086/269] wireguard: allowedips: remove nodes in O(1) commit f634f418c227c912e7ea95a3299efdc9b10e4022 upstream. Previously, deleting peers would require traversing the entire trie in order to rebalance nodes and safely free them. This meant that removing 1000 peers from a trie with a half million nodes would take an extremely long time, during which we're holding the rtnl lock. Large-scale users were reporting 200ms latencies added to the networking stack as a whole every time their userspace software would queue up significant removals. That's a serious situation. This commit fixes that by maintaining a double pointer to the parent's bit pointer for each node, and then using the already existing node list belonging to each peer to go directly to the node, fix up its pointers, and free it with RCU. This means removal is O(1) instead of O(n), and we don't use gobs of stack. The removal algorithm has the same downside as the code that it fixes: it won't collapse needlessly long runs of fillers. We can enhance that in the future if it ever becomes a problem. This commit documents that limitation with a TODO comment in code, a small but meaningful improvement over the prior situation. Currently the biggest flaw, which the next commit addresses, is that because this increases the node size on 64-bit machines from 60 bytes to 68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up using twice as much memory per node, because of power-of-two allocations, which is a big bummer. We'll need to figure something out there. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/allowedips.c | 132 ++++++++++++----------------- drivers/net/wireguard/allowedips.h | 9 +- 2 files changed, 57 insertions(+), 84 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 3725e9cd85f4f..2785cfd3a2212 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -66,60 +66,6 @@ static void root_remove_peer_lists(struct allowedips_node *root) } } -static void walk_remove_by_peer(struct allowedips_node __rcu **top, - struct wg_peer *peer, struct mutex *lock) -{ -#define REF(p) rcu_access_pointer(p) -#define DEREF(p) rcu_dereference_protected(*(p), lockdep_is_held(lock)) -#define PUSH(p) ({ \ - WARN_ON(IS_ENABLED(DEBUG) && len >= 128); \ - stack[len++] = p; \ - }) - - struct allowedips_node __rcu **stack[128], **nptr; - struct allowedips_node *node, *prev; - unsigned int len; - - if (unlikely(!peer || !REF(*top))) - return; - - for (prev = NULL, len = 0, PUSH(top); len > 0; prev = node) { - nptr = stack[len - 1]; - node = DEREF(nptr); - if (!node) { - --len; - continue; - } - if (!prev || REF(prev->bit[0]) == node || - REF(prev->bit[1]) == node) { - if (REF(node->bit[0])) - PUSH(&node->bit[0]); - else if (REF(node->bit[1])) - PUSH(&node->bit[1]); - } else if (REF(node->bit[0]) == prev) { - if (REF(node->bit[1])) - PUSH(&node->bit[1]); - } else { - if (rcu_dereference_protected(node->peer, - lockdep_is_held(lock)) == peer) { - RCU_INIT_POINTER(node->peer, NULL); - list_del_init(&node->peer_list); - if (!node->bit[0] || !node->bit[1]) { - rcu_assign_pointer(*nptr, DEREF( - &node->bit[!REF(node->bit[0])])); - kfree_rcu(node, rcu); - node = DEREF(nptr); - } - } - --len; - } - } - -#undef REF -#undef DEREF -#undef PUSH -} - static unsigned int fls128(u64 a, u64 b) { return a ? fls64(a) + 64U : fls64(b); @@ -224,6 +170,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, RCU_INIT_POINTER(node->peer, peer); list_add_tail(&node->peer_list, &peer->allowedips_list); copy_and_assign_cidr(node, key, cidr, bits); + rcu_assign_pointer(node->parent_bit, trie); rcu_assign_pointer(*trie, node); return 0; } @@ -243,9 +190,9 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, if (!node) { down = rcu_dereference_protected(*trie, lockdep_is_held(lock)); } else { - down = rcu_dereference_protected(CHOOSE_NODE(node, key), - lockdep_is_held(lock)); + down = rcu_dereference_protected(CHOOSE_NODE(node, key), lockdep_is_held(lock)); if (!down) { + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, key)); rcu_assign_pointer(CHOOSE_NODE(node, key), newnode); return 0; } @@ -254,29 +201,37 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, parent = node; if (newnode->cidr == cidr) { + rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(newnode, down->bits)); rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down); - if (!parent) + if (!parent) { + rcu_assign_pointer(newnode->parent_bit, trie); rcu_assign_pointer(*trie, newnode); - else - rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), - newnode); - } else { - node = kzalloc(sizeof(*node), GFP_KERNEL); - if (unlikely(!node)) { - list_del(&newnode->peer_list); - kfree(newnode); - return -ENOMEM; + } else { + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(parent, newnode->bits)); + rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), newnode); } - INIT_LIST_HEAD(&node->peer_list); - copy_and_assign_cidr(node, newnode->bits, cidr, bits); - - rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); - rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); - if (!parent) - rcu_assign_pointer(*trie, node); - else - rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), - node); + return 0; + } + + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (unlikely(!node)) { + list_del(&newnode->peer_list); + kfree(newnode); + return -ENOMEM; + } + INIT_LIST_HEAD(&node->peer_list); + copy_and_assign_cidr(node, newnode->bits, cidr, bits); + + rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(node, down->bits)); + rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, newnode->bits)); + rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); + if (!parent) { + rcu_assign_pointer(node->parent_bit, trie); + rcu_assign_pointer(*trie, node); + } else { + rcu_assign_pointer(node->parent_bit, &CHOOSE_NODE(parent, node->bits)); + rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), node); } return 0; } @@ -335,9 +290,30 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { + struct allowedips_node *node, *child, *tmp; + + if (list_empty(&peer->allowedips_list)) + return; ++table->seq; - walk_remove_by_peer(&table->root4, peer, lock); - walk_remove_by_peer(&table->root6, peer, lock); + list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) { + list_del_init(&node->peer_list); + RCU_INIT_POINTER(node->peer, NULL); + if (node->bit[0] && node->bit[1]) + continue; + child = rcu_dereference_protected( + node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); + if (child) + child->parent_bit = node->parent_bit; + *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; + kfree_rcu(node, rcu); + + /* TODO: Note that we currently don't walk up and down in order to + * free any potential filler nodes. This means that this function + * doesn't free up as much as it could, which could be revisited + * at some point. + */ + } } int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr) diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index e5c83cafcef4c..f08f552e68529 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -15,14 +15,11 @@ struct wg_peer; struct allowedips_node { struct wg_peer __rcu *peer; struct allowedips_node __rcu *bit[2]; - /* While it may seem scandalous that we waste space for v4, - * we're alloc'ing to the nearest power of 2 anyway, so this - * doesn't actually make a difference. - */ - u8 bits[16] __aligned(__alignof(u64)); u8 cidr, bit_at_a, bit_at_b, bitlen; + u8 bits[16] __aligned(__alignof(u64)); - /* Keep rarely used list at bottom to be beyond cache line. */ + /* Keep rarely used members at bottom to be beyond cache line. */ + struct allowedips_node *__rcu *parent_bit; /* XXX: this puts us at 68->128 bytes instead of 60->64 bytes!! */ union { struct list_head peer_list; struct rcu_head rcu; From c5155c741a484e036e7997420559431a951f2106 Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:37 +0200 Subject: [PATCH 087/269] wireguard: allowedips: allocate nodes in kmem_cache commit dc680de28ca849dfe589dc15ac56d22505f0ef11 upstream. The previous commit moved from O(n) to O(1) for removal, but in the process introduced an additional pointer member to a struct that increased the size from 60 to 68 bytes, putting nodes in the 128-byte slab. With deployed systems having as many as 2 million nodes, this represents a significant doubling in memory usage (128 MiB -> 256 MiB). Fix this by using our own kmem_cache, that's sized exactly right. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Suggested-by: Arnd Bergmann Suggested-by: Matthew Wilcox Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/allowedips.c | 31 ++++++++++++++++++++++++------ drivers/net/wireguard/allowedips.h | 5 ++++- drivers/net/wireguard/main.c | 10 +++++++++- 3 files changed, 38 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 2785cfd3a2212..c540dce8d224e 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -6,6 +6,8 @@ #include "allowedips.h" #include "peer.h" +static struct kmem_cache *node_cache; + static void swap_endian(u8 *dst, const u8 *src, u8 bits) { if (bits == 32) { @@ -40,6 +42,11 @@ static void push_rcu(struct allowedips_node **stack, } } +static void node_free_rcu(struct rcu_head *rcu) +{ + kmem_cache_free(node_cache, container_of(rcu, struct allowedips_node, rcu)); +} + static void root_free_rcu(struct rcu_head *rcu) { struct allowedips_node *node, *stack[128] = { @@ -49,7 +56,7 @@ static void root_free_rcu(struct rcu_head *rcu) while (len > 0 && (node = stack[--len])) { push_rcu(stack, node->bit[0], &len); push_rcu(stack, node->bit[1], &len); - kfree(node); + kmem_cache_free(node_cache, node); } } @@ -164,7 +171,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return -EINVAL; if (!rcu_access_pointer(*trie)) { - node = kzalloc(sizeof(*node), GFP_KERNEL); + node = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; RCU_INIT_POINTER(node->peer, peer); @@ -180,7 +187,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } - newnode = kzalloc(sizeof(*newnode), GFP_KERNEL); + newnode = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!newnode)) return -ENOMEM; RCU_INIT_POINTER(newnode->peer, peer); @@ -213,10 +220,10 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } - node = kzalloc(sizeof(*node), GFP_KERNEL); + node = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!node)) { list_del(&newnode->peer_list); - kfree(newnode); + kmem_cache_free(node_cache, newnode); return -ENOMEM; } INIT_LIST_HEAD(&node->peer_list); @@ -306,7 +313,7 @@ void wg_allowedips_remove_by_peer(struct allowedips *table, if (child) child->parent_bit = node->parent_bit; *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; - kfree_rcu(node, rcu); + call_rcu(&node->rcu, node_free_rcu); /* TODO: Note that we currently don't walk up and down in order to * free any potential filler nodes. This means that this function @@ -350,4 +357,16 @@ struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, return NULL; } +int __init wg_allowedips_slab_init(void) +{ + node_cache = KMEM_CACHE(allowedips_node, 0); + return node_cache ? 0 : -ENOMEM; +} + +void wg_allowedips_slab_uninit(void) +{ + rcu_barrier(); + kmem_cache_destroy(node_cache); +} + #include "selftest/allowedips.c" diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index f08f552e68529..32d611aaf3cc2 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -19,7 +19,7 @@ struct allowedips_node { u8 bits[16] __aligned(__alignof(u64)); /* Keep rarely used members at bottom to be beyond cache line. */ - struct allowedips_node *__rcu *parent_bit; /* XXX: this puts us at 68->128 bytes instead of 60->64 bytes!! */ + struct allowedips_node *__rcu *parent_bit; union { struct list_head peer_list; struct rcu_head rcu; @@ -53,4 +53,7 @@ struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, bool wg_allowedips_selftest(void); #endif +int wg_allowedips_slab_init(void); +void wg_allowedips_slab_uninit(void); + #endif /* _WG_ALLOWEDIPS_H */ diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c index 0a3ebfdac7947..75dbe77b0b4b4 100644 --- a/drivers/net/wireguard/main.c +++ b/drivers/net/wireguard/main.c @@ -21,10 +21,15 @@ static int __init mod_init(void) { int ret; + ret = wg_allowedips_slab_init(); + if (ret < 0) + goto err_allowedips; + #ifdef DEBUG + ret = -ENOTRECOVERABLE; if (!wg_allowedips_selftest() || !wg_packet_counter_selftest() || !wg_ratelimiter_selftest()) - return -ENOTRECOVERABLE; + goto err_peer; #endif wg_noise_init(); @@ -50,6 +55,8 @@ static int __init mod_init(void) err_device: wg_peer_uninit(); err_peer: + wg_allowedips_slab_uninit(); +err_allowedips: return ret; } @@ -58,6 +65,7 @@ static void __exit mod_exit(void) wg_genetlink_uninit(); wg_device_uninit(); wg_peer_uninit(); + wg_allowedips_slab_uninit(); } module_init(mod_init); From df3b45f6d1ce605120750b983fc4448e3769619a Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 4 Jun 2021 17:17:38 +0200 Subject: [PATCH 088/269] wireguard: allowedips: free empty intermediate nodes when removing single node commit bf7b042dc62a31f66d3a41dd4dfc7806f267b307 upstream. When removing single nodes, it's possible that that node's parent is an empty intermediate node, in which case, it too should be removed. Otherwise the trie fills up and never is fully emptied, leading to gradual memory leaks over time for tries that are modified often. There was originally code to do this, but was removed during refactoring in 2016 and never reworked. Now that we have proper parent pointers from the previous commits, we can implement this properly. In order to reduce branching and expensive comparisons, we want to keep the double pointer for parent assignment (which lets us easily chain up to the root), but we still need to actually get the parent's base address. So encode the bit number into the last two bits of the pointer, and pack and unpack it as needed. This is a little bit clumsy but is the fastest and less memory wasteful of the compromises. Note that we align the root struct here to a minimum of 4, because it's embedded into a larger struct, and we're relying on having the bottom two bits for our flag, which would only be 16-bit aligned on m68k. The existing macro-based helpers were a bit unwieldy for adding the bit packing to, so this commit replaces them with safer and clearer ordinary functions. We add a test to the randomized/fuzzer part of the selftests, to free the randomized tries by-peer, refuzz it, and repeat, until it's supposed to be empty, and then then see if that actually resulted in the whole thing being emptied. That combined with kmemcheck should hopefully make sure this commit is doing what it should. Along the way this resulted in various other cleanups of the tests and fixes for recent graphviz. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/wireguard/allowedips.c | 102 ++++++------ drivers/net/wireguard/allowedips.h | 4 +- drivers/net/wireguard/selftest/allowedips.c | 162 ++++++++++---------- 3 files changed, 137 insertions(+), 131 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index c540dce8d224e..b7197e80f2264 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -30,8 +30,11 @@ static void copy_and_assign_cidr(struct allowedips_node *node, const u8 *src, node->bitlen = bits; memcpy(node->bits, src, bits / 8U); } -#define CHOOSE_NODE(parent, key) \ - parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1] + +static inline u8 choose(struct allowedips_node *node, const u8 *key) +{ + return (key[node->bit_at_a] >> node->bit_at_b) & 1; +} static void push_rcu(struct allowedips_node **stack, struct allowedips_node __rcu *p, unsigned int *len) @@ -112,7 +115,7 @@ static struct allowedips_node *find_node(struct allowedips_node *trie, u8 bits, found = node; if (node->cidr == bits) break; - node = rcu_dereference_bh(CHOOSE_NODE(node, key)); + node = rcu_dereference_bh(node->bit[choose(node, key)]); } return found; } @@ -144,8 +147,7 @@ static bool node_placement(struct allowedips_node __rcu *trie, const u8 *key, u8 cidr, u8 bits, struct allowedips_node **rnode, struct mutex *lock) { - struct allowedips_node *node = rcu_dereference_protected(trie, - lockdep_is_held(lock)); + struct allowedips_node *node = rcu_dereference_protected(trie, lockdep_is_held(lock)); struct allowedips_node *parent = NULL; bool exact = false; @@ -155,13 +157,24 @@ static bool node_placement(struct allowedips_node __rcu *trie, const u8 *key, exact = true; break; } - node = rcu_dereference_protected(CHOOSE_NODE(parent, key), - lockdep_is_held(lock)); + node = rcu_dereference_protected(parent->bit[choose(parent, key)], lockdep_is_held(lock)); } *rnode = parent; return exact; } +static inline void connect_node(struct allowedips_node **parent, u8 bit, struct allowedips_node *node) +{ + node->parent_bit_packed = (unsigned long)parent | bit; + rcu_assign_pointer(*parent, node); +} + +static inline void choose_and_connect_node(struct allowedips_node *parent, struct allowedips_node *node) +{ + u8 bit = choose(parent, node->bits); + connect_node(&parent->bit[bit], bit, node); +} + static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, u8 cidr, struct wg_peer *peer, struct mutex *lock) { @@ -177,8 +190,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, RCU_INIT_POINTER(node->peer, peer); list_add_tail(&node->peer_list, &peer->allowedips_list); copy_and_assign_cidr(node, key, cidr, bits); - rcu_assign_pointer(node->parent_bit, trie); - rcu_assign_pointer(*trie, node); + connect_node(trie, 2, node); return 0; } if (node_placement(*trie, key, cidr, bits, &node, lock)) { @@ -197,10 +209,10 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, if (!node) { down = rcu_dereference_protected(*trie, lockdep_is_held(lock)); } else { - down = rcu_dereference_protected(CHOOSE_NODE(node, key), lockdep_is_held(lock)); + const u8 bit = choose(node, key); + down = rcu_dereference_protected(node->bit[bit], lockdep_is_held(lock)); if (!down) { - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, key)); - rcu_assign_pointer(CHOOSE_NODE(node, key), newnode); + connect_node(&node->bit[bit], bit, newnode); return 0; } } @@ -208,15 +220,11 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, parent = node; if (newnode->cidr == cidr) { - rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(newnode, down->bits)); - rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down); - if (!parent) { - rcu_assign_pointer(newnode->parent_bit, trie); - rcu_assign_pointer(*trie, newnode); - } else { - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(parent, newnode->bits)); - rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), newnode); - } + choose_and_connect_node(newnode, down); + if (!parent) + connect_node(trie, 2, newnode); + else + choose_and_connect_node(parent, newnode); return 0; } @@ -229,17 +237,12 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, INIT_LIST_HEAD(&node->peer_list); copy_and_assign_cidr(node, newnode->bits, cidr, bits); - rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(node, down->bits)); - rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, newnode->bits)); - rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); - if (!parent) { - rcu_assign_pointer(node->parent_bit, trie); - rcu_assign_pointer(*trie, node); - } else { - rcu_assign_pointer(node->parent_bit, &CHOOSE_NODE(parent, node->bits)); - rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), node); - } + choose_and_connect_node(node, down); + choose_and_connect_node(node, newnode); + if (!parent) + connect_node(trie, 2, node); + else + choose_and_connect_node(parent, node); return 0; } @@ -297,7 +300,8 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { - struct allowedips_node *node, *child, *tmp; + struct allowedips_node *node, *child, **parent_bit, *parent, *tmp; + bool free_parent; if (list_empty(&peer->allowedips_list)) return; @@ -307,19 +311,29 @@ void wg_allowedips_remove_by_peer(struct allowedips *table, RCU_INIT_POINTER(node->peer, NULL); if (node->bit[0] && node->bit[1]) continue; - child = rcu_dereference_protected( - node->bit[!rcu_access_pointer(node->bit[0])], - lockdep_is_held(lock)); + child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); if (child) - child->parent_bit = node->parent_bit; - *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; + child->parent_bit_packed = node->parent_bit_packed; + parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); + *parent_bit = child; + parent = (void *)parent_bit - + offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); + free_parent = !rcu_access_pointer(node->bit[0]) && + !rcu_access_pointer(node->bit[1]) && + (node->parent_bit_packed & 3) <= 1 && + !rcu_access_pointer(parent->peer); + if (free_parent) + child = rcu_dereference_protected( + parent->bit[!(node->parent_bit_packed & 1)], + lockdep_is_held(lock)); call_rcu(&node->rcu, node_free_rcu); - - /* TODO: Note that we currently don't walk up and down in order to - * free any potential filler nodes. This means that this function - * doesn't free up as much as it could, which could be revisited - * at some point. - */ + if (!free_parent) + continue; + if (child) + child->parent_bit_packed = parent->parent_bit_packed; + *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; + call_rcu(&parent->rcu, node_free_rcu); } } diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index 32d611aaf3cc2..2346c797eb4d8 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -19,7 +19,7 @@ struct allowedips_node { u8 bits[16] __aligned(__alignof(u64)); /* Keep rarely used members at bottom to be beyond cache line. */ - struct allowedips_node *__rcu *parent_bit; + unsigned long parent_bit_packed; union { struct list_head peer_list; struct rcu_head rcu; @@ -30,7 +30,7 @@ struct allowedips { struct allowedips_node __rcu *root4; struct allowedips_node __rcu *root6; u64 seq; -}; +} __aligned(4); /* We pack the lower 2 bits of &root, but m68k only gives 16-bit alignment. */ void wg_allowedips_init(struct allowedips *table); void wg_allowedips_free(struct allowedips *table, struct mutex *mutex); diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 0d2a43a2d4008..e173204ae7d78 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -19,32 +19,22 @@ #include -static __init void swap_endian_and_apply_cidr(u8 *dst, const u8 *src, u8 bits, - u8 cidr) -{ - swap_endian(dst, src, bits); - memset(dst + (cidr + 7) / 8, 0, bits / 8 - (cidr + 7) / 8); - if (cidr) - dst[(cidr + 7) / 8 - 1] &= ~0U << ((8 - (cidr % 8)) % 8); -} - static __init void print_node(struct allowedips_node *node, u8 bits) { char *fmt_connection = KERN_DEBUG "\t\"%p/%d\" -> \"%p/%d\";\n"; - char *fmt_declaration = KERN_DEBUG - "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n"; + char *fmt_declaration = KERN_DEBUG "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n"; + u8 ip1[16], ip2[16], cidr1, cidr2; char *style = "dotted"; - u8 ip1[16], ip2[16]; u32 color = 0; + if (node == NULL) + return; if (bits == 32) { fmt_connection = KERN_DEBUG "\t\"%pI4/%d\" -> \"%pI4/%d\";\n"; - fmt_declaration = KERN_DEBUG - "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n"; + fmt_declaration = KERN_DEBUG "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n"; } else if (bits == 128) { fmt_connection = KERN_DEBUG "\t\"%pI6/%d\" -> \"%pI6/%d\";\n"; - fmt_declaration = KERN_DEBUG - "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n"; + fmt_declaration = KERN_DEBUG "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n"; } if (node->peer) { hsiphash_key_t key = { { 0 } }; @@ -55,24 +45,20 @@ static __init void print_node(struct allowedips_node *node, u8 bits) hsiphash_1u32(0xabad1dea, &key) % 200; style = "bold"; } - swap_endian_and_apply_cidr(ip1, node->bits, bits, node->cidr); - printk(fmt_declaration, ip1, node->cidr, style, color); + wg_allowedips_read_node(node, ip1, &cidr1); + printk(fmt_declaration, ip1, cidr1, style, color); if (node->bit[0]) { - swap_endian_and_apply_cidr(ip2, - rcu_dereference_raw(node->bit[0])->bits, bits, - node->cidr); - printk(fmt_connection, ip1, node->cidr, ip2, - rcu_dereference_raw(node->bit[0])->cidr); - print_node(rcu_dereference_raw(node->bit[0]), bits); + wg_allowedips_read_node(rcu_dereference_raw(node->bit[0]), ip2, &cidr2); + printk(fmt_connection, ip1, cidr1, ip2, cidr2); } if (node->bit[1]) { - swap_endian_and_apply_cidr(ip2, - rcu_dereference_raw(node->bit[1])->bits, - bits, node->cidr); - printk(fmt_connection, ip1, node->cidr, ip2, - rcu_dereference_raw(node->bit[1])->cidr); - print_node(rcu_dereference_raw(node->bit[1]), bits); + wg_allowedips_read_node(rcu_dereference_raw(node->bit[1]), ip2, &cidr2); + printk(fmt_connection, ip1, cidr1, ip2, cidr2); } + if (node->bit[0]) + print_node(rcu_dereference_raw(node->bit[0]), bits); + if (node->bit[1]) + print_node(rcu_dereference_raw(node->bit[1]), bits); } static __init void print_tree(struct allowedips_node __rcu *top, u8 bits) @@ -121,8 +107,8 @@ static __init inline union nf_inet_addr horrible_cidr_to_mask(u8 cidr) { union nf_inet_addr mask; - memset(&mask, 0x00, 128 / 8); - memset(&mask, 0xff, cidr / 8); + memset(&mask, 0, sizeof(mask)); + memset(&mask.all, 0xff, cidr / 8); if (cidr % 32) mask.all[cidr / 32] = (__force u32)htonl( (0xFFFFFFFFUL << (32 - (cidr % 32))) & 0xFFFFFFFFUL); @@ -149,42 +135,36 @@ horrible_mask_self(struct horrible_allowedips_node *node) } static __init inline bool -horrible_match_v4(const struct horrible_allowedips_node *node, - struct in_addr *ip) +horrible_match_v4(const struct horrible_allowedips_node *node, struct in_addr *ip) { return (ip->s_addr & node->mask.ip) == node->ip.ip; } static __init inline bool -horrible_match_v6(const struct horrible_allowedips_node *node, - struct in6_addr *ip) +horrible_match_v6(const struct horrible_allowedips_node *node, struct in6_addr *ip) { - return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == - node->ip.ip6[0] && - (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == - node->ip.ip6[1] && - (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == - node->ip.ip6[2] && + return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == node->ip.ip6[0] && + (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == node->ip.ip6[1] && + (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == node->ip.ip6[2] && (ip->in6_u.u6_addr32[3] & node->mask.ip6[3]) == node->ip.ip6[3]; } static __init void -horrible_insert_ordered(struct horrible_allowedips *table, - struct horrible_allowedips_node *node) +horrible_insert_ordered(struct horrible_allowedips *table, struct horrible_allowedips_node *node) { struct horrible_allowedips_node *other = NULL, *where = NULL; u8 my_cidr = horrible_mask_to_cidr(node->mask); hlist_for_each_entry(other, &table->head, table) { - if (!memcmp(&other->mask, &node->mask, - sizeof(union nf_inet_addr)) && - !memcmp(&other->ip, &node->ip, - sizeof(union nf_inet_addr)) && - other->ip_version == node->ip_version) { + if (other->ip_version == node->ip_version && + !memcmp(&other->mask, &node->mask, sizeof(union nf_inet_addr)) && + !memcmp(&other->ip, &node->ip, sizeof(union nf_inet_addr))) { other->value = node->value; kfree(node); return; } + } + hlist_for_each_entry(other, &table->head, table) { where = other; if (horrible_mask_to_cidr(other->mask) <= my_cidr) break; @@ -201,8 +181,7 @@ static __init int horrible_allowedips_insert_v4(struct horrible_allowedips *table, struct in_addr *ip, u8 cidr, void *value) { - struct horrible_allowedips_node *node = kzalloc(sizeof(*node), - GFP_KERNEL); + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; @@ -219,8 +198,7 @@ static __init int horrible_allowedips_insert_v6(struct horrible_allowedips *table, struct in6_addr *ip, u8 cidr, void *value) { - struct horrible_allowedips_node *node = kzalloc(sizeof(*node), - GFP_KERNEL); + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; @@ -234,39 +212,43 @@ horrible_allowedips_insert_v6(struct horrible_allowedips *table, } static __init void * -horrible_allowedips_lookup_v4(struct horrible_allowedips *table, - struct in_addr *ip) +horrible_allowedips_lookup_v4(struct horrible_allowedips *table, struct in_addr *ip) { struct horrible_allowedips_node *node; - void *ret = NULL; hlist_for_each_entry(node, &table->head, table) { - if (node->ip_version != 4) - continue; - if (horrible_match_v4(node, ip)) { - ret = node->value; - break; - } + if (node->ip_version == 4 && horrible_match_v4(node, ip)) + return node->value; } - return ret; + return NULL; } static __init void * -horrible_allowedips_lookup_v6(struct horrible_allowedips *table, - struct in6_addr *ip) +horrible_allowedips_lookup_v6(struct horrible_allowedips *table, struct in6_addr *ip) { struct horrible_allowedips_node *node; - void *ret = NULL; hlist_for_each_entry(node, &table->head, table) { - if (node->ip_version != 6) + if (node->ip_version == 6 && horrible_match_v6(node, ip)) + return node->value; + } + return NULL; +} + + +static __init void +horrible_allowedips_remove_by_value(struct horrible_allowedips *table, void *value) +{ + struct horrible_allowedips_node *node; + struct hlist_node *h; + + hlist_for_each_entry_safe(node, h, &table->head, table) { + if (node->value != value) continue; - if (horrible_match_v6(node, ip)) { - ret = node->value; - break; - } + hlist_del(&node->table); + kfree(node); } - return ret; + } static __init bool randomized_test(void) @@ -397,23 +379,33 @@ static __init bool randomized_test(void) print_tree(t.root6, 128); } - for (i = 0; i < NUM_QUERIES; ++i) { - prandom_bytes(ip, 4); - if (lookup(t.root4, 32, ip) != - horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) { - pr_err("allowedips random self-test: FAIL\n"); - goto free; + for (j = 0;; ++j) { + for (i = 0; i < NUM_QUERIES; ++i) { + prandom_bytes(ip, 4); + if (lookup(t.root4, 32, ip) != horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) { + horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip); + pr_err("allowedips random v4 self-test: FAIL\n"); + goto free; + } + prandom_bytes(ip, 16); + if (lookup(t.root6, 128, ip) != horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) { + pr_err("allowedips random v6 self-test: FAIL\n"); + goto free; + } } + if (j >= NUM_PEERS) + break; + mutex_lock(&mutex); + wg_allowedips_remove_by_peer(&t, peers[j], &mutex); + mutex_unlock(&mutex); + horrible_allowedips_remove_by_value(&h, peers[j]); } - for (i = 0; i < NUM_QUERIES; ++i) { - prandom_bytes(ip, 16); - if (lookup(t.root6, 128, ip) != - horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) { - pr_err("allowedips random self-test: FAIL\n"); - goto free; - } + if (t.root4 || t.root6) { + pr_err("allowedips random self-test removal: FAIL\n"); + goto free; } + ret = true; free: From dac53568c6ac4c4678e6c2f4e66314b99ec85f03 Mon Sep 17 00:00:00 2001 From: Pavel Skripkin Date: Thu, 3 Jun 2021 19:38:12 +0300 Subject: [PATCH 089/269] net: caif: added cfserl_release function commit bce130e7f392ddde8cfcb09927808ebd5f9c8669 upstream. Added cfserl_release() function. Cc: stable@vger.kernel.org Signed-off-by: Pavel Skripkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- include/net/caif/cfserl.h | 1 + net/caif/cfserl.c | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/include/net/caif/cfserl.h b/include/net/caif/cfserl.h index 14a55e03bb3ce..67cce8757175a 100644 --- a/include/net/caif/cfserl.h +++ b/include/net/caif/cfserl.h @@ -9,4 +9,5 @@ #include struct cflayer *cfserl_create(int instance, bool use_stx); +void cfserl_release(struct cflayer *layer); #endif diff --git a/net/caif/cfserl.c b/net/caif/cfserl.c index e11725a4bb0ed..40cd57ad0a0f4 100644 --- a/net/caif/cfserl.c +++ b/net/caif/cfserl.c @@ -31,6 +31,11 @@ static int cfserl_transmit(struct cflayer *layr, struct cfpkt *pkt); static void cfserl_ctrlcmd(struct cflayer *layr, enum caif_ctrlcmd ctrl, int phyid); +void cfserl_release(struct cflayer *layer) +{ + kfree(layer); +} + struct cflayer *cfserl_create(int instance, bool use_stx) { struct cfserl *this = kzalloc(sizeof(struct cfserl), GFP_ATOMIC); From d6db727457dd29938524f04b301c83ac67cccb87 Mon Sep 17 00:00:00 2001 From: Pavel Skripkin Date: Thu, 3 Jun 2021 19:38:51 +0300 Subject: [PATCH 090/269] net: caif: add proper error handling commit a2805dca5107d5603f4bbc027e81e20d93476e96 upstream. caif_enroll_dev() can fail in some cases. Ingnoring these cases can lead to memory leak due to not assigning link_support pointer to anywhere. Fixes: 7c18d2205ea7 ("caif: Restructure how link caif link layer enroll") Cc: stable@vger.kernel.org Signed-off-by: Pavel Skripkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- include/net/caif/caif_dev.h | 2 +- include/net/caif/cfcnfg.h | 2 +- net/caif/caif_dev.c | 8 +++++--- net/caif/cfcnfg.c | 16 +++++++++++----- 4 files changed, 18 insertions(+), 10 deletions(-) diff --git a/include/net/caif/caif_dev.h b/include/net/caif/caif_dev.h index 48ecca8530ffa..b655d8666f555 100644 --- a/include/net/caif/caif_dev.h +++ b/include/net/caif/caif_dev.h @@ -119,7 +119,7 @@ void caif_free_client(struct cflayer *adap_layer); * The link_support layer is used to add any Link Layer specific * framing. */ -void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, +int caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, struct cflayer *link_support, int head_room, struct cflayer **layer, int (**rcv_func)( struct sk_buff *, struct net_device *, diff --git a/include/net/caif/cfcnfg.h b/include/net/caif/cfcnfg.h index 2aa5e91d84576..8819ff4db35a6 100644 --- a/include/net/caif/cfcnfg.h +++ b/include/net/caif/cfcnfg.h @@ -62,7 +62,7 @@ void cfcnfg_remove(struct cfcnfg *cfg); * @fcs: Specify if checksum is used in CAIF Framing Layer. * @head_room: Head space needed by link specific protocol. */ -void +int cfcnfg_add_phy_layer(struct cfcnfg *cnfg, struct net_device *dev, struct cflayer *phy_layer, enum cfcnfg_phy_preference pref, diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c index c10e5a55758d2..fffbe41440b3d 100644 --- a/net/caif/caif_dev.c +++ b/net/caif/caif_dev.c @@ -308,7 +308,7 @@ static void dev_flowctrl(struct net_device *dev, int on) caifd_put(caifd); } -void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, +int caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, struct cflayer *link_support, int head_room, struct cflayer **layer, int (**rcv_func)(struct sk_buff *, struct net_device *, @@ -319,11 +319,12 @@ void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, enum cfcnfg_phy_preference pref; struct cfcnfg *cfg = get_cfcnfg(dev_net(dev)); struct caif_device_entry_list *caifdevs; + int res; caifdevs = caif_device_list(dev_net(dev)); caifd = caif_device_alloc(dev); if (!caifd) - return; + return -ENOMEM; *layer = &caifd->layer; spin_lock_init(&caifd->flow_lock); @@ -344,7 +345,7 @@ void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, strlcpy(caifd->layer.name, dev->name, sizeof(caifd->layer.name)); caifd->layer.transmit = transmit; - cfcnfg_add_phy_layer(cfg, + res = cfcnfg_add_phy_layer(cfg, dev, &caifd->layer, pref, @@ -354,6 +355,7 @@ void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev, mutex_unlock(&caifdevs->lock); if (rcv_func) *rcv_func = receive; + return res; } EXPORT_SYMBOL(caif_enroll_dev); diff --git a/net/caif/cfcnfg.c b/net/caif/cfcnfg.c index 399239a14420f..cac30e676ac94 100644 --- a/net/caif/cfcnfg.c +++ b/net/caif/cfcnfg.c @@ -450,7 +450,7 @@ cfcnfg_linkup_rsp(struct cflayer *layer, u8 channel_id, enum cfctrl_srv serv, rcu_read_unlock(); } -void +int cfcnfg_add_phy_layer(struct cfcnfg *cnfg, struct net_device *dev, struct cflayer *phy_layer, enum cfcnfg_phy_preference pref, @@ -459,7 +459,7 @@ cfcnfg_add_phy_layer(struct cfcnfg *cnfg, { struct cflayer *frml; struct cfcnfg_phyinfo *phyinfo = NULL; - int i; + int i, res = 0; u8 phyid; mutex_lock(&cnfg->lock); @@ -473,12 +473,15 @@ cfcnfg_add_phy_layer(struct cfcnfg *cnfg, goto got_phyid; } pr_warn("Too many CAIF Link Layers (max 6)\n"); + res = -EEXIST; goto out; got_phyid: phyinfo = kzalloc(sizeof(struct cfcnfg_phyinfo), GFP_ATOMIC); - if (!phyinfo) + if (!phyinfo) { + res = -ENOMEM; goto out_err; + } phy_layer->id = phyid; phyinfo->pref = pref; @@ -492,8 +495,10 @@ cfcnfg_add_phy_layer(struct cfcnfg *cnfg, frml = cffrml_create(phyid, fcs); - if (!frml) + if (!frml) { + res = -ENOMEM; goto out_err; + } phyinfo->frm_layer = frml; layer_set_up(frml, cnfg->mux); @@ -511,11 +516,12 @@ cfcnfg_add_phy_layer(struct cfcnfg *cnfg, list_add_rcu(&phyinfo->node, &cnfg->phys); out: mutex_unlock(&cnfg->lock); - return; + return res; out_err: kfree(phyinfo); mutex_unlock(&cnfg->lock); + return res; } EXPORT_SYMBOL(cfcnfg_add_phy_layer); From af2806345a37313f01b1c9f15e046745b8ee2daa Mon Sep 17 00:00:00 2001 From: Pavel Skripkin Date: Thu, 3 Jun 2021 19:39:11 +0300 Subject: [PATCH 091/269] net: caif: fix memory leak in caif_device_notify commit b53558a950a89824938e9811eddfc8efcd94e1bb upstream. In case of caif_enroll_dev() fail, allocated link_support won't be assigned to the corresponding structure. So simply free allocated pointer in case of error Fixes: 7c18d2205ea7 ("caif: Restructure how link caif link layer enroll") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+7ec324747ce876a29db6@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/caif/caif_dev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c index fffbe41440b3d..440139706130a 100644 --- a/net/caif/caif_dev.c +++ b/net/caif/caif_dev.c @@ -370,6 +370,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what, struct cflayer *layer, *link_support; int head_room = 0; struct caif_device_entry_list *caifdevs; + int res; cfg = get_cfcnfg(dev_net(dev)); caifdevs = caif_device_list(dev_net(dev)); @@ -395,8 +396,10 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what, break; } } - caif_enroll_dev(dev, caifdev, link_support, head_room, + res = caif_enroll_dev(dev, caifdev, link_support, head_room, &layer, NULL); + if (res) + cfserl_release(link_support); caifdev->flowctrl = dev_flowctrl; break; From 46403c1f80b0d3f937ff9c4f5edc63bb64bc5051 Mon Sep 17 00:00:00 2001 From: Pavel Skripkin Date: Thu, 3 Jun 2021 19:39:35 +0300 Subject: [PATCH 092/269] net: caif: fix memory leak in cfusbl_device_notify commit 7f5d86669fa4d485523ddb1d212e0a2d90bd62bb upstream. In case of caif_enroll_dev() fail, allocated link_support won't be assigned to the corresponding structure. So simply free allocated pointer in case of error. Fixes: 7ad65bf68d70 ("caif: Add support for CAIF over CDC NCM USB interface") Cc: stable@vger.kernel.org Signed-off-by: Pavel Skripkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/caif/caif_usb.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index a0116b9503d9d..b02e1292f7f19 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -115,6 +115,11 @@ static struct cflayer *cfusbl_create(int phyid, u8 ethaddr[ETH_ALEN], return (struct cflayer *) this; } +static void cfusbl_release(struct cflayer *layer) +{ + kfree(layer); +} + static struct packet_type caif_usb_type __read_mostly = { .type = cpu_to_be16(ETH_P_802_EX1), }; @@ -127,6 +132,7 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what, struct cflayer *layer, *link_support; struct usbnet *usbnet; struct usb_device *usbdev; + int res; /* Check whether we have a NCM device, and find its VID/PID. */ if (!(dev->dev.parent && dev->dev.parent->driver && @@ -169,8 +175,11 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what, if (dev->num_tx_queues > 1) pr_warn("USB device uses more than one tx queue\n"); - caif_enroll_dev(dev, &common, link_support, CFUSB_MAX_HEADLEN, + res = caif_enroll_dev(dev, &common, link_support, CFUSB_MAX_HEADLEN, &layer, &caif_usb_type.func); + if (res) + goto err; + if (!pack_added) dev_add_pack(&caif_usb_type); pack_added = true; @@ -178,6 +187,9 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what, strlcpy(layer->name, dev->name, sizeof(layer->name)); return 0; +err: + cfusbl_release(link_support); + return res; } static struct notifier_block caif_device_notifier = { From a5e554f789815105ac5866f8fb133ab6a1566286 Mon Sep 17 00:00:00 2001 From: Johnny Chuang Date: Tue, 13 Apr 2021 09:20:50 +0800 Subject: [PATCH 093/269] HID: i2c-hid: Skip ELAN power-on command after reset commit ca66a6770bd9d6d99e469debd1c7363ac455daf9 upstream. For ELAN touchscreen, we found our boot code of IC was not flexible enough to receive and handle this command. Once the FW main code of our controller is crashed for some reason, the controller could not be enumerated successfully to be recognized by the system host. therefore, it lost touch functionality. Add quirk for skip send power-on command after reset. It will impact to ELAN touchscreen and touchpad on HID over I2C projects. Fixes: 43b7029f475e ("HID: i2c-hid: Send power-on command after reset"). Cc: stable@vger.kernel.org Signed-off-by: Johnny Chuang Reviewed-by: Harry Cutts Reviewed-by: Douglas Anderson Tested-by: Douglas Anderson Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman --- drivers/hid/i2c-hid/i2c-hid-core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c index 13bc59ed1d9e7..1f08c848c33de 100644 --- a/drivers/hid/i2c-hid/i2c-hid-core.c +++ b/drivers/hid/i2c-hid/i2c-hid-core.c @@ -50,6 +50,7 @@ #define I2C_HID_QUIRK_BOGUS_IRQ BIT(4) #define I2C_HID_QUIRK_RESET_ON_RESUME BIT(5) #define I2C_HID_QUIRK_BAD_INPUT_SIZE BIT(6) +#define I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET BIT(7) /* flags */ @@ -183,6 +184,11 @@ static const struct i2c_hid_quirks { I2C_HID_QUIRK_RESET_ON_RESUME }, { USB_VENDOR_ID_ITE, I2C_DEVICE_ID_ITE_LENOVO_LEGION_Y720, I2C_HID_QUIRK_BAD_INPUT_SIZE }, + /* + * Sending the wakeup after reset actually break ELAN touchscreen controller + */ + { USB_VENDOR_ID_ELAN, HID_ANY_ID, + I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET }, { 0, 0 } }; @@ -466,7 +472,8 @@ static int i2c_hid_hwreset(struct i2c_client *client) } /* At least some SIS devices need this after reset */ - ret = i2c_hid_set_power(client, I2C_HID_PWR_ON); + if (!(ihid->quirks & I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET)) + ret = i2c_hid_set_power(client, I2C_HID_PWR_ON); out_unlock: mutex_unlock(&ihid->reset_lock); From b5d013c4c76b276890135b5d32803c4c63924b77 Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 17 May 2021 12:04:30 +0200 Subject: [PATCH 094/269] HID: magicmouse: fix NULL-deref on disconnect commit 4b4f6cecca446abcb686c6e6c451d4f1ec1a7497 upstream. Commit 9d7b18668956 ("HID: magicmouse: add support for Apple Magic Trackpad 2") added a sanity check for an Apple trackpad but returned success instead of -ENODEV when the check failed. This means that the remove callback will dereference the never-initialised driver data pointer when the driver is later unbound (e.g. on USB disconnect). Reported-by: syzbot+ee6f6e2e68886ca256a8@syzkaller.appspotmail.com Fixes: 9d7b18668956 ("HID: magicmouse: add support for Apple Magic Trackpad 2") Cc: stable@vger.kernel.org # 4.20 Cc: Claudio Mettler Cc: Marek Wyborski Cc: Sean O'Brien Signed-off-by: Johan Hovold Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-magicmouse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hid/hid-magicmouse.c b/drivers/hid/hid-magicmouse.c index abd86903875f0..fc4c074597539 100644 --- a/drivers/hid/hid-magicmouse.c +++ b/drivers/hid/hid-magicmouse.c @@ -597,7 +597,7 @@ static int magicmouse_probe(struct hid_device *hdev, if (id->vendor == USB_VENDOR_ID_APPLE && id->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2 && hdev->type != HID_TYPE_USBMOUSE) - return 0; + return -ENODEV; msc = devm_kzalloc(&hdev->dev, sizeof(*msc), GFP_KERNEL); if (msc == NULL) { From 127f25be2ff0fb3706adf4cf91a0e18c43b59885 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ahelenia=20Ziemia=C5=84ska?= Date: Mon, 8 Mar 2021 18:42:03 +0100 Subject: [PATCH 095/269] HID: multitouch: require Finger field to mark Win8 reports as MT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit a2353e3b26012ff43bcdf81d37a3eaddd7ecdbf3 upstream. This effectively changes collection_is_mt from contact ID in report->field to (device is Win8 => collection is finger) && contact ID in report->field Some devices erroneously report Pen for fingers, and Win8 stylus-on-touchscreen devices report contact ID, but mark the accompanying touchscreen device's collection correctly Cc: stable@vger.kernel.org Signed-off-by: Ahelenia Ziemiańska Acked-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman --- drivers/hid/hid-multitouch.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 8429ebe7097e4..8580ace596c25 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -604,9 +604,13 @@ static struct mt_report_data *mt_allocate_report_data(struct mt_device *td, if (!(HID_MAIN_ITEM_VARIABLE & field->flags)) continue; - for (n = 0; n < field->report_count; n++) { - if (field->usage[n].hid == HID_DG_CONTACTID) - rdata->is_mt_collection = true; + if (field->logical == HID_DG_FINGER || td->hdev->group != HID_GROUP_MULTITOUCH_WIN_8) { + for (n = 0; n < field->report_count; n++) { + if (field->usage[n].hid == HID_DG_CONTACTID) { + rdata->is_mt_collection = true; + break; + } + } } } From d11e5b96efde861e6c753c15e694f3ee34e7a513 Mon Sep 17 00:00:00 2001 From: Bob Peterson Date: Tue, 18 May 2021 09:12:10 -0400 Subject: [PATCH 096/269] gfs2: fix scheduling while atomic bug in glocks commit 20265d9a67e40eafd39a8884658ca2e36f05985d upstream. Before this patch, in the unlikely event that gfs2_glock_dq encountered a withdraw, it would do a wait_on_bit to wait for its journal to be recovered, but it never released the glock's spin_lock, which caused a scheduling-while-atomic error. This patch unlocks the lockref spin_lock before waiting for recovery. Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Cc: stable@vger.kernel.org # v5.7+ Reported-by: Alexander Aring Signed-off-by: Bob Peterson Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman --- fs/gfs2/glock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 35a6fd103761b..ea2f2de448063 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -1457,9 +1457,11 @@ void gfs2_glock_dq(struct gfs2_holder *gh) glock_blocked_by_withdraw(gl) && gh->gh_gl != sdp->sd_jinode_gl) { sdp->sd_glock_dqs_held++; + spin_unlock(&gl->gl_lockref.lock); might_sleep(); wait_on_bit(&sdp->sd_flags, SDF_WITHDRAW_RECOVERY, TASK_UNINTERRUPTIBLE); + spin_lock(&gl->gl_lockref.lock); } if (gh->gh_flags & GL_NOCACHE) handle_callback(gl, LM_ST_UNLOCKED, 0, false); From 029c06103e0ad669e6bec74f26b006198041326d Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 2 Jun 2021 13:38:23 +0200 Subject: [PATCH 097/269] ALSA: timer: Fix master timer notification commit 9c1fe96bded935369f8340c2ac2e9e189f697d5d upstream. snd_timer_notify1() calls the notification to each slave for a master event, but it passes a wrong event number. It should be +10 offset, corresponding to SNDRV_TIMER_EVENT_MXXX, but it's incorrectly with +100 offset. Casually this was spotted by UBSAN check via syzkaller. Reported-by: syzbot+d102fa5b35335a7e544e@syzkaller.appspotmail.com Reviewed-by: Jaroslav Kysela Cc: Link: https://lore.kernel.org/r/000000000000e5560e05c3bd1d63@google.com Link: https://lore.kernel.org/r/20210602113823.23777-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/core/timer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/core/timer.c b/sound/core/timer.c index 765ea66665a8c..c15c8314671b7 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -520,9 +520,10 @@ static void snd_timer_notify1(struct snd_timer_instance *ti, int event) return; if (timer->hw.flags & SNDRV_TIMER_HW_SLAVE) return; + event += 10; /* convert to SNDRV_TIMER_EVENT_MXXX */ list_for_each_entry(ts, &ti->slave_active_head, active_list) if (ts->ccallback) - ts->ccallback(ts, event + 100, &tstamp, resolution); + ts->ccallback(ts, event, &tstamp, resolution); } /* start/continue a master timer */ From cfbb57fcb1808d42a410339d7d81936c91abbbf5 Mon Sep 17 00:00:00 2001 From: Carlos M Date: Mon, 31 May 2021 22:20:26 +0200 Subject: [PATCH 098/269] ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx commit 901be145a46eb79879367d853194346a549e623d upstream. For the HP Pavilion 15-CK0xx, with audio subsystem ID 0x103c:0x841c, adding a line in patch_realtek.c to apply the ALC269_FIXUP_HP_MUTE_LED_MIC3 fix activates the mute key LED. Signed-off-by: Carlos M Cc: Link: https://lore.kernel.org/r/20210531202026.35427-1-carlos.marr.pz@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index d8424d226714f..cc13a68197f3c 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8289,6 +8289,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x82bf, "HP G3 mini", ALC221_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x82c0, "HP G3 mini premium", ALC221_FIXUP_HP_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x83b9, "HP Spectre x360", ALC269_FIXUP_HP_MUTE_LED_MIC3), + SND_PCI_QUIRK(0x103c, 0x841c, "HP Pavilion 15-CK0xx", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x103c, 0x8497, "HP Envy x360", ALC269_FIXUP_HP_MUTE_LED_MIC3), SND_PCI_QUIRK(0x103c, 0x84da, "HP OMEN dc0019-ur", ALC295_FIXUP_HP_OMEN), SND_PCI_QUIRK(0x103c, 0x84e7, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3), From 846848c0520f3391a2ef41c90f26ebef858971af Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Wed, 2 Jun 2021 22:54:24 +0800 Subject: [PATCH 099/269] ALSA: hda: update the power_state during the direct-complete commit b8b90c17602689eeaa5b219d104bbc215d1225cc upstream. The patch_realtek.c needs to check if the power_state.event equals PM_EVENT_SUSPEND, after using the direct-complete, the suspend() and resume() will be skipped if the codec is already rt_suspended, in this case, the patch_realtek.c will always get PM_EVENT_ON even the system is really resumed from S3. We could set power_state to PMSG_SUSPEND in the prepare(), if other PM functions are called before complete(), those functions will override power_state; if no other PM functions are called before complete(), we could know the suspend() and resume() are skipped since only S3 pm functions could be skipped by direct-complete, in this case set power_state to PMSG_RESUME in the complete(). This could guarantee the first time of calling hda_codec_runtime_resume() after complete() has the correct power_state. Fixes: 215a22ed31a1 ("ALSA: hda: Refactor codec PM to use direct-complete optimization") Cc: Signed-off-by: Hui Wang Link: https://lore.kernel.org/r/20210602145424.3132-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/hda_codec.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index eec1775dfffe9..4cec1bd77e6fe 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -2973,6 +2973,7 @@ static int hda_codec_runtime_resume(struct device *dev) #ifdef CONFIG_PM_SLEEP static int hda_codec_pm_prepare(struct device *dev) { + dev->power.power_state = PMSG_SUSPEND; return pm_runtime_suspended(dev); } @@ -2980,6 +2981,10 @@ static void hda_codec_pm_complete(struct device *dev) { struct hda_codec *codec = dev_to_hda_codec(dev); + /* If no other pm-functions are called between prepare() and complete() */ + if (dev->power.power_state.event == PM_EVENT_SUSPEND) + dev->power.power_state = PMSG_RESUME; + if (pm_runtime_suspended(dev) && (codec->jackpoll_interval || hda_codec_need_resume(codec) || codec->forced_resume)) pm_request_resume(dev); From 623603e255aa00642a9d3327a652639407eb0501 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Vok=C3=A1=C4=8D?= Date: Tue, 13 Apr 2021 16:45:57 +0200 Subject: [PATCH 100/269] ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 0e4a4a08cd78efcaddbc2e4c5ed86b5a5cb8a15e upstream. The FEC does not have a PHY so it should not have a phy-handle. It is connected to the switch at RGMII level so we need a fixed-link sub-node on both ends. This was not a problem until the qca8k.c driver was converted to PHYLINK by commit b3591c2a3661 ("net: dsa: qca8k: Switch to PHYLINK instead of PHYLIB"). That commit revealed the FEC configuration was not correct. Fixes: 87489ec3a77f ("ARM: dts: imx: Add Y Soft IOTA Draco, Hydra and Ursa boards") Cc: stable@vger.kernel.org Signed-off-by: Michal Vokáč Reviewed-by: Andrew Lunn Signed-off-by: Shawn Guo Signed-off-by: Greg Kroah-Hartman --- arch/arm/boot/dts/imx6dl-yapp4-common.dtsi | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi b/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi index 7d2c72562c735..9148a01ed6d9f 100644 --- a/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi +++ b/arch/arm/boot/dts/imx6dl-yapp4-common.dtsi @@ -105,9 +105,13 @@ phy-reset-gpios = <&gpio1 25 GPIO_ACTIVE_LOW>; phy-reset-duration = <20>; phy-supply = <&sw2_reg>; - phy-handle = <ðphy0>; status = "okay"; + fixed-link { + speed = <1000>; + full-duplex; + }; + mdio { #address-cells = <1>; #size-cells = <0>; From b2057d138f1b1141e00fdd9bee364bb8438cf223 Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Mon, 26 Apr 2021 12:23:21 +0200 Subject: [PATCH 101/269] ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators commit 8967b27a6c1c19251989c7ab33c058d16e4a5f53 upstream. Per schematic, both PU and SOC regulator are supplied from LTC3676 SW1 via VDDSOC_IN rail, add the PU input. Both VDD1P1, VDD2P5 are supplied from LTC3676 SW2 via VDDHIGH_IN rail, add both inputs. While no instability or problems are currently observed, the regulators should be fully described in DT and that description should fully match the hardware, else this might lead to unforseen issues later. Fix this. Fixes: 52c7a088badd ("ARM: dts: imx6q: Add support for the DHCOM iMX6 SoM and PDK2") Reviewed-by: Fabio Estevam Signed-off-by: Marek Vasut Cc: Christoph Niedermaier Cc: Fabio Estevam Cc: Ludwig Zenz Cc: NXP Linux Team Cc: Shawn Guo Cc: stable@vger.kernel.org Reviewed-by: Christoph Niedermaier Signed-off-by: Shawn Guo Signed-off-by: Greg Kroah-Hartman --- arch/arm/boot/dts/imx6q-dhcom-som.dtsi | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/arm/boot/dts/imx6q-dhcom-som.dtsi b/arch/arm/boot/dts/imx6q-dhcom-som.dtsi index 236fc205c3890..d0768ae429faa 100644 --- a/arch/arm/boot/dts/imx6q-dhcom-som.dtsi +++ b/arch/arm/boot/dts/imx6q-dhcom-som.dtsi @@ -406,6 +406,18 @@ vin-supply = <&sw1_reg>; }; +®_pu { + vin-supply = <&sw1_reg>; +}; + +®_vdd1p1 { + vin-supply = <&sw2_reg>; +}; + +®_vdd2p5 { + vin-supply = <&sw2_reg>; +}; + &uart1 { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_uart1>; From 01d349a481f0591230300a9171330136f9159bcd Mon Sep 17 00:00:00 2001 From: Alexey Makhalov Date: Fri, 21 May 2021 07:55:33 +0000 Subject: [PATCH 102/269] ext4: fix memory leak in ext4_fill_super commit afd09b617db3786b6ef3dc43e28fe728cfea84df upstream. Buffer head references must be released before calling kill_bdev(); otherwise the buffer head (and its page referenced by b_data) will not be freed by kill_bdev, and subsequently that bh will be leaked. If blocksizes differ, sb_set_blocksize() will kill current buffers and page cache by using kill_bdev(). And then super block will be reread again but using correct blocksize this time. sb_set_blocksize() didn't fully free superblock page and buffer head, and being busy, they were not freed and instead leaked. This can easily be reproduced by calling an infinite loop of: systemctl start .mount, and systemctl stop .mount ... since systemd creates a cgroup for each slice which it mounts, and the bh leak get amplified by a dying memory cgroup that also never gets freed, and memory consumption is much more easily noticed. Fixes: ce40733ce93d ("ext4: Check for return value from sb_set_blocksize") Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3") Link: https://lore.kernel.org/r/20210521075533.95732-1-amakhalov@vmware.com Signed-off-by: Alexey Makhalov Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman --- fs/ext4/super.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index c7f5b665834fc..21c4ba2513ce5 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4451,14 +4451,20 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) } if (sb->s_blocksize != blocksize) { + /* + * bh must be released before kill_bdev(), otherwise + * it won't be freed and its page also. kill_bdev() + * is called by sb_set_blocksize(). + */ + brelse(bh); /* Validate the filesystem blocksize */ if (!sb_set_blocksize(sb, blocksize)) { ext4_msg(sb, KERN_ERR, "bad block size %d", blocksize); + bh = NULL; goto failed_mount; } - brelse(bh); logical_sb_block = sb_block * EXT4_MIN_BLOCK_SIZE; offset = do_div(logical_sb_block, blocksize); bh = ext4_sb_bread_unmovable(sb, logical_sb_block); @@ -5181,8 +5187,9 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) kfree(get_qf_name(sb, sbi, i)); #endif fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy); - ext4_blkdev_remove(sbi); + /* ext4_blkdev_remove() calls kill_bdev(), release bh before it. */ brelse(bh); + ext4_blkdev_remove(sbi); out_fail: sb->s_fs_info = NULL; kfree(sbi->s_blockgroup_lock); From d3b668b96ad3192c0581a248ae2f596cd054792a Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Thu, 6 May 2021 22:10:42 +0800 Subject: [PATCH 103/269] ext4: fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed commit 082cd4ec240b8734a82a89ffb890216ac98fec68 upstream. We got follow bug_on when run fsstress with injecting IO fault: [130747.323114] kernel BUG at fs/ext4/extents_status.c:762! [130747.323117] Internal error: Oops - BUG: 0 [#1] SMP ...... [130747.334329] Call trace: [130747.334553] ext4_es_cache_extent+0x150/0x168 [ext4] [130747.334975] ext4_cache_extents+0x64/0xe8 [ext4] [130747.335368] ext4_find_extent+0x300/0x330 [ext4] [130747.335759] ext4_ext_map_blocks+0x74/0x1178 [ext4] [130747.336179] ext4_map_blocks+0x2f4/0x5f0 [ext4] [130747.336567] ext4_mpage_readpages+0x4a8/0x7a8 [ext4] [130747.336995] ext4_readpage+0x54/0x100 [ext4] [130747.337359] generic_file_buffered_read+0x410/0xae8 [130747.337767] generic_file_read_iter+0x114/0x190 [130747.338152] ext4_file_read_iter+0x5c/0x140 [ext4] [130747.338556] __vfs_read+0x11c/0x188 [130747.338851] vfs_read+0x94/0x150 [130747.339110] ksys_read+0x74/0xf0 This patch's modification is according to Jan Kara's suggestion in: https://patchwork.ozlabs.org/project/linux-ext4/patch/20210428085158.3728201-1-yebin10@huawei.com/ "I see. Now I understand your patch. Honestly, seeing how fragile is trying to fix extent tree after split has failed in the middle, I would probably go even further and make sure we fix the tree properly in case of ENOSPC and EDQUOT (those are easily user triggerable). Anything else indicates a HW problem or fs corruption so I'd rather leave the extent tree as is and don't try to fix it (which also means we will not create overlapping extents)." Cc: stable@kernel.org Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20210506141042.3298679-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman --- fs/ext4/extents.c | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 12eac88373032..e6542ba264330 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3206,7 +3206,10 @@ static int ext4_split_extent_at(handle_t *handle, ext4_ext_mark_unwritten(ex2); err = ext4_ext_insert_extent(handle, inode, ppath, &newex, flags); - if (err == -ENOSPC && (EXT4_EXT_MAY_ZEROOUT & split_flag)) { + if (err != -ENOSPC && err != -EDQUOT) + goto out; + + if (EXT4_EXT_MAY_ZEROOUT & split_flag) { if (split_flag & (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) { if (split_flag & EXT4_EXT_DATA_VALID1) { err = ext4_ext_zeroout(inode, ex2); @@ -3232,25 +3235,22 @@ static int ext4_split_extent_at(handle_t *handle, ext4_ext_pblock(&orig_ex)); } - if (err) - goto fix_extent_len; - /* update the extent length and mark as initialized */ - ex->ee_len = cpu_to_le16(ee_len); - ext4_ext_try_to_merge(handle, inode, path, ex); - err = ext4_ext_dirty(handle, inode, path + path->p_depth); - if (err) - goto fix_extent_len; - - /* update extent status tree */ - err = ext4_zeroout_es(inode, &zero_ex); - - goto out; - } else if (err) - goto fix_extent_len; - -out: - ext4_ext_show_leaf(inode, path); - return err; + if (!err) { + /* update the extent length and mark as initialized */ + ex->ee_len = cpu_to_le16(ee_len); + ext4_ext_try_to_merge(handle, inode, path, ex); + err = ext4_ext_dirty(handle, inode, path + path->p_depth); + if (!err) + /* update extent status tree */ + err = ext4_zeroout_es(inode, &zero_ex); + /* If we failed at this point, we don't know in which + * state the extent tree exactly is so don't try to fix + * length of the original extent as it may do even more + * damage. + */ + goto out; + } + } fix_extent_len: ex->ee_len = orig_ex.ee_len; @@ -3260,6 +3260,9 @@ static int ext4_split_extent_at(handle_t *handle, */ ext4_ext_dirty(handle, inode, path + path->p_depth); return err; +out: + ext4_ext_show_leaf(inode, path); + return err; } /* From fb86acc6236934e2b654f50d2a97fa77eac17859 Mon Sep 17 00:00:00 2001 From: Harshad Shirwadkar Date: Wed, 19 May 2021 14:59:20 -0700 Subject: [PATCH 104/269] ext4: fix fast commit alignment issues commit a7ba36bc94f20b6c77f16364b9a23f582ea8faac upstream. Fast commit recovery data on disk may not be aligned. So, when the recovery code reads it, this patch makes sure that fast commit info found on-disk is first memcpy-ed into an aligned variable before accessing it. As a consequence of it, we also remove some macros that could resulted in unaligned accesses. Cc: stable@kernel.org Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Harshad Shirwadkar Link: https://lore.kernel.org/r/20210519215920.2037527-1-harshads@google.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman --- fs/ext4/fast_commit.c | 182 +++++++++++++++++++++--------------------- fs/ext4/fast_commit.h | 7 -- 2 files changed, 90 insertions(+), 99 deletions(-) diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c index 896e1176e0449..53647fa038773 100644 --- a/fs/ext4/fast_commit.c +++ b/fs/ext4/fast_commit.c @@ -1227,18 +1227,6 @@ static void ext4_fc_cleanup(journal_t *journal, int full) /* Ext4 Replay Path Routines */ -/* Get length of a particular tlv */ -static inline int ext4_fc_tag_len(struct ext4_fc_tl *tl) -{ - return le16_to_cpu(tl->fc_len); -} - -/* Get a pointer to "value" of a tlv */ -static inline u8 *ext4_fc_tag_val(struct ext4_fc_tl *tl) -{ - return (u8 *)tl + sizeof(*tl); -} - /* Helper struct for dentry replay routines */ struct dentry_info_args { int parent_ino, dname_len, ino, inode_len; @@ -1246,28 +1234,29 @@ struct dentry_info_args { }; static inline void tl_to_darg(struct dentry_info_args *darg, - struct ext4_fc_tl *tl) + struct ext4_fc_tl *tl, u8 *val) { - struct ext4_fc_dentry_info *fcd; + struct ext4_fc_dentry_info fcd; - fcd = (struct ext4_fc_dentry_info *)ext4_fc_tag_val(tl); + memcpy(&fcd, val, sizeof(fcd)); - darg->parent_ino = le32_to_cpu(fcd->fc_parent_ino); - darg->ino = le32_to_cpu(fcd->fc_ino); - darg->dname = fcd->fc_dname; - darg->dname_len = ext4_fc_tag_len(tl) - - sizeof(struct ext4_fc_dentry_info); + darg->parent_ino = le32_to_cpu(fcd.fc_parent_ino); + darg->ino = le32_to_cpu(fcd.fc_ino); + darg->dname = val + offsetof(struct ext4_fc_dentry_info, fc_dname); + darg->dname_len = le16_to_cpu(tl->fc_len) - + sizeof(struct ext4_fc_dentry_info); } /* Unlink replay function */ -static int ext4_fc_replay_unlink(struct super_block *sb, struct ext4_fc_tl *tl) +static int ext4_fc_replay_unlink(struct super_block *sb, struct ext4_fc_tl *tl, + u8 *val) { struct inode *inode, *old_parent; struct qstr entry; struct dentry_info_args darg; int ret = 0; - tl_to_darg(&darg, tl); + tl_to_darg(&darg, tl, val); trace_ext4_fc_replay(sb, EXT4_FC_TAG_UNLINK, darg.ino, darg.parent_ino, darg.dname_len); @@ -1357,13 +1346,14 @@ static int ext4_fc_replay_link_internal(struct super_block *sb, } /* Link replay function */ -static int ext4_fc_replay_link(struct super_block *sb, struct ext4_fc_tl *tl) +static int ext4_fc_replay_link(struct super_block *sb, struct ext4_fc_tl *tl, + u8 *val) { struct inode *inode; struct dentry_info_args darg; int ret = 0; - tl_to_darg(&darg, tl); + tl_to_darg(&darg, tl, val); trace_ext4_fc_replay(sb, EXT4_FC_TAG_LINK, darg.ino, darg.parent_ino, darg.dname_len); @@ -1408,9 +1398,10 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino) /* * Inode replay function */ -static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl) +static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl, + u8 *val) { - struct ext4_fc_inode *fc_inode; + struct ext4_fc_inode fc_inode; struct ext4_inode *raw_inode; struct ext4_inode *raw_fc_inode; struct inode *inode = NULL; @@ -1418,9 +1409,9 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl) int inode_len, ino, ret, tag = le16_to_cpu(tl->fc_tag); struct ext4_extent_header *eh; - fc_inode = (struct ext4_fc_inode *)ext4_fc_tag_val(tl); + memcpy(&fc_inode, val, sizeof(fc_inode)); - ino = le32_to_cpu(fc_inode->fc_ino); + ino = le32_to_cpu(fc_inode.fc_ino); trace_ext4_fc_replay(sb, tag, ino, 0, 0); inode = ext4_iget(sb, ino, EXT4_IGET_NORMAL); @@ -1432,12 +1423,13 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl) ext4_fc_record_modified_inode(sb, ino); - raw_fc_inode = (struct ext4_inode *)fc_inode->fc_raw_inode; + raw_fc_inode = (struct ext4_inode *) + (val + offsetof(struct ext4_fc_inode, fc_raw_inode)); ret = ext4_get_fc_inode_loc(sb, ino, &iloc); if (ret) goto out; - inode_len = ext4_fc_tag_len(tl) - sizeof(struct ext4_fc_inode); + inode_len = le16_to_cpu(tl->fc_len) - sizeof(struct ext4_fc_inode); raw_inode = ext4_raw_inode(&iloc); memcpy(raw_inode, raw_fc_inode, offsetof(struct ext4_inode, i_block)); @@ -1505,14 +1497,15 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl) * inode for which we are trying to create a dentry here, should already have * been replayed before we start here. */ -static int ext4_fc_replay_create(struct super_block *sb, struct ext4_fc_tl *tl) +static int ext4_fc_replay_create(struct super_block *sb, struct ext4_fc_tl *tl, + u8 *val) { int ret = 0; struct inode *inode = NULL; struct inode *dir = NULL; struct dentry_info_args darg; - tl_to_darg(&darg, tl); + tl_to_darg(&darg, tl, val); trace_ext4_fc_replay(sb, EXT4_FC_TAG_CREAT, darg.ino, darg.parent_ino, darg.dname_len); @@ -1591,9 +1584,9 @@ static int ext4_fc_record_regions(struct super_block *sb, int ino, /* Replay add range tag */ static int ext4_fc_replay_add_range(struct super_block *sb, - struct ext4_fc_tl *tl) + struct ext4_fc_tl *tl, u8 *val) { - struct ext4_fc_add_range *fc_add_ex; + struct ext4_fc_add_range fc_add_ex; struct ext4_extent newex, *ex; struct inode *inode; ext4_lblk_t start, cur; @@ -1603,15 +1596,14 @@ static int ext4_fc_replay_add_range(struct super_block *sb, struct ext4_ext_path *path = NULL; int ret; - fc_add_ex = (struct ext4_fc_add_range *)ext4_fc_tag_val(tl); - ex = (struct ext4_extent *)&fc_add_ex->fc_ex; + memcpy(&fc_add_ex, val, sizeof(fc_add_ex)); + ex = (struct ext4_extent *)&fc_add_ex.fc_ex; trace_ext4_fc_replay(sb, EXT4_FC_TAG_ADD_RANGE, - le32_to_cpu(fc_add_ex->fc_ino), le32_to_cpu(ex->ee_block), + le32_to_cpu(fc_add_ex.fc_ino), le32_to_cpu(ex->ee_block), ext4_ext_get_actual_len(ex)); - inode = ext4_iget(sb, le32_to_cpu(fc_add_ex->fc_ino), - EXT4_IGET_NORMAL); + inode = ext4_iget(sb, le32_to_cpu(fc_add_ex.fc_ino), EXT4_IGET_NORMAL); if (IS_ERR(inode)) { jbd_debug(1, "Inode not found."); return 0; @@ -1720,32 +1712,33 @@ static int ext4_fc_replay_add_range(struct super_block *sb, /* Replay DEL_RANGE tag */ static int -ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl) +ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, + u8 *val) { struct inode *inode; - struct ext4_fc_del_range *lrange; + struct ext4_fc_del_range lrange; struct ext4_map_blocks map; ext4_lblk_t cur, remaining; int ret; - lrange = (struct ext4_fc_del_range *)ext4_fc_tag_val(tl); - cur = le32_to_cpu(lrange->fc_lblk); - remaining = le32_to_cpu(lrange->fc_len); + memcpy(&lrange, val, sizeof(lrange)); + cur = le32_to_cpu(lrange.fc_lblk); + remaining = le32_to_cpu(lrange.fc_len); trace_ext4_fc_replay(sb, EXT4_FC_TAG_DEL_RANGE, - le32_to_cpu(lrange->fc_ino), cur, remaining); + le32_to_cpu(lrange.fc_ino), cur, remaining); - inode = ext4_iget(sb, le32_to_cpu(lrange->fc_ino), EXT4_IGET_NORMAL); + inode = ext4_iget(sb, le32_to_cpu(lrange.fc_ino), EXT4_IGET_NORMAL); if (IS_ERR(inode)) { - jbd_debug(1, "Inode %d not found", le32_to_cpu(lrange->fc_ino)); + jbd_debug(1, "Inode %d not found", le32_to_cpu(lrange.fc_ino)); return 0; } ret = ext4_fc_record_modified_inode(sb, inode->i_ino); jbd_debug(1, "DEL_RANGE, inode %ld, lblk %d, len %d\n", - inode->i_ino, le32_to_cpu(lrange->fc_lblk), - le32_to_cpu(lrange->fc_len)); + inode->i_ino, le32_to_cpu(lrange.fc_lblk), + le32_to_cpu(lrange.fc_len)); while (remaining > 0) { map.m_lblk = cur; map.m_len = remaining; @@ -1766,8 +1759,8 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl) } ret = ext4_punch_hole(inode, - le32_to_cpu(lrange->fc_lblk) << sb->s_blocksize_bits, - le32_to_cpu(lrange->fc_len) << sb->s_blocksize_bits); + le32_to_cpu(lrange.fc_lblk) << sb->s_blocksize_bits, + le32_to_cpu(lrange.fc_len) << sb->s_blocksize_bits); if (ret) jbd_debug(1, "ext4_punch_hole returned %d", ret); ext4_ext_replay_shrink_inode(inode, @@ -1909,11 +1902,11 @@ static int ext4_fc_replay_scan(journal_t *journal, struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_fc_replay_state *state; int ret = JBD2_FC_REPLAY_CONTINUE; - struct ext4_fc_add_range *ext; - struct ext4_fc_tl *tl; - struct ext4_fc_tail *tail; - __u8 *start, *end; - struct ext4_fc_head *head; + struct ext4_fc_add_range ext; + struct ext4_fc_tl tl; + struct ext4_fc_tail tail; + __u8 *start, *end, *cur, *val; + struct ext4_fc_head head; struct ext4_extent *ex; state = &sbi->s_fc_replay_state; @@ -1940,15 +1933,17 @@ static int ext4_fc_replay_scan(journal_t *journal, } state->fc_replay_expected_off++; - fc_for_each_tl(start, end, tl) { + for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) { + memcpy(&tl, cur, sizeof(tl)); + val = cur + sizeof(tl); jbd_debug(3, "Scan phase, tag:%s, blk %lld\n", - tag2str(le16_to_cpu(tl->fc_tag)), bh->b_blocknr); - switch (le16_to_cpu(tl->fc_tag)) { + tag2str(le16_to_cpu(tl.fc_tag)), bh->b_blocknr); + switch (le16_to_cpu(tl.fc_tag)) { case EXT4_FC_TAG_ADD_RANGE: - ext = (struct ext4_fc_add_range *)ext4_fc_tag_val(tl); - ex = (struct ext4_extent *)&ext->fc_ex; + memcpy(&ext, val, sizeof(ext)); + ex = (struct ext4_extent *)&ext.fc_ex; ret = ext4_fc_record_regions(sb, - le32_to_cpu(ext->fc_ino), + le32_to_cpu(ext.fc_ino), le32_to_cpu(ex->ee_block), ext4_ext_pblock(ex), ext4_ext_get_actual_len(ex)); if (ret < 0) @@ -1962,18 +1957,18 @@ static int ext4_fc_replay_scan(journal_t *journal, case EXT4_FC_TAG_INODE: case EXT4_FC_TAG_PAD: state->fc_cur_tag++; - state->fc_crc = ext4_chksum(sbi, state->fc_crc, tl, - sizeof(*tl) + ext4_fc_tag_len(tl)); + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, + sizeof(tl) + le16_to_cpu(tl.fc_len)); break; case EXT4_FC_TAG_TAIL: state->fc_cur_tag++; - tail = (struct ext4_fc_tail *)ext4_fc_tag_val(tl); - state->fc_crc = ext4_chksum(sbi, state->fc_crc, tl, - sizeof(*tl) + + memcpy(&tail, val, sizeof(tail)); + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, + sizeof(tl) + offsetof(struct ext4_fc_tail, fc_crc)); - if (le32_to_cpu(tail->fc_tid) == expected_tid && - le32_to_cpu(tail->fc_crc) == state->fc_crc) { + if (le32_to_cpu(tail.fc_tid) == expected_tid && + le32_to_cpu(tail.fc_crc) == state->fc_crc) { state->fc_replay_num_tags = state->fc_cur_tag; state->fc_regions_valid = state->fc_regions_used; @@ -1984,19 +1979,19 @@ static int ext4_fc_replay_scan(journal_t *journal, state->fc_crc = 0; break; case EXT4_FC_TAG_HEAD: - head = (struct ext4_fc_head *)ext4_fc_tag_val(tl); - if (le32_to_cpu(head->fc_features) & + memcpy(&head, val, sizeof(head)); + if (le32_to_cpu(head.fc_features) & ~EXT4_FC_SUPPORTED_FEATURES) { ret = -EOPNOTSUPP; break; } - if (le32_to_cpu(head->fc_tid) != expected_tid) { + if (le32_to_cpu(head.fc_tid) != expected_tid) { ret = JBD2_FC_REPLAY_STOP; break; } state->fc_cur_tag++; - state->fc_crc = ext4_chksum(sbi, state->fc_crc, tl, - sizeof(*tl) + ext4_fc_tag_len(tl)); + state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur, + sizeof(tl) + le16_to_cpu(tl.fc_len)); break; default: ret = state->fc_replay_num_tags ? @@ -2020,11 +2015,11 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh, { struct super_block *sb = journal->j_private; struct ext4_sb_info *sbi = EXT4_SB(sb); - struct ext4_fc_tl *tl; - __u8 *start, *end; + struct ext4_fc_tl tl; + __u8 *start, *end, *cur, *val; int ret = JBD2_FC_REPLAY_CONTINUE; struct ext4_fc_replay_state *state = &sbi->s_fc_replay_state; - struct ext4_fc_tail *tail; + struct ext4_fc_tail tail; if (pass == PASS_SCAN) { state->fc_current_pass = PASS_SCAN; @@ -2051,49 +2046,52 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh, start = (u8 *)bh->b_data; end = (__u8 *)bh->b_data + journal->j_blocksize - 1; - fc_for_each_tl(start, end, tl) { + for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) { + memcpy(&tl, cur, sizeof(tl)); + val = cur + sizeof(tl); + if (state->fc_replay_num_tags == 0) { ret = JBD2_FC_REPLAY_STOP; ext4_fc_set_bitmaps_and_counters(sb); break; } jbd_debug(3, "Replay phase, tag:%s\n", - tag2str(le16_to_cpu(tl->fc_tag))); + tag2str(le16_to_cpu(tl.fc_tag))); state->fc_replay_num_tags--; - switch (le16_to_cpu(tl->fc_tag)) { + switch (le16_to_cpu(tl.fc_tag)) { case EXT4_FC_TAG_LINK: - ret = ext4_fc_replay_link(sb, tl); + ret = ext4_fc_replay_link(sb, &tl, val); break; case EXT4_FC_TAG_UNLINK: - ret = ext4_fc_replay_unlink(sb, tl); + ret = ext4_fc_replay_unlink(sb, &tl, val); break; case EXT4_FC_TAG_ADD_RANGE: - ret = ext4_fc_replay_add_range(sb, tl); + ret = ext4_fc_replay_add_range(sb, &tl, val); break; case EXT4_FC_TAG_CREAT: - ret = ext4_fc_replay_create(sb, tl); + ret = ext4_fc_replay_create(sb, &tl, val); break; case EXT4_FC_TAG_DEL_RANGE: - ret = ext4_fc_replay_del_range(sb, tl); + ret = ext4_fc_replay_del_range(sb, &tl, val); break; case EXT4_FC_TAG_INODE: - ret = ext4_fc_replay_inode(sb, tl); + ret = ext4_fc_replay_inode(sb, &tl, val); break; case EXT4_FC_TAG_PAD: trace_ext4_fc_replay(sb, EXT4_FC_TAG_PAD, 0, - ext4_fc_tag_len(tl), 0); + le16_to_cpu(tl.fc_len), 0); break; case EXT4_FC_TAG_TAIL: trace_ext4_fc_replay(sb, EXT4_FC_TAG_TAIL, 0, - ext4_fc_tag_len(tl), 0); - tail = (struct ext4_fc_tail *)ext4_fc_tag_val(tl); - WARN_ON(le32_to_cpu(tail->fc_tid) != expected_tid); + le16_to_cpu(tl.fc_len), 0); + memcpy(&tail, val, sizeof(tail)); + WARN_ON(le32_to_cpu(tail.fc_tid) != expected_tid); break; case EXT4_FC_TAG_HEAD: break; default: - trace_ext4_fc_replay(sb, le16_to_cpu(tl->fc_tag), 0, - ext4_fc_tag_len(tl), 0); + trace_ext4_fc_replay(sb, le16_to_cpu(tl.fc_tag), 0, + le16_to_cpu(tl.fc_len), 0); ret = -ECANCELED; break; } diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h index 3a6e5a1fa1b80..d8d0998a5c163 100644 --- a/fs/ext4/fast_commit.h +++ b/fs/ext4/fast_commit.h @@ -146,12 +146,5 @@ struct ext4_fc_replay_state { #define region_last(__region) (((__region)->lblk) + ((__region)->len) - 1) -#define fc_for_each_tl(__start, __end, __tl) \ - for (tl = (struct ext4_fc_tl *)start; \ - (u8 *)tl < (u8 *)end; \ - tl = (struct ext4_fc_tl *)((u8 *)tl + \ - sizeof(struct ext4_fc_tl) + \ - + le16_to_cpu(tl->fc_len))) - #endif /* __FAST_COMMIT_H__ */ From 2050c6e5b161e5e25ce3c420fef58b24fa388a49 Mon Sep 17 00:00:00 2001 From: Phillip Potter Date: Mon, 12 Apr 2021 08:38:37 +0100 Subject: [PATCH 105/269] ext4: fix memory leak in ext4_mb_init_backend on error path. commit a8867f4e3809050571c98de7a2d465aff5e4daf5 upstream. Fix a memory leak discovered by syzbot when a file system is corrupted with an illegally large s_log_groups_per_flex. Reported-by: syzbot+aa12d6106ea4ca1b6aae@syzkaller.appspotmail.com Signed-off-by: Phillip Potter Cc: stable@kernel.org Link: https://lore.kernel.org/r/20210412073837.1686-1-phil@philpotter.co.uk Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman --- fs/ext4/mballoc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index b6229fe1aa233..9c390c3d7fb15 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2738,7 +2738,7 @@ static int ext4_mb_init_backend(struct super_block *sb) */ if (sbi->s_es->s_log_groups_per_flex >= 32) { ext4_msg(sb, KERN_ERR, "too many log groups per flexible block group"); - goto err_freesgi; + goto err_freebuddy; } sbi->s_mb_prefetch = min_t(uint, 1 << sbi->s_es->s_log_groups_per_flex, BLK_MAX_SEGMENT_SIZE >> (sb->s_blocksize_bits - 9)); From 3b713aafa7c928c53cdc08bc948373566762ce1c Mon Sep 17 00:00:00 2001 From: Ritesh Harjani Date: Thu, 29 Apr 2021 16:13:44 +0530 Subject: [PATCH 106/269] ext4: fix accessing uninit percpu counter variable with fast_commit commit b45f189a19b38e01676628db79cd3eeb1333516e upstream. When running generic/527 with fast_commit configuration, the following issue is seen on Power. With fast_commit, during ext4_fc_replay() (which can be called from ext4_fill_super()), if inode eviction happens then it can access an uninitialized percpu counter variable. This patch adds the check before accessing the counters in ext4_free_inode() path. [ 321.165371] run fstests generic/527 at 2021-04-29 08:38:43 [ 323.027786] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: block_validity. Quota mode: none. [ 323.618772] BUG: Unable to handle kernel data access on read at 0x1fbd80000 [ 323.619767] Faulting instruction address: 0xc000000000bae78c cpu 0x1: Vector: 300 (Data Access) at [c000000010706ef0] pc: c000000000bae78c: percpu_counter_add_batch+0x3c/0x100 lr: c0000000006d0bb0: ext4_free_inode+0x780/0xb90 pid = 5593, comm = mount ext4_free_inode+0x780/0xb90 ext4_evict_inode+0xa8c/0xc60 evict+0xfc/0x1e0 ext4_fc_replay+0xc50/0x20f0 do_one_pass+0xfe0/0x1350 jbd2_journal_recover+0x184/0x2e0 jbd2_journal_load+0x1c0/0x4a0 ext4_fill_super+0x2458/0x4200 mount_bdev+0x1dc/0x290 ext4_mount+0x28/0x40 legacy_get_tree+0x4c/0xa0 vfs_get_tree+0x4c/0x120 path_mount+0xcf8/0xd70 do_mount+0x80/0xd0 sys_mount+0x3fc/0x490 system_call_exception+0x384/0x3d0 system_call_common+0xec/0x278 Cc: stable@kernel.org Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Ritesh Harjani Reviewed-by: Harshad Shirwadkar Link: https://lore.kernel.org/r/6cceb9a75c54bef8fa9696c1b08c8df5ff6169e2.1619692410.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman --- fs/ext4/ialloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index c92558ede623e..b294ebcb4db4b 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -322,14 +322,16 @@ void ext4_free_inode(handle_t *handle, struct inode *inode) if (is_directory) { count = ext4_used_dirs_count(sb, gdp) - 1; ext4_used_dirs_set(sb, gdp, count); - percpu_counter_dec(&sbi->s_dirs_counter); + if (percpu_counter_initialized(&sbi->s_dirs_counter)) + percpu_counter_dec(&sbi->s_dirs_counter); } ext4_inode_bitmap_csum_set(sb, block_group, gdp, bitmap_bh, EXT4_INODES_PER_GROUP(sb) / 8); ext4_group_desc_csum_set(sb, block_group, gdp); ext4_unlock_group(sb, block_group); - percpu_counter_inc(&sbi->s_freeinodes_counter); + if (percpu_counter_initialized(&sbi->s_freeinodes_counter)) + percpu_counter_inc(&sbi->s_freeinodes_counter); if (sbi->s_log_groups_per_flex) { struct flex_groups *fg; From a4ed12f5edc41abbfe2bce4a215de2f6ff4993eb Mon Sep 17 00:00:00 2001 From: Phil Elwell Date: Tue, 8 Jun 2021 13:00:49 +0100 Subject: [PATCH 107/269] usb: dwc2: Fix build in periphal-only mode In branches to which 24d209dba5a3 ("usb: dwc2: Fix hibernation between host and device modes.") has been back-ported, the bus_suspended member of struct dwc2_hsotg is only present in builds that support host-mode. To avoid having to pull in several more non-Fix commits in order to get it to compile, wrap the usage of the member in a macro conditional. Fixes: 24d209dba5a3 ("usb: dwc2: Fix hibernation between host and device modes.") Signed-off-by: Phil Elwell Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc2/core_intr.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c index 510fd0572feb1..e3f429f1575e9 100644 --- a/drivers/usb/dwc2/core_intr.c +++ b/drivers/usb/dwc2/core_intr.c @@ -707,7 +707,11 @@ static inline void dwc_handle_gpwrdn_disc_det(struct dwc2_hsotg *hsotg, dwc2_writel(hsotg, gpwrdn_tmp, GPWRDN); hsotg->hibernated = 0; + +#if IS_ENABLED(CONFIG_USB_DWC2_HOST) || \ + IS_ENABLED(CONFIG_USB_DWC2_DUAL_ROLE) hsotg->bus_suspended = 0; +#endif if (gpwrdn & GPWRDN_IDSTS) { hsotg->op_state = OTG_STATE_B_PERIPHERAL; From 7178be006d495ffb741c329012da289b62dddfe6 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Fri, 4 Jun 2021 20:01:14 -0700 Subject: [PATCH 108/269] pid: take a reference when initializing `cad_pid` commit 0711f0d7050b9e07c44bc159bbc64ac0a1022c7f upstream. During boot, kernel_init_freeable() initializes `cad_pid` to the init task's struct pid. Later on, we may change `cad_pid` via a sysctl, and when this happens proc_do_cad_pid() will increment the refcount on the new pid via get_pid(), and will decrement the refcount on the old pid via put_pid(). As we never called get_pid() when we initialized `cad_pid`, we decrement a reference we never incremented, can therefore free the init task's struct pid early. As there can be dangling references to the struct pid, we can later encounter a use-after-free (e.g. when delivering signals). This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to have been around since the conversion of `cad_pid` to struct pid in commit 9ec52099e4b8 ("[PATCH] replace cad_pid by a struct pid") from the pre-KASAN stone age of v2.6.19. Fix this by getting a reference to the init task's struct pid when we assign it to `cad_pid`. Full KASAN splat below. ================================================================== BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline] BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273 CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1 Hardware name: linux,dummy-virt (DT) Call trace: ns_of_pid include/linux/pid.h:153 [inline] task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 do_notify_parent+0x308/0xe60 kernel/signal.c:1950 exit_notify kernel/exit.c:682 [inline] do_exit+0x2334/0x2bd0 kernel/exit.c:845 do_group_exit+0x108/0x2c8 kernel/exit.c:922 get_signal+0x4e4/0x2a88 kernel/signal.c:2781 do_signal arch/arm64/kernel/signal.c:882 [inline] do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936 work_pending+0xc/0x2dc Allocated by task 0: slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516 slab_alloc_node mm/slub.c:2907 [inline] slab_alloc mm/slub.c:2915 [inline] kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920 alloc_pid+0xdc/0xc00 kernel/pid.c:180 copy_process+0x2794/0x5e18 kernel/fork.c:2129 kernel_clone+0x194/0x13c8 kernel/fork.c:2500 kernel_thread+0xd4/0x110 kernel/fork.c:2552 rest_init+0x44/0x4a0 init/main.c:687 arch_call_rest_init+0x1c/0x28 start_kernel+0x520/0x554 init/main.c:1064 0x0 Freed by task 270: slab_free_hook mm/slub.c:1562 [inline] slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600 slab_free mm/slub.c:3161 [inline] kmem_cache_free+0x224/0x8e0 mm/slub.c:3177 put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114 put_pid+0x30/0x48 kernel/pid.c:109 proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401 proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591 proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617 call_write_iter include/linux/fs.h:1977 [inline] new_sync_write+0x3ac/0x510 fs/read_write.c:518 vfs_write fs/read_write.c:605 [inline] vfs_write+0x9c4/0x1018 fs/read_write.c:585 ksys_write+0x124/0x240 fs/read_write.c:658 __do_sys_write fs/read_write.c:670 [inline] __se_sys_write fs/read_write.c:667 [inline] __arm64_sys_write+0x78/0xb0 fs/read_write.c:667 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall arch/arm64/kernel/syscall.c:49 [inline] el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129 do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168 el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416 el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432 el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701 The buggy address belongs to the object at ffff23794dda0000 which belongs to the cache pid of size 224 The buggy address is located 4 bytes inside of 224-byte region [ffff23794dda0000, ffff23794dda00e0) The buggy address belongs to the page: page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0 head:(____ptrval____) order:1 compound_mapcount:0 flags: 0x3fffc0000010200(slab|head) raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080 raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 ================================================================== Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com Fixes: 9ec52099e4b8678a ("[PATCH] replace cad_pid by a struct pid") Signed-off-by: Mark Rutland Acked-by: Christian Brauner Cc: Cedric Le Goater Cc: Christian Brauner Cc: Eric W. Biederman Cc: Kees Cook Cc: Paul Mackerras Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- init/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/init/main.c b/init/main.c index d9d9141112511..b4449544390ca 100644 --- a/init/main.c +++ b/init/main.c @@ -1505,7 +1505,7 @@ static noinline void __init kernel_init_freeable(void) */ set_mems_allowed(node_states[N_MEMORY]); - cad_pid = task_pid(current); + cad_pid = get_pid(task_pid(current)); smp_prepare_cpus(setup_max_cpus); From c8d5faee46242c3f33b8a71a4d7d52214785bfcc Mon Sep 17 00:00:00 2001 From: Junxiao Bi Date: Fri, 4 Jun 2021 20:01:42 -0700 Subject: [PATCH 109/269] ocfs2: fix data corruption by fallocate commit 6bba4471f0cc1296fe3c2089b9e52442d3074b2e upstream. When fallocate punches holes out of inode size, if original isize is in the middle of last cluster, then the part from isize to the end of the cluster will be zeroed with buffer write, at that time isize is not yet updated to match the new size, if writeback is kicked in, it will invoke ocfs2_writepage()->block_write_full_page() where the pages out of inode size will be dropped. That will cause file corruption. Fix this by zero out eof blocks when extending the inode size. Running the following command with qemu-image 4.2.1 can get a corrupted coverted image file easily. qemu-img convert -p -t none -T none -f qcow2 $qcow_image \ -O qcow2 -o compat=1.1 $qcow_image.conv The usage of fallocate in qemu is like this, it first punches holes out of inode size, then extend the inode size. fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2276196352, 65536) = 0 fallocate(11, 0, 2276196352, 65536) = 0 v1: https://www.spinics.net/lists/linux-fsdevel/msg193999.html v2: https://lore.kernel.org/linux-fsdevel/20210525093034.GB4112@quack2.suse.cz/T/ Link: https://lkml.kernel.org/r/20210528210648.9124-1-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi Reviewed-by: Joseph Qi Cc: Jan Kara Cc: Mark Fasheh Cc: Joel Becker Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/ocfs2/file.c | 55 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 8880071ee4ee0..2b296d720c9fa 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1855,6 +1855,45 @@ int ocfs2_remove_inode_range(struct inode *inode, return ret; } +/* + * zero out partial blocks of one cluster. + * + * start: file offset where zero starts, will be made upper block aligned. + * len: it will be trimmed to the end of current cluster if "start + len" + * is bigger than it. + */ +static int ocfs2_zeroout_partial_cluster(struct inode *inode, + u64 start, u64 len) +{ + int ret; + u64 start_block, end_block, nr_blocks; + u64 p_block, offset; + u32 cluster, p_cluster, nr_clusters; + struct super_block *sb = inode->i_sb; + u64 end = ocfs2_align_bytes_to_clusters(sb, start); + + if (start + len < end) + end = start + len; + + start_block = ocfs2_blocks_for_bytes(sb, start); + end_block = ocfs2_blocks_for_bytes(sb, end); + nr_blocks = end_block - start_block; + if (!nr_blocks) + return 0; + + cluster = ocfs2_bytes_to_clusters(sb, start); + ret = ocfs2_get_clusters(inode, cluster, &p_cluster, + &nr_clusters, NULL); + if (ret) + return ret; + if (!p_cluster) + return 0; + + offset = start_block - ocfs2_clusters_to_blocks(sb, cluster); + p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset; + return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS); +} + /* * Parts of this function taken from xfs_change_file_space() */ @@ -1865,7 +1904,7 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, { int ret; s64 llen; - loff_t size; + loff_t size, orig_isize; struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); struct buffer_head *di_bh = NULL; handle_t *handle; @@ -1896,6 +1935,7 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, goto out_inode_unlock; } + orig_isize = i_size_read(inode); switch (sr->l_whence) { case 0: /*SEEK_SET*/ break; @@ -1903,7 +1943,7 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, sr->l_start += f_pos; break; case 2: /*SEEK_END*/ - sr->l_start += i_size_read(inode); + sr->l_start += orig_isize; break; default: ret = -EINVAL; @@ -1957,6 +1997,14 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, default: ret = -EINVAL; } + + /* zeroout eof blocks in the cluster. */ + if (!ret && change_size && orig_isize < size) { + ret = ocfs2_zeroout_partial_cluster(inode, orig_isize, + size - orig_isize); + if (!ret) + i_size_write(inode, size); + } up_write(&OCFS2_I(inode)->ip_alloc_sem); if (ret) { mlog_errno(ret); @@ -1973,9 +2021,6 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, goto out_inode_unlock; } - if (change_size && i_size_read(inode) < size) - i_size_write(inode, size); - inode->i_ctime = inode->i_mtime = current_time(inode); ret = ocfs2_mark_inode_dirty(handle, inode, di_bh); if (ret < 0) From 5f2e1e818e9fe3df198259cdbc070989d7f01c24 Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Fri, 4 Jun 2021 20:01:18 -0700 Subject: [PATCH 110/269] mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() commit 04f7ce3f07ce39b1a3ca03a56b238a53acc52cfd upstream. In pmd/pud_advanced_tests(), the vaddr is aligned up to the next pmd/pud entry, and so it does not match the given pmdp/pudp and (aligned down) pfn any more. For s390, this results in memory corruption, because the IDTE instruction used e.g. in xxx_get_and_clear() will take the vaddr for some calculations, in combination with the given pmdp. It will then end up with a wrong table origin, ending on ...ff8, and some of those wrongly set low-order bits will also select a wrong pagetable level for the index addition. IDTE could therefore invalidate (or 0x20) something outside of the page tables, depending on the wrongly picked index, which in turn depends on the random vaddr. As result, we sometimes see "BUG task_struct (Not tainted): Padding overwritten" on s390, where one 0x5a padding value got overwritten with 0x7a. Fix this by aligning down, similar to how the pmd/pud_aligned pfns are calculated. Link: https://lkml.kernel.org/r/20210525130043.186290-2-gerald.schaefer@linux.ibm.com Fixes: a5c3b9ffb0f40 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers") Signed-off-by: Gerald Schaefer Reviewed-by: Anshuman Khandual Cc: Vineet Gupta Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: [5.9+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/debug_vm_pgtable.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index c05d9dcf78911..750bfef26be37 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -163,7 +163,7 @@ static void __init pmd_advanced_tests(struct mm_struct *mm, pr_debug("Validating PMD advanced\n"); /* Align the address wrt HPAGE_PMD_SIZE */ - vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE; + vaddr &= HPAGE_PMD_MASK; pgtable_trans_huge_deposit(mm, pmdp, pgtable); @@ -285,7 +285,7 @@ static void __init pud_advanced_tests(struct mm_struct *mm, pr_debug("Validating PUD advanced\n"); /* Align the address wrt HPAGE_PUD_SIZE */ - vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE; + vaddr &= HPAGE_PUD_MASK; set_pud_at(mm, vaddr, pudp, pud); pudp_set_wrprotect(mm, vaddr, pudp); From 68dcd32b326a3cb875ae579b120dae34718451a2 Mon Sep 17 00:00:00 2001 From: Ding Hui Date: Fri, 4 Jun 2021 20:01:21 -0700 Subject: [PATCH 111/269] mm/page_alloc: fix counting of free pages after take off from buddy commit bac9c6fa1f929213bbd0ac9cdf21e8e2f0916828 upstream. Recently we found that there is a lot MemFree left in /proc/meminfo after do a lot of pages soft offline, it's not quite correct. Before Oscar's rework of soft offline for free pages [1], if we soft offline free pages, these pages are left in buddy with HWPoison flag, and NR_FREE_PAGES is not updated immediately. So the difference between NR_FREE_PAGES and real number of available free pages is also even big at the beginning. However, with the workload running, when we catch HWPoison page in any alloc functions subsequently, we will remove it from buddy, meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES will get more and more closer to the real number of available free pages. (regardless of unpoison_memory()) Now, for offline free pages, after a successful call take_page_off_buddy(), the page is no longer belong to buddy allocator, and will not be used any more, but we missed accounting NR_FREE_PAGES in this situation, and there is no chance to be updated later. Do update in take_page_off_buddy() like rmqueue() does, but avoid double counting if some one already set_migratetype_isolate() on the page. [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") Link: https://lkml.kernel.org/r/20210526075247.11130-1-dinghui@sangfor.com.cn Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") Signed-off-by: Ding Hui Suggested-by: Naoya Horiguchi Reviewed-by: Oscar Salvador Acked-by: David Hildenbrand Acked-by: Naoya Horiguchi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/page_alloc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7ffa706e5c305..81cc7fdc9c8fd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8870,6 +8870,8 @@ bool take_page_off_buddy(struct page *page) del_page_from_free_list(page_head, zone, page_order); break_down_buddy_pages(zone, page_head, page, 0, page_order, migratetype); + if (!is_migrate_isolate(migratetype)) + __mod_zone_freepage_state(zone, -1, migratetype); ret = true; break; } From 942c5864de85dc14602ec875e88e0337896db6d9 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 29 May 2021 11:17:30 +0200 Subject: [PATCH 112/269] x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() commit 9bfecd05833918526cc7357d55e393393440c5fa upstream. While digesting the XSAVE-related horrors which got introduced with the supervisor/user split, the recent addition of ENQCMD-related functionality got on the radar and turned out to be similarly broken. update_pasid(), which is only required when X86_FEATURE_ENQCMD is available, is invoked from two places: 1) From switch_to() for the incoming task 2) Via a SMP function call from the IOMMU/SMV code #1 is half-ways correct as it hacks around the brokenness of get_xsave_addr() by enforcing the state to be 'present', but all the conditionals in that code are completely pointless for that. Also the invocation is just useless overhead because at that point it's guaranteed that TIF_NEED_FPU_LOAD is set on the incoming task and all of this can be handled at return to user space. #2 is broken beyond repair. The comment in the code claims that it is safe to invoke this in an IPI, but that's just wishful thinking. FPU state of a running task is protected by fregs_lock() which is nothing else than a local_bh_disable(). As BH-disabled regions run usually with interrupts enabled the IPI can hit a code section which modifies FPU state and there is absolutely no guarantee that any of the assumptions which are made for the IPI case is true. Also the IPI is sent to all CPUs in mm_cpumask(mm), but the IPI is invoked with a NULL pointer argument, so it can hit a completely unrelated task and unconditionally force an update for nothing. Worse, it can hit a kernel thread which operates on a user space address space and set a random PASID for it. The offending commit does not cleanly revert, but it's sufficient to force disable X86_FEATURE_ENQCMD and to remove the broken update_pasid() code to make this dysfunctional all over the place. Anything more complex would require more surgery and none of the related functions outside of the x86 core code are blatantly wrong, so removing those would be overkill. As nothing enables the PASID bit in the IA32_XSS MSR yet, which is required to make this actually work, this cannot result in a regression except for related out of tree train-wrecks, but they are broken already today. Fixes: 20f0afd1fb3d ("x86/mmu: Allocate/free a PASID") Signed-off-by: Thomas Gleixner Signed-off-by: Borislav Petkov Acked-by: Andy Lutomirski Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtsd6gr9.ffs@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/disabled-features.h | 7 +-- arch/x86/include/asm/fpu/api.h | 6 +-- arch/x86/include/asm/fpu/internal.h | 7 --- arch/x86/kernel/fpu/xstate.c | 57 ------------------------ 4 files changed, 3 insertions(+), 74 deletions(-) diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index 5861d34f97718..09db5b8f1444a 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -56,11 +56,8 @@ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31)) #endif -#ifdef CONFIG_IOMMU_SUPPORT -# define DISABLE_ENQCMD 0 -#else -# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31)) -#endif +/* Force disable because it's broken beyond repair */ +#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31)) /* * Make sure to add features to the correct mask diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 38f4936045ab6..8b9bfaad6e662 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -79,10 +79,6 @@ extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name); */ #define PASID_DISABLED 0 -#ifdef CONFIG_IOMMU_SUPPORT -/* Update current's PASID MSR/state by mm's PASID. */ -void update_pasid(void); -#else static inline void update_pasid(void) { } -#endif + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 8d33ad80704f2..ceeba9f631722 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -584,13 +584,6 @@ static inline void switch_fpu_finish(struct fpu *new_fpu) pkru_val = pk->pkru; } __write_pkru(pkru_val); - - /* - * Expensive PASID MSR write will be avoided in update_pasid() because - * TIF_NEED_FPU_LOAD was set. And the PASID state won't be updated - * unless it's different from mm->pasid to reduce overhead. - */ - update_pasid(); } #endif /* _ASM_X86_FPU_INTERNAL_H */ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 5d8047441a0aa..67f1a03b9b235 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1402,60 +1402,3 @@ int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, return 0; } #endif /* CONFIG_PROC_PID_ARCH_STATUS */ - -#ifdef CONFIG_IOMMU_SUPPORT -void update_pasid(void) -{ - u64 pasid_state; - u32 pasid; - - if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) - return; - - if (!current->mm) - return; - - pasid = READ_ONCE(current->mm->pasid); - /* Set the valid bit in the PASID MSR/state only for valid pasid. */ - pasid_state = pasid == PASID_DISABLED ? - pasid : pasid | MSR_IA32_PASID_VALID; - - /* - * No need to hold fregs_lock() since the task's fpstate won't - * be changed by others (e.g. ptrace) while the task is being - * switched to or is in IPI. - */ - if (!test_thread_flag(TIF_NEED_FPU_LOAD)) { - /* The MSR is active and can be directly updated. */ - wrmsrl(MSR_IA32_PASID, pasid_state); - } else { - struct fpu *fpu = ¤t->thread.fpu; - struct ia32_pasid_state *ppasid_state; - struct xregs_state *xsave; - - /* - * The CPU's xstate registers are not currently active. Just - * update the PASID state in the memory buffer here. The - * PASID MSR will be loaded when returning to user mode. - */ - xsave = &fpu->state.xsave; - xsave->header.xfeatures |= XFEATURE_MASK_PASID; - ppasid_state = get_xsave_addr(xsave, XFEATURE_PASID); - /* - * Since XFEATURE_MASK_PASID is set in xfeatures, ppasid_state - * won't be NULL and no need to check its value. - * - * Only update the task's PASID state when it's different - * from the mm's pasid. - */ - if (ppasid_state->pasid != pasid_state) { - /* - * Invalid fpregs so that state restoring will pick up - * the PASID state. - */ - __fpu_invalidate_fpregs_state(fpu); - ppasid_state->pasid = pasid_state; - } - } -} -#endif /* CONFIG_IOMMU_SUPPORT */ From 445477e9274efd08459b7ccf19578a63c3596082 Mon Sep 17 00:00:00 2001 From: Pu Wen Date: Wed, 2 Jun 2021 15:02:07 +0800 Subject: [PATCH 113/269] x86/sev: Check SME/SEV support in CPUID first commit 009767dbf42ac0dbe3cf48c1ee224f6b778aa85a upstream. The first two bits of the CPUID leaf 0x8000001F EAX indicate whether SEV or SME is supported, respectively. It's better to check whether SEV or SME is actually supported before accessing the MSR_AMD64_SEV to check whether SEV or SME is enabled. This is both a bare-metal issue and a guest/VM issue. Since the first generation Hygon Dhyana CPU doesn't support the MSR_AMD64_SEV, reading that MSR results in a #GP - either directly from hardware in the bare-metal case or via the hypervisor (because the RDMSR is actually intercepted) in the guest/VM case, resulting in a failed boot. And since this is very early in the boot phase, rdmsrl_safe()/native_read_msr_safe() can't be used. So check the CPUID bits first, before accessing the MSR. [ tlendacky: Expand and improve commit message. ] [ bp: Massage commit message. ] Fixes: eab696d8e8b9 ("x86/sev: Do not require Hypervisor CPUID bit for SEV guests") Signed-off-by: Pu Wen Signed-off-by: Borislav Petkov Acked-by: Tom Lendacky Cc: # v5.10+ Link: https://lkml.kernel.org/r/20210602070207.2480-1-puwen@hygon.cn Signed-off-by: Greg Kroah-Hartman --- arch/x86/mm/mem_encrypt_identity.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index a19374d261013..65f599e9075bc 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -504,10 +504,6 @@ void __init sme_enable(struct boot_params *bp) #define AMD_SME_BIT BIT(0) #define AMD_SEV_BIT BIT(1) - /* Check the SEV MSR whether SEV or SME is enabled */ - sev_status = __rdmsr(MSR_AMD64_SEV); - feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT; - /* * Check for the SME/SEV feature: * CPUID Fn8000_001F[EAX] @@ -519,11 +515,16 @@ void __init sme_enable(struct boot_params *bp) eax = 0x8000001f; ecx = 0; native_cpuid(&eax, &ebx, &ecx, &edx); - if (!(eax & feature_mask)) + /* Check whether SEV or SME is supported */ + if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT))) return; me_mask = 1UL << (ebx & 0x3f); + /* Check the SEV MSR whether SEV or SME is enabled */ + sev_status = __rdmsr(MSR_AMD64_SEV); + feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT; + /* Check if memory encryption is enabled */ if (feature_mask == AMD_SME_BIT) { /* From 48ee0db61c8299022ec88c79ad137f290196cac2 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Mon, 31 May 2021 09:21:38 +0200 Subject: [PATCH 114/269] nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect commit 4ac06a1e013cf5fdd963317ffd3b968560f33bba upstream. It's possible to trigger NULL pointer dereference by local unprivileged user, when calling getsockname() after failed bind() (e.g. the bind fails because LLCP_SAP_MAX used as SAP): BUG: kernel NULL pointer dereference, address: 0000000000000000 CPU: 1 PID: 426 Comm: llcp_sock_getna Not tainted 5.13.0-rc2-next-20210521+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1 04/01/2014 Call Trace: llcp_sock_getname+0xb1/0xe0 __sys_getpeername+0x95/0xc0 ? lockdep_hardirqs_on_prepare+0xd5/0x180 ? syscall_enter_from_user_mode+0x1c/0x40 __x64_sys_getpeername+0x11/0x20 do_syscall_64+0x36/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae This can be reproduced with Syzkaller C repro (bind followed by getpeername): https://syzkaller.appspot.com/x/repro.c?x=14def446e00000 Cc: Fixes: d646960f7986 ("NFC: Initial LLCP support") Reported-by: syzbot+80fb126e7f7d8b1a5914@syzkaller.appspotmail.com Reported-by: butt3rflyh4ck Signed-off-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20210531072138.5219-1-krzysztof.kozlowski@canonical.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c index 53dbe733f9981..6cfd30fc07985 100644 --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -110,6 +110,7 @@ static int llcp_sock_bind(struct socket *sock, struct sockaddr *addr, int alen) if (!llcp_sock->service_name) { nfc_llcp_local_put(llcp_sock->local); llcp_sock->local = NULL; + llcp_sock->dev = NULL; ret = -ENOMEM; goto put_dev; } @@ -119,6 +120,7 @@ static int llcp_sock_bind(struct socket *sock, struct sockaddr *addr, int alen) llcp_sock->local = NULL; kfree(llcp_sock->service_name); llcp_sock->service_name = NULL; + llcp_sock->dev = NULL; ret = -EADDRINUSE; goto put_dev; } From 58da0b509e4b8f4a3a4b1b2e23871d108f81338a Mon Sep 17 00:00:00 2001 From: Luben Tuikov Date: Wed, 12 May 2021 12:33:23 -0400 Subject: [PATCH 115/269] drm/amdgpu: Don't query CE and UE errors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit dce3d8e1d070900e0feeb06787a319ff9379212c upstream. On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher Cc: stable@vger.kernel.org Signed-off-by: Luben Tuikov Reviewed-by: Alexander Deucher Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 16 ---------------- 1 file changed, 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index c80d8339f58c4..2c1c5f7f98deb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -337,7 +337,6 @@ static int amdgpu_ctx_query2(struct amdgpu_device *adev, { struct amdgpu_ctx *ctx; struct amdgpu_ctx_mgr *mgr; - unsigned long ras_counter; if (!fpriv) return -EINVAL; @@ -362,21 +361,6 @@ static int amdgpu_ctx_query2(struct amdgpu_device *adev, if (atomic_read(&ctx->guilty)) out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY; - /*query ue count*/ - ras_counter = amdgpu_ras_query_error_count(adev, false); - /*ras counter is monotonic increasing*/ - if (ras_counter != ctx->ras_counter_ue) { - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE; - ctx->ras_counter_ue = ras_counter; - } - - /*query ce count*/ - ras_counter = amdgpu_ras_query_error_count(adev, true); - if (ras_counter != ctx->ras_counter_ce) { - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE; - ctx->ras_counter_ce = ras_counter; - } - mutex_unlock(&mgr->lock); return 0; } From 3a6b69221f96f87c680bbc9fba01db3415b18f27 Mon Sep 17 00:00:00 2001 From: Nirmoy Das Date: Fri, 28 May 2021 16:54:16 +0200 Subject: [PATCH 116/269] drm/amdgpu: make sure we unpin the UVD BO MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 07438603a07e52f1c6aa731842bd298d2725b7be upstream. Releasing pinned BOs is illegal now. UVD 6 was missing from: commit 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Fixes: 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Cc: stable@vger.kernel.org Signed-off-by: Nirmoy Das Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c index 666bfa4a0b8ea..53f0899eb3166 100644 --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c @@ -356,6 +356,7 @@ static int uvd_v6_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout) error: dma_fence_put(fence); + amdgpu_bo_unpin(bo); amdgpu_bo_unreserve(bo); amdgpu_bo_unref(&bo); return r; From 42f75a4381a4ffb1b7488f90c657ea0b5461d3b7 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Tue, 25 May 2021 13:08:41 +0200 Subject: [PATCH 117/269] x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing commit 7d65f9e80646c595e8c853640a9d0768a33e204c upstream. PIC interrupts do not support affinity setting and they can end up on any online CPU. Therefore, it's required to mark the associated vectors as system-wide reserved. Otherwise, the corresponding irq descriptors are copied to the secondary CPUs but the vectors are not marked as assigned or reserved. This works correctly for the IO/APIC case. When the IO/APIC is disabled via config, kernel command line or lack of enumeration then all legacy interrupts are routed through the PIC, but nothing marks them as system-wide reserved vectors. As a consequence, a subsequent allocation on a secondary CPU can result in allocating one of these vectors, which triggers the BUG() in apic_update_vector() because the interrupt descriptor slot is not empty. Imran tried to work around that by marking those interrupts as allocated when a CPU comes online. But that's wrong in case that the IO/APIC is available and one of the legacy interrupts, e.g. IRQ0, has been switched to PIC mode because then marking them as allocated will fail as they are already marked as system vectors. Stay consistent and update the legacy vectors after attempting IO/APIC initialization and mark them as system vectors in case that no IO/APIC is available. Fixes: 69cde0004a4b ("x86/vector: Use matrix allocator for vector assignment") Reported-by: Imran Khan Signed-off-by: Thomas Gleixner Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210519233928.2157496-1-imran.f.khan@oracle.com Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/apic.h | 1 + arch/x86/kernel/apic/apic.c | 1 + arch/x86/kernel/apic/vector.c | 20 ++++++++++++++++++++ 3 files changed, 22 insertions(+) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 51abd44ab8c2d..3b4412c83eec0 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -174,6 +174,7 @@ static inline int apic_is_clustered_box(void) extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask); extern void lapic_assign_system_vectors(void); extern void lapic_assign_legacy_vector(unsigned int isairq, bool replace); +extern void lapic_update_legacy_vectors(void); extern void lapic_online(void); extern void lapic_offline(void); extern bool apic_needs_pit(void); diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 539f3e88ca7cd..24539a05c58c7 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2539,6 +2539,7 @@ static void __init apic_bsp_setup(bool upmode) end_local_APIC_setup(); irq_remap_enable_fault_handling(); setup_IO_APIC(); + lapic_update_legacy_vectors(); } #ifdef CONFIG_UP_LATE_INIT diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 758bbf25ef748..bd557e9f5dd8e 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -687,6 +687,26 @@ void lapic_assign_legacy_vector(unsigned int irq, bool replace) irq_matrix_assign_system(vector_matrix, ISA_IRQ_VECTOR(irq), replace); } +void __init lapic_update_legacy_vectors(void) +{ + unsigned int i; + + if (IS_ENABLED(CONFIG_X86_IO_APIC) && nr_ioapics > 0) + return; + + /* + * If the IO/APIC is disabled via config, kernel command line or + * lack of enumeration then all legacy interrupts are routed + * through the PIC. Make sure that they are marked as legacy + * vectors. PIC_CASCADE_IRQ has already been marked in + * lapic_assign_system_vectors(). + */ + for (i = 0; i < nr_legacy_irqs(); i++) { + if (i != PIC_CASCADE_IR) + lapic_assign_legacy_vector(i, true); + } +} + void __init lapic_assign_system_vectors(void) { unsigned int i, vector = 0; From 5e5e63bacbe8f1ef9688e7804275eb88cf0be51a Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Wed, 19 May 2021 16:17:17 +0530 Subject: [PATCH 118/269] powerpc/kprobes: Fix validation of prefixed instructions across page boundary commit 82123a3d1d5a306fdf50c968a474cc60fe43a80f upstream. When checking if the probed instruction is the suffix of a prefixed instruction, we access the instruction at the previous word. If the probed instruction is the very first word of a module, we can end up trying to access an invalid page. Fix this by skipping the check for all instructions at the beginning of a page. Prefixed instructions cannot cross a 64-byte boundary and as such, we don't expect to encounter a suffix as the very first word in a page for kernel text. Even if there are prefixed instructions crossing a page boundary (from a module, for instance), the instruction will be illegal, so preventing probing on the suffix of such prefix instructions isn't worthwhile. Fixes: b4657f7650ba ("powerpc/kprobes: Don't allow breakpoints on suffixes") Cc: stable@vger.kernel.org # v5.8+ Reported-by: Christophe Leroy Signed-off-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/0df9a032a05576a2fa8e97d1b769af2ff0eafbd6.1621416666.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/kprobes.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c index 01ab2163659e4..e8c2a6373157d 100644 --- a/arch/powerpc/kernel/kprobes.c +++ b/arch/powerpc/kernel/kprobes.c @@ -108,7 +108,6 @@ int arch_prepare_kprobe(struct kprobe *p) int ret = 0; struct kprobe *prev; struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->addr); - struct ppc_inst prefix = ppc_inst_read((struct ppc_inst *)(p->addr - 1)); if ((unsigned long)p->addr & 0x03) { printk("Attempt to register kprobe at an unaligned address\n"); @@ -116,7 +115,8 @@ int arch_prepare_kprobe(struct kprobe *p) } else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) { printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n"); ret = -EINVAL; - } else if (ppc_inst_prefixed(prefix)) { + } else if ((unsigned long)p->addr & ~PAGE_MASK && + ppc_inst_prefixed(ppc_inst_read((struct ppc_inst *)(p->addr - 1)))) { printk("Cannot register a kprobe on the second word of prefixed instruction\n"); ret = -EINVAL; } From b547a16b24918edd63042f9d81c0d310212d2e94 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Wed, 19 May 2021 09:38:27 -0400 Subject: [PATCH 119/269] btrfs: mark ordered extent and inode with error if we fail to finish commit d61bec08b904cf171835db98168f82bc338e92e4 upstream. While doing error injection testing I saw that sometimes we'd get an abort that wouldn't stop the current transaction commit from completing. This abort was coming from finish ordered IO, but at this point in the transaction commit we should have gotten an error and stopped. It turns out the abort came from finish ordered io while trying to write out the free space cache. It occurred to me that any failure inside of finish_ordered_io isn't actually raised to the person doing the writing, so we could have any number of failures in this path and think the ordered extent completed successfully and the inode was fine. Fix this by marking the ordered extent with BTRFS_ORDERED_IOERR, and marking the mapping of the inode with mapping_set_error, so any callers that simply call fdatawait will also get the error. With this we're seeing the IO error on the free space inode when we fail to do the finish_ordered_io. CC: stable@vger.kernel.org # 4.19+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/inode.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 94c24b2a211bf..c913875f416cc 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2760,6 +2760,18 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) if (ret || truncated) { u64 unwritten_start = start; + /* + * If we failed to finish this ordered extent for any reason we + * need to make sure BTRFS_ORDERED_IOERR is set on the ordered + * extent, and mark the inode with the error if it wasn't + * already set. Any error during writeback would have already + * set the mapping error, so we need to set it if we're the ones + * marking this ordered extent as failed. + */ + if (ret && !test_and_set_bit(BTRFS_ORDERED_IOERR, + &ordered_extent->flags)) + mapping_set_error(ordered_extent->inode->i_mapping, -EIO); + if (truncated) unwritten_start += logical_len; clear_extent_uptodate(io_tree, unwritten_start, end, NULL); From 5a89982fa2bba459b82323655df986945a853bbe Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Wed, 19 May 2021 10:52:45 -0400 Subject: [PATCH 120/269] btrfs: fix error handling in btrfs_del_csums commit b86652be7c83f70bf406bed18ecf55adb9bfb91b upstream. Error injection stress would sometimes fail with checksums on disk that did not have a corresponding extent. This occurred because the pattern in btrfs_del_csums was while (1) { ret = btrfs_search_slot(); if (ret < 0) break; } ret = 0; out: btrfs_free_path(path); return ret; If we got an error from btrfs_search_slot we'd clear the error because we were breaking instead of goto out. Instead of using goto out, simply handle the cases where we may leave a random value in ret, and get rid of the ret = 0; out: pattern and simply allow break to have the proper error reporting. With this fix we properly abort the transaction and do not commit thinking we successfully deleted the csum. Reviewed-by: Qu Wenruo CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/file-item.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 8f4f2bd6d9b95..48a2ea6d70921 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -690,7 +690,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans, u64 end_byte = bytenr + len; u64 csum_end; struct extent_buffer *leaf; - int ret; + int ret = 0; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); int blocksize_bits = fs_info->sb->s_blocksize_bits; @@ -709,6 +709,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans, path->leave_spinning = 1; ret = btrfs_search_slot(trans, root, &key, path, -1, 1); if (ret > 0) { + ret = 0; if (path->slots[0] == 0) break; path->slots[0]--; @@ -765,7 +766,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans, ret = btrfs_del_items(trans, root, path, path->slots[0], del_nr); if (ret) - goto out; + break; if (key.offset == bytenr) break; } else if (key.offset < bytenr && csum_end > end_byte) { @@ -809,8 +810,9 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans, ret = btrfs_split_item(trans, root, path, &key, offset); if (ret && ret != -EAGAIN) { btrfs_abort_transaction(trans, ret); - goto out; + break; } + ret = 0; key.offset = end_byte - 1; } else { @@ -820,8 +822,6 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans, } btrfs_release_path(path); } - ret = 0; -out: btrfs_free_path(path); return ret; } From 466d83fdbbe345f3cfd5f7b2633f740ecad67853 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Wed, 19 May 2021 10:52:46 -0400 Subject: [PATCH 121/269] btrfs: return errors from btrfs_del_csums in cleanup_ref_head commit 856bd270dc4db209c779ce1e9555c7641ffbc88e upstream. We are unconditionally returning 0 in cleanup_ref_head, despite the fact that btrfs_del_csums could fail. We need to return the error so the transaction gets aborted properly, fix this by returning ret from btrfs_del_csums in cleanup_ref_head. Reviewed-by: Qu Wenruo CC: stable@vger.kernel.org # 4.19+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/extent-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 51c18da4792ec..728d75e8e792a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1830,7 +1830,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle *trans, trace_run_delayed_ref_head(fs_info, head, 0); btrfs_delayed_ref_unlock(head); btrfs_put_delayed_ref_head(head); - return 0; + return ret; } static struct btrfs_delayed_ref_head *btrfs_obtain_ref_head( From 48568f3944ee7357e8fed394804745bd981e978a Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Wed, 19 May 2021 13:13:15 -0400 Subject: [PATCH 122/269] btrfs: fixup error handling in fixup_inode_link_counts commit 011b28acf940eb61c000059dd9e2cfcbf52ed96b upstream. This function has the following pattern while (1) { ret = whatever(); if (ret) goto out; } ret = 0 out: return ret; However several places in this while loop we simply break; when there's a problem, thus clearing the return value, and in one case we do a return -EIO, and leak the memory for the path. Fix this by re-arranging the loop to deal with ret == 1 coming from btrfs_search_slot, and then simply delete the ret = 0; out: bit so everybody can break if there is an error, which will allow for proper error handling to occur. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/tree-log.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 9a0cfa0e124da..300951088a11c 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -1752,6 +1752,7 @@ static noinline int fixup_inode_link_counts(struct btrfs_trans_handle *trans, break; if (ret == 1) { + ret = 0; if (path->slots[0] == 0) break; path->slots[0]--; @@ -1764,17 +1765,19 @@ static noinline int fixup_inode_link_counts(struct btrfs_trans_handle *trans, ret = btrfs_del_item(trans, root, path); if (ret) - goto out; + break; btrfs_release_path(path); inode = read_one_inode(root, key.offset); - if (!inode) - return -EIO; + if (!inode) { + ret = -EIO; + break; + } ret = fixup_inode_link_count(trans, root, inode); iput(inode); if (ret) - goto out; + break; /* * fixup on a directory may create new entries, @@ -1783,8 +1786,6 @@ static noinline int fixup_inode_link_counts(struct btrfs_trans_handle *trans, */ key.offset = (u64)-1; } - ret = 0; -out: btrfs_release_path(path); return ret; } From 0df50d47d17401f9f140dfbe752a65e5d72f9932 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Wed, 19 May 2021 14:04:21 -0400 Subject: [PATCH 123/269] btrfs: abort in rename_exchange if we fail to insert the second ref commit dc09ef3562726cd520c8338c1640872a60187af5 upstream. Error injection stress uncovered a problem where we'd leave a dangling inode ref if we failed during a rename_exchange. This happens because we insert the inode ref for one side of the rename, and then for the other side. If this second inode ref insert fails we'll leave the first one dangling and leave a corrupt file system behind. Fix this by aborting if we did the insert for the first inode ref. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/inode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c913875f416cc..4f26dae63b64a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8890,6 +8890,7 @@ static int btrfs_rename_exchange(struct inode *old_dir, int ret2; bool root_log_pinned = false; bool dest_log_pinned = false; + bool need_abort = false; /* we only allow rename subvolume link between subvolumes */ if (old_ino != BTRFS_FIRST_FREE_OBJECTID && root != dest) @@ -8946,6 +8947,7 @@ static int btrfs_rename_exchange(struct inode *old_dir, old_idx); if (ret) goto out_fail; + need_abort = true; } /* And now for the dest. */ @@ -8961,8 +8963,11 @@ static int btrfs_rename_exchange(struct inode *old_dir, new_ino, btrfs_ino(BTRFS_I(old_dir)), new_idx); - if (ret) + if (ret) { + if (need_abort) + btrfs_abort_transaction(trans, ret); goto out_fail; + } } /* Update inode version and ctime/mtime. */ From baa6763123e2b63b8289943c7211ba0e3220432f Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Tue, 25 May 2021 11:05:28 +0100 Subject: [PATCH 124/269] btrfs: fix deadlock when cloning inline extents and low on available space commit 76a6d5cd74479e7ec8a7f9a29bce63d5549b6b2e upstream. There are a few cases where cloning an inline extent requires copying data into a page of the destination inode. For these cases we are allocating the required data and metadata space while holding a leaf locked. This can result in a deadlock when we are low on available space because allocating the space may flush delalloc and two deadlock scenarios can happen: 1) When starting writeback for an inode with a very small dirty range that fits in an inline extent, we deadlock during the writeback when trying to insert the inline extent, at cow_file_range_inline(), if the extent is going to be located in the leaf for which we are already holding a read lock; 2) After successfully starting writeback, for non-inline extent cases, the async reclaim thread will hang waiting for an ordered extent to complete if the ordered extent completion needs to modify the leaf for which the clone task is holding a read lock (for adding or replacing file extent items). So the cloning task will wait forever on the async reclaim thread to make progress, which in turn is waiting for the ordered extent completion which in turn is waiting to acquire a write lock on the same leaf. So fix this by making sure we release the path (and therefore the leaf) every time we need to copy the inline extent's data into a page of the destination inode, as by that time we do not need to have the leaf locked. Fixes: 05a5a7621ce66c ("Btrfs: implement full reflink support for inline extents") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/reflink.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index eeb66e797e0bf..96ef9fed9a656 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -207,10 +207,7 @@ static int clone_copy_inline_extent(struct inode *dst, * inline extent's data to the page. */ ASSERT(key.offset > 0); - ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset, - inline_data, size, datal, - comp_type); - goto out; + goto copy_to_page; } } else if (i_size_read(dst) <= datal) { struct btrfs_file_extent_item *ei; @@ -226,13 +223,10 @@ static int clone_copy_inline_extent(struct inode *dst, BTRFS_FILE_EXTENT_INLINE) goto copy_inline_extent; - ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset, - inline_data, size, datal, comp_type); - goto out; + goto copy_to_page; } copy_inline_extent: - ret = 0; /* * We have no extent items, or we have an extent at offset 0 which may * or may not be inlined. All these cases are dealt the same way. @@ -244,11 +238,13 @@ static int clone_copy_inline_extent(struct inode *dst, * clone. Deal with all these cases by copying the inline extent * data into the respective page at the destination inode. */ - ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset, - inline_data, size, datal, comp_type); - goto out; + goto copy_to_page; } + /* + * Release path before starting a new transaction so we don't hold locks + * that would confuse lockdep. + */ btrfs_release_path(path); /* * If we end up here it means were copy the inline extent into a leaf @@ -281,11 +277,6 @@ static int clone_copy_inline_extent(struct inode *dst, ret = btrfs_inode_set_file_extent_range(BTRFS_I(dst), 0, aligned_end); out: if (!ret && !trans) { - /* - * Release path before starting a new transaction so we don't - * hold locks that would confuse lockdep. - */ - btrfs_release_path(path); /* * No transaction here means we copied the inline extent into a * page of the destination inode. @@ -306,6 +297,21 @@ static int clone_copy_inline_extent(struct inode *dst, *trans_out = trans; return ret; + +copy_to_page: + /* + * Release our path because we don't need it anymore and also because + * copy_inline_to_page() needs to reserve data and metadata, which may + * need to flush delalloc when we are low on available space and + * therefore cause a deadlock if writeback of an inline extent needs to + * write to the same leaf or an ordered extent completion needs to write + * to the same leaf. + */ + btrfs_release_path(path); + + ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset, + inline_data, size, datal, comp_type); + goto out; } /** From 2eb4ec9c2c3535b9755c484183cc5c4d90fd37ff Mon Sep 17 00:00:00 2001 From: Mina Almasry Date: Fri, 4 Jun 2021 20:01:36 -0700 Subject: [PATCH 125/269] mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY [ Upstream commit d84cf06e3dd8c5c5b547b5d8931015fc536678e5 ] The userfaultfd hugetlb tests cause a resv_huge_pages underflow. This happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on an index for which we already have a page in the cache. When this happens, we allocate a second page, double consuming the reservation, and then fail to insert the page into the cache and return -EEXIST. To fix this, we first check if there is a page in the cache which already consumed the reservation, and return -EEXIST immediately if so. There is still a rare condition where we fail to copy the page contents AND race with a call for hugetlb_no_page() for this index and again we will underflow resv_huge_pages. That is fixed in a more complicated patch not targeted for -stable. Test: Hacked the code locally such that resv_huge_pages underflows produce a warning, then: ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success ./tools/testing/selftests/vm/userfaultfd hugetlb 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success Both tests succeed and produce no warnings. After the test runs number of free/resv hugepages is correct. [mike.kravetz@oracle.com: changelog fixes] Link: https://lkml.kernel.org/r/20210528004649.85298-1-almasrymina@google.com Fixes: 8fb5debc5fcd ("userfaultfd: hugetlbfs: add hugetlb_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Mina Almasry Reviewed-by: Mike Kravetz Cc: Axel Rasmussen Cc: Peter Xu Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/hugetlb.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 900851a4f9146..bc1006a327338 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4708,10 +4708,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, struct page *page; if (!*pagep) { - ret = -ENOMEM; + /* If a page already exists, then it's UFFDIO_COPY for + * a non-missing case. Return -EEXIST. + */ + if (vm_shared && + hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) { + ret = -EEXIST; + goto out; + } + page = alloc_huge_page(dst_vma, dst_addr, 0); - if (IS_ERR(page)) + if (IS_ERR(page)) { + ret = -ENOMEM; goto out; + } ret = copy_huge_page_from_user(page, (const void __user *) src_addr, From 05e41f6f1c4e8c42edb9715b6629d9ab2af61064 Mon Sep 17 00:00:00 2001 From: Dmitry Baryshkov Date: Thu, 1 Apr 2021 05:05:33 +0300 Subject: [PATCH 126/269] drm/msm/dpu: always use mdp device to scale bandwidth commit a670ff578f1fb855fedc7931fa5bbc06b567af22 upstream. Currently DPU driver scales bandwidth and core clock for sc7180 only, while the rest of chips get static bandwidth votes. Make all chipsets scale bandwidth and clock per composition requirements like sc7180 does. Drop old voting path completely. Tested on RB3 (SDM845) and RB5 (SM8250). Signed-off-by: Dmitry Baryshkov Link: https://lore.kernel.org/r/20210401020533.3956787-2-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark Signed-off-by: Amit Pundir Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 3 +- drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c | 51 +----------------------- 2 files changed, 2 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index e69ea810e18d9..c8217f4858a15 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -931,8 +931,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms) DPU_DEBUG("REG_DMA is not defined"); } - if (of_device_is_compatible(dev->dev->of_node, "qcom,sc7180-mdss")) - dpu_kms_parse_data_bus_icc_path(dpu_kms); + dpu_kms_parse_data_bus_icc_path(dpu_kms); pm_runtime_get_sync(&dpu_kms->pdev->dev); diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c index cd4078807db1b..3416e9617ee9a 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c @@ -31,40 +31,8 @@ struct dpu_mdss { void __iomem *mmio; struct dss_module_power mp; struct dpu_irq_controller irq_controller; - struct icc_path *path[2]; - u32 num_paths; }; -static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev, - struct dpu_mdss *dpu_mdss) -{ - struct icc_path *path0 = of_icc_get(dev->dev, "mdp0-mem"); - struct icc_path *path1 = of_icc_get(dev->dev, "mdp1-mem"); - - if (IS_ERR_OR_NULL(path0)) - return PTR_ERR_OR_ZERO(path0); - - dpu_mdss->path[0] = path0; - dpu_mdss->num_paths = 1; - - if (!IS_ERR_OR_NULL(path1)) { - dpu_mdss->path[1] = path1; - dpu_mdss->num_paths++; - } - - return 0; -} - -static void dpu_mdss_icc_request_bw(struct msm_mdss *mdss) -{ - struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); - int i; - u64 avg_bw = dpu_mdss->num_paths ? MAX_BW / dpu_mdss->num_paths : 0; - - for (i = 0; i < dpu_mdss->num_paths; i++) - icc_set_bw(dpu_mdss->path[i], avg_bw, kBps_to_icc(MAX_BW)); -} - static void dpu_mdss_irq(struct irq_desc *desc) { struct dpu_mdss *dpu_mdss = irq_desc_get_handler_data(desc); @@ -178,8 +146,6 @@ static int dpu_mdss_enable(struct msm_mdss *mdss) struct dss_module_power *mp = &dpu_mdss->mp; int ret; - dpu_mdss_icc_request_bw(mdss); - ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true); if (ret) { DPU_ERROR("clock enable failed, ret:%d\n", ret); @@ -213,15 +179,12 @@ static int dpu_mdss_disable(struct msm_mdss *mdss) { struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss); struct dss_module_power *mp = &dpu_mdss->mp; - int ret, i; + int ret; ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, false); if (ret) DPU_ERROR("clock disable failed, ret:%d\n", ret); - for (i = 0; i < dpu_mdss->num_paths; i++) - icc_set_bw(dpu_mdss->path[i], 0, 0); - return ret; } @@ -232,7 +195,6 @@ static void dpu_mdss_destroy(struct drm_device *dev) struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss); struct dss_module_power *mp = &dpu_mdss->mp; int irq; - int i; pm_runtime_suspend(dev->dev); pm_runtime_disable(dev->dev); @@ -242,9 +204,6 @@ static void dpu_mdss_destroy(struct drm_device *dev) msm_dss_put_clk(mp->clk_config, mp->num_clk); devm_kfree(&pdev->dev, mp->clk_config); - for (i = 0; i < dpu_mdss->num_paths; i++) - icc_put(dpu_mdss->path[i]); - if (dpu_mdss->mmio) devm_iounmap(&pdev->dev, dpu_mdss->mmio); dpu_mdss->mmio = NULL; @@ -276,12 +235,6 @@ int dpu_mdss_init(struct drm_device *dev) DRM_DEBUG("mapped mdss address space @%pK\n", dpu_mdss->mmio); - if (!of_device_is_compatible(dev->dev->of_node, "qcom,sc7180-mdss")) { - ret = dpu_mdss_parse_data_bus_icc_path(dev, dpu_mdss); - if (ret) - return ret; - } - mp = &dpu_mdss->mp; ret = msm_dss_parse_clock(pdev, mp); if (ret) { @@ -307,8 +260,6 @@ int dpu_mdss_init(struct drm_device *dev) pm_runtime_enable(dev->dev); - dpu_mdss_icc_request_bw(priv->mdss); - return ret; irq_error: From fe910d20e2d8e0736bbea9c1efe6a49535e807ea Mon Sep 17 00:00:00 2001 From: Anand Jain Date: Fri, 30 Apr 2021 19:59:51 +0800 Subject: [PATCH 127/269] btrfs: fix unmountable seed device after fstrim commit 5e753a817b2d5991dfe8a801b7b1e8e79a1c5a20 upstream. The following test case reproduces an issue of wrongly freeing in-use blocks on the readonly seed device when fstrim is called on the rw sprout device. As shown below. Create a seed device and add a sprout device to it: $ mkfs.btrfs -fq -dsingle -msingle /dev/loop0 $ btrfstune -S 1 /dev/loop0 $ mount /dev/loop0 /btrfs $ btrfs dev add -f /dev/loop1 /btrfs BTRFS info (device loop0): relocating block group 290455552 flags system BTRFS info (device loop0): relocating block group 1048576 flags system BTRFS info (device loop0): disk added /dev/loop1 $ umount /btrfs Mount the sprout device and run fstrim: $ mount /dev/loop1 /btrfs $ fstrim /btrfs $ umount /btrfs Now try to mount the seed device, and it fails: $ mount /dev/loop0 /btrfs mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. Block 5292032 is missing on the readonly seed device: $ dmesg -kt | tail BTRFS error (device loop0): bad tree block start, want 5292032 have 0 BTRFS warning (device loop0): couldn't read-tree root BTRFS error (device loop0): open_ctree failed >From the dump-tree of the seed device (taken before the fstrim). Block 5292032 belonged to the block group starting at 5242880: $ btrfs inspect dump-tree -e /dev/loop0 | grep -A1 BLOCK_GROUP item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 block group used 114688 chunk_objectid 256 flags METADATA >From the dump-tree of the sprout device (taken before the fstrim). fstrim used block-group 5242880 to find the related free space to free: $ btrfs inspect dump-tree -e /dev/loop1 | grep -A1 BLOCK_GROUP item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 block group used 32768 chunk_objectid 256 flags METADATA BPF kernel tracing the fstrim command finds the missing block 5292032 within the range of the discarded blocks as below: kprobe:btrfs_discard_extent { printf("freeing start %llu end %llu num_bytes %llu:\n", arg1, arg1+arg2, arg2); } freeing start 5259264 end 5406720 num_bytes 147456 Fix this by avoiding the discard command to the readonly seed device. Reported-by: Chris Murphy CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana Signed-off-by: Anand Jain Signed-off-by: David Sterba Signed-off-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/extent-tree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 728d75e8e792a..73ebe0c5fdbc9 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1297,16 +1297,20 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, for (i = 0; i < bbio->num_stripes; i++, stripe++) { u64 bytes; struct request_queue *req_q; + struct btrfs_device *device = stripe->dev; - if (!stripe->dev->bdev) { + if (!device->bdev) { ASSERT(btrfs_test_opt(fs_info, DEGRADED)); continue; } - req_q = bdev_get_queue(stripe->dev->bdev); + req_q = bdev_get_queue(device->bdev); if (!blk_queue_discard(req_q)) continue; - ret = btrfs_issue_discard(stripe->dev->bdev, + if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) + continue; + + ret = btrfs_issue_discard(device->bdev, stripe->physical, stripe->length, &bytes); From b3ee3f50ab1bf7b60ba4a8346dca05ba3412fead Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Wed, 21 Apr 2021 19:21:22 -0700 Subject: [PATCH 128/269] KVM: SVM: Truncate GPR value for DR and CR accesses in !64-bit mode commit 0884335a2e653b8a045083aa1d57ce74269ac81d upstream. Drop bits 63:32 on loads/stores to/from DRs and CRs when the vCPU is not in 64-bit mode. The APM states bits 63:32 are dropped for both DRs and CRs: In 64-bit mode, the operand size is fixed at 64 bits without the need for a REX prefix. In non-64-bit mode, the operand size is fixed at 32 bits and the upper 32 bits of the destination are forced to 0. Fixes: 7ff76d58a9dc ("KVM: SVM: enhance MOV CR intercept handler") Fixes: cae3797a4639 ("KVM: SVM: enhance mov DR intercept handler") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Message-Id: <20210422022128.3464144-4-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/svm/svm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9d4eb114613c2..41d44fb5f753d 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2362,7 +2362,7 @@ static int cr_interception(struct vcpu_svm *svm) err = 0; if (cr >= 16) { /* mov to cr */ cr -= 16; - val = kvm_register_read(&svm->vcpu, reg); + val = kvm_register_readl(&svm->vcpu, reg); trace_kvm_cr_write(cr, val); switch (cr) { case 0: @@ -2408,7 +2408,7 @@ static int cr_interception(struct vcpu_svm *svm) kvm_queue_exception(&svm->vcpu, UD_VECTOR); return 1; } - kvm_register_write(&svm->vcpu, reg, val); + kvm_register_writel(&svm->vcpu, reg, val); trace_kvm_cr_read(cr, val); } return kvm_complete_insn_gp(&svm->vcpu, err); @@ -2439,13 +2439,13 @@ static int dr_interception(struct vcpu_svm *svm) if (dr >= 16) { /* mov to DRn */ if (!kvm_require_dr(&svm->vcpu, dr - 16)) return 1; - val = kvm_register_read(&svm->vcpu, reg); + val = kvm_register_readl(&svm->vcpu, reg); kvm_set_dr(&svm->vcpu, dr - 16, val); } else { if (!kvm_require_dr(&svm->vcpu, dr)) return 1; kvm_get_dr(&svm->vcpu, dr, &val); - kvm_register_write(&svm->vcpu, reg, val); + kvm_register_writel(&svm->vcpu, reg, val); } return kvm_skip_emulated_instruction(&svm->vcpu); From b327c97747595b462a003a11e6728ebd860cd285 Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Fri, 14 May 2021 09:05:41 +0100 Subject: [PATCH 129/269] KVM: arm64: Fix debug register indexing commit cb853ded1d25e5b026ce115dbcde69e3d7e2e831 upstream. Commit 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") flipped the register number to 0 for all the debug registers in the sysreg table, hereby indicating that these registers live in a separate shadow structure. However, the author of this patch failed to realise that all the accessors are using that particular index instead of the register encoding, resulting in all the registers hitting index 0. Not quite a valid implementation of the architecture... Address the issue by fixing all the accessors to use the CRm field of the encoding, which contains the debug register index. Fixes: 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") Reported-by: Ricardo Koller Signed-off-by: Marc Zyngier Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- arch/arm64/kvm/sys_regs.c | 42 +++++++++++++++++++-------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index b246a4acba416..568f11e23830c 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -464,14 +464,14 @@ static bool trap_bvr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *rd) { - u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg]; + u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm]; if (p->is_write) reg_to_dbg(vcpu, p, dbg_reg); else dbg_to_reg(vcpu, p, dbg_reg); - trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg); + trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg); return true; } @@ -479,7 +479,7 @@ static bool trap_bvr(struct kvm_vcpu *vcpu, static int set_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm]; if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -489,7 +489,7 @@ static int set_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static int get_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm]; if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -499,21 +499,21 @@ static int get_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static void reset_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { - vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg] = rd->val; + vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm] = rd->val; } static bool trap_bcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *rd) { - u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg]; + u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm]; if (p->is_write) reg_to_dbg(vcpu, p, dbg_reg); else dbg_to_reg(vcpu, p, dbg_reg); - trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg); + trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg); return true; } @@ -521,7 +521,7 @@ static bool trap_bcr(struct kvm_vcpu *vcpu, static int set_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm]; if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -532,7 +532,7 @@ static int set_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static int get_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm]; if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -542,22 +542,22 @@ static int get_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static void reset_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { - vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg] = rd->val; + vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm] = rd->val; } static bool trap_wvr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *rd) { - u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg]; + u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm]; if (p->is_write) reg_to_dbg(vcpu, p, dbg_reg); else dbg_to_reg(vcpu, p, dbg_reg); - trace_trap_reg(__func__, rd->reg, p->is_write, - vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg]); + trace_trap_reg(__func__, rd->CRm, p->is_write, + vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm]); return true; } @@ -565,7 +565,7 @@ static bool trap_wvr(struct kvm_vcpu *vcpu, static int set_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm]; if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -575,7 +575,7 @@ static int set_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static int get_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm]; if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -585,21 +585,21 @@ static int get_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static void reset_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { - vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg] = rd->val; + vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm] = rd->val; } static bool trap_wcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *rd) { - u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg]; + u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm]; if (p->is_write) reg_to_dbg(vcpu, p, dbg_reg); else dbg_to_reg(vcpu, p, dbg_reg); - trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg); + trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg); return true; } @@ -607,7 +607,7 @@ static bool trap_wcr(struct kvm_vcpu *vcpu, static int set_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm]; if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -617,7 +617,7 @@ static int set_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static int get_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, const struct kvm_one_reg *reg, void __user *uaddr) { - __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg]; + __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm]; if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0) return -EFAULT; @@ -627,7 +627,7 @@ static int get_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, static void reset_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { - vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg] = rd->val; + vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm] = rd->val; } static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) From 38b858da1c58ad46519a257764e059e663b59ff2 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov Date: Mon, 31 May 2021 16:05:24 +0200 Subject: [PATCH 130/269] x86/kvm: Teardown PV features on boot CPU as well commit 8b79feffeca28c5459458fe78676b081e87c93a4 upstream. Various PV features (Async PF, PV EOI, steal time) work through memory shared with hypervisor and when we restore from hibernation we must properly teardown all these features to make sure hypervisor doesn't write to stale locations after we jump to the previously hibernated kernel (which can try to place anything there). For secondary CPUs the job is already done by kvm_cpu_down_prepare(), register syscore ops to do the same for boot CPU. Krzysztof: This fixes memory corruption visible after second resume from hibernation: BUG: Bad page state in process dbus-daemon pfn:18b01 page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591 flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim) raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim) Signed-off-by: Vitaly Kuznetsov Message-Id: <20210414123544.1060604-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini Signed-off-by: Andrea Righi [krzysztof: Extend the commit message, adjust for v5.10 context] Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/kvm.c | 57 +++++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 7f57ede3cb8e7..6af3f9c3956c9 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -460,6 +461,25 @@ static bool pv_tlb_flush_supported(void) static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask); +static void kvm_guest_cpu_offline(void) +{ + kvm_disable_steal_time(); + if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) + wrmsrl(MSR_KVM_PV_EOI_EN, 0); + kvm_pv_disable_apf(); + apf_task_wake_all(); +} + +static int kvm_cpu_online(unsigned int cpu) +{ + unsigned long flags; + + local_irq_save(flags); + kvm_guest_cpu_init(); + local_irq_restore(flags); + return 0; +} + #ifdef CONFIG_SMP static bool pv_ipi_supported(void) @@ -587,31 +607,34 @@ static void __init kvm_smp_prepare_boot_cpu(void) kvm_spinlock_init(); } -static void kvm_guest_cpu_offline(void) +static int kvm_cpu_down_prepare(unsigned int cpu) { - kvm_disable_steal_time(); - if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) - wrmsrl(MSR_KVM_PV_EOI_EN, 0); - kvm_pv_disable_apf(); - apf_task_wake_all(); -} + unsigned long flags; -static int kvm_cpu_online(unsigned int cpu) -{ - local_irq_disable(); - kvm_guest_cpu_init(); - local_irq_enable(); + local_irq_save(flags); + kvm_guest_cpu_offline(); + local_irq_restore(flags); return 0; } -static int kvm_cpu_down_prepare(unsigned int cpu) +#endif + +static int kvm_suspend(void) { - local_irq_disable(); kvm_guest_cpu_offline(); - local_irq_enable(); + return 0; } -#endif + +static void kvm_resume(void) +{ + kvm_cpu_online(raw_smp_processor_id()); +} + +static struct syscore_ops kvm_syscore_ops = { + .suspend = kvm_suspend, + .resume = kvm_resume, +}; static void kvm_flush_tlb_others(const struct cpumask *cpumask, const struct flush_tlb_info *info) @@ -681,6 +704,8 @@ static void __init kvm_guest_init(void) kvm_guest_cpu_init(); #endif + register_syscore_ops(&kvm_syscore_ops); + /* * Hard lockup detection is enabled by default. Disable it, as guests * can get false positives too easily, for example if the host is From 3b0becf8b1ecf642a9edaf4c9628ffc641e490d6 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov Date: Mon, 31 May 2021 16:05:25 +0200 Subject: [PATCH 131/269] x86/kvm: Disable kvmclock on all CPUs on shutdown commit c02027b5742b5aa804ef08a4a9db433295533046 upstream. Currenly, we disable kvmclock from machine_shutdown() hook and this only happens for boot CPU. We need to disable it for all CPUs to guard against memory corruption e.g. on restore from hibernate. Note, writing '0' to kvmclock MSR doesn't clear memory location, it just prevents hypervisor from updating the location so for the short while after write and while CPU is still alive, the clock remains usable and correct so we don't need to switch to some other clocksource. Signed-off-by: Vitaly Kuznetsov Message-Id: <20210414123544.1060604-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini Signed-off-by: Andrea Righi Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/kvm_para.h | 4 ++-- arch/x86/kernel/kvm.c | 1 + arch/x86/kernel/kvmclock.c | 5 +---- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 3381198525126..9c56e0defd453 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -7,8 +7,6 @@ #include #include -extern void kvmclock_init(void); - #ifdef CONFIG_KVM_GUEST bool kvm_check_and_clear_guest_paused(void); #else @@ -86,6 +84,8 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1, } #ifdef CONFIG_KVM_GUEST +void kvmclock_init(void); +void kvmclock_disable(void); bool kvm_para_available(void); unsigned int kvm_arch_para_features(void); unsigned int kvm_arch_para_hints(void); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 6af3f9c3956c9..be1c42e663c65 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -468,6 +468,7 @@ static void kvm_guest_cpu_offline(void) wrmsrl(MSR_KVM_PV_EOI_EN, 0); kvm_pv_disable_apf(); apf_task_wake_all(); + kvmclock_disable(); } static int kvm_cpu_online(unsigned int cpu) diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index 5ee705b44560b..327a0de010668 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -221,11 +221,9 @@ static void kvm_crash_shutdown(struct pt_regs *regs) } #endif -static void kvm_shutdown(void) +void kvmclock_disable(void) { native_write_msr(msr_kvm_system_time, 0, 0); - kvm_disable_steal_time(); - native_machine_shutdown(); } static void __init kvmclock_init_mem(void) @@ -352,7 +350,6 @@ void __init kvmclock_init(void) #endif x86_platform.save_sched_clock_state = kvm_save_sched_clock_state; x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state; - machine_ops.shutdown = kvm_shutdown; #ifdef CONFIG_KEXEC_CORE machine_ops.crash_shutdown = kvm_crash_shutdown; #endif From 334c59d58de5faf449d9c9feaa8c50dd8b4046a7 Mon Sep 17 00:00:00 2001 From: Vitaly Kuznetsov Date: Mon, 31 May 2021 16:05:26 +0200 Subject: [PATCH 132/269] x86/kvm: Disable all PV features on crash commit 3d6b84132d2a57b5a74100f6923a8feb679ac2ce upstream. Crash shutdown handler only disables kvmclock and steal time, other PV features remain active so we risk corrupting memory or getting some side-effects in kdump kernel. Move crash handler to kvm.c and unify with CPU offline. Signed-off-by: Vitaly Kuznetsov Message-Id: <20210414123544.1060604-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman --- arch/x86/include/asm/kvm_para.h | 6 ----- arch/x86/kernel/kvm.c | 44 ++++++++++++++++++++++++--------- arch/x86/kernel/kvmclock.c | 21 ---------------- 3 files changed, 32 insertions(+), 39 deletions(-) diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 9c56e0defd453..69299878b200a 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -92,7 +92,6 @@ unsigned int kvm_arch_para_hints(void); void kvm_async_pf_task_wait_schedule(u32 token); void kvm_async_pf_task_wake(u32 token); u32 kvm_read_and_reset_apf_flags(void); -void kvm_disable_steal_time(void); bool __kvm_handle_async_pf(struct pt_regs *regs, u32 token); DECLARE_STATIC_KEY_FALSE(kvm_async_pf_enabled); @@ -137,11 +136,6 @@ static inline u32 kvm_read_and_reset_apf_flags(void) return 0; } -static inline void kvm_disable_steal_time(void) -{ - return; -} - static __always_inline bool kvm_handle_async_pf(struct pt_regs *regs, u32 token) { return false; diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index be1c42e663c65..7462b79c39de6 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -38,6 +38,7 @@ #include #include #include +#include #include DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled); @@ -375,6 +376,14 @@ static void kvm_pv_disable_apf(void) pr_info("Unregister pv shared memory for cpu %d\n", smp_processor_id()); } +static void kvm_disable_steal_time(void) +{ + if (!has_steal_clock) + return; + + wrmsr(MSR_KVM_STEAL_TIME, 0, 0); +} + static void kvm_pv_guest_cpu_reboot(void *unused) { /* @@ -417,14 +426,6 @@ static u64 kvm_steal_clock(int cpu) return steal; } -void kvm_disable_steal_time(void) -{ - if (!has_steal_clock) - return; - - wrmsr(MSR_KVM_STEAL_TIME, 0, 0); -} - static inline void __set_percpu_decrypted(void *ptr, unsigned long size) { early_set_memory_decrypted((unsigned long) ptr, size); @@ -461,13 +462,14 @@ static bool pv_tlb_flush_supported(void) static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask); -static void kvm_guest_cpu_offline(void) +static void kvm_guest_cpu_offline(bool shutdown) { kvm_disable_steal_time(); if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) wrmsrl(MSR_KVM_PV_EOI_EN, 0); kvm_pv_disable_apf(); - apf_task_wake_all(); + if (!shutdown) + apf_task_wake_all(); kvmclock_disable(); } @@ -613,7 +615,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu) unsigned long flags; local_irq_save(flags); - kvm_guest_cpu_offline(); + kvm_guest_cpu_offline(false); local_irq_restore(flags); return 0; } @@ -622,7 +624,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu) static int kvm_suspend(void) { - kvm_guest_cpu_offline(); + kvm_guest_cpu_offline(false); return 0; } @@ -637,6 +639,20 @@ static struct syscore_ops kvm_syscore_ops = { .resume = kvm_resume, }; +/* + * After a PV feature is registered, the host will keep writing to the + * registered memory location. If the guest happens to shutdown, this memory + * won't be valid. In cases like kexec, in which you install a new kernel, this + * means a random memory location will be kept being written. + */ +#ifdef CONFIG_KEXEC_CORE +static void kvm_crash_shutdown(struct pt_regs *regs) +{ + kvm_guest_cpu_offline(true); + native_machine_crash_shutdown(regs); +} +#endif + static void kvm_flush_tlb_others(const struct cpumask *cpumask, const struct flush_tlb_info *info) { @@ -705,6 +721,10 @@ static void __init kvm_guest_init(void) kvm_guest_cpu_init(); #endif +#ifdef CONFIG_KEXEC_CORE + machine_ops.crash_shutdown = kvm_crash_shutdown; +#endif + register_syscore_ops(&kvm_syscore_ops); /* diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index 327a0de010668..c4ac26333bc41 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -20,7 +20,6 @@ #include #include #include -#include #include static int kvmclock __initdata = 1; @@ -204,23 +203,6 @@ static void kvm_setup_secondary_clock(void) } #endif -/* - * After the clock is registered, the host will keep writing to the - * registered memory location. If the guest happens to shutdown, this memory - * won't be valid. In cases like kexec, in which you install a new kernel, this - * means a random memory location will be kept being written. So before any - * kind of shutdown from our side, we unregister the clock by writing anything - * that does not have the 'enable' bit set in the msr - */ -#ifdef CONFIG_KEXEC_CORE -static void kvm_crash_shutdown(struct pt_regs *regs) -{ - native_write_msr(msr_kvm_system_time, 0, 0); - kvm_disable_steal_time(); - native_machine_crash_shutdown(regs); -} -#endif - void kvmclock_disable(void) { native_write_msr(msr_kvm_system_time, 0, 0); @@ -350,9 +332,6 @@ void __init kvmclock_init(void) #endif x86_platform.save_sched_clock_state = kvm_save_sched_clock_state; x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state; -#ifdef CONFIG_KEXEC_CORE - machine_ops.crash_shutdown = kvm_crash_shutdown; -#endif kvm_get_preset_lpj(); /* From f20eef4d068637dc48ed24887ebc7b1faa860ae5 Mon Sep 17 00:00:00 2001 From: Gao Xiang Date: Tue, 15 Dec 2020 20:44:03 -0800 Subject: [PATCH 133/269] lib/lz4: explicitly support in-place decompression commit 89b158635ad79574bde8e94d45dad33f8cf09549 upstream. LZ4 final literal copy could be overlapped when doing in-place decompression, so it's unsafe to just use memcpy() on an optimized memcpy approach but memmove() instead. Upstream LZ4 has updated this years ago [1] (and the impact is non-sensible [2] plus only a few bytes remain), this commit just synchronizes LZ4 upstream code to the kernel side as well. It can be observed as EROFS in-place decompression failure on specific files when X86_FEATURE_ERMS is unsupported, memcpy() optimization of commit 59daa706fbec ("x86, mem: Optimize memcpy by avoiding memory false dependece") will be enabled then. Currently most modern x86-CPUs support ERMS, these CPUs just use "rep movsb" approach so no problem at all. However, it can still be verified with forcely disabling ERMS feature... arch/x86/lib/memcpy_64.S: ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \ - "jmp memcpy_erms", X86_FEATURE_ERMS + "jmp memcpy_orig", X86_FEATURE_ERMS We didn't observe any strange on arm64/arm/x86 platform before since most memcpy() would behave in an increasing address order ("copy upwards" [3]) and it's the correct order of in-place decompression but it really needs an update to memmove() for sure considering it's an undefined behavior according to the standard and some unique optimization already exists in the kernel. [1] https://github.com/lz4/lz4/commit/33cb8518ac385835cc17be9a770b27b40cd0e15b [2] https://github.com/lz4/lz4/pull/717#issuecomment-497818921 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=12518 Link: https://lkml.kernel.org/r/20201122030749.2698994-1-hsiangkao@redhat.com Signed-off-by: Gao Xiang Reviewed-by: Nick Terrell Cc: Yann Collet Cc: Miao Xie Cc: Chao Yu Cc: Li Guifu Cc: Guo Xuenan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Gao Xiang Signed-off-by: Greg Kroah-Hartman --- lib/lz4/lz4_decompress.c | 6 +++++- lib/lz4/lz4defs.h | 1 + 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c index 00cb0d0b73e16..8a7724a6ce2fb 100644 --- a/lib/lz4/lz4_decompress.c +++ b/lib/lz4/lz4_decompress.c @@ -263,7 +263,11 @@ static FORCE_INLINE int LZ4_decompress_generic( } } - LZ4_memcpy(op, ip, length); + /* + * supports overlapping memory regions; only matters + * for in-place decompression scenarios + */ + LZ4_memmove(op, ip, length); ip += length; op += length; diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h index c91dd96ef6291..673bd206aa98b 100644 --- a/lib/lz4/lz4defs.h +++ b/lib/lz4/lz4defs.h @@ -146,6 +146,7 @@ static FORCE_INLINE void LZ4_writeLE16(void *memPtr, U16 value) * environments. This is needed when decompressing the Linux Kernel, for example. */ #define LZ4_memcpy(dst, src, size) __builtin_memcpy(dst, src, size) +#define LZ4_memmove(dst, src, size) __builtin_memmove(dst, src, size) static FORCE_INLINE void LZ4_copy8(void *dst, const void *src) { From eddf2d9f76b01201dd778f2d36d75b8050217cf7 Mon Sep 17 00:00:00 2001 From: Roja Rani Yarubandi Date: Tue, 25 May 2021 18:40:51 +0530 Subject: [PATCH 134/269] i2c: qcom-geni: Suspend and resume the bus during SYSTEM_SLEEP_PM ops commit 57648e860485de39c800a89f849fdd03c2d31d15 upstream. Mark bus as suspended during system suspend to block the future transfers. Implement geni_i2c_resume_noirq() to resume the bus. Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Roja Rani Yarubandi Reviewed-by: Stephen Boyd Signed-off-by: Wolfram Sang Signed-off-by: Greg Kroah-Hartman --- drivers/i2c/busses/i2c-qcom-geni.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 899ad2c7d67d3..86f028febce35 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -702,6 +702,8 @@ static int __maybe_unused geni_i2c_suspend_noirq(struct device *dev) { struct geni_i2c_dev *gi2c = dev_get_drvdata(dev); + i2c_mark_adapter_suspended(&gi2c->adap); + if (!gi2c->suspended) { geni_i2c_runtime_suspend(dev); pm_runtime_disable(dev); @@ -711,8 +713,16 @@ static int __maybe_unused geni_i2c_suspend_noirq(struct device *dev) return 0; } +static int __maybe_unused geni_i2c_resume_noirq(struct device *dev) +{ + struct geni_i2c_dev *gi2c = dev_get_drvdata(dev); + + i2c_mark_adapter_resumed(&gi2c->adap); + return 0; +} + static const struct dev_pm_ops geni_i2c_pm_ops = { - SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(geni_i2c_suspend_noirq, NULL) + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(geni_i2c_suspend_noirq, geni_i2c_resume_noirq) SET_RUNTIME_PM_OPS(geni_i2c_runtime_suspend, geni_i2c_runtime_resume, NULL) }; From 316de9a88c83c672c18d35bd76034d84e3769fe9 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Wed, 19 May 2021 13:32:20 +0200 Subject: [PATCH 135/269] netfilter: nf_tables: missing error reporting for not selected expressions commit c781471d67a56d7d4c113669a11ede0463b5c719 upstream. Sometimes users forget to turn on nftables extensions from Kconfig that they need. In such case, the error reporting from userspace is misleading: $ sudo nft add rule x y counter Error: Could not process rule: No such file or directory add rule x y counter ^^^^^^^^^^^^^^^^^^^^ Add missing NL_SET_BAD_ATTR() to provide a hint: $ nft add rule x y counter Error: Could not process rule: No such file or directory add rule x y counter ^^^^^^^ Fixes: 83d9dcba06c5 ("netfilter: nf_tables: extended netlink error reporting for expressions") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman --- net/netfilter/nf_tables_api.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 7bf7bfa0c7d9c..e34d05cc57549 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -3263,8 +3263,10 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, if (n == NFT_RULE_MAXEXPRS) goto err1; err = nf_tables_expr_parse(&ctx, tmp, &info[n]); - if (err < 0) + if (err < 0) { + NL_SET_BAD_ATTR(extack, tmp); goto err1; + } size += info[n].ops->size; n++; } From 6b53db8c4c14b4e7256f058d202908b54a7b85b4 Mon Sep 17 00:00:00 2001 From: Roger Pau Monne Date: Mon, 7 Jun 2021 15:13:15 +0200 Subject: [PATCH 136/269] xen-netback: take a reference to the RX task thread MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 107866a8eb0b664675a260f1ba0655010fac1e08 upstream. Do this in order to prevent the task from being freed if the thread returns (which can be triggered by the frontend) before the call to kthread_stop done as part of the backend tear down. Not taking the reference will lead to a use-after-free in that scenario. Such reference was taken before but dropped as part of the rework done in 2ac061ce97f4. Reintroduce the reference taking and add a comment this time explaining why it's needed. This is XSA-374 / CVE-2021-28691. Fixes: 2ac061ce97f4 ('xen/netback: cleanup init and deinit code') Signed-off-by: Roger Pau Monné Cc: stable@vger.kernel.org Reviewed-by: Jan Beulich Reviewed-by: Juergen Gross Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman --- drivers/net/xen-netback/interface.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index e02a4fbb74de5..7ce9807fc24c5 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -685,6 +685,7 @@ static void xenvif_disconnect_queue(struct xenvif_queue *queue) { if (queue->task) { kthread_stop(queue->task); + put_task_struct(queue->task); queue->task = NULL; } @@ -745,6 +746,11 @@ int xenvif_connect_data(struct xenvif_queue *queue, if (IS_ERR(task)) goto kthread_err; queue->task = task; + /* + * Take a reference to the task in order to prevent it from being freed + * if the thread function returns before kthread_stop is called. + */ + get_task_struct(task); task = kthread_run(xenvif_dealloc_kthread, queue, "%s-dealloc", queue->name); From d17d47da59f726dc4c87caebda3a50333d7e2fd3 Mon Sep 17 00:00:00 2001 From: David Ahern Date: Mon, 7 Jun 2021 11:35:30 -0600 Subject: [PATCH 137/269] neighbour: allow NUD_NOARP entries to be forced GCed commit 7a6b1ab7475fd6478eeaf5c9d1163e7a18125c8f upstream. IFF_POINTOPOINT interfaces use NUD_NOARP entries for IPv6. It's possible to fill up the neighbour table with enough entries that it will overflow for valid connections after that. This behaviour is more prevalent after commit 58956317c8de ("neighbor: Improve garbage collection") is applied, as it prevents removal from entries that are not NUD_FAILED, unless they are more than 5s old. Fixes: 58956317c8de (neighbor: Improve garbage collection) Reported-by: Kasper Dupont Signed-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/core/neighbour.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index a18c2973b8c6d..c452ebf209394 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -239,6 +239,7 @@ static int neigh_forced_gc(struct neigh_table *tbl) write_lock(&n->lock); if ((n->nud_state == NUD_FAILED) || + (n->nud_state == NUD_NOARP) || (tbl->is_multicast && tbl->is_multicast(n->primary_key)) || time_after(tref, n->updated)) From 951358a824f96be927ae50fad1e72e05bbb57b56 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Thu, 10 Jun 2021 13:39:29 +0200 Subject: [PATCH 138/269] Linux 5.10.43 Link: https://lore.kernel.org/r/20210608175942.377073879@linuxfoundation.org Tested-by: Shuah Khan Tested-by: Jon Hunter Tested-by: Linux Kernel Functional Testing Tested-by: Jason Self Tested-by: Salvatore Bonaccorso Tested-by: Florian Fainelli Tested-by: Guenter Roeck Tested-by: Pavel Machek (CIP) Tested-by: Sudip Mukherjee Signed-off-by: Greg Kroah-Hartman --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 290903d0e7dab..ec9ee8032a985 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 5 PATCHLEVEL = 10 -SUBLEVEL = 42 +SUBLEVEL = 43 EXTRAVERSION = NAME = Dare mighty things From f70102cb369cde6ab7551ca58152d00fd3478fec Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Tue, 8 Jun 2021 10:12:21 -0700 Subject: [PATCH 139/269] proc: Track /proc/$pid/attr/ opener mm_struct commit 591a22c14d3f45cc38bd1931c593c221df2f1881 upstream. Commit bfb819ea20ce ("proc: Check /proc/$pid/attr/ writes against file opener") tried to make sure that there could not be a confusion between the opener of a /proc/$pid/attr/ file and the writer. It used struct cred to make sure the privileges didn't change. However, there were existing cases where a more privileged thread was passing the opened fd to a differently privileged thread (during container setup). Instead, use mm_struct to track whether the opener and writer are still the same process. (This is what several other proc files already do, though for different reasons.) Reported-by: Christian Brauner Reported-by: Andrea Righi Tested-by: Andrea Righi Fixes: bfb819ea20ce ("proc: Check /proc/$pid/attr/ writes against file opener") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/proc/base.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 297ea12b3cfd2..511d98d09754a 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2675,6 +2675,11 @@ static int proc_pident_readdir(struct file *file, struct dir_context *ctx, } #ifdef CONFIG_SECURITY +static int proc_pid_attr_open(struct inode *inode, struct file *file) +{ + return __mem_open(inode, file, PTRACE_MODE_READ_FSCREDS); +} + static ssize_t proc_pid_attr_read(struct file * file, char __user * buf, size_t count, loff_t *ppos) { @@ -2705,7 +2710,7 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf, int rv; /* A task may only write when it was the opener. */ - if (file->f_cred != current_real_cred()) + if (file->private_data != current->mm) return -EPERM; rcu_read_lock(); @@ -2755,9 +2760,11 @@ static ssize_t proc_pid_attr_write(struct file * file, const char __user * buf, } static const struct file_operations proc_pid_attr_operations = { + .open = proc_pid_attr_open, .read = proc_pid_attr_read, .write = proc_pid_attr_write, .llseek = generic_file_llseek, + .release = mem_release, }; #define LSM_DIR_OPS(LSM) \ From 3b89db7468f70dc8dd548acd8568b7afbfeefe6d Mon Sep 17 00:00:00 2001 From: Marco Felsch Date: Fri, 23 Apr 2021 15:54:02 +0200 Subject: [PATCH 140/269] ASoC: max98088: fix ni clock divider calculation [ Upstream commit 6c9762a78c325107dc37d20ee21002b841679209 ] The ni1/ni2 ratio formula [1] uses the pclk which is the prescaled mclk. The max98088 datasheet [2] has no such formula but table-12 equals so we can assume that it is the same for both devices. While on it make use of DIV_ROUND_CLOSEST_ULL(). [1] https://datasheets.maximintegrated.com/en/ds/MAX98089.pdf; page 86 [2] https://datasheets.maximintegrated.com/en/ds/MAX98088.pdf; page 82 Signed-off-by: Marco Felsch Link: https://lore.kernel.org/r/20210423135402.32105-1-m.felsch@pengutronix.de Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/max98088.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/sound/soc/codecs/max98088.c b/sound/soc/codecs/max98088.c index 4be24e7f51c89..f8e49e45ce33f 100644 --- a/sound/soc/codecs/max98088.c +++ b/sound/soc/codecs/max98088.c @@ -41,6 +41,7 @@ struct max98088_priv { enum max98088_type devtype; struct max98088_pdata *pdata; struct clk *mclk; + unsigned char mclk_prescaler; unsigned int sysclk; struct max98088_cdata dai[2]; int eq_textcnt; @@ -998,13 +999,16 @@ static int max98088_dai1_hw_params(struct snd_pcm_substream *substream, /* Configure NI when operating as master */ if (snd_soc_component_read(component, M98088_REG_14_DAI1_FORMAT) & M98088_DAI_MAS) { + unsigned long pclk; + if (max98088->sysclk == 0) { dev_err(component->dev, "Invalid system clock frequency\n"); return -EINVAL; } ni = 65536ULL * (rate < 50000 ? 96ULL : 48ULL) * (unsigned long long int)rate; - do_div(ni, (unsigned long long int)max98088->sysclk); + pclk = DIV_ROUND_CLOSEST(max98088->sysclk, max98088->mclk_prescaler); + ni = DIV_ROUND_CLOSEST_ULL(ni, pclk); snd_soc_component_write(component, M98088_REG_12_DAI1_CLKCFG_HI, (ni >> 8) & 0x7F); snd_soc_component_write(component, M98088_REG_13_DAI1_CLKCFG_LO, @@ -1065,13 +1069,16 @@ static int max98088_dai2_hw_params(struct snd_pcm_substream *substream, /* Configure NI when operating as master */ if (snd_soc_component_read(component, M98088_REG_1C_DAI2_FORMAT) & M98088_DAI_MAS) { + unsigned long pclk; + if (max98088->sysclk == 0) { dev_err(component->dev, "Invalid system clock frequency\n"); return -EINVAL; } ni = 65536ULL * (rate < 50000 ? 96ULL : 48ULL) * (unsigned long long int)rate; - do_div(ni, (unsigned long long int)max98088->sysclk); + pclk = DIV_ROUND_CLOSEST(max98088->sysclk, max98088->mclk_prescaler); + ni = DIV_ROUND_CLOSEST_ULL(ni, pclk); snd_soc_component_write(component, M98088_REG_1A_DAI2_CLKCFG_HI, (ni >> 8) & 0x7F); snd_soc_component_write(component, M98088_REG_1B_DAI2_CLKCFG_LO, @@ -1113,8 +1120,10 @@ static int max98088_dai_set_sysclk(struct snd_soc_dai *dai, */ if ((freq >= 10000000) && (freq < 20000000)) { snd_soc_component_write(component, M98088_REG_10_SYS_CLK, 0x10); + max98088->mclk_prescaler = 1; } else if ((freq >= 20000000) && (freq < 30000000)) { snd_soc_component_write(component, M98088_REG_10_SYS_CLK, 0x20); + max98088->mclk_prescaler = 2; } else { dev_err(component->dev, "Invalid master clock frequency\n"); return -EINVAL; From cb24d57ad5720714e4f0dd9dcb0c4b325ab54adf Mon Sep 17 00:00:00 2001 From: Vijendar Mukunda Date: Wed, 28 Apr 2021 01:53:31 +0530 Subject: [PATCH 141/269] ASoC: amd: fix for pcm_read() error [ Upstream commit 6879e8e759bf9e05eaee85e32ca1a936e6b46da1 ] Below phython script throwing pcm_read() error. import subprocess p = subprocess.Popen(["aplay -t raw -D plughw:1,0 /dev/zero"], shell=True) subprocess.call(["arecord -Dhw:1,0 --dump-hw-params"], shell=True) subprocess.call(["arecord -Dhw:1,0 -fdat -d1 /dev/null"], shell=True) p.kill() Handling ACP global external interrupt enable register causing this issue. This register got updated wrongly when there is active stream causing interrupts disabled for active stream. Refactored code to handle enabling and disabling external interrupts. Signed-off-by: Vijendar Mukunda Link: https://lore.kernel.org/r/1619555017-29858-1-git-send-email-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/amd/raven/acp3x-pcm-dma.c | 10 ---------- sound/soc/amd/raven/acp3x.h | 1 + sound/soc/amd/raven/pci-acp3x.c | 15 +++++++++++++++ 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/sound/soc/amd/raven/acp3x-pcm-dma.c b/sound/soc/amd/raven/acp3x-pcm-dma.c index 417cda24030cd..2447a1e6e913f 100644 --- a/sound/soc/amd/raven/acp3x-pcm-dma.c +++ b/sound/soc/amd/raven/acp3x-pcm-dma.c @@ -237,10 +237,6 @@ static int acp3x_dma_open(struct snd_soc_component *component, return ret; } - if (!adata->play_stream && !adata->capture_stream && - !adata->i2ssp_play_stream && !adata->i2ssp_capture_stream) - rv_writel(1, adata->acp3x_base + mmACP_EXTERNAL_INTR_ENB); - i2s_data->acp3x_base = adata->acp3x_base; runtime->private_data = i2s_data; return ret; @@ -367,12 +363,6 @@ static int acp3x_dma_close(struct snd_soc_component *component, } } - /* Disable ACP irq, when the current stream is being closed and - * another stream is also not active. - */ - if (!adata->play_stream && !adata->capture_stream && - !adata->i2ssp_play_stream && !adata->i2ssp_capture_stream) - rv_writel(0, adata->acp3x_base + mmACP_EXTERNAL_INTR_ENB); return 0; } diff --git a/sound/soc/amd/raven/acp3x.h b/sound/soc/amd/raven/acp3x.h index 03fe93913e12e..c3f0c8b7545db 100644 --- a/sound/soc/amd/raven/acp3x.h +++ b/sound/soc/amd/raven/acp3x.h @@ -77,6 +77,7 @@ #define ACP_POWER_OFF_IN_PROGRESS 0x03 #define ACP3x_ITER_IRER_SAMP_LEN_MASK 0x38 +#define ACP_EXT_INTR_STAT_CLEAR_MASK 0xFFFFFFFF struct acp3x_platform_info { u16 play_i2s_instance; diff --git a/sound/soc/amd/raven/pci-acp3x.c b/sound/soc/amd/raven/pci-acp3x.c index 77f2d93896067..df83d2ce75ea7 100644 --- a/sound/soc/amd/raven/pci-acp3x.c +++ b/sound/soc/amd/raven/pci-acp3x.c @@ -76,6 +76,19 @@ static int acp3x_reset(void __iomem *acp3x_base) return -ETIMEDOUT; } +static void acp3x_enable_interrupts(void __iomem *acp_base) +{ + rv_writel(0x01, acp_base + mmACP_EXTERNAL_INTR_ENB); +} + +static void acp3x_disable_interrupts(void __iomem *acp_base) +{ + rv_writel(ACP_EXT_INTR_STAT_CLEAR_MASK, acp_base + + mmACP_EXTERNAL_INTR_STAT); + rv_writel(0x00, acp_base + mmACP_EXTERNAL_INTR_CNTL); + rv_writel(0x00, acp_base + mmACP_EXTERNAL_INTR_ENB); +} + static int acp3x_init(struct acp3x_dev_data *adata) { void __iomem *acp3x_base = adata->acp3x_base; @@ -93,6 +106,7 @@ static int acp3x_init(struct acp3x_dev_data *adata) pr_err("ACP3x reset failed\n"); return ret; } + acp3x_enable_interrupts(acp3x_base); return 0; } @@ -100,6 +114,7 @@ static int acp3x_deinit(void __iomem *acp3x_base) { int ret; + acp3x_disable_interrupts(acp3x_base); /* Reset */ ret = acp3x_reset(acp3x_base); if (ret) { From 4b8b7bc3a726268e5c15d9bafe27863a85fdfc8e Mon Sep 17 00:00:00 2001 From: Saravana Kannan Date: Mon, 26 Apr 2021 16:56:38 -0700 Subject: [PATCH 142/269] spi: Fix spi device unregister flow [ Upstream commit c7299fea67696db5bd09d924d1f1080d894f92ef ] When an SPI device is unregistered, the spi->controller->cleanup() is called in the device's release callback. That's wrong for a couple of reasons: 1. spi_dev_put() can be called before spi_add_device() is called. And it's spi_add_device() that calls spi_setup(). This will cause clean() to get called without the spi device ever being setup. 2. There's no guarantee that the controller's driver would be present by the time the spi device's release function gets called. 3. It also causes "sleeping in atomic context" stack dump[1] when device link deletion code does a put_device() on the spi device. Fix these issues by simply moving the cleanup from the device release callback to the actual spi_unregister_device() function. [1] - https://lore.kernel.org/lkml/CAHp75Vc=FCGcUyS0v6fnxme2YJ+qD+Y-hQDQLa2JhWNON9VmsQ@mail.gmail.com/ Signed-off-by: Saravana Kannan Link: https://lore.kernel.org/r/20210426235638.1285530-1-saravanak@google.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index a6f1e94af13c5..96560853b3a3c 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -47,10 +47,6 @@ static void spidev_release(struct device *dev) { struct spi_device *spi = to_spi_device(dev); - /* spi controllers may cleanup for released devices */ - if (spi->controller->cleanup) - spi->controller->cleanup(spi); - spi_controller_put(spi->controller); kfree(spi->driver_override); kfree(spi); @@ -550,6 +546,12 @@ static int spi_dev_check(struct device *dev, void *data) return 0; } +static void spi_cleanup(struct spi_device *spi) +{ + if (spi->controller->cleanup) + spi->controller->cleanup(spi); +} + /** * spi_add_device - Add spi_device allocated with spi_alloc_device * @spi: spi_device to register @@ -614,11 +616,13 @@ int spi_add_device(struct spi_device *spi) /* Device may be bound to an active driver when this returns */ status = device_add(&spi->dev); - if (status < 0) + if (status < 0) { dev_err(dev, "can't add %s, status %d\n", dev_name(&spi->dev), status); - else + spi_cleanup(spi); + } else { dev_dbg(dev, "registered child %s\n", dev_name(&spi->dev)); + } done: mutex_unlock(&spi_add_lock); @@ -705,6 +709,8 @@ void spi_unregister_device(struct spi_device *spi) if (!spi) return; + spi_cleanup(spi); + if (spi->dev.of_node) { of_node_clear_flag(spi->dev.of_node, OF_POPULATED); of_node_put(spi->dev.of_node); From fb91ab403e6ee65ef8e193198dc5f52f723e2b93 Mon Sep 17 00:00:00 2001 From: Karen Dombroski Date: Wed, 28 Apr 2021 23:38:02 -0600 Subject: [PATCH 143/269] spi: spi-zynq-qspi: Fix stack violation bug [ Upstream commit 6d5ff8e632a4f2389c331e5554cd1c2a9a28c7aa ] When the number of bytes for the op is greater than one, the read could run off the end of the function stack and cause a crash. This patch restores the behaviour of safely reading out of the original opcode location. Signed-off-by: Karen Dombroski Signed-off-by: Amit Kumar Mahapatra Link: https://lore.kernel.org/r/20210429053802.17650-3-amit.kumar-mahapatra@xilinx.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi-zynq-qspi.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/spi/spi-zynq-qspi.c b/drivers/spi/spi-zynq-qspi.c index 5d8a5ee62fa23..2765289028fae 100644 --- a/drivers/spi/spi-zynq-qspi.c +++ b/drivers/spi/spi-zynq-qspi.c @@ -528,18 +528,17 @@ static int zynq_qspi_exec_mem_op(struct spi_mem *mem, struct zynq_qspi *xqspi = spi_controller_get_devdata(mem->spi->master); int err = 0, i; u8 *tmpbuf; - u8 opcode = op->cmd.opcode; dev_dbg(xqspi->dev, "cmd:%#x mode:%d.%d.%d.%d\n", - opcode, op->cmd.buswidth, op->addr.buswidth, + op->cmd.opcode, op->cmd.buswidth, op->addr.buswidth, op->dummy.buswidth, op->data.buswidth); zynq_qspi_chipselect(mem->spi, true); zynq_qspi_config_op(xqspi, mem->spi); - if (op->cmd.nbytes) { + if (op->cmd.opcode) { reinit_completion(&xqspi->data_completion); - xqspi->txbuf = &opcode; + xqspi->txbuf = (u8 *)&op->cmd.opcode; xqspi->rxbuf = NULL; xqspi->tx_bytes = op->cmd.nbytes; xqspi->rx_bytes = op->cmd.nbytes; From 584b2c7ce24450a7c687f976b54333607e14e058 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Wed, 5 May 2021 15:25:29 +0200 Subject: [PATCH 144/269] bpf: Forbid trampoline attach for functions with variable arguments [ Upstream commit 31379397dcc364a59ce764fabb131b645c43e340 ] We can't currently allow to attach functions with variable arguments. The problem is that we should save all the registers for arguments, which is probably doable, but if caller uses more than 6 arguments, we need stack data, which will be wrong, because of the extra stack frame we do in bpf trampoline, so we could crash. Also currently there's malformed trampoline code generated for such functions at the moment as described in: https://lore.kernel.org/bpf/20210429212834.82621-1-jolsa@kernel.org/ Signed-off-by: Jiri Olsa Signed-off-by: Daniel Borkmann Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20210505132529.401047-1-jolsa@kernel.org Signed-off-by: Sasha Levin --- kernel/bpf/btf.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index ed7d02e8bc939..aaf2fbaa0cc76 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -4960,6 +4960,12 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, m->ret_size = ret; for (i = 0; i < nargs; i++) { + if (i == nargs - 1 && args[i].type == 0) { + bpf_log(log, + "The function %s with variable args is unsupported.\n", + tname); + return -EINVAL; + } ret = __get_type_size(btf, args[i].type, &t); if (ret < 0) { bpf_log(log, @@ -4967,6 +4973,12 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); return -EINVAL; } + if (ret == 0) { + bpf_log(log, + "The function %s has malformed void argument.\n", + tname); + return -EINVAL; + } m->arg_size[i] = ret; } m->nr_args = nargs; From 1e5cab50208c8fb7351b798cb1d569debfeb994a Mon Sep 17 00:00:00 2001 From: Jeimon Date: Sat, 8 May 2021 11:52:30 +0800 Subject: [PATCH 145/269] net/nfc/rawsock.c: fix a permission check bug [ Upstream commit 8ab78863e9eff11910e1ac8bcf478060c29b379e ] The function rawsock_create() calls a privileged function sk_alloc(), which requires a ns-aware check to check net->user_ns, i.e., ns_capable(). However, the original code checks the init_user_ns using capable(). So we replace the capable() with ns_capable(). Signed-off-by: Jeimon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/nfc/rawsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c index 9c7eb8455ba8e..5f1d438a0a23f 100644 --- a/net/nfc/rawsock.c +++ b/net/nfc/rawsock.c @@ -329,7 +329,7 @@ static int rawsock_create(struct net *net, struct socket *sock, return -ESOCKTNOSUPPORT; if (sock->type == SOCK_RAW) { - if (!capable(CAP_NET_RAW)) + if (!ns_capable(net->user_ns, CAP_NET_RAW)) return -EPERM; sock->ops = &rawsock_raw_ops; } else { From 2f523cd4a9311cba629facc7d353eabbd492bd5b Mon Sep 17 00:00:00 2001 From: Dinghao Liu Date: Mon, 12 Apr 2021 13:49:07 +0800 Subject: [PATCH 146/269] usb: cdns3: Fix runtime PM imbalance on error [ Upstream commit 07adc0225484fc199e3dc15ec889f75f498c4fca ] When cdns3_gadget_start() fails, a pairing PM usage counter decrement is needed to keep the counter balanced. Signed-off-by: Dinghao Liu Link: https://lore.kernel.org/r/20210412054908.7975-1-dinghao.liu@zju.edu.cn Signed-off-by: Peter Chen Signed-off-by: Sasha Levin --- drivers/usb/cdns3/gadget.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/cdns3/gadget.c b/drivers/usb/cdns3/gadget.c index 0aa85cc07ff19..c24c0e3440e39 100644 --- a/drivers/usb/cdns3/gadget.c +++ b/drivers/usb/cdns3/gadget.c @@ -3255,8 +3255,10 @@ static int __cdns3_gadget_init(struct cdns3 *cdns) pm_runtime_get_sync(cdns->dev); ret = cdns3_gadget_start(cdns); - if (ret) + if (ret) { + pm_runtime_put_sync(cdns->dev); return ret; + } /* * Because interrupt line can be shared with other components in From a5ee8f54d0ab1fd5f953d4a5a161682745461901 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sat, 8 May 2021 17:01:45 +0200 Subject: [PATCH 147/269] ASoC: Intel: bytcr_rt5640: Add quirk for the Glavey TM800A550L tablet [ Upstream commit 28c268d3acdd4cbcd2ac320b85609e77f84e74a7 ] Add a quirk for the Glavey TM800A550L tablet, this BYTCR tablet has no CHAN package in its ACPI tables and uses SSP0-AIF1 rather then SSP0-AIF2 which is the default for BYTCR devices. Signed-off-by: Hans de Goede Acked-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20210508150146.28403-1-hdegoede@redhat.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/intel/boards/bytcr_rt5640.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c index 1ef0464249d1b..d5c905b006853 100644 --- a/sound/soc/intel/boards/bytcr_rt5640.c +++ b/sound/soc/intel/boards/bytcr_rt5640.c @@ -570,6 +570,17 @@ static const struct dmi_system_id byt_rt5640_quirk_table[] = { BYT_RT5640_SSP0_AIF1 | BYT_RT5640_MCLK_EN), }, + { /* Glavey TM800A550L */ + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "AMI Corporation"), + DMI_MATCH(DMI_BOARD_NAME, "Aptio CRB"), + /* Above strings are too generic, also match on BIOS version */ + DMI_MATCH(DMI_BIOS_VERSION, "ZY-8-BI-PX4S70VTR400-X423B-005-D"), + }, + .driver_data = (void *)(BYTCR_INPUT_DEFAULTS | + BYT_RT5640_SSP0_AIF1 | + BYT_RT5640_MCLK_EN), + }, { .matches = { DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), From 4352209ed0f0d5e1ff48ace27931e9fc43d10c40 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sat, 8 May 2021 17:01:46 +0200 Subject: [PATCH 148/269] ASoC: Intel: bytcr_rt5640: Add quirk for the Lenovo Miix 3-830 tablet [ Upstream commit f0353e1f53f92f7b3da91e6669f5d58ee222ebe8 ] The Lenovo Miix 3-830 tablet has only 1 speaker, has an internal analog mic on IN1 and uses JD2 for jack-detect, add a quirk to automatically apply these settings on Lenovo Miix 3-830 tablets. Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20210508150146.28403-2-hdegoede@redhat.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/intel/boards/bytcr_rt5640.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c index d5c905b006853..ca14730232ba9 100644 --- a/sound/soc/intel/boards/bytcr_rt5640.c +++ b/sound/soc/intel/boards/bytcr_rt5640.c @@ -659,6 +659,20 @@ static const struct dmi_system_id byt_rt5640_quirk_table[] = { BYT_RT5640_MONO_SPEAKER | BYT_RT5640_MCLK_EN), }, + { /* Lenovo Miix 3-830 */ + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "Lenovo MIIX 3-830"), + }, + .driver_data = (void *)(BYT_RT5640_IN1_MAP | + BYT_RT5640_JD_SRC_JD2_IN4N | + BYT_RT5640_OVCD_TH_2000UA | + BYT_RT5640_OVCD_SF_0P75 | + BYT_RT5640_MONO_SPEAKER | + BYT_RT5640_DIFF_MIC | + BYT_RT5640_SSP0_AIF1 | + BYT_RT5640_MCLK_EN), + }, { /* Linx Linx7 tablet */ .matches = { DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LINX"), From cad3dc73c0645d00adfe96cebc8d950897cc1227 Mon Sep 17 00:00:00 2001 From: Eric Farman Date: Tue, 11 May 2021 21:56:30 +0200 Subject: [PATCH 149/269] vfio-ccw: Reset FSM state to IDLE inside FSM [ Upstream commit 6c02ac4c9211edabe17bda437ac97e578756f31b ] When an I/O request is made, the fsm_io_request() routine moves the FSM state from IDLE to CP_PROCESSING, and then fsm_io_helper() moves it to CP_PENDING if the START SUBCHANNEL received a cc0. Yet, the error case to go from CP_PROCESSING back to IDLE is done after the FSM call returns. Let's move this up into the FSM proper, to provide some better symmetry when unwinding in this case. Signed-off-by: Eric Farman Reviewed-by: Cornelia Huck Acked-by: Matthew Rosato Message-Id: <20210511195631.3995081-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck Signed-off-by: Sasha Levin --- drivers/s390/cio/vfio_ccw_fsm.c | 1 + drivers/s390/cio/vfio_ccw_ops.c | 2 -- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c index 23e61aa638e4e..e435a9cd92dac 100644 --- a/drivers/s390/cio/vfio_ccw_fsm.c +++ b/drivers/s390/cio/vfio_ccw_fsm.c @@ -318,6 +318,7 @@ static void fsm_io_request(struct vfio_ccw_private *private, } err_out: + private->state = VFIO_CCW_STATE_IDLE; trace_vfio_ccw_fsm_io_request(scsw->cmd.fctl, schid, io_region->ret_code, errstr); } diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index 1ad5f7018ec2d..2280f51dd679d 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -276,8 +276,6 @@ static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private, } vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ); - if (region->ret_code != 0) - private->state = VFIO_CCW_STATE_IDLE; ret = (region->ret_code != 0) ? region->ret_code : count; out_unlock: From 01905f3232fdc0737de5c38e9d817f87a06a1a6d Mon Sep 17 00:00:00 2001 From: Eric Farman Date: Tue, 11 May 2021 21:56:31 +0200 Subject: [PATCH 150/269] vfio-ccw: Serialize FSM IDLE state with I/O completion [ Upstream commit 2af7a834a435460d546f0cf0a8b8e4d259f1d910 ] Today, the stacked call to vfio_ccw_sch_io_todo() does three things: 1) Update a solicited IRB with CP information, and release the CP if the interrupt was the end of a START operation. 2) Copy the IRB data into the io_region, under the protection of the io_mutex 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that vfio-ccw can accept more work. The trouble is that step 3 is (A) invoked for both solicited and unsolicited interrupts, and (B) sitting after the mutex for step 2. This second piece becomes a problem if it processes an interrupt for a CLEAR SUBCHANNEL while another thread initiates a START, thus allowing the CP and FSM states to get out of sync. That is: CPU 1 CPU 2 fsm_do_clear() fsm_irq() fsm_io_request() vfio_ccw_sch_io_todo() fsm_io_helper() Since the FSM state and CP should be kept in sync, let's make a note when the CP is released, and rely on that as an indication that the FSM should also be reset at the end of this routine and open up the device for more work. Signed-off-by: Eric Farman Acked-by: Matthew Rosato Reviewed-by: Cornelia Huck Message-Id: <20210511195631.3995081-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck Signed-off-by: Sasha Levin --- drivers/s390/cio/vfio_ccw_drv.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index 8c625b530035f..9b61e9b131ade 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -86,6 +86,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) struct vfio_ccw_private *private; struct irb *irb; bool is_final; + bool cp_is_finished = false; private = container_of(work, struct vfio_ccw_private, io_work); irb = &private->irb; @@ -94,14 +95,21 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); if (scsw_is_solicited(&irb->scsw)) { cp_update_scsw(&private->cp, &irb->scsw); - if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) + if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) { cp_free(&private->cp); + cp_is_finished = true; + } } mutex_lock(&private->io_mutex); memcpy(private->io_region->irb_area, irb, sizeof(*irb)); mutex_unlock(&private->io_mutex); - if (private->mdev && is_final) + /* + * Reset to IDLE only if processing of a channel program + * has finished. Do not overwrite a possible processing + * state if the final interrupt was for HSCH or CSCH. + */ + if (private->mdev && cp_is_finished) private->state = VFIO_CCW_STATE_IDLE; if (private->io_trigger) From 369f3caa4d74380efdbf614a01de067171fa19a1 Mon Sep 17 00:00:00 2001 From: Zou Wei Date: Wed, 12 May 2021 11:12:25 +0800 Subject: [PATCH 151/269] ASoC: sti-sas: add missing MODULE_DEVICE_TABLE [ Upstream commit e072b2671606c77538d6a4dd5dda80b508cb4816 ] This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot Signed-off-by: Zou Wei Link: https://lore.kernel.org/r/1620789145-14936-1-git-send-email-zou_wei@huawei.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/sti-sas.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/soc/codecs/sti-sas.c b/sound/soc/codecs/sti-sas.c index ec9933b054ad3..423daac9d5a9f 100644 --- a/sound/soc/codecs/sti-sas.c +++ b/sound/soc/codecs/sti-sas.c @@ -411,6 +411,7 @@ static const struct of_device_id sti_sas_dev_match[] = { }, {}, }; +MODULE_DEVICE_TABLE(of, sti_sas_dev_match); static int sti_sas_driver_probe(struct platform_device *pdev) { From 06e84ea1f4573c13a37e22d522a5d4f8f8a9e378 Mon Sep 17 00:00:00 2001 From: Chunyan Zhang Date: Wed, 12 May 2021 17:35:34 +0800 Subject: [PATCH 152/269] spi: sprd: Add missing MODULE_DEVICE_TABLE [ Upstream commit 7907cad7d07e0055789ec0c534452f19dfe1fc80 ] MODULE_DEVICE_TABLE is used to extract the device information out of the driver and builds a table when being compiled. If using this macro, kernel can find the driver if available when the device is plugged in, and then loads that driver and initializes the device. Signed-off-by: Chunyan Zhang Link: https://lore.kernel.org/r/20210512093534.243040-1-zhang.lyra@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi-sprd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/spi/spi-sprd.c b/drivers/spi/spi-sprd.c index b41a75749b498..28e70db9bbba8 100644 --- a/drivers/spi/spi-sprd.c +++ b/drivers/spi/spi-sprd.c @@ -1068,6 +1068,7 @@ static const struct of_device_id sprd_spi_of_match[] = { { .compatible = "sprd,sc9860-spi", }, { /* sentinel */ } }; +MODULE_DEVICE_TABLE(of, sprd_spi_of_match); static struct platform_driver sprd_spi_driver = { .driver = { From 2e2145ccfbcb0dd38d8423681d22b595ca735846 Mon Sep 17 00:00:00 2001 From: Li Jun Date: Fri, 14 May 2021 18:59:44 +0800 Subject: [PATCH 153/269] usb: chipidea: udc: assign interrupt number to USB gadget structure [ Upstream commit 9e3927f6373da54cb17e17f4bd700907e1123d2f ] Chipidea also need sync interrupt before unbind the udc while gadget remove driver, otherwise setup irq handling may happen while unbind, see below dump generated from android function switch stress test: [ 4703.503056] android_work: sent uevent USB_STATE=CONNECTED [ 4703.514642] android_work: sent uevent USB_STATE=DISCONNECTED [ 4703.651339] android_work: sent uevent USB_STATE=CONNECTED [ 4703.661806] init: Control message: Processed ctl.stop for 'adbd' from pid: 561 (system_server) [ 4703.673469] init: processing action (init.svc.adbd=stopped) from (/system/etc/init/hw/init.usb.configfs.rc:14) [ 4703.676451] Unable to handle kernel read from unreadable memory at virtual address 0000000000000090 [ 4703.676454] Mem abort info: [ 4703.676458] ESR = 0x96000004 [ 4703.676461] EC = 0x25: DABT (current EL), IL = 32 bits [ 4703.676464] SET = 0, FnV = 0 [ 4703.676466] EA = 0, S1PTW = 0 [ 4703.676468] Data abort info: [ 4703.676471] ISV = 0, ISS = 0x00000004 [ 4703.676473] CM = 0, WnR = 0 [ 4703.676478] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004a867000 [ 4703.676481] [0000000000000090] pgd=0000000000000000, p4d=0000000000000000 [ 4703.676503] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 4703.758297] Modules linked in: synaptics_dsx_i2c moal(O) mlan(O) [ 4703.764327] CPU: 0 PID: 235 Comm: lmkd Tainted: G W O 5.10.9-00001-g3f5fd8487c38-dirty #63 [ 4703.773720] Hardware name: NXP i.MX8MNano EVK board (DT) [ 4703.779033] pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 4703.785046] pc : _raw_write_unlock_bh+0xc0/0x2c8 [ 4703.789667] lr : android_setup+0x4c/0x168 [ 4703.793676] sp : ffff80001256bd80 [ 4703.796989] x29: ffff80001256bd80 x28: 00000000000000a8 [ 4703.802304] x27: ffff800012470000 x26: ffff80006d923000 [ 4703.807616] x25: ffff800012471000 x24: ffff00000b091140 [ 4703.812929] x23: ffff0000077dbd38 x22: ffff0000077da490 [ 4703.818242] x21: ffff80001256be30 x20: 0000000000000000 [ 4703.823554] x19: 0000000000000080 x18: ffff800012561048 [ 4703.828867] x17: 0000000000000000 x16: 0000000000000039 [ 4703.834180] x15: ffff8000106ad258 x14: ffff80001194c277 [ 4703.839493] x13: 0000000000003934 x12: 0000000000000000 [ 4703.844805] x11: 0000000000000000 x10: 0000000000000001 [ 4703.850117] x9 : 0000000000000000 x8 : 0000000000000090 [ 4703.855429] x7 : 6f72646e61203a70 x6 : ffff8000124f2450 [ 4703.860742] x5 : ffffffffffffffff x4 : 0000000000000009 [ 4703.866054] x3 : ffff8000108a290c x2 : ffff00007fb3a9c8 [ 4703.871367] x1 : 0000000000000000 x0 : 0000000000000090 [ 4703.876681] Call trace: [ 4703.879129] _raw_write_unlock_bh+0xc0/0x2c8 [ 4703.883397] android_setup+0x4c/0x168 [ 4703.887059] udc_irq+0x824/0xa9c [ 4703.890287] ci_irq+0x124/0x148 [ 4703.893429] __handle_irq_event_percpu+0x84/0x268 [ 4703.898131] handle_irq_event+0x64/0x14c [ 4703.902054] handle_fasteoi_irq+0x110/0x210 [ 4703.906236] __handle_domain_irq+0x8c/0xd4 [ 4703.910332] gic_handle_irq+0x6c/0x124 [ 4703.914081] el1_irq+0xdc/0x1c0 [ 4703.917221] _raw_spin_unlock_irq+0x20/0x54 [ 4703.921405] finish_task_switch+0x84/0x224 [ 4703.925502] __schedule+0x4a4/0x734 [ 4703.928990] schedule+0xa0/0xe8 [ 4703.932132] do_notify_resume+0x150/0x184 [ 4703.936140] work_pending+0xc/0x40c [ 4703.939633] Code: d5384613 521b0a69 d5184609 f9800111 (885ffd01) [ 4703.945732] ---[ end trace ba5c1875ae49d53c ]--- [ 4703.950350] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 4703.957223] SMP: stopping secondary CPUs [ 4703.961151] Kernel Offset: disabled [ 4703.964638] CPU features: 0x0240002,2000200c [ 4703.968905] Memory Limit: none [ 4703.971963] Rebooting in 5 seconds.. Tested-by: faqiang.zhu Signed-off-by: Li Jun Link: https://lore.kernel.org/r/1620989984-7653-1-git-send-email-jun.li@nxp.com Signed-off-by: Peter Chen Signed-off-by: Sasha Levin --- drivers/usb/chipidea/udc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c index 60ea932afe2b8..5f35cdd2cf1dd 100644 --- a/drivers/usb/chipidea/udc.c +++ b/drivers/usb/chipidea/udc.c @@ -2055,6 +2055,7 @@ static int udc_start(struct ci_hdrc *ci) ci->gadget.name = ci->platdata->name; ci->gadget.otg_caps = otg_caps; ci->gadget.sg_supported = 1; + ci->gadget.irq = ci->irq; if (ci->platdata->flags & CI_HDRC_REQUIRES_ALIGNED_DMA) ci->gadget.quirk_avoids_skb_reserve = 1; From 143fc7220961220eecc04669e5909af8847bf8c8 Mon Sep 17 00:00:00 2001 From: Zheyu Ma Date: Sun, 16 May 2021 07:11:40 +0000 Subject: [PATCH 154/269] isdn: mISDN: netjet: Fix crash in nj_probe: [ Upstream commit 9f6f852550d0e1b7735651228116ae9d300f69b3 ] 'nj_setup' in netjet.c might fail with -EIO and in this case 'card->irq' is initialized and is bigger than zero. A subsequent call to 'nj_release' will free the irq that has not been requested. Fix this bug by deleting the previous assignment to 'card->irq' and just keep the assignment before 'request_irq'. The KASAN's log reveals it: [ 3.354615 ] WARNING: CPU: 0 PID: 1 at kernel/irq/manage.c:1826 free_irq+0x100/0x480 [ 3.355112 ] Modules linked in: [ 3.355310 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00144-g25a1298726e #13 [ 3.355816 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.356552 ] RIP: 0010:free_irq+0x100/0x480 [ 3.356820 ] Code: 6e 08 74 6f 4d 89 f4 e8 5e ac 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 4f ac 09 00 8b 75 c8 48 c7 c7 78 c1 2e 85 e8 e0 cf f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 72 33 0b 03 48 8b 43 40 4c 8b a0 80 [ 3.358012 ] RSP: 0000:ffffc90000017b48 EFLAGS: 00010082 [ 3.358357 ] RAX: 0000000000000000 RBX: ffff888104dc8000 RCX: 0000000000000000 [ 3.358814 ] RDX: ffff8881003c8000 RSI: ffffffff8124a9e6 RDI: 00000000ffffffff [ 3.359272 ] RBP: ffffc90000017b88 R08: 0000000000000000 R09: 0000000000000000 [ 3.359732 ] R10: ffffc900000179f0 R11: 0000000000001d04 R12: 0000000000000000 [ 3.360195 ] R13: ffff888107dc6000 R14: ffff888107dc6928 R15: ffff888104dc80a8 [ 3.360652 ] FS: 0000000000000000(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000 [ 3.361170 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.361538 ] CR2: 0000000000000000 CR3: 000000000582e000 CR4: 00000000000006f0 [ 3.362003 ] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.362175 ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.362175 ] Call Trace: [ 3.362175 ] nj_release+0x51/0x1e0 [ 3.362175 ] nj_probe+0x450/0x950 [ 3.362175 ] ? pci_device_remove+0x110/0x110 [ 3.362175 ] local_pci_probe+0x45/0xa0 [ 3.362175 ] pci_device_probe+0x12b/0x1d0 [ 3.362175 ] really_probe+0x2a9/0x610 [ 3.362175 ] driver_probe_device+0x90/0x1d0 [ 3.362175 ] ? mutex_lock_nested+0x1b/0x20 [ 3.362175 ] device_driver_attach+0x68/0x70 [ 3.362175 ] __driver_attach+0x124/0x1b0 [ 3.362175 ] ? device_driver_attach+0x70/0x70 [ 3.362175 ] bus_for_each_dev+0xbb/0x110 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] driver_attach+0x27/0x30 [ 3.362175 ] bus_add_driver+0x1eb/0x2a0 [ 3.362175 ] driver_register+0xa9/0x180 [ 3.362175 ] __pci_register_driver+0x82/0x90 [ 3.362175 ] ? w6692_init+0x38/0x38 [ 3.362175 ] nj_init+0x36/0x38 [ 3.362175 ] do_one_initcall+0x7f/0x3d0 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.362175 ] kernel_init_freeable+0x2aa/0x301 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] kernel_init+0x18/0x190 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ret_from_fork+0x1f/0x30 [ 3.362175 ] Kernel panic - not syncing: panic_on_warn set ... [ 3.362175 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00144-g25a1298726e #13 [ 3.362175 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.362175 ] Call Trace: [ 3.362175 ] dump_stack+0xba/0xf5 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] panic+0x15a/0x3f2 [ 3.362175 ] ? __warn+0xf2/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] __warn+0x108/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] report_bug+0x119/0x1c0 [ 3.362175 ] handle_bug+0x3b/0x80 [ 3.362175 ] exc_invalid_op+0x18/0x70 [ 3.362175 ] asm_exc_invalid_op+0x12/0x20 [ 3.362175 ] RIP: 0010:free_irq+0x100/0x480 [ 3.362175 ] Code: 6e 08 74 6f 4d 89 f4 e8 5e ac 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 4f ac 09 00 8b 75 c8 48 c7 c7 78 c1 2e 85 e8 e0 cf f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 72 33 0b 03 48 8b 43 40 4c 8b a0 80 [ 3.362175 ] RSP: 0000:ffffc90000017b48 EFLAGS: 00010082 [ 3.362175 ] RAX: 0000000000000000 RBX: ffff888104dc8000 RCX: 0000000000000000 [ 3.362175 ] RDX: ffff8881003c8000 RSI: ffffffff8124a9e6 RDI: 00000000ffffffff [ 3.362175 ] RBP: ffffc90000017b88 R08: 0000000000000000 R09: 0000000000000000 [ 3.362175 ] R10: ffffc900000179f0 R11: 0000000000001d04 R12: 0000000000000000 [ 3.362175 ] R13: ffff888107dc6000 R14: ffff888107dc6928 R15: ffff888104dc80a8 [ 3.362175 ] ? vprintk+0x76/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] nj_release+0x51/0x1e0 [ 3.362175 ] nj_probe+0x450/0x950 [ 3.362175 ] ? pci_device_remove+0x110/0x110 [ 3.362175 ] local_pci_probe+0x45/0xa0 [ 3.362175 ] pci_device_probe+0x12b/0x1d0 [ 3.362175 ] really_probe+0x2a9/0x610 [ 3.362175 ] driver_probe_device+0x90/0x1d0 [ 3.362175 ] ? mutex_lock_nested+0x1b/0x20 [ 3.362175 ] device_driver_attach+0x68/0x70 [ 3.362175 ] __driver_attach+0x124/0x1b0 [ 3.362175 ] ? device_driver_attach+0x70/0x70 [ 3.362175 ] bus_for_each_dev+0xbb/0x110 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] driver_attach+0x27/0x30 [ 3.362175 ] bus_add_driver+0x1eb/0x2a0 [ 3.362175 ] driver_register+0xa9/0x180 [ 3.362175 ] __pci_register_driver+0x82/0x90 [ 3.362175 ] ? w6692_init+0x38/0x38 [ 3.362175 ] nj_init+0x36/0x38 [ 3.362175 ] do_one_initcall+0x7f/0x3d0 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.362175 ] kernel_init_freeable+0x2aa/0x301 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] kernel_init+0x18/0x190 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ret_from_fork+0x1f/0x30 [ 3.362175 ] Dumping ftrace buffer: [ 3.362175 ] (ftrace buffer empty) [ 3.362175 ] Kernel Offset: disabled [ 3.362175 ] Rebooting in 1 seconds.. Reported-by: Zheyu Ma Signed-off-by: Zheyu Ma Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/isdn/hardware/mISDN/netjet.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/netjet.c b/drivers/isdn/hardware/mISDN/netjet.c index ee925b58bbcea..2a1ddd47a0968 100644 --- a/drivers/isdn/hardware/mISDN/netjet.c +++ b/drivers/isdn/hardware/mISDN/netjet.c @@ -1100,7 +1100,6 @@ nj_probe(struct pci_dev *pdev, const struct pci_device_id *ent) card->typ = NETJET_S_TJ300; card->base = pci_resource_start(pdev, 0); - card->irq = pdev->irq; pci_set_drvdata(pdev, card); err = setup_instance(card); if (err) From 42e49001380989f757f17045cf352dff9ae124b3 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Mon, 17 May 2021 16:13:35 +0200 Subject: [PATCH 155/269] bonding: init notify_work earlier to avoid uninitialized use [ Upstream commit 35d96e631860226d5dc4de0fad0a415362ec2457 ] If bond_kobj_init() or later kzalloc() in bond_alloc_slave() fail, then we call kobject_put() on the slave->kobj. This in turn calls the release function slave_kobj_release() which will always try to cancel_delayed_work_sync(&slave->notify_work), which shouldn't be done on an uninitialized work struct. Always initialize the work struct earlier to avoid problems here. Syzbot bisected this down to a completely pointless commit, some fault injection may have been at work here that caused the alloc failure in the first place, which may interact badly with bisect. Reported-by: syzbot+bfda097c12a00c8cae67@syzkaller.appspotmail.com Signed-off-by: Johannes Berg Acked-by: Jay Vosburgh Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 47afc5938c26b..345a3f61c723c 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1502,6 +1502,7 @@ static struct slave *bond_alloc_slave(struct bonding *bond, slave->bond = bond; slave->dev = slave_dev; + INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work); if (bond_kobj_init(slave)) return NULL; @@ -1514,7 +1515,6 @@ static struct slave *bond_alloc_slave(struct bonding *bond, return NULL; } } - INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work); return slave; } From 1d6d43d4805da9b3fa0f5841e8b1083c89868f35 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Mon, 17 May 2021 16:38:09 +0200 Subject: [PATCH 156/269] netlink: disable IRQs for netlink_lock_table() [ Upstream commit 1d482e666b8e74c7555dbdfbfb77205eeed3ff2d ] Syzbot reports that in mac80211 we have a potential deadlock between our "local->stop_queue_reasons_lock" (spinlock) and netlink's nl_table_lock (rwlock). This is because there's at least one situation in which we might try to send a netlink message with this spinlock held while it is also possible to take the spinlock from a hardirq context, resulting in the following deadlock scenario reported by lockdep: CPU0 CPU1 ---- ---- lock(nl_table_lock); local_irq_disable(); lock(&local->queue_stop_reason_lock); lock(nl_table_lock); lock(&local->queue_stop_reason_lock); This seems valid, we can take the queue_stop_reason_lock in any kind of context ("CPU0"), and call ieee80211_report_ack_skb() with the spinlock held and IRQs disabled ("CPU1") in some code path (ieee80211_do_stop() via ieee80211_free_txskb()). Short of disallowing netlink use in scenarios like these (which would be rather complex in mac80211's case due to the deep callchain), it seems the only fix for this is to disable IRQs while nl_table_lock is held to avoid hitting this scenario, this disallows the "CPU0" portion of the reported deadlock. Note that the writer side (netlink_table_grab()) already disables IRQs for this lock. Unfortunately though, this seems like a huge hammer, and maybe the whole netlink table locking should be reworked. Reported-by: syzbot+69ff9dff50dcfe14ddd4@syzkaller.appspotmail.com Signed-off-by: Johannes Berg Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/netlink/af_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index daca50d6bb128..e527f5686e2bf 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -453,11 +453,13 @@ void netlink_table_ungrab(void) static inline void netlink_lock_table(void) { + unsigned long flags; + /* read_lock() synchronizes us to netlink_table_grab */ - read_lock(&nl_table_lock); + read_lock_irqsave(&nl_table_lock, flags); atomic_inc(&nl_table_users); - read_unlock(&nl_table_lock); + read_unlock_irqrestore(&nl_table_lock, flags); } static inline void From be23c4af3d8a1b986fe9b43b8966797653a76ca4 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Mon, 17 May 2021 12:04:13 +0300 Subject: [PATCH 157/269] net: mdiobus: get rid of a BUG_ON() [ Upstream commit 1dde47a66d4fb181830d6fa000e5ea86907b639e ] We spotted a bug recently during a review where a driver was unregistering a bus that wasn't registered, which would trigger this BUG_ON(). Let's handle that situation more gracefully, and just print a warning and return. Reported-by: Russell King (Oracle) Signed-off-by: Dan Carpenter Reviewed-by: Russell King (Oracle) Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/phy/mdio_bus.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 757e950fb745b..b848439fa837c 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -608,7 +608,8 @@ void mdiobus_unregister(struct mii_bus *bus) struct mdio_device *mdiodev; int i; - BUG_ON(bus->state != MDIOBUS_REGISTERED); + if (WARN_ON_ONCE(bus->state != MDIOBUS_REGISTERED)) + return; bus->state = MDIOBUS_UNREGISTERED; for (i = 0; i < PHY_MAX_ADDR; i++) { From 5ca472d40e2daebbfd926701902debcab0bbd15c Mon Sep 17 00:00:00 2001 From: Shakeel Butt Date: Wed, 12 May 2021 13:19:46 -0700 Subject: [PATCH 158/269] cgroup: disable controllers at parse time MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 45e1ba40837ac2f6f4d4716bddb8d44bd7e4a251 ] This patch effectively reverts the commit a3e72739b7a7 ("cgroup: fix too early usage of static_branch_disable()"). The commit 6041186a3258 ("init: initialize jump labels before command line option parsing") has moved the jump_label_init() before parse_args() which has made the commit a3e72739b7a7 unnecessary. On the other hand there are consequences of disabling the controllers later as there are subsystems doing the controller checks for different decisions. One such incident is reported [1] regarding the memory controller and its impact on memory reclaim code. [1] https://lore.kernel.org/linux-mm/921e53f3-4b13-aab8-4a9e-e83ff15371e4@nec.com Signed-off-by: Shakeel Butt Reported-by: NOMURA JUNICHI(野村 淳一) Signed-off-by: Tejun Heo Tested-by: Jun'ichi Nomura Signed-off-by: Sasha Levin --- kernel/cgroup/cgroup.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 5d1fdf7c3ec65..c8b811e039cc2 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -5665,8 +5665,6 @@ int __init cgroup_init_early(void) return 0; } -static u16 cgroup_disable_mask __initdata; - /** * cgroup_init - cgroup initialization * @@ -5725,12 +5723,8 @@ int __init cgroup_init(void) * disabled flag and cftype registration needs kmalloc, * both of which aren't available during early_init. */ - if (cgroup_disable_mask & (1 << ssid)) { - static_branch_disable(cgroup_subsys_enabled_key[ssid]); - printk(KERN_INFO "Disabling %s control group subsystem\n", - ss->name); + if (!cgroup_ssid_enabled(ssid)) continue; - } if (cgroup1_ssid_disabled(ssid)) printk(KERN_INFO "Disabling %s control group subsystem in v1 mounts\n", @@ -6245,7 +6239,10 @@ static int __init cgroup_disable(char *str) if (strcmp(token, ss->name) && strcmp(token, ss->legacy_name)) continue; - cgroup_disable_mask |= 1 << i; + + static_branch_disable(cgroup_subsys_enabled_key[i]); + pr_info("Disabling %s control group subsystem\n", + ss->name); } } return 1; From ad241cb1cffbe7f9d121ba728a381416e8540954 Mon Sep 17 00:00:00 2001 From: Sergey Senozhatsky Date: Thu, 20 May 2021 19:14:22 +0900 Subject: [PATCH 159/269] wq: handle VM suspension in stall detection [ Upstream commit 940d71c6462e8151c78f28e4919aa8882ff2054e ] If VCPU is suspended (VM suspend) in wq_watchdog_timer_fn() then once this VCPU resumes it will see the new jiffies value, while it may take a while before IRQ detects PVCLOCK_GUEST_STOPPED on this VCPU and updates all the watchdogs via pvclock_touch_watchdogs(). There is a small chance of misreported WQ stalls in the meantime, because new jiffies is time_after() old 'ts + thresh'. wq_watchdog_timer_fn() { for_each_pool(pool, pi) { if (time_after(jiffies, ts + thresh)) { pr_emerg("BUG: workqueue lockup - pool"); } } } Save jiffies at the beginning of this function and use that value for stall detection. If VM gets suspended then we continue using "old" jiffies value and old WQ touch timestamps. If IRQ at some point restarts the stall detection cycle (pvclock_touch_watchdogs()) then old jiffies will always be before new 'ts + thresh'. Signed-off-by: Sergey Senozhatsky Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin --- kernel/workqueue.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 1e2ca744dadbe..b23f7d1044be7 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -50,6 +50,7 @@ #include #include #include +#include #include "workqueue_internal.h" @@ -5758,6 +5759,7 @@ static void wq_watchdog_timer_fn(struct timer_list *unused) { unsigned long thresh = READ_ONCE(wq_watchdog_thresh) * HZ; bool lockup_detected = false; + unsigned long now = jiffies; struct worker_pool *pool; int pi; @@ -5772,6 +5774,12 @@ static void wq_watchdog_timer_fn(struct timer_list *unused) if (list_empty(&pool->worklist)) continue; + /* + * If a virtual machine is stopped by the host it can look to + * the watchdog like a stall. + */ + kvm_check_and_clear_guest_paused(); + /* get the latest of pool and touched timestamps */ pool_ts = READ_ONCE(pool->watchdog_ts); touched = READ_ONCE(wq_watchdog_touched); @@ -5790,12 +5798,12 @@ static void wq_watchdog_timer_fn(struct timer_list *unused) } /* did we stall? */ - if (time_after(jiffies, ts + thresh)) { + if (time_after(now, ts + thresh)) { lockup_detected = true; pr_emerg("BUG: workqueue lockup - pool"); pr_cont_pool_info(pool); pr_cont(" stuck for %us!\n", - jiffies_to_msecs(jiffies - pool_ts) / 1000); + jiffies_to_msecs(now - pool_ts) / 1000); } } From 4353eb42180dbc48fe7a8b69c6d2490d310537bf Mon Sep 17 00:00:00 2001 From: Zheyu Ma Date: Thu, 20 May 2021 12:32:36 +0000 Subject: [PATCH 160/269] net/qla3xxx: fix schedule while atomic in ql_sem_spinlock [ Upstream commit 13a6f3153922391e90036ba2267d34eed63196fc ] When calling the 'ql_sem_spinlock', the driver has already acquired the spin lock, so the driver should not call 'ssleep' in atomic context. This bug can be fixed by using 'mdelay' instead of 'ssleep'. The KASAN's log reveals it: [ 3.238124 ] BUG: scheduling while atomic: swapper/0/1/0x00000002 [ 3.238748 ] 2 locks held by swapper/0/1: [ 3.239151 ] #0: ffff88810177b240 (&dev->mutex){....}-{3:3}, at: __device_driver_lock+0x41/0x60 [ 3.240026 ] #1: ffff888107c60e28 (&qdev->hw_lock){....}-{2:2}, at: ql3xxx_probe+0x2aa/0xea0 [ 3.240873 ] Modules linked in: [ 3.241187 ] irq event stamp: 460854 [ 3.241541 ] hardirqs last enabled at (460853): [] _raw_spin_unlock_irqrestore+0x4f/0x70 [ 3.242245 ] hardirqs last disabled at (460854): [] _raw_spin_lock_irqsave+0x2a/0x70 [ 3.242245 ] softirqs last enabled at (446076): [] __do_softirq+0x2e4/0x4b1 [ 3.242245 ] softirqs last disabled at (446069): [] irq_exit_rcu+0x100/0x110 [ 3.242245 ] Preemption disabled at: [ 3.242245 ] [] ql3xxx_probe+0x2aa/0xea0 [ 3.242245 ] Kernel panic - not syncing: scheduling while atomic [ 3.242245 ] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00145 -gee7dc339169-dirty #16 [ 3.242245 ] Call Trace: [ 3.242245 ] dump_stack+0xba/0xf5 [ 3.242245 ] ? ql3xxx_probe+0x1f0/0xea0 [ 3.242245 ] panic+0x15a/0x3f2 [ 3.242245 ] ? vprintk+0x76/0x150 [ 3.242245 ] ? ql3xxx_probe+0x2aa/0xea0 [ 3.242245 ] __schedule_bug+0xae/0xe0 [ 3.242245 ] __schedule+0x72e/0xa00 [ 3.242245 ] schedule+0x43/0xf0 [ 3.242245 ] schedule_timeout+0x28b/0x500 [ 3.242245 ] ? del_timer_sync+0xf0/0xf0 [ 3.242245 ] ? msleep+0x2f/0x70 [ 3.242245 ] msleep+0x59/0x70 [ 3.242245 ] ql3xxx_probe+0x307/0xea0 [ 3.242245 ] ? _raw_spin_unlock_irqrestore+0x3a/0x70 [ 3.242245 ] ? pci_device_remove+0x110/0x110 [ 3.242245 ] local_pci_probe+0x45/0xa0 [ 3.242245 ] pci_device_probe+0x12b/0x1d0 [ 3.242245 ] really_probe+0x2a9/0x610 [ 3.242245 ] driver_probe_device+0x90/0x1d0 [ 3.242245 ] ? mutex_lock_nested+0x1b/0x20 [ 3.242245 ] device_driver_attach+0x68/0x70 [ 3.242245 ] __driver_attach+0x124/0x1b0 [ 3.242245 ] ? device_driver_attach+0x70/0x70 [ 3.242245 ] bus_for_each_dev+0xbb/0x110 [ 3.242245 ] ? rdinit_setup+0x45/0x45 [ 3.242245 ] driver_attach+0x27/0x30 [ 3.242245 ] bus_add_driver+0x1eb/0x2a0 [ 3.242245 ] driver_register+0xa9/0x180 [ 3.242245 ] __pci_register_driver+0x82/0x90 [ 3.242245 ] ? yellowfin_init+0x25/0x25 [ 3.242245 ] ql3xxx_driver_init+0x23/0x25 [ 3.242245 ] do_one_initcall+0x7f/0x3d0 [ 3.242245 ] ? rdinit_setup+0x45/0x45 [ 3.242245 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.242245 ] kernel_init_freeable+0x2aa/0x301 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] kernel_init+0x18/0x190 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] ret_from_fork+0x1f/0x30 [ 3.242245 ] Dumping ftrace buffer: [ 3.242245 ] (ftrace buffer empty) [ 3.242245 ] Kernel Offset: disabled [ 3.242245 ] Rebooting in 1 seconds. Reported-by: Zheyu Ma Signed-off-by: Zheyu Ma Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qlogic/qla3xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c index 27740c027681b..a83b3d69a6565 100644 --- a/drivers/net/ethernet/qlogic/qla3xxx.c +++ b/drivers/net/ethernet/qlogic/qla3xxx.c @@ -114,7 +114,7 @@ static int ql_sem_spinlock(struct ql3_adapter *qdev, value = readl(&port_regs->CommonRegs.semaphoreReg); if ((value & (sem_mask >> 16)) == sem_bits) return 0; - ssleep(1); + mdelay(1000); } while (--seconds); return -1; } From e806df71eed223ab8ca099c61a41f3e9a576733f Mon Sep 17 00:00:00 2001 From: Rao Shoaib Date: Fri, 21 May 2021 11:08:06 -0700 Subject: [PATCH 161/269] RDS tcp loopback connection can hang [ Upstream commit aced3ce57cd37b5ca332bcacd370d01f5a8c5371 ] When TCP is used as transport and a program on the system connects to RDS port 16385, connection is accepted but denied per the rules of RDS. However, RDS connections object is left in the list. Next loopback connection will select that connection object as it is at the head of list. The connection attempt will hang as the connection object is set to connect over TCP which is not allowed The issue can be reproduced easily, use rds-ping to ping a local IP address. After that use any program like ncat to connect to the same IP address and port 16385. This will hang so ctrl-c out. Now try rds-ping, it will hang. To fix the issue this patch adds checks to disallow the connection object creation and destroys the connection object. Signed-off-by: Rao Shoaib Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/rds/connection.c | 23 +++++++++++++++++------ net/rds/tcp.c | 4 ++-- net/rds/tcp.h | 3 ++- net/rds/tcp_listen.c | 6 ++++++ 4 files changed, 27 insertions(+), 9 deletions(-) diff --git a/net/rds/connection.c b/net/rds/connection.c index f2fcab182095c..a3bc4b54d4910 100644 --- a/net/rds/connection.c +++ b/net/rds/connection.c @@ -240,12 +240,23 @@ static struct rds_connection *__rds_conn_create(struct net *net, if (loop_trans) { rds_trans_put(loop_trans); conn->c_loopback = 1; - if (is_outgoing && trans->t_prefer_loopback) { - /* "outgoing" connection - and the transport - * says it wants the connection handled by the - * loopback transport. This is what TCP does. - */ - trans = &rds_loop_transport; + if (trans->t_prefer_loopback) { + if (likely(is_outgoing)) { + /* "outgoing" connection to local address. + * Protocol says it wants the connection + * handled by the loopback transport. + * This is what TCP does. + */ + trans = &rds_loop_transport; + } else { + /* No transport currently in use + * should end up here, but if it + * does, reset/destroy the connection. + */ + kmem_cache_free(rds_conn_slab, conn); + conn = ERR_PTR(-EOPNOTSUPP); + goto out; + } } } diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 43db0eca911fa..abf19c0e3ba0b 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -313,8 +313,8 @@ static void rds6_tcp_tc_info(struct socket *sock, unsigned int len, } #endif -static int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr, - __u32 scope_id) +int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr, + __u32 scope_id) { struct net_device *dev = NULL; #if IS_ENABLED(CONFIG_IPV6) diff --git a/net/rds/tcp.h b/net/rds/tcp.h index bad9cf49d5657..dc8d745d68575 100644 --- a/net/rds/tcp.h +++ b/net/rds/tcp.h @@ -59,7 +59,8 @@ u32 rds_tcp_snd_una(struct rds_tcp_connection *tc); u64 rds_tcp_map_seq(struct rds_tcp_connection *tc, u32 seq); extern struct rds_transport rds_tcp_transport; void rds_tcp_accept_work(struct sock *sk); - +int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr, + __u32 scope_id); /* tcp_connect.c */ int rds_tcp_conn_path_connect(struct rds_conn_path *cp); void rds_tcp_conn_path_shutdown(struct rds_conn_path *conn); diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 101cf14215a0b..09cadd556d1e1 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -167,6 +167,12 @@ int rds_tcp_accept_one(struct socket *sock) } #endif + if (!rds_tcp_laddr_check(sock_net(sock->sk), peer_addr, dev_if)) { + /* local address connection is only allowed via loopback */ + ret = -EOPNOTSUPP; + goto out; + } + conn = rds_conn_create(sock_net(sock->sk), my_addr, peer_addr, &rds_tcp_transport, 0, GFP_KERNEL, dev_if); From 8d717c9135a3340ae62d1699484850bfb4112b0c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=8D=C3=B1igo=20Huguet?= Date: Fri, 21 May 2021 16:38:35 +0200 Subject: [PATCH 162/269] net:sfc: fix non-freed irq in legacy irq mode MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 8f03eeb6e0a0a0b8d617ee0a4bce729e47130036 ] SFC driver can be configured via modparam to work using MSI-X, MSI or legacy IRQ interrupts. In the last one, the interrupt was not properly released on module remove. It was not freed because the flag irqs_hooked was not set during initialization in the case of using legacy IRQ. Example of (trimmed) trace during module remove without this fix: remove_proc_entry: removing non-empty directory 'irq/125', leaking at least '0000:3b:00.1' WARNING: CPU: 39 PID: 3658 at fs/proc/generic.c:715 remove_proc_entry+0x15c/0x170 ...trimmed... Call Trace: unregister_irq_proc+0xe3/0x100 free_desc+0x29/0x70 irq_free_descs+0x47/0x70 mp_unmap_irq+0x58/0x60 acpi_unregister_gsi_ioapic+0x2a/0x40 acpi_pci_irq_disable+0x78/0xb0 pci_disable_device+0xd1/0x100 efx_pci_remove+0xa1/0x1e0 [sfc] pci_device_remove+0x38/0xa0 __device_release_driver+0x177/0x230 driver_detach+0xcb/0x110 bus_remove_driver+0x58/0xd0 pci_unregister_driver+0x2a/0xb0 efx_exit_module+0x24/0xf40 [sfc] __do_sys_delete_module.constprop.0+0x171/0x280 ? exit_to_user_mode_prepare+0x83/0x1d0 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f9f9385800b ...trimmed... Signed-off-by: Íñigo Huguet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/sfc/nic.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c index d1e908846f5dd..22fbb0ae77fba 100644 --- a/drivers/net/ethernet/sfc/nic.c +++ b/drivers/net/ethernet/sfc/nic.c @@ -90,6 +90,7 @@ int efx_nic_init_interrupt(struct efx_nic *efx) efx->pci_dev->irq); goto fail1; } + efx->irqs_hooked = true; return 0; } From 30030c6ff3d447900255647b9b5fc1143313e7eb Mon Sep 17 00:00:00 2001 From: Javed Hasan Date: Tue, 18 May 2021 23:14:16 -0700 Subject: [PATCH 163/269] scsi: bnx2fc: Return failure if io_req is already in ABTS processing [ Upstream commit 122c81c563b0c1c6b15ff76a9159af5ee1f21563 ] Return failure from bnx2fc_eh_abort() if io_req is already in ABTS processing. Link: https://lore.kernel.org/r/20210519061416.19321-1-jhasan@marvell.com Reviewed-by: Himanshu Madhani Signed-off-by: Javed Hasan Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/bnx2fc/bnx2fc_io.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 1a0dc18d69155..ed300a279a387 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1220,6 +1220,7 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) was a result from the ABTS request rather than the CLEANUP request */ set_bit(BNX2FC_FLAG_IO_CLEANUP, &io_req->req_flags); + rc = FAILED; goto done; } From eac1d159b7a33616666dc8e072ba34300e048834 Mon Sep 17 00:00:00 2001 From: Matt Wang Date: Wed, 19 May 2021 09:49:32 +0000 Subject: [PATCH 164/269] scsi: vmw_pvscsi: Set correct residual data length [ Upstream commit e662502b3a782d479e67736a5a1c169a703d853a ] Some commands (such as INQUIRY) may return less data than the initiator requested. To avoid conducting useless information, set the right residual count to make upper layer aware of this. Before (INQUIRY PAGE 0xB0 with 128B buffer): $ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00 SCSI Status: Good Received 128 bytes of data: 00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............ 10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@.... 20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... ..... 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ After: $ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00 SCSI Status: Good Received 64 bytes of data: 00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............ 10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@.... 20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... ..... 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [mkp: clarified description] Link: https://lore.kernel.org/r/03C41093-B62E-43A2-913E-CFC92F1C70C3@vmware.com Signed-off-by: Matt Wang Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/vmw_pvscsi.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/vmw_pvscsi.c b/drivers/scsi/vmw_pvscsi.c index 081f54ab7d86c..1421b1394d816 100644 --- a/drivers/scsi/vmw_pvscsi.c +++ b/drivers/scsi/vmw_pvscsi.c @@ -587,7 +587,13 @@ static void pvscsi_complete_request(struct pvscsi_adapter *adapter, case BTSTAT_SUCCESS: case BTSTAT_LINKED_COMMAND_COMPLETED: case BTSTAT_LINKED_COMMAND_COMPLETED_WITH_FLAG: - /* If everything went fine, let's move on.. */ + /* + * Commands like INQUIRY may transfer less data than + * requested by the initiator via bufflen. Set residual + * count to make upper layer aware of the actual amount + * of data returned. + */ + scsi_set_resid(cmd, scsi_bufflen(cmd) - e->dataLen); cmd->result = (DID_OK << 16); break; From 16763635285cb921c419a5a2dd9ddca2adcd2509 Mon Sep 17 00:00:00 2001 From: Yang Yingliang Date: Wed, 19 May 2021 21:05:19 +0800 Subject: [PATCH 165/269] scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq [ Upstream commit 7907a021e4bbfa29cccacd2ba2dade894d9a7d4c ] irqs allocated with devm_request_irq() should not be freed using free_irq(). Doing so causes a dangling pointer and a subsequent double free. Link: https://lore.kernel.org/r/20210519130519.2661938-1-yangyingliang@huawei.com Reported-by: Hulk Robot Acked-by: John Garry Signed-off-by: Yang Yingliang Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 19170c7ac336f..e9a82a390672c 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -3359,14 +3359,14 @@ hisi_sas_v3_destroy_irqs(struct pci_dev *pdev, struct hisi_hba *hisi_hba) { int i; - free_irq(pci_irq_vector(pdev, 1), hisi_hba); - free_irq(pci_irq_vector(pdev, 2), hisi_hba); - free_irq(pci_irq_vector(pdev, 11), hisi_hba); + devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 1), hisi_hba); + devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 2), hisi_hba); + devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 11), hisi_hba); for (i = 0; i < hisi_hba->cq_nvecs; i++) { struct hisi_sas_cq *cq = &hisi_hba->cq[i]; int nr = hisi_sas_intr_conv ? 16 : 16 + i; - free_irq(pci_irq_vector(pdev, nr), cq); + devm_free_irq(&pdev->dev, pci_irq_vector(pdev, nr), cq); } pci_free_irq_vectors(pdev); } From bbb48789b62aa3dfd6ca3d731105efe528d91455 Mon Sep 17 00:00:00 2001 From: Dmitry Bogdanov Date: Thu, 15 Apr 2021 23:35:54 +0300 Subject: [PATCH 166/269] scsi: target: qla2xxx: Wait for stop_phase1 at WWN removal [ Upstream commit 2ef7665dfd88830f15415ba007c7c9a46be7acd8 ] Target de-configuration panics at high CPU load because TPGT and WWPN can be removed on separate threads. TPGT removal requests a reset HBA on a separate thread and waits for reset complete (phase1). Due to high CPU load that HBA reset can be delayed for some time. WWPN removal does qlt_stop_phase2(). There it is believed that phase1 has already completed and thus tgt.tgt_ops is subsequently cleared. However, tgt.tgt_ops is needed to process incoming traffic and therefore this will cause one of the following panics: NIP qlt_reset+0x7c/0x220 [qla2xxx] LR qlt_reset+0x68/0x220 [qla2xxx] Call Trace: 0xc000003ffff63a78 (unreliable) qlt_handle_imm_notify+0x800/0x10c0 [qla2xxx] qlt_24xx_atio_pkt+0x208/0x590 [qla2xxx] qlt_24xx_process_atio_queue+0x33c/0x7a0 [qla2xxx] qla83xx_msix_atio_q+0x54/0x90 [qla2xxx] or NIP qlt_24xx_handle_abts+0xd0/0x2a0 [qla2xxx] LR qlt_24xx_handle_abts+0xb4/0x2a0 [qla2xxx] Call Trace: qlt_24xx_handle_abts+0x90/0x2a0 [qla2xxx] (unreliable) qlt_24xx_process_atio_queue+0x500/0x7a0 [qla2xxx] qla83xx_msix_atio_q+0x54/0x90 [qla2xxx] or NIP qlt_create_sess+0x90/0x4e0 [qla2xxx] LR qla24xx_do_nack_work+0xa8/0x180 [qla2xxx] Call Trace: 0xc0000000348fba30 (unreliable) qla24xx_do_nack_work+0xa8/0x180 [qla2xxx] qla2x00_do_work+0x674/0xbf0 [qla2xxx] qla2x00_iocb_work_fn The patch fixes the issue by serializing qlt_stop_phase1() and qlt_stop_phase2() functions to make WWPN removal wait for phase1 completion. Link: https://lore.kernel.org/r/20210415203554.27890-1-d.bogdanov@yadro.com Reviewed-by: Roman Bolshakov Signed-off-by: Dmitry Bogdanov Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/qla2xxx/qla_target.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index dcae8f071c355..8d4976725a75a 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1559,10 +1559,12 @@ void qlt_stop_phase2(struct qla_tgt *tgt) return; } + mutex_lock(&tgt->ha->optrom_mutex); mutex_lock(&vha->vha_tgt.tgt_mutex); tgt->tgt_stop = 0; tgt->tgt_stopped = 1; mutex_unlock(&vha->vha_tgt.tgt_mutex); + mutex_unlock(&tgt->ha->optrom_mutex); ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00c, "Stop of tgt %p finished\n", tgt); From a385cbf31ed0deceb2e09c3950b605f94c9a4f0a Mon Sep 17 00:00:00 2001 From: Zong Li Date: Sat, 22 May 2021 17:16:11 +0800 Subject: [PATCH 167/269] net: macb: ensure the device is available before accessing GEMGXL control registers [ Upstream commit 5eff1461a6dec84f04fafa9128548bad51d96147 ] If runtime power menagement is enabled, the gigabit ethernet PLL would be disabled after macb_probe(). During this period of time, the system would hang up if we try to access GEMGXL control registers. We can't put runtime_pm_get/runtime_pm_put/ there due to the issue of sleep inside atomic section (7fa2955ff70ce453 ("sh_eth: Fix sleeping function called from invalid context"). Add netif_running checking to ensure the device is available before accessing GEMGXL device. Changed in v2: - Use netif_running instead of its own flag Signed-off-by: Zong Li Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/cadence/macb_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 390f45e49eaf7..1e8bf6b9834bb 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -2709,6 +2709,9 @@ static struct net_device_stats *gem_get_stats(struct macb *bp) struct gem_stats *hwstat = &bp->hw_stats.gem; struct net_device_stats *nstat = &bp->dev->stats; + if (!netif_running(bp->dev)) + return nstat; + gem_update_stats(bp); nstat->rx_errors = (hwstat->rx_frame_check_sequence_errors + From ee144b79802849626f8fd83ab5929e7e8118ff6b Mon Sep 17 00:00:00 2001 From: Saubhik Mukherjee Date: Mon, 24 May 2021 19:07:12 +0530 Subject: [PATCH 168/269] net: appletalk: cops: Fix data race in cops_probe1 [ Upstream commit a4dd4fc6105e54393d637450a11d4cddb5fabc4f ] In cops_probe1(), there is a write to dev->base_addr after requesting an interrupt line and registering the interrupt handler cops_interrupt(). The handler might be called in parallel to handle an interrupt. cops_interrupt() tries to read dev->base_addr leading to a potential data race. So write to dev->base_addr before calling request_irq(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Saubhik Mukherjee Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/appletalk/cops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/appletalk/cops.c b/drivers/net/appletalk/cops.c index ba8e70a8e3125..6b12ce822e51a 100644 --- a/drivers/net/appletalk/cops.c +++ b/drivers/net/appletalk/cops.c @@ -327,6 +327,8 @@ static int __init cops_probe1(struct net_device *dev, int ioaddr) break; } + dev->base_addr = ioaddr; + /* Reserve any actual interrupt. */ if (dev->irq) { retval = request_irq(dev->irq, cops_interrupt, 0, dev->name, dev); @@ -334,8 +336,6 @@ static int __init cops_probe1(struct net_device *dev, int ioaddr) goto err_out; } - dev->base_addr = ioaddr; - lp = netdev_priv(dev); spin_lock_init(&lp->lock); From f42afc0f29024f81a34175e0b24d746a6d24225b Mon Sep 17 00:00:00 2001 From: George McCollister Date: Mon, 24 May 2021 15:29:53 -0500 Subject: [PATCH 169/269] net: dsa: microchip: enable phy errata workaround on 9567 [ Upstream commit 8c42a49738f16af0061f9ae5c2f5a955f268d9e3 ] Also enable phy errata workaround on 9567 since has the same errata as the 9477 according to the manufacture's documentation. Signed-off-by: George McCollister Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/dsa/microchip/ksz9477.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c index abfd3802bb517..b3aa99eb6c2c5 100644 --- a/drivers/net/dsa/microchip/ksz9477.c +++ b/drivers/net/dsa/microchip/ksz9477.c @@ -1532,6 +1532,7 @@ static const struct ksz_chip_data ksz9477_switch_chips[] = { .num_statics = 16, .cpu_ports = 0x7F, /* can be configured as cpu port */ .port_cnt = 7, /* total physical port count */ + .phy_errata_9477 = true, }, }; From 37a079a6ae1432affa8e10c229eabc1923e7ccd0 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke Date: Fri, 21 May 2021 10:23:46 +0200 Subject: [PATCH 170/269] nvme-fabrics: decode host pathing error for connect [ Upstream commit 4d9442bf263ac45d495bb7ecf75009e59c0622b2 ] Add an additional decoding for 'host pathing error' during connect. Signed-off-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Reviewed-by: Himanshu Madhani Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/fabrics.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c index 8575724734e02..7015fba2e5125 100644 --- a/drivers/nvme/host/fabrics.c +++ b/drivers/nvme/host/fabrics.c @@ -336,6 +336,11 @@ static void nvmf_log_connect_error(struct nvme_ctrl *ctrl, cmd->connect.recfmt); break; + case NVME_SC_HOST_PATH_ERROR: + dev_err(ctrl->device, + "Connect command failed: host path error\n"); + break; + default: dev_err(ctrl->device, "Connect command failed, error wo/DNR bit: %d\n", From 7519ece673e300b0362572edbde7e030552705ec Mon Sep 17 00:00:00 2001 From: Tiezhu Yang Date: Sat, 15 May 2021 19:02:01 +0800 Subject: [PATCH 171/269] MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER [ Upstream commit 78cf0eb926cb1abeff2106bae67752e032fe5f3e ] When update the latest mainline kernel with the following three configs, the kernel hangs during startup: (1) CONFIG_FUNCTION_GRAPH_TRACER=y (2) CONFIG_PREEMPT_TRACER=y (3) CONFIG_FTRACE_STARTUP_TEST=y When update the latest mainline kernel with the above two configs (1) and (2), the kernel starts normally, but it still hangs when execute the following command: echo "function_graph" > /sys/kernel/debug/tracing/current_tracer Without CONFIG_PREEMPT_TRACER=y, the above two kinds of kernel hangs disappeared, so it seems that CONFIG_PREEMPT_TRACER has some influences with function_graph tracer at the first glance. I use ejtag to find out the epc address is related with preempt_enable() in the file arch/mips/lib/mips-atomic.c, because function tracing can trace the preempt_{enable,disable} calls that are traced, replace them with preempt_{enable,disable}_notrace to prevent function tracing from going into an infinite loop, and then it can fix the kernel hang issue. By the way, it seems that this commit is a complement and improvement of commit f93a1a00f2bd ("MIPS: Fix crash that occurs when function tracing is enabled"). Signed-off-by: Tiezhu Yang Cc: Steven Rostedt Signed-off-by: Thomas Bogendoerfer Signed-off-by: Sasha Levin --- arch/mips/lib/mips-atomic.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/mips/lib/mips-atomic.c b/arch/mips/lib/mips-atomic.c index de03838b343b8..a9b72eacfc0b3 100644 --- a/arch/mips/lib/mips-atomic.c +++ b/arch/mips/lib/mips-atomic.c @@ -37,7 +37,7 @@ */ notrace void arch_local_irq_disable(void) { - preempt_disable(); + preempt_disable_notrace(); __asm__ __volatile__( " .set push \n" @@ -53,7 +53,7 @@ notrace void arch_local_irq_disable(void) : /* no inputs */ : "memory"); - preempt_enable(); + preempt_enable_notrace(); } EXPORT_SYMBOL(arch_local_irq_disable); @@ -61,7 +61,7 @@ notrace unsigned long arch_local_irq_save(void) { unsigned long flags; - preempt_disable(); + preempt_disable_notrace(); __asm__ __volatile__( " .set push \n" @@ -78,7 +78,7 @@ notrace unsigned long arch_local_irq_save(void) : /* no inputs */ : "memory"); - preempt_enable(); + preempt_enable_notrace(); return flags; } @@ -88,7 +88,7 @@ notrace void arch_local_irq_restore(unsigned long flags) { unsigned long __tmp1; - preempt_disable(); + preempt_disable_notrace(); __asm__ __volatile__( " .set push \n" @@ -106,7 +106,7 @@ notrace void arch_local_irq_restore(unsigned long flags) : "0" (flags) : "memory"); - preempt_enable(); + preempt_enable_notrace(); } EXPORT_SYMBOL(arch_local_irq_restore); From 90547d5db50bcb2705709e420e0af51535109113 Mon Sep 17 00:00:00 2001 From: John Keeping Date: Wed, 12 May 2021 12:14:21 +0100 Subject: [PATCH 172/269] dm verity: fix require_signatures module_param permissions [ Upstream commit 0c1f3193b1cdd21e7182f97dc9bca7d284d18a15 ] The third parameter of module_param() is permissions for the sysfs node but it looks like it is being used as the initial value of the parameter here. In fact, false here equates to omitting the file from sysfs and does not affect the value of require_signatures. Making the parameter writable is not simple because going from false->true is fine but it should not be possible to remove the requirement to verify a signature. But it can be useful to inspect the value of this parameter from userspace, so change the permissions to make a read-only file in sysfs. Signed-off-by: John Keeping Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin --- drivers/md/dm-verity-verify-sig.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/dm-verity-verify-sig.c b/drivers/md/dm-verity-verify-sig.c index 614e43db93aa8..919154ae4cae6 100644 --- a/drivers/md/dm-verity-verify-sig.c +++ b/drivers/md/dm-verity-verify-sig.c @@ -15,7 +15,7 @@ #define DM_VERITY_VERIFY_ERR(s) DM_VERITY_ROOT_HASH_VERIFICATION " " s static bool require_signatures; -module_param(require_signatures, bool, false); +module_param(require_signatures, bool, 0444); MODULE_PARM_DESC(require_signatures, "Verify the roothash of dm-verity hash tree"); From b0308804b2e0edb0f8e88f29c5bf6875018fe65c Mon Sep 17 00:00:00 2001 From: Jiapeng Chong Date: Tue, 25 May 2021 19:00:12 +0800 Subject: [PATCH 173/269] bnx2x: Fix missing error code in bnx2x_iov_init_one() [ Upstream commit 65161c35554f7135e6656b3df1ce2c500ca0bdcf ] Eliminate the follow smatch warning: drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1227 bnx2x_iov_init_one() warn: missing error code 'err'. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c index 9c2f51f230351..9108b497b3c99 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c @@ -1224,8 +1224,10 @@ int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param, goto failed; /* SR-IOV capability was enabled but there are no VFs*/ - if (iov->total == 0) + if (iov->total == 0) { + err = -EINVAL; goto failed; + } iov->nr_virtfn = min_t(u16, iov->total, num_vfs_param); From 2538f06f946820752fd9b2dfe6c66fe55cbf99ab Mon Sep 17 00:00:00 2001 From: Sagi Grimberg Date: Fri, 21 May 2021 14:51:15 -0700 Subject: [PATCH 174/269] nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME [ Upstream commit 042a3eaad6daeabcfaf163aa44da8ea3cf8b5496 ] We need to select NVME_CORE. Signed-off-by: Sagi Grimberg Reviewed-by: Max Gurtovoy Reviewed-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig index a44d49d63968a..494675aeaaad7 100644 --- a/drivers/nvme/host/Kconfig +++ b/drivers/nvme/host/Kconfig @@ -71,7 +71,8 @@ config NVME_FC config NVME_TCP tristate "NVM Express over Fabrics TCP host driver" depends on INET - depends on BLK_DEV_NVME + depends on BLOCK + select NVME_CORE select NVME_FABRICS select CRYPTO select CRYPTO_CRC32C From 590f718a6473055d9468323802298b5bbddf382c Mon Sep 17 00:00:00 2001 From: Sagi Grimberg Date: Tue, 25 May 2021 08:49:05 -0700 Subject: [PATCH 175/269] nvmet: fix false keep-alive timeout when a controller is torn down [ Upstream commit aaeadd7075dc9e184bc7876e9dd7b3bada771df2 ] Controller teardown flow may take some time in case it has many I/O queues, and the host may not send us keep-alive during this period. Hence reset the traffic based keep-alive timer so we don't trigger a controller teardown as a result of a keep-alive expiration. Reported-by: Yi Zhang Signed-off-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Reviewed-by: Hannes Reinecke Tested-by: Yi Zhang Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/target/core.c | 15 +++++++++++---- drivers/nvme/target/nvmet.h | 2 +- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index 8b939e9db470c..9a8fa2e582d5b 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -379,10 +379,10 @@ static void nvmet_keep_alive_timer(struct work_struct *work) { struct nvmet_ctrl *ctrl = container_of(to_delayed_work(work), struct nvmet_ctrl, ka_work); - bool cmd_seen = ctrl->cmd_seen; + bool reset_tbkas = ctrl->reset_tbkas; - ctrl->cmd_seen = false; - if (cmd_seen) { + ctrl->reset_tbkas = false; + if (reset_tbkas) { pr_debug("ctrl %d reschedule traffic based keep-alive timer\n", ctrl->cntlid); schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ); @@ -792,6 +792,13 @@ void nvmet_sq_destroy(struct nvmet_sq *sq) percpu_ref_exit(&sq->ref); if (ctrl) { + /* + * The teardown flow may take some time, and the host may not + * send us keep-alive during this period, hence reset the + * traffic based keep-alive timer so we don't trigger a + * controller teardown as a result of a keep-alive expiration. + */ + ctrl->reset_tbkas = true; nvmet_ctrl_put(ctrl); sq->ctrl = NULL; /* allows reusing the queue later */ } @@ -942,7 +949,7 @@ bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq, } if (sq->ctrl) - sq->ctrl->cmd_seen = true; + sq->ctrl->reset_tbkas = true; return true; diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index ea96487b5424e..4bf6d21290c23 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -166,7 +166,7 @@ struct nvmet_ctrl { struct nvmet_subsys *subsys; struct nvmet_sq **sqs; - bool cmd_seen; + bool reset_tbkas; struct mutex lock; u64 cap; From a7c3c17867b396b40a2076bc9f6b9dd12670600c Mon Sep 17 00:00:00 2001 From: Chris Packham Date: Wed, 12 May 2021 09:20:50 +1200 Subject: [PATCH 176/269] powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers [ Upstream commit 7adc7b225cddcfd0f346d10144fd7a3d3d9f9ea7 ] The i2c controllers on the P2040/P2041 have an erratum where the documented scheme for i2c bus recovery will not work (A-004447). A different mechanism is needed which is documented in the P2040 Chip Errata Rev Q (latest available at the time of writing). Signed-off-by: Chris Packham Acked-by: Michael Ellerman Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi index 872e4485dc3f0..ddc018d42252f 100644 --- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi @@ -371,7 +371,23 @@ }; /include/ "qoriq-i2c-0.dtsi" + i2c@118000 { + fsl,i2c-erratum-a004447; + }; + + i2c@118100 { + fsl,i2c-erratum-a004447; + }; + /include/ "qoriq-i2c-1.dtsi" + i2c@119000 { + fsl,i2c-erratum-a004447; + }; + + i2c@119100 { + fsl,i2c-erratum-a004447; + }; + /include/ "qoriq-duart-0.dtsi" /include/ "qoriq-duart-1.dtsi" /include/ "qoriq-gpio-0.dtsi" From 3a5b982463f4eda2cd6368a974b6608a9517d50c Mon Sep 17 00:00:00 2001 From: Chris Packham Date: Wed, 12 May 2021 09:20:51 +1200 Subject: [PATCH 177/269] powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers [ Upstream commit 19ae697a1e4edf1d755b413e3aa38da65e2db23b ] The i2c controllers on the P1010 have an erratum where the documented scheme for i2c bus recovery will not work (A-004447). A different mechanism is needed which is documented in the P1010 Chip Errata Rev L. Signed-off-by: Chris Packham Acked-by: Michael Ellerman Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi index 1b4aafc1f6a27..9716a0484ecf0 100644 --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi @@ -122,7 +122,15 @@ }; /include/ "pq3-i2c-0.dtsi" + i2c@3000 { + fsl,i2c-erratum-a004447; + }; + /include/ "pq3-i2c-1.dtsi" + i2c@3100 { + fsl,i2c-erratum-a004447; + }; + /include/ "pq3-duart-0.dtsi" /include/ "pq3-espi-0.dtsi" spi0: spi@7000 { From 0c4d4de2da4dc39168077da2b21dd590523f3975 Mon Sep 17 00:00:00 2001 From: Saravana Kannan Date: Wed, 5 May 2021 09:47:34 -0700 Subject: [PATCH 178/269] spi: Don't have controller clean up spi device before driver unbind [ Upstream commit 27e7db56cf3dffd302bd7ddfacb1d405cf671a2a ] When a spi device is unregistered and triggers a driver unbind, the driver might need to access the spi device. So, don't have the controller clean up the spi device before the driver is unbound. Clean up the spi device after the driver is unbound. Fixes: c7299fea6769 ("spi: Fix spi device unregister flow") Reported-by: Lukas Wunner Signed-off-by: Saravana Kannan Tested-by: Andy Shevchenko Link: https://lore.kernel.org/r/20210505164734.175546-1-saravanak@google.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 96560853b3a3c..0cf67de741e78 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -709,15 +709,15 @@ void spi_unregister_device(struct spi_device *spi) if (!spi) return; - spi_cleanup(spi); - if (spi->dev.of_node) { of_node_clear_flag(spi->dev.of_node, OF_POPULATED); of_node_put(spi->dev.of_node); } if (ACPI_COMPANION(&spi->dev)) acpi_device_clear_enumerated(ACPI_COMPANION(&spi->dev)); - device_unregister(&spi->dev); + device_del(&spi->dev); + spi_cleanup(spi); + put_device(&spi->dev); } EXPORT_SYMBOL_GPL(spi_unregister_device); From fa05ba61967ad051f5f2b7c4f39d6c56719c9900 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Thu, 27 May 2021 23:10:56 +0200 Subject: [PATCH 179/269] spi: Cleanup on failure of initial setup [ Upstream commit 2ec6f20b33eb4f62ab90bdcd620436c883ec3af6 ] Commit c7299fea6769 ("spi: Fix spi device unregister flow") changed the SPI core's behavior if the ->setup() hook returns an error upon adding an spi_device: Before, the ->cleanup() hook was invoked to free any allocations that were made by ->setup(). With the commit, that's no longer the case, so the ->setup() hook is expected to free the allocations itself. I've identified 5 drivers which depend on the old behavior and am fixing them up hereinafter: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c spi-omap2-mcspi.c spi-pxa2xx.c Importantly, ->setup() is not only invoked on spi_device *addition*: It may subsequently be called to *change* SPI parameters. If changing these SPI parameters fails, freeing memory allocations would be wrong. That should only be done if the spi_device is finally destroyed. I am therefore using a bool "initial_setup" in 4 of the affected drivers to differentiate between the invocation on *adding* the spi_device and any subsequent invocations: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c spi-omap2-mcspi.c In spi-pxa2xx.c, it seems the ->setup() hook can only fail on spi_device addition, not any subsequent calls. It therefore doesn't need the bool. It's worth noting that 5 other drivers already perform a cleanup if the ->setup() hook fails. Before c7299fea6769, they caused a double-free if ->setup() failed on spi_device addition. Since the commit, they're fine. These drivers are: spi-mpc512x-psc.c spi-pl022.c spi-s3c64xx.c spi-st-ssc4.c spi-tegra114.c (spi-pxa2xx.c also already performs a cleanup, but only in one of several error paths.) Fixes: c7299fea6769 ("spi: Fix spi device unregister flow") Signed-off-by: Lukas Wunner Cc: Saravana Kannan Acked-by: Andy Shevchenko # pxa2xx Link: https://lore.kernel.org/r/f76a0599469f265b69c371538794101fa37b5536.1622149321.git.lukas@wunner.de Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi-bitbang.c | 18 ++++++++++++++---- drivers/spi/spi-fsl-spi.c | 4 ++++ drivers/spi/spi-omap-uwire.c | 9 ++++++++- drivers/spi/spi-omap2-mcspi.c | 33 ++++++++++++++++++++------------- drivers/spi/spi-pxa2xx.c | 9 ++++++++- 5 files changed, 54 insertions(+), 19 deletions(-) diff --git a/drivers/spi/spi-bitbang.c b/drivers/spi/spi-bitbang.c index 1a7352abd8786..3d8948a17095b 100644 --- a/drivers/spi/spi-bitbang.c +++ b/drivers/spi/spi-bitbang.c @@ -181,6 +181,8 @@ int spi_bitbang_setup(struct spi_device *spi) { struct spi_bitbang_cs *cs = spi->controller_state; struct spi_bitbang *bitbang; + bool initial_setup = false; + int retval; bitbang = spi_master_get_devdata(spi->master); @@ -189,22 +191,30 @@ int spi_bitbang_setup(struct spi_device *spi) if (!cs) return -ENOMEM; spi->controller_state = cs; + initial_setup = true; } /* per-word shift register access, in hardware or bitbanging */ cs->txrx_word = bitbang->txrx_word[spi->mode & (SPI_CPOL|SPI_CPHA)]; - if (!cs->txrx_word) - return -EINVAL; + if (!cs->txrx_word) { + retval = -EINVAL; + goto err_free; + } if (bitbang->setup_transfer) { - int retval = bitbang->setup_transfer(spi, NULL); + retval = bitbang->setup_transfer(spi, NULL); if (retval < 0) - return retval; + goto err_free; } dev_dbg(&spi->dev, "%s, %u nsec/bit\n", __func__, 2 * cs->nsecs); return 0; + +err_free: + if (initial_setup) + kfree(cs); + return retval; } EXPORT_SYMBOL_GPL(spi_bitbang_setup); diff --git a/drivers/spi/spi-fsl-spi.c b/drivers/spi/spi-fsl-spi.c index d0e5aa18b7bad..bdf94cc7be1af 100644 --- a/drivers/spi/spi-fsl-spi.c +++ b/drivers/spi/spi-fsl-spi.c @@ -440,6 +440,7 @@ static int fsl_spi_setup(struct spi_device *spi) { struct mpc8xxx_spi *mpc8xxx_spi; struct fsl_spi_reg __iomem *reg_base; + bool initial_setup = false; int retval; u32 hw_mode; struct spi_mpc8xxx_cs *cs = spi_get_ctldata(spi); @@ -452,6 +453,7 @@ static int fsl_spi_setup(struct spi_device *spi) if (!cs) return -ENOMEM; spi_set_ctldata(spi, cs); + initial_setup = true; } mpc8xxx_spi = spi_master_get_devdata(spi->master); @@ -475,6 +477,8 @@ static int fsl_spi_setup(struct spi_device *spi) retval = fsl_spi_setup_transfer(spi, NULL); if (retval < 0) { cs->hw_mode = hw_mode; /* Restore settings */ + if (initial_setup) + kfree(cs); return retval; } diff --git a/drivers/spi/spi-omap-uwire.c b/drivers/spi/spi-omap-uwire.c index 71402f71ddd85..df28c6664aba6 100644 --- a/drivers/spi/spi-omap-uwire.c +++ b/drivers/spi/spi-omap-uwire.c @@ -424,15 +424,22 @@ static int uwire_setup_transfer(struct spi_device *spi, struct spi_transfer *t) static int uwire_setup(struct spi_device *spi) { struct uwire_state *ust = spi->controller_state; + bool initial_setup = false; + int status; if (ust == NULL) { ust = kzalloc(sizeof(*ust), GFP_KERNEL); if (ust == NULL) return -ENOMEM; spi->controller_state = ust; + initial_setup = true; } - return uwire_setup_transfer(spi, NULL); + status = uwire_setup_transfer(spi, NULL); + if (status && initial_setup) + kfree(ust); + + return status; } static void uwire_cleanup(struct spi_device *spi) diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c index d4c9510af3931..3596bbe4b7760 100644 --- a/drivers/spi/spi-omap2-mcspi.c +++ b/drivers/spi/spi-omap2-mcspi.c @@ -1032,8 +1032,22 @@ static void omap2_mcspi_release_dma(struct spi_master *master) } } +static void omap2_mcspi_cleanup(struct spi_device *spi) +{ + struct omap2_mcspi_cs *cs; + + if (spi->controller_state) { + /* Unlink controller state from context save list */ + cs = spi->controller_state; + list_del(&cs->node); + + kfree(cs); + } +} + static int omap2_mcspi_setup(struct spi_device *spi) { + bool initial_setup = false; int ret; struct omap2_mcspi *mcspi = spi_master_get_devdata(spi->master); struct omap2_mcspi_regs *ctx = &mcspi->ctx; @@ -1051,35 +1065,28 @@ static int omap2_mcspi_setup(struct spi_device *spi) spi->controller_state = cs; /* Link this to context save list */ list_add_tail(&cs->node, &ctx->cs); + initial_setup = true; } ret = pm_runtime_get_sync(mcspi->dev); if (ret < 0) { pm_runtime_put_noidle(mcspi->dev); + if (initial_setup) + omap2_mcspi_cleanup(spi); return ret; } ret = omap2_mcspi_setup_transfer(spi, NULL); + if (ret && initial_setup) + omap2_mcspi_cleanup(spi); + pm_runtime_mark_last_busy(mcspi->dev); pm_runtime_put_autosuspend(mcspi->dev); return ret; } -static void omap2_mcspi_cleanup(struct spi_device *spi) -{ - struct omap2_mcspi_cs *cs; - - if (spi->controller_state) { - /* Unlink controller state from context save list */ - cs = spi->controller_state; - list_del(&cs->node); - - kfree(cs); - } -} - static irqreturn_t omap2_mcspi_irq_handler(int irq, void *data) { struct omap2_mcspi *mcspi = data; diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c index d6b534d38e5da..56a62095ec8c2 100644 --- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -1254,6 +1254,8 @@ static int setup_cs(struct spi_device *spi, struct chip_data *chip, chip->gpio_cs_inverted = spi->mode & SPI_CS_HIGH; err = gpiod_direction_output(gpiod, !chip->gpio_cs_inverted); + if (err) + gpiod_put(chip->gpiod_cs); } return err; @@ -1267,6 +1269,7 @@ static int setup(struct spi_device *spi) struct driver_data *drv_data = spi_controller_get_devdata(spi->controller); uint tx_thres, tx_hi_thres, rx_thres; + int err; switch (drv_data->ssp_type) { case QUARK_X1000_SSP: @@ -1413,7 +1416,11 @@ static int setup(struct spi_device *spi) if (drv_data->ssp_type == CE4100_SSP) return 0; - return setup_cs(spi, chip, chip_info); + err = setup_cs(spi, chip, chip_info); + if (err) + kfree(chip); + + return err; } static void cleanup(struct spi_device *spi) From d78b76af9f61f384526137d45e53cea0a1020132 Mon Sep 17 00:00:00 2001 From: Chris Packham Date: Mon, 29 Mar 2021 14:52:03 +1300 Subject: [PATCH 180/269] i2c: mpc: Make use of i2c_recover_bus() [ Upstream commit 65171b2df15eb7545431d75c2729b5062da89b43 ] Move the existing calls of mpc_i2c_fixup() to a recovery function registered via bus_recovery_info. This makes it more obvious that recovery is supported and allows for a future where recovery is triggered by the i2c core. Signed-off-by: Chris Packham Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-mpc.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c index d94f05c8b8b79..6a0d55e9e8e32 100644 --- a/drivers/i2c/busses/i2c-mpc.c +++ b/drivers/i2c/busses/i2c-mpc.c @@ -586,7 +586,7 @@ static int mpc_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num) if ((status & (CSR_MCF | CSR_MBB | CSR_RXAK)) != 0) { writeb(status & ~CSR_MAL, i2c->base + MPC_I2C_SR); - mpc_i2c_fixup(i2c); + i2c_recover_bus(&i2c->adap); } return -EIO; } @@ -622,7 +622,7 @@ static int mpc_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num) if ((status & (CSR_MCF | CSR_MBB | CSR_RXAK)) != 0) { writeb(status & ~CSR_MAL, i2c->base + MPC_I2C_SR); - mpc_i2c_fixup(i2c); + i2c_recover_bus(&i2c->adap); } return -EIO; } @@ -637,6 +637,15 @@ static u32 mpc_functionality(struct i2c_adapter *adap) | I2C_FUNC_SMBUS_READ_BLOCK_DATA | I2C_FUNC_SMBUS_BLOCK_PROC_CALL; } +static int fsl_i2c_bus_recovery(struct i2c_adapter *adap) +{ + struct mpc_i2c *i2c = i2c_get_adapdata(adap); + + mpc_i2c_fixup(i2c); + + return 0; +} + static const struct i2c_algorithm mpc_algo = { .master_xfer = mpc_xfer, .functionality = mpc_functionality, @@ -648,6 +657,10 @@ static struct i2c_adapter mpc_ops = { .timeout = HZ, }; +static struct i2c_bus_recovery_info fsl_i2c_recovery_info = { + .recover_bus = fsl_i2c_bus_recovery, +}; + static const struct of_device_id mpc_i2c_of_match[]; static int fsl_i2c_probe(struct platform_device *op) { @@ -740,6 +753,7 @@ static int fsl_i2c_probe(struct platform_device *op) i2c_set_adapdata(&i2c->adap, i2c); i2c->adap.dev.parent = &op->dev; i2c->adap.dev.of_node = of_node_get(op->dev.of_node); + i2c->adap.bus_recovery_info = &fsl_i2c_recovery_info; result = i2c_add_adapter(&i2c->adap); if (result < 0) From fff6af6deae8af944f07eb27dba0236e1a2e95cb Mon Sep 17 00:00:00 2001 From: Chris Packham Date: Wed, 12 May 2021 09:20:52 +1200 Subject: [PATCH 181/269] i2c: mpc: implement erratum A-004447 workaround [ Upstream commit 8f0cdec8b5fd94135d643662506ee94ae9e98785 ] The P2040/P2041 has an erratum where the normal i2c recovery mechanism does not work. Implement the alternative recovery mechanism documented in the P2040 Chip Errata Rev Q. Signed-off-by: Chris Packham Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-mpc.c | 79 +++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c index 6a0d55e9e8e32..af349661fd769 100644 --- a/drivers/i2c/busses/i2c-mpc.c +++ b/drivers/i2c/busses/i2c-mpc.c @@ -23,6 +23,7 @@ #include #include +#include #include #include #include @@ -49,6 +50,7 @@ #define CCR_MTX 0x10 #define CCR_TXAK 0x08 #define CCR_RSTA 0x04 +#define CCR_RSVD 0x02 #define CSR_MCF 0x80 #define CSR_MAAS 0x40 @@ -70,6 +72,7 @@ struct mpc_i2c { u8 fdr, dfsrr; #endif struct clk *clk_per; + bool has_errata_A004447; }; struct mpc_i2c_divider { @@ -176,6 +179,75 @@ static int i2c_wait(struct mpc_i2c *i2c, unsigned timeout, int writing) return 0; } +static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask) +{ + void __iomem *addr = i2c->base + MPC_I2C_SR; + u8 val; + + return readb_poll_timeout(addr, val, val & mask, 0, 100); +} + +/* + * Workaround for Erratum A004447. From the P2040CE Rev Q + * + * 1. Set up the frequency divider and sampling rate. + * 2. I2CCR - a0h + * 3. Poll for I2CSR[MBB] to get set. + * 4. If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to + * step 5. If MAL is not set, then go to step 13. + * 5. I2CCR - 00h + * 6. I2CCR - 22h + * 7. I2CCR - a2h + * 8. Poll for I2CSR[MBB] to get set. + * 9. Issue read to I2CDR. + * 10. Poll for I2CSR[MIF] to be set. + * 11. I2CCR - 82h + * 12. Workaround complete. Skip the next steps. + * 13. Issue read to I2CDR. + * 14. Poll for I2CSR[MIF] to be set. + * 15. I2CCR - 80h + */ +static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c) +{ + int ret; + u32 val; + + writeccr(i2c, CCR_MEN | CCR_MSTA); + ret = i2c_mpc_wait_sr(i2c, CSR_MBB); + if (ret) { + dev_err(i2c->dev, "timeout waiting for CSR_MBB\n"); + return; + } + + val = readb(i2c->base + MPC_I2C_SR); + + if (val & CSR_MAL) { + writeccr(i2c, 0x00); + writeccr(i2c, CCR_MSTA | CCR_RSVD); + writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD); + ret = i2c_mpc_wait_sr(i2c, CSR_MBB); + if (ret) { + dev_err(i2c->dev, "timeout waiting for CSR_MBB\n"); + return; + } + val = readb(i2c->base + MPC_I2C_DR); + ret = i2c_mpc_wait_sr(i2c, CSR_MIF); + if (ret) { + dev_err(i2c->dev, "timeout waiting for CSR_MIF\n"); + return; + } + writeccr(i2c, CCR_MEN | CCR_RSVD); + } else { + val = readb(i2c->base + MPC_I2C_DR); + ret = i2c_mpc_wait_sr(i2c, CSR_MIF); + if (ret) { + dev_err(i2c->dev, "timeout waiting for CSR_MIF\n"); + return; + } + writeccr(i2c, CCR_MEN); + } +} + #if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x) static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = { {20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23}, @@ -641,7 +713,10 @@ static int fsl_i2c_bus_recovery(struct i2c_adapter *adap) { struct mpc_i2c *i2c = i2c_get_adapdata(adap); - mpc_i2c_fixup(i2c); + if (i2c->has_errata_A004447) + mpc_i2c_fixup_A004447(i2c); + else + mpc_i2c_fixup(i2c); return 0; } @@ -745,6 +820,8 @@ static int fsl_i2c_probe(struct platform_device *op) dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 1000000 / HZ); platform_set_drvdata(op, i2c); + if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447")) + i2c->has_errata_A004447 = true; i2c->adap = mpc_ops; of_address_to_resource(op->dev.of_node, 0, &res); From bd7d88b0874f82f7b29d1a53e574cedaf23166ba Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 10 Jun 2021 17:20:59 +0200 Subject: [PATCH 182/269] ALSA: seq: Fix race of snd_seq_timer_open() commit 83e197a8414c0ba545e7e3916ce05f836f349273 upstream. The timer instance per queue is exclusive, and snd_seq_timer_open() should have managed the concurrent accesses. It looks as if it's checking the already existing timer instance at the beginning, but it's not right, because there is no protection, hence any later concurrent call of snd_seq_timer_open() may override the timer instance easily. This may result in UAF, as the leftover timer instance can keep running while the queue itself gets closed, as spotted by syzkaller recently. For avoiding the race, add a proper check at the assignment of tmr->timeri again, and return -EBUSY if it's been already registered. Reported-by: syzbot+ddc1260a83ed1cbf6fb5@syzkaller.appspotmail.com Cc: Link: https://lore.kernel.org/r/000000000000dce34f05c42f110c@google.com Link: https://lore.kernel.org/r/20210610152059.24633-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/core/seq/seq_timer.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/sound/core/seq/seq_timer.c b/sound/core/seq/seq_timer.c index 1645e4142e302..9863be6fd43e1 100644 --- a/sound/core/seq/seq_timer.c +++ b/sound/core/seq/seq_timer.c @@ -297,8 +297,16 @@ int snd_seq_timer_open(struct snd_seq_queue *q) return err; } spin_lock_irq(&tmr->lock); - tmr->timeri = t; + if (tmr->timeri) + err = -EBUSY; + else + tmr->timeri = t; spin_unlock_irq(&tmr->lock); + if (err < 0) { + snd_timer_close(t); + snd_timer_instance_free(t); + return err; + } return 0; } From 98f842951f8aa222e8a8453e6dbce6c056e9984f Mon Sep 17 00:00:00 2001 From: Takashi Sakamoto Date: Sat, 5 Jun 2021 18:10:54 +0900 Subject: [PATCH 183/269] ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun() commit 9981b20a5e3694f4625ab5a1ddc98ce7503f6d12 upstream. In the workqueue to queue wake-up event, isochronous context is not processed, thus it's useless to check context for the workqueue to switch status of runtime for PCM substream to XRUN. On the other hand, in software IRQ context of 1394 OHCI, it's needed. This commit fixes the bug introduced when tasklet was replaced with workqueue. Cc: Fixes: 2b3d2987d800 ("ALSA: firewire: Replace tasklet with work") Signed-off-by: Takashi Sakamoto Link: https://lore.kernel.org/r/20210605091054.68866-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/firewire/amdtp-stream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index e0faa6601966c..5805c5de39fbf 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -804,7 +804,7 @@ static void generate_pkt_descs(struct amdtp_stream *s, struct pkt_desc *descs, static inline void cancel_stream(struct amdtp_stream *s) { s->packet_index = -1; - if (current_work() == &s->period_work) + if (in_interrupt()) amdtp_stream_pcm_abort(s); WRITE_ONCE(s->pcm_buffer_pointer, SNDRV_PCM_POS_XRUN); } From 6a81e47788fffcf046db4f3328e74cbe6dc8fbb5 Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Tue, 8 Jun 2021 10:46:00 +0800 Subject: [PATCH 184/269] ALSA: hda/realtek: headphone and mic don't work on an Acer laptop commit 57c9e21a49b1c196cda28f54de9a5d556ac93f20 upstream. There are 2 issues on this machine, the 1st one is mic's plug/unplug can't be detected, that is because the mic is set to manual detecting mode, need to apply ALC255_FIXUP_XIAOMI_HEADSET_MIC to set it to auto detecting mode. The other one is headphone's plug/unplug can't be detected by pulseaudio, that is because the pulseaudio will use ucm2/sof-hda-dsp on this machine, and the ucm2 only handle 'Headphone Jack', but on this machine the headphone's pincfg sets the location to Front, then the alsa mixer name is "Front Headphone Jack" instead of "Headphone Jack", so override the pincfg to change location to Left. BugLink: http://bugs.launchpad.net/bugs/1930188 Cc: Signed-off-by: Hui Wang Link: https://lore.kernel.org/r/20210608024600.6198-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index cc13a68197f3c..b0e0608befee5 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6560,6 +6560,7 @@ enum { ALC285_FIXUP_HP_SPECTRE_X360, ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP, ALC623_FIXUP_LENOVO_THINKSTATION_P340, + ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, }; static const struct hda_fixup alc269_fixups[] = { @@ -8132,6 +8133,15 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC283_FIXUP_HEADSET_MIC, }, + [ALC255_FIXUP_ACER_HEADPHONE_AND_MIC] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x21, 0x03211030 }, /* Change the Headphone location to Left */ + { } + }, + .chained = true, + .chain_id = ALC255_FIXUP_XIAOMI_HEADSET_MIC + }, }; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -8168,6 +8178,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1025, 0x132a, "Acer TravelMate B114-21", ALC233_FIXUP_ACER_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x1330, "Acer TravelMate X514-51T", ALC255_FIXUP_ACER_HEADSET_MIC), SND_PCI_QUIRK(0x1025, 0x1430, "Acer TravelMate B311R-31", ALC256_FIXUP_ACER_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1025, 0x1466, "Acer Aspire A515-56", ALC255_FIXUP_ACER_HEADPHONE_AND_MIC), SND_PCI_QUIRK(0x1028, 0x0470, "Dell M101z", ALC269_FIXUP_DELL_M101Z), SND_PCI_QUIRK(0x1028, 0x054b, "Dell XPS one 2710", ALC275_FIXUP_DELL_XPS), SND_PCI_QUIRK(0x1028, 0x05bd, "Dell Latitude E6440", ALC292_FIXUP_DELL_E7X), @@ -8722,6 +8733,7 @@ static const struct hda_model_fixup alc269_fixup_models[] = { {.id = ALC285_FIXUP_HP_SPECTRE_X360, .name = "alc285-hp-spectre-x360"}, {.id = ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP, .name = "alc287-ideapad-bass-spk-amp"}, {.id = ALC623_FIXUP_LENOVO_THINKSTATION_P340, .name = "alc623-lenovo-thinkstation-p340"}, + {.id = ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, .name = "alc255-acer-headphone-and-mic"}, {} }; #define ALC225_STANDARD_PINS \ From bd0fe358d182de23dd5be92750b5538becba68ef Mon Sep 17 00:00:00 2001 From: Jeremy Szu Date: Sat, 5 Jun 2021 16:25:36 +0800 Subject: [PATCH 185/269] ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2 commit 15d295b560e6dd45f839a53ae69e4f63b54eb32f upstream. The HP Elite Dragonfly G2 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu Cc: Link: https://lore.kernel.org/r/20210605082539.41797-1-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b0e0608befee5..97b271f216f0d 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8307,6 +8307,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8519, "HP Spectre x360 15-df0xxx", ALC285_FIXUP_HP_SPECTRE_X360), SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x86c7, "HP Envy AiO 32", ALC274_FIXUP_HP_ENVY_GPIO), + SND_PCI_QUIRK(0x103c, 0x8716, "HP Elite Dragonfly G2 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8724, "HP EliteBook 850 G7", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8729, "HP", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8730, "HP ProBook 445 G7", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), From 557306806777762f8faa57a1a43cd427aa0ebd93 Mon Sep 17 00:00:00 2001 From: Jeremy Szu Date: Sat, 5 Jun 2021 16:25:37 +0800 Subject: [PATCH 186/269] ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8 commit 61d3e87468fad82dc8e8cb6de7db563ada64b532 upstream. The HP EliteBook x360 1040 G8 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu Cc: Link: https://lore.kernel.org/r/20210605082539.41797-2-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 97b271f216f0d..0281988f9b30a 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8308,6 +8308,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED), SND_PCI_QUIRK(0x103c, 0x86c7, "HP Envy AiO 32", ALC274_FIXUP_HP_ENVY_GPIO), SND_PCI_QUIRK(0x103c, 0x8716, "HP Elite Dragonfly G2 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), + SND_PCI_QUIRK(0x103c, 0x8720, "HP EliteBook x360 1040 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8724, "HP EliteBook 850 G7", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8729, "HP", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8730, "HP ProBook 445 G7", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), From d62d55f3941b99a88384ce764f70bc5865d42c06 Mon Sep 17 00:00:00 2001 From: Jeremy Szu Date: Sat, 5 Jun 2021 16:25:38 +0800 Subject: [PATCH 187/269] ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8 commit dfb06401b4cdfc71e2fc3e19b877ab845cc9f7f7 upstream. The HP EliteBook 840 Aero G8 using ALC285 codec which using 0x04 to control mute LED and 0x01 to control micmute LED. In the other hand, there is no output from right channel of speaker. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu Cc: Link: https://lore.kernel.org/r/20210605082539.41797-3-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 0281988f9b30a..1e7b329f85b3b 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8327,6 +8327,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x87f5, "HP", ALC287_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x87f7, "HP Spectre x360 14", ALC245_FIXUP_HP_X360_AMP), SND_PCI_QUIRK(0x103c, 0x8846, "HP EliteBook 850 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x884b, "HP EliteBook 840 Aero G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x884c, "HP EliteBook 840 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x886d, "HP ZBook Fury 17.3 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), From 05e6b715947592c5103ebdf13182fcacf522a8ed Mon Sep 17 00:00:00 2001 From: Jeremy Szu Date: Tue, 8 Jun 2021 19:47:48 +0800 Subject: [PATCH 188/269] ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8 commit 600dd2a7e8b62170d177381cc1303861f48f9780 upstream. The HP ZBook Power G8 using ALC236 codec which using 0x02 to control mute LED and 0x01 to control micmute LED. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu Cc: Link: https://lore.kernel.org/r/20210608114750.32009-1-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 1e7b329f85b3b..e46e43dac6bfd 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8332,6 +8332,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x886d, "HP ZBook Fury 17.3 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8873, "HP ZBook Studio 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), + SND_PCI_QUIRK(0x103c, 0x888d, "HP ZBook Power 15.6 inch G8 Mobile Workstation PC", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8896, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_MUTE_LED), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300), From 82a8ffba54d31e97582051cb56ba1f988018681e Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Sat, 22 May 2021 19:49:50 +0200 Subject: [PATCH 189/269] spi: bcm2835: Fix out-of-bounds access with more than 4 slaves commit 13817d466eb8713a1ffd254f537402f091d48444 upstream. Commit 571e31fa60b3 ("spi: bcm2835: Cache CS register value for ->prepare_message()") limited the number of slaves to 3 at compile-time. The limitation was necessitated by a statically-sized array prepare_cs[] in the driver private data which contains a per-slave register value. The commit sought to enforce the limitation at run-time by setting the controller's num_chipselect to 3: Slaves with a higher chipselect are rejected by spi_add_device(). However the commit neglected that num_chipselect only limits the number of *native* chipselects. If GPIO chipselects are specified in the device tree for more than 3 slaves, num_chipselect is silently raised by of_spi_get_gpio_numbers() and the result are out-of-bounds accesses to the statically-sized array prepare_cs[]. As a bandaid fix which is backportable to stable, raise the number of allowed slaves to 24 (which "ought to be enough for anybody"), enforce the limitation on slave ->setup and revert num_chipselect to 3 (which is the number of native chipselects supported by the controller). An upcoming for-next commit will allow an arbitrary number of slaves. Fixes: 571e31fa60b3 ("spi: bcm2835: Cache CS register value for ->prepare_message()") Reported-by: Joe Burmeister Signed-off-by: Lukas Wunner Cc: stable@vger.kernel.org # v5.4+ Cc: Phil Elwell Link: https://lore.kernel.org/r/75854affc1923309fde05e47494263bde73e5592.1621703210.git.lukas@wunner.de Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/spi/spi-bcm2835.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c index 197485f2c2b22..29ee555a42f90 100644 --- a/drivers/spi/spi-bcm2835.c +++ b/drivers/spi/spi-bcm2835.c @@ -68,7 +68,7 @@ #define BCM2835_SPI_FIFO_SIZE 64 #define BCM2835_SPI_FIFO_SIZE_3_4 48 #define BCM2835_SPI_DMA_MIN_LENGTH 96 -#define BCM2835_SPI_NUM_CS 4 /* raise as necessary */ +#define BCM2835_SPI_NUM_CS 24 /* raise as necessary */ #define BCM2835_SPI_MODE_BITS (SPI_CPOL | SPI_CPHA | SPI_CS_HIGH \ | SPI_NO_CS | SPI_3WIRE) @@ -1195,6 +1195,12 @@ static int bcm2835_spi_setup(struct spi_device *spi) struct gpio_chip *chip; u32 cs; + if (spi->chip_select >= BCM2835_SPI_NUM_CS) { + dev_err(&spi->dev, "only %d chip-selects supported\n", + BCM2835_SPI_NUM_CS - 1); + return -EINVAL; + } + /* * Precalculate SPI slave's CS register value for ->prepare_message(): * The driver always uses software-controlled GPIO chip select, hence @@ -1288,7 +1294,7 @@ static int bcm2835_spi_probe(struct platform_device *pdev) ctlr->use_gpio_descriptors = true; ctlr->mode_bits = BCM2835_SPI_MODE_BITS; ctlr->bits_per_word_mask = SPI_BPW_MASK(8); - ctlr->num_chipselect = BCM2835_SPI_NUM_CS; + ctlr->num_chipselect = 3; ctlr->setup = bcm2835_spi_setup; ctlr->transfer_one = bcm2835_spi_transfer_one; ctlr->handle_err = bcm2835_spi_handle_err; From afd87792db355282c4608356b98bb2dd650a6885 Mon Sep 17 00:00:00 2001 From: Zhang Rui Date: Thu, 3 Jun 2021 10:34:14 +0800 Subject: [PATCH 190/269] Revert "ACPI: sleep: Put the FACS table after using it" commit f1ffa9d4cccc8fdf6c03fb1b3429154d22037988 upstream. Commit 95722237cb2a ("ACPI: sleep: Put the FACS table after using it") puts the FACS table during initialization. But the hardware signature bits in the FACS table need to be accessed, after every hibernation, to compare with the original hardware signature. So there is no reason to release the FACS table mapping after initialization. This reverts commit 95722237cb2ae4f7b73471058cdb19e8f4057c93. An alternative solution is to use acpi_gbl_FACS variable instead, which is mapped by the ACPICA core and never released. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212277 Reported-by: Stephan Hohe Signed-off-by: Zhang Rui Cc: 5.8+ # 5.8+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- drivers/acpi/sleep.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c index aff13bf4d9470..31c9d0c8ae11f 100644 --- a/drivers/acpi/sleep.c +++ b/drivers/acpi/sleep.c @@ -1290,10 +1290,8 @@ static void acpi_sleep_hibernate_setup(void) return; acpi_get_table(ACPI_SIG_FACS, 1, (struct acpi_table_header **)&facs); - if (facs) { + if (facs) s4_hardware_signature = facs->hardware_signature; - acpi_put_table((struct acpi_table_header *)facs); - } } #else /* !CONFIG_HIBERNATION */ static inline void acpi_sleep_hibernate_setup(void) {} From 491d52e0078860b33b6c14f0a7ac74ca1b603bd6 Mon Sep 17 00:00:00 2001 From: Desmond Cheong Zhi Xi Date: Tue, 8 Jun 2021 19:04:36 +0800 Subject: [PATCH 191/269] drm: Fix use-after-free read in drm_getunique() commit b436acd1cf7fac0ba987abd22955d98025c80c2b upstream. There is a time-of-check-to-time-of-use error in drm_getunique() due to retrieving file_priv->master prior to locking the device's master mutex. An example can be seen in the crash report of the use-after-free error found by Syzbot: https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803 In the report, the master pointer was used after being freed. This is because another process had acquired the device's master mutex in drm_setmaster_ioctl(), then overwrote fpriv->master in drm_new_set_master(). The old value of fpriv->master was subsequently freed before the mutex was unlocked. To fix this, we lock the device's master mutex before retrieving the pointer from from fpriv->master. This patch passes the Syzbot reproducer test. Reported-by: syzbot+c3a706cec1ea99e1c693@syzkaller.appspotmail.com Signed-off-by: Desmond Cheong Zhi Xi Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210608110436.239583-1-desmondcheongzx@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_ioctl.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 789ee65ac1f57..ae647be4a49fb 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -118,17 +118,18 @@ int drm_getunique(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct drm_unique *u = data; - struct drm_master *master = file_priv->master; + struct drm_master *master; - mutex_lock(&master->dev->master_mutex); + mutex_lock(&dev->master_mutex); + master = file_priv->master; if (u->unique_len >= master->unique_len) { if (copy_to_user(u->unique, master->unique, master->unique_len)) { - mutex_unlock(&master->dev->master_mutex); + mutex_unlock(&dev->master_mutex); return -EFAULT; } } u->unique_len = master->unique_len; - mutex_unlock(&master->dev->master_mutex); + mutex_unlock(&dev->master_mutex); return 0; } From aa8591a58cbd2986090709e4202881f18e8ae30e Mon Sep 17 00:00:00 2001 From: Desmond Cheong Zhi Xi Date: Wed, 9 Jun 2021 17:21:19 +0800 Subject: [PATCH 192/269] drm: Lock pointer access in drm_master_release() commit c336a5ee984708db4826ef9e47d184e638e29717 upstream. This patch eliminates the following smatch warning: drivers/gpu/drm/drm_auth.c:320 drm_master_release() warn: unlocked access 'master' (line 318) expected lock '&dev->master_mutex' The 'file_priv->master' field should be protected by the mutex lock to '&dev->master_mutex'. This is because other processes can concurrently modify this field and free the current 'file_priv->master' pointer. This could result in a use-after-free error when 'master' is dereferenced in subsequent function calls to 'drm_legacy_lock_master_cleanup()' or to 'drm_lease_revoke()'. An example of a scenario that would produce this error can be seen from a similar bug in 'drm_getunique()' that was reported by Syzbot: https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803 In the Syzbot report, another process concurrently acquired the device's master mutex in 'drm_setmaster_ioctl()', then overwrote 'fpriv->master' in 'drm_new_set_master()'. The old value of 'fpriv->master' was subsequently freed before the mutex was unlocked. Reported-by: Dan Carpenter Signed-off-by: Desmond Cheong Zhi Xi Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210609092119.173590-1-desmondcheongzx@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_auth.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index f2d46b7ac6f9f..232abbba36868 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -314,9 +314,10 @@ int drm_master_open(struct drm_file *file_priv) void drm_master_release(struct drm_file *file_priv) { struct drm_device *dev = file_priv->minor->dev; - struct drm_master *master = file_priv->master; + struct drm_master *master; mutex_lock(&dev->master_mutex); + master = file_priv->master; if (file_priv->magic) idr_remove(&file_priv->master->magic_map, file_priv->magic); From 14831b79560b3408e0ea5d4ba37a465a04b70a11 Mon Sep 17 00:00:00 2001 From: Kan Liang Date: Tue, 1 Jun 2021 06:09:03 -0700 Subject: [PATCH 193/269] perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server commit 848ff3768684701a4ce73a2ec0e5d438d4e2b0da upstream. Perf tool errors out with the latest event list for the Ice Lake server. event syntax error: 'unc_m2m_imc_reads.to_pmm' \___ value too big for format, maximum is 255 The same as the Snow Ridge server, the M2M uncore unit in the Ice Lake server has the unit mask extension field as well. Fixes: 2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support") Reported-by: Jin Yao Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1622552943-119174-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Greg Kroah-Hartman --- arch/x86/events/intel/uncore_snbep.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index 3112186a4f4b2..16159950fcf5b 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -5067,9 +5067,10 @@ static struct intel_uncore_type icx_uncore_m2m = { .perf_ctr = SNR_M2M_PCI_PMON_CTR0, .event_ctl = SNR_M2M_PCI_PMON_CTL0, .event_mask = SNBEP_PMON_RAW_EVENT_MASK, + .event_mask_ext = SNR_M2M_PCI_PMON_UMASK_EXT, .box_ctl = SNR_M2M_PCI_PMON_BOX_CTL, .ops = &snr_m2m_uncore_pci_ops, - .format_group = &skx_uncore_format_group, + .format_group = &snr_m2m_uncore_format_group, }; static struct attribute *icx_upi_uncore_formats_attr[] = { From 6b6ff4d1f349cb35a7c7d2057819af1b14f80437 Mon Sep 17 00:00:00 2001 From: Lai Jiangshan Date: Thu, 3 Jun 2021 13:24:55 +0800 Subject: [PATCH 194/269] KVM: X86: MMU: Use the correct inherited permissions to get shadow page commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream. When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com> Cc: stable@vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- Documentation/virt/kvm/mmu.rst | 4 ++-- arch/x86/kvm/mmu/paging_tmpl.h | 14 +++++++++----- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst index 5bfe28b0728e8..20d85daed395e 100644 --- a/Documentation/virt/kvm/mmu.rst +++ b/Documentation/virt/kvm/mmu.rst @@ -171,8 +171,8 @@ Shadow pages contain the following information: shadow pages) so role.quadrant takes values in the range 0..3. Each quadrant maps 1GB virtual address space. role.access: - Inherited guest access permissions in the form uwx. Note execute - permission is positive, not negative. + Inherited guest access permissions from the parent ptes in the form uwx. + Note execute permission is positive, not negative. role.invalid: The page is invalid and should not be used. It is a root page that is currently pinned (by a cpu hardware register pointing to it); once it is diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 50e268eb8e1a9..00a0bfaed6e86 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -90,8 +90,8 @@ struct guest_walker { gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; bool pte_writable[PT_MAX_FULL_LEVELS]; - unsigned pt_access; - unsigned pte_access; + unsigned int pt_access[PT_MAX_FULL_LEVELS]; + unsigned int pte_access; gfn_t gfn; struct x86_exception fault; }; @@ -418,13 +418,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } walker->ptes[walker->level - 1] = pte; + + /* Convert to ACC_*_MASK flags for struct guest_walker. */ + walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask); } while (!is_last_gpte(mmu, walker->level, pte)); pte_pkey = FNAME(gpte_pkeys)(vcpu, pte); accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0; /* Convert to ACC_*_MASK flags for struct guest_walker. */ - walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask); walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask); errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access); if (unlikely(errcode)) @@ -463,7 +465,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } pgprintk("%s: pte %llx pte_access %x pt_access %x\n", - __func__, (u64)pte, walker->pte_access, walker->pt_access); + __func__, (u64)pte, walker->pte_access, + walker->pt_access[walker->level - 1]); return 1; error: @@ -635,7 +638,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr, bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled; struct kvm_mmu_page *sp = NULL; struct kvm_shadow_walk_iterator it; - unsigned direct_access, access = gw->pt_access; + unsigned int direct_access, access; int top_level, level, req_level, ret; gfn_t base_gfn = gw->gfn; @@ -667,6 +670,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr, sp = NULL; if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; + access = gw->pt_access[it.level - 2]; sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1, false, access); } From 7af299b97734c7e7f465b42a2139ce4d77246975 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Tue, 8 Jun 2021 15:31:42 -0400 Subject: [PATCH 195/269] kvm: avoid speculation-based attacks from out-of-range memslot accesses commit da27a83fd6cc7780fea190e1f5c19e87019da65c upstream. KVM's mechanism for accessing guest memory translates a guest physical address (gpa) to a host virtual address using the right-shifted gpa (also known as gfn) and a struct kvm_memory_slot. The translation is performed in __gfn_to_hva_memslot using the following formula: hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE It is expected that gfn falls within the boundaries of the guest's physical memory. However, a guest can access invalid physical addresses in such a way that the gfn is invalid. __gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first retrieves a memslot through __gfn_to_memslot. While __gfn_to_memslot does check that the gfn falls within the boundaries of the guest's physical memory or not, a CPU can speculate the result of the check and continue execution speculatively using an illegal gfn. The speculation can result in calculating an out-of-bounds hva. If the resulting host virtual address is used to load another guest physical address, this is effectively a Spectre gadget consisting of two consecutive reads, the second of which is data dependent on the first. Right now it's not clear if there are any cases in which this is exploitable. One interesting case was reported by the original author of this patch, and involves visiting guest page tables on x86. Right now these are not vulnerable because the hva read goes through get_user(), which contains an LFENCE speculation barrier. However, there are patches in progress for x86 uaccess.h to mask kernel addresses instead of using LFENCE; once these land, a guest could use speculation to read from the VMM's ring 3 address space. Other architectures such as ARM already use the address masking method, and would be susceptible to this same kind of data-dependent access gadgets. Therefore, this patch proactively protects from these attacks by masking out-of-bounds gfns in __gfn_to_hva_memslot, which blocks speculation of invalid hvas. Sean Christopherson noted that this patch does not cover kvm_read_guest_offset_cached. This however is limited to a few bytes past the end of the cache, and therefore it is unlikely to be useful in the context of building a chain of data dependent accesses. Reported-by: Artemiy Margaritov Co-developed-by: Artemiy Margaritov Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- include/linux/kvm_host.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a2278b9ff57d2..ecab72456c100 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1104,7 +1104,15 @@ __gfn_to_memslot(struct kvm_memslots *slots, gfn_t gfn) static inline unsigned long __gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn) { - return slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE; + /* + * The index was checked originally in search_memslots. To avoid + * that a malicious guest builds a Spectre gadget out of e.g. page + * table walks, do not let the processor speculate loads outside + * the guest's registered memslots. + */ + unsigned long offset = array_index_nospec(gfn - slot->base_gfn, + slot->npages); + return slot->userspace_addr + offset * PAGE_SIZE; } static inline int memslot_id(struct kvm *kvm, gfn_t gfn) From 8d5c0f6b7a784f5ff71fb463ecad58cf042cfccb Mon Sep 17 00:00:00 2001 From: Wenli Looi Date: Mon, 7 Jun 2021 23:46:20 -0700 Subject: [PATCH 196/269] staging: rtl8723bs: Fix uninitialized variables commit 43c85d770db80cb135f576f8fde6ff1a08e707a4 upstream. The sinfo.pertid and sinfo.generation variables are not initialized and it causes a crash when we use this as a wireless access point. [ 456.873025] ------------[ cut here ]------------ [ 456.878198] kernel BUG at mm/slub.c:3968! [ 456.882680] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ snip ] [ 457.271004] Backtrace: [ 457.273733] [] (kfree) from [] (nl80211_send_station+0x954/0xfc4) [ 457.282481] r9:eccca0c0 r8:e8edfec0 r7:00000000 r6:00000011 r5:e80a9480 r4:e8edfe00 [ 457.291132] [] (nl80211_send_station) from [] (cfg80211_new_sta+0x90/0x1cc) [ 457.300850] r10:e80a9480 r9:e8edfe00 r8:ea678cca r7:00000a20 r6:00000000 r5:ec46d000 [ 457.309586] r4:ec46d9e0 [ 457.312433] [] (cfg80211_new_sta) from [] (rtw_cfg80211_indicate_sta_assoc+0x80/0x9c [r8723bs]) [ 457.324095] r10:00009930 r9:e85b9d80 r8:bf091050 r7:00000000 r6:00000000 r5:0000001c [ 457.332831] r4:c1606788 [ 457.335692] [] (rtw_cfg80211_indicate_sta_assoc [r8723bs]) from [] (rtw_stassoc_event_callback+0x1c8/0x1d4 [r8723bs]) [ 457.349489] r7:ea678cc0 r6:000000a1 r5:f1225f84 r4:f086b000 [ 457.355845] [] (rtw_stassoc_event_callback [r8723bs]) from [] (mlme_evt_hdl+0x8c/0xb4 [r8723bs]) [ 457.367601] r7:c1604900 r6:f086c4b8 r5:00000000 r4:f086c000 [ 457.373959] [] (mlme_evt_hdl [r8723bs]) from [] (rtw_cmd_thread+0x198/0x3d8 [r8723bs]) [ 457.384744] r5:f086e000 r4:f086c000 [ 457.388754] [] (rtw_cmd_thread [r8723bs]) from [] (kthread+0x170/0x174) [ 457.398083] r10:ed7a57e8 r9:bf0367a4 r8:f086b000 r7:e8ede000 r6:00000000 r5:e9975200 [ 457.406828] r4:e8369900 [ 457.409653] [] (kthread) from [] (ret_from_fork+0x14/0x2c) [ 457.417718] Exception stack(0xe8edffb0 to 0xe8edfff8) [ 457.423356] ffa0: 00000000 00000000 00000000 00000000 [ 457.432492] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 457.441618] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 457.449006] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c014a0a4 [ 457.457750] r4:e9975200 [ 457.460574] Code: 1a000003 e5953004 e3130001 1a000000 (e7f001f2) [ 457.467381] ---[ end trace 4acbc8c15e9e6aa7 ]--- Link: https://forum.armbian.com/topic/14727-wifi-ap-kernel-bug-in-kernel-5444/ Fixes: 8689c051a201 ("cfg80211: dynamically allocate per-tid stats for station info") Fixes: f5ea9120be2e ("nl80211: add generation number to all dumps") Signed-off-by: Wenli Looi Reviewed-by: Dan Carpenter Cc: stable Link: https://lore.kernel.org/r/20210608064620.74059-1-wlooi@ucalgary.ca Signed-off-by: Greg Kroah-Hartman --- drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c index ea3ae3d38337e..b7993e25764d5 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -2384,7 +2384,7 @@ void rtw_cfg80211_indicate_sta_assoc(struct adapter *padapter, u8 *pmgmt_frame, DBG_871X(FUNC_ADPT_FMT"\n", FUNC_ADPT_ARG(padapter)); { - struct station_info sinfo; + struct station_info sinfo = {}; u8 ie_offset; if (GetFrameSubType(pmgmt_frame) == WIFI_ASSOCREQ) ie_offset = _ASOCREQ_IE_OFFSET_; From bf240fee5b341fbc68615f04892af274e7abf8b3 Mon Sep 17 00:00:00 2001 From: Xiao Ni Date: Fri, 28 May 2021 14:16:38 +0800 Subject: [PATCH 197/269] async_xor: check src_offs is not NULL before updating it commit 9be148e408df7d361ec5afd6299b7736ff3928b0 upstream. When PAGE_SIZE is greater than 4kB, multiple stripes may share the same page. Thus, src_offs is added to async_xor_offs() with array of offsets. However, async_xor() passes NULL src_offs to async_xor_offs(). In such case, src_offs should not be updated. Add a check before the update. Fixes: ceaf2966ab08(async_xor: increase src_offs when dropping destination page) Cc: stable@vger.kernel.org # v5.10+ Reported-by: Oleksandr Shchirskyi Tested-by: Oleksandr Shchirskyi Signed-off-by: Xiao Ni Signed-off-by: Song Liu Signed-off-by: Greg Kroah-Hartman --- crypto/async_tx/async_xor.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c index 6cd7f7025df47..d8a91521144e0 100644 --- a/crypto/async_tx/async_xor.c +++ b/crypto/async_tx/async_xor.c @@ -233,7 +233,8 @@ async_xor_offs(struct page *dest, unsigned int offset, if (submit->flags & ASYNC_TX_XOR_DROP_DST) { src_cnt--; src_list++; - src_offs++; + if (src_offs) + src_offs++; } /* wait for any prerequisite operations */ From ca69dc891b2875e73cf55defaf06bad27c44573c Mon Sep 17 00:00:00 2001 From: Ritesh Harjani Date: Sun, 30 May 2021 20:24:05 +0530 Subject: [PATCH 198/269] btrfs: return value from btrfs_mark_extent_written() in case of error commit e7b2ec3d3d4ebeb4cff7ae45cf430182fa6a49fb upstream. We always return 0 even in case of an error in btrfs_mark_extent_written(). Fix it to return proper error value in case of a failure. All callers handle it. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Ritesh Harjani Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 7e87549c5edaf..ffa48ac98d1e5 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1088,7 +1088,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, int del_nr = 0; int del_slot = 0; int recow; - int ret; + int ret = 0; u64 ino = btrfs_ino(inode); path = btrfs_alloc_path(); @@ -1309,7 +1309,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, } out: btrfs_free_path(path); - return 0; + return ret; } /* From 31fe243a6376e81f69ffab3853c35a6c4f3be5fc Mon Sep 17 00:00:00 2001 From: Nikolay Borisov Date: Mon, 31 May 2021 12:26:01 +0300 Subject: [PATCH 199/269] btrfs: promote debugging asserts to full-fledged checks in validate_super commit aefd7f7065567a4666f42c0fc8cdb379d2e036bf upstream. Syzbot managed to trigger this assert while performing its fuzzing. Turns out it's better to have those asserts turned into full-fledged checks so that in case buggy btrfs images are mounted the users gets an error and mounting is stopped. Alternatively with CONFIG_BTRFS_ASSERT disabled such image would have been erroneously allowed to be mounted. Reported-by: syzbot+a6bf271c02e4fe66b4e4@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn Reviewed-by: Qu Wenruo Signed-off-by: Nikolay Borisov Reviewed-by: David Sterba [ add uuids to the messages ] Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/disk-io.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 56f3b9acd2154..e025cd8f3f071 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2467,6 +2467,24 @@ static int validate_super(struct btrfs_fs_info *fs_info, ret = -EINVAL; } + if (memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, + BTRFS_FSID_SIZE)) { + btrfs_err(fs_info, + "superblock fsid doesn't match fsid of fs_devices: %pU != %pU", + fs_info->super_copy->fsid, fs_info->fs_devices->fsid); + ret = -EINVAL; + } + + if (btrfs_fs_incompat(fs_info, METADATA_UUID) && + memcmp(fs_info->fs_devices->metadata_uuid, + fs_info->super_copy->metadata_uuid, BTRFS_FSID_SIZE)) { + btrfs_err(fs_info, +"superblock metadata_uuid doesn't match metadata uuid of fs_devices: %pU != %pU", + fs_info->super_copy->metadata_uuid, + fs_info->fs_devices->metadata_uuid); + ret = -EINVAL; + } + if (memcmp(fs_info->fs_devices->metadata_uuid, sb->dev_item.fsid, BTRFS_FSID_SIZE) != 0) { btrfs_err(fs_info, @@ -2969,14 +2987,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device disk_super = fs_info->super_copy; - ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, - BTRFS_FSID_SIZE)); - - if (btrfs_fs_incompat(fs_info, METADATA_UUID)) { - ASSERT(!memcmp(fs_info->fs_devices->metadata_uuid, - fs_info->super_copy->metadata_uuid, - BTRFS_FSID_SIZE)); - } features = btrfs_super_flags(disk_super); if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) { From 74d3b20b1b206a76f2cbccc5e09106adf6b5775c Mon Sep 17 00:00:00 2001 From: Alexander Kuznetsov Date: Wed, 9 Jun 2021 10:17:19 +0300 Subject: [PATCH 200/269] cgroup1: don't allow '\n' in renaming commit b7e24eb1caa5f8da20d405d262dba67943aedc42 upstream. cgroup_mkdir() have restriction on newline usage in names: $ mkdir $'/sys/fs/cgroup/cpu/test\ntest2' mkdir: cannot create directory '/sys/fs/cgroup/cpu/test\ntest2': Invalid argument But in cgroup1_rename() such check is missed. This allows us to make /proc//cgroup unparsable: $ mkdir /sys/fs/cgroup/cpu/test $ mv /sys/fs/cgroup/cpu/test $'/sys/fs/cgroup/cpu/test\ntest2' $ echo $$ > $'/sys/fs/cgroup/cpu/test\ntest2' $ cat /proc/self/cgroup 11:pids:/ 10:freezer:/ 9:hugetlb:/ 8:cpuset:/ 7:blkio:/user.slice 6:memory:/user.slice 5:net_cls,net_prio:/ 4:perf_event:/ 3:devices:/user.slice 2:cpu,cpuacct:/test test2 1:name=systemd:/ 0::/ Signed-off-by: Alexander Kuznetsov Reported-by: Andrey Krasichkov Acked-by: Dmitry Yakunin Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman --- kernel/cgroup/cgroup-v1.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index a5751784ad740..f6dddb3a8f4a2 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -820,6 +820,10 @@ static int cgroup1_rename(struct kernfs_node *kn, struct kernfs_node *new_parent struct cgroup *cgrp = kn->priv; int ret; + /* do not accept '\n' to prevent making /proc//cgroup unparsable */ + if (strchr(new_name_str, '\n')) + return -EINVAL; + if (kernfs_type(kn) != KERNFS_DIR) return -ENOTDIR; if (kn->parent != new_parent) From 97524384762c1fb9b3ded931498dd2047bd0de81 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Mon, 7 Jun 2021 21:39:08 -0400 Subject: [PATCH 201/269] ftrace: Do not blindly read the ip address in ftrace_bug() commit 6c14133d2d3f768e0a35128faac8aa6ed4815051 upstream. It was reported that a bug on arm64 caused a bad ip address to be used for updating into a nop in ftrace_init(), but the error path (rightfully) returned -EINVAL and not -EFAULT, as the bug caused more than one error to occur. But because -EINVAL was returned, the ftrace_bug() tried to report what was at the location of the ip address, and read it directly. This caused the machine to panic, as the ip was not pointing to a valid memory address. Instead, read the ip address with copy_from_kernel_nofault() to safely access the memory, and if it faults, report that the address faulted, otherwise report what was in that location. Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/ Cc: stable@vger.kernel.org Fixes: 05736a427f7e1 ("ftrace: warn on failure to disable mcount callers") Reported-by: Mark-PK Tsai Tested-by: Mark-PK Tsai Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- kernel/trace/ftrace.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index a6d15a3187d0e..30010614b9237 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1968,12 +1968,18 @@ static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops, static void print_ip_ins(const char *fmt, const unsigned char *p) { + char ins[MCOUNT_INSN_SIZE]; int i; + if (copy_from_kernel_nofault(ins, p, MCOUNT_INSN_SIZE)) { + printk(KERN_CONT "%s[FAULT] %px\n", fmt, p); + return; + } + printk(KERN_CONT "%s", fmt); for (i = 0; i < MCOUNT_INSN_SIZE; i++) - printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]); + printk(KERN_CONT "%s%02x", i ? ":" : "", ins[i]); } enum ftrace_bug_type ftrace_bug_type; From 67aca230caf346ddf608ee69469777cd52929493 Mon Sep 17 00:00:00 2001 From: Wolfram Sang Date: Wed, 2 Jun 2021 09:34:35 +0200 Subject: [PATCH 202/269] mmc: renesas_sdhi: abort tuning when timeout detected MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 2c9017d0b5d3fbf17e69577a42d9e610ca122810 upstream. We have to bring the eMMC from sending-data state back to transfer state once we detected a CRC error (timeout) during tuning. So, send a stop command via mmc_abort_tuning(). Fixes: 4f11997773b6 ("mmc: tmio: Add tuning support") Reported-by Yoshihiro Shimoda Signed-off-by: Wolfram Sang Reviewed-by: Niklas Söderlund Reviewed-by: Yoshihiro Shimoda Tested-by: Yoshihiro Shimoda Link: https://lore.kernel.org/r/20210602073435.5955-1-wsa+renesas@sang-engineering.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman --- drivers/mmc/host/renesas_sdhi_core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c index acb9c81a4e456..666c824715e19 100644 --- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -660,14 +660,19 @@ static int renesas_sdhi_execute_tuning(struct mmc_host *mmc, u32 opcode) /* Issue CMD19 twice for each tap */ for (i = 0; i < 2 * priv->tap_num; i++) { + int cmd_error; + /* Set sampling clock position */ sd_scc_write32(host, priv, SH_MOBILE_SDHI_SCC_TAPSET, i % priv->tap_num); - if (mmc_send_tuning(mmc, opcode, NULL) == 0) + if (mmc_send_tuning(mmc, opcode, &cmd_error) == 0) set_bit(i, priv->taps); if (sd_scc_read32(host, priv, SH_MOBILE_SDHI_SCC_SMPCMP) == 0) set_bit(i, priv->smpcmp); + + if (cmd_error) + mmc_abort_tuning(mmc, opcode); } ret = renesas_sdhi_select_tuning(host); From 1bf2c28ab2684a05c0c6b0be70682a44312b4785 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Fri, 4 Jun 2021 14:59:43 +0200 Subject: [PATCH 203/269] mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 6687cd72aa9112a454a4646986e0402dd1b07d0e upstream. R-Car M3-W ES3.0 is marketed as R-Car M3-W+ (R8A77961), and has its own compatible value "renesas,r8a77961". Hence using soc_device_match() with soc_id = "r8a7796" and revision = "ES3.*" does not actually match running on an R-Car M3-W+ SoC. Fix this by matching with soc_id = "r8a77961" instead. Fixes: a38c078fea0b1393 ("mmc: renesas_sdhi: Avoid bad TAP in HS400") Signed-off-by: Geert Uytterhoeven Reviewed-by: Wolfram Sang Reviewed-by: Niklas Söderlund Reviewed-by: Yoshihiro Shimoda Link: https://lore.kernel.org/r/ee8af5d631f5331139ffea714539030d97352e93.1622811525.git.geert+renesas@glider.be Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman --- drivers/mmc/host/renesas_sdhi_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c index 666c824715e19..addaaf2810e26 100644 --- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -902,7 +902,7 @@ static const struct soc_device_attribute sdhi_quirks_match[] = { { .soc_id = "r8a7795", .revision = "ES3.*", .data = &sdhi_quirks_bad_taps2367 }, { .soc_id = "r8a7796", .revision = "ES1.[012]", .data = &sdhi_quirks_4tap_nohs400 }, { .soc_id = "r8a7796", .revision = "ES1.*", .data = &sdhi_quirks_r8a7796_es13 }, - { .soc_id = "r8a7796", .revision = "ES3.*", .data = &sdhi_quirks_bad_taps1357 }, + { .soc_id = "r8a77961", .data = &sdhi_quirks_bad_taps1357 }, { .soc_id = "r8a77965", .data = &sdhi_quirks_r8a77965 }, { .soc_id = "r8a77980", .data = &sdhi_quirks_nohs400 }, { .soc_id = "r8a77990", .data = &sdhi_quirks_r8a77990 }, From 0f5a20b1fd9da3ac9f7c6edcad522712ca694d5c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maciej=20=C5=BBenczykowski?= Date: Mon, 7 Jun 2021 17:53:44 -0700 Subject: [PATCH 204/269] USB: f_ncm: ncm_bitrate (speed) is unsigned MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 3370139745853f7826895293e8ac3aec1430508e upstream. [ 190.544755] configfs-gadget gadget: notify speed -44967296 This is because 4250000000 - 2**32 is -44967296. Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added") Cc: Brooke Basile Cc: Bryan O'Donoghue Cc: Felipe Balbi Cc: Lorenzo Colitti Cc: Yauheni Kaliuta Cc: Linux USB Mailing List Acked-By: Lorenzo Colitti Signed-off-by: Maciej Żenczykowski Cc: stable Link: https://lore.kernel.org/r/20210608005344.3762668-1-zenczykowski@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_ncm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index 019bea8e09cce..0d23c6c11a137 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -583,7 +583,7 @@ static void ncm_do_notify(struct f_ncm *ncm) data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget)); data[1] = data[0]; - DBG(cdev, "notify speed %d\n", ncm_bitrate(cdev->gadget)); + DBG(cdev, "notify speed %u\n", ncm_bitrate(cdev->gadget)); ncm->notify_state = NCM_NOTIFY_CONNECT; break; } From 0ff5f83ae147e63c297e0a5515c9c271b7448f6f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maciej=20=C5=BBenczykowski?= Date: Tue, 8 Jun 2021 01:54:38 -0700 Subject: [PATCH 205/269] usb: f_ncm: only first packet of aggregate needs to start timer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 1958ff5ad2d4908b44a72bcf564dfe67c981e7fe upstream. The reasoning for this change is that if we already had a packet pending, then we also already had a pending timer, and as such there is no need to reschedule it. This also prevents packets getting delayed 60 ms worst case under a tiny packet every 290us transmit load, by keeping the timeout always relative to the first queued up packet. (300us delay * 16KB max aggregation / 80 byte packet =~ 60 ms) As such the first packet is now at most delayed by 300us. Under low transmit load, this will simply result in us sending a shorter aggregate, as originally intended. This patch has the benefit of greatly reducing (by ~10 factor with 1500 byte frames aggregated into 16 kiB) the number of (potentially pretty costly) updates to the hrtimer. Cc: Brooke Basile Cc: Bryan O'Donoghue Cc: Felipe Balbi Cc: Lorenzo Colitti Signed-off-by: Maciej Żenczykowski Link: https://lore.kernel.org/r/20210608085438.813960-1-zenczykowski@gmail.com Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_ncm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index 0d23c6c11a137..855127249f242 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1101,11 +1101,11 @@ static struct sk_buff *ncm_wrap_ntb(struct gether *port, ncm->ndp_dgram_count = 1; /* Note: we skip opts->next_ndp_index */ - } - /* Delay the timer. */ - hrtimer_start(&ncm->task_timer, TX_TIMEOUT_NSECS, - HRTIMER_MODE_REL_SOFT); + /* Start the timer. */ + hrtimer_start(&ncm->task_timer, TX_TIMEOUT_NSECS, + HRTIMER_MODE_REL_SOFT); + } /* Add the datagram position entries */ ntb_ndp = skb_put_zero(ncm->skb_tx_ndp, dgram_idx_len); From b452e8bb7c525fbecc69aef44b0721d2ece032bc Mon Sep 17 00:00:00 2001 From: Kyle Tso Date: Fri, 28 May 2021 16:16:13 +0800 Subject: [PATCH 206/269] usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms commit 6490fa565534fa83593278267785a694fd378a2b upstream. Current timer PD_T_SINK_WAIT_CAP is set to 240ms which will violate the SinkWaitCapTimer (tTypeCSinkWaitCap 310 - 620 ms) defined in the PD Spec if the port is faster enough when running the state machine. Set it to the lower bound 310ms to ensure the timeout is in Spec. Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)") Cc: stable Reviewed-by: Guenter Roeck Signed-off-by: Kyle Tso Link: https://lore.kernel.org/r/20210528081613.730661-1-kyletso@google.com Signed-off-by: Greg Kroah-Hartman --- include/linux/usb/pd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/usb/pd.h b/include/linux/usb/pd.h index 3a805e2ecbc99..433040ff840a3 100644 --- a/include/linux/usb/pd.h +++ b/include/linux/usb/pd.h @@ -459,7 +459,7 @@ static inline unsigned int rdo_max_power(u32 rdo) #define PD_T_RECEIVER_RESPONSE 15 /* 15ms max */ #define PD_T_SOURCE_ACTIVITY 45 #define PD_T_SINK_ACTIVITY 135 -#define PD_T_SINK_WAIT_CAP 240 +#define PD_T_SINK_WAIT_CAP 310 /* 310 - 620 ms */ #define PD_T_PS_TRANSITION 500 #define PD_T_SRC_TRANSITION 35 #define PD_T_DRP_SNK 40 From 750a0d75564293be3ed50f13ef7f38ab75106421 Mon Sep 17 00:00:00 2001 From: Neil Armstrong Date: Tue, 1 Jun 2021 10:48:30 +0200 Subject: [PATCH 207/269] usb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled commit 4d2aa178d2ad2fb156711113790dde13e9aa2376 upstream. When only PHY1 is used (for example on Odroid-HC4), the regmap init code uses the usb2 ports when doesn't initialize the PHY1 regmap entry. This fixes: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... pc : regmap_update_bits_base+0x40/0xa0 lr : dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8 ... Call trace: regmap_update_bits_base+0x40/0xa0 dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8 dwc3_meson_g12a_usb2_init+0x7c/0xc8 dwc3_meson_g12a_usb_init+0x28/0x48 dwc3_meson_g12a_probe+0x298/0x540 platform_probe+0x70/0xe0 really_probe+0xf0/0x4d8 driver_probe_device+0xfc/0x168 ... Fixes: 013af227f58a97 ("usb: dwc3: meson-g12a: handle the phy and glue registers separately") Reviewed-by: Martin Blumenstingl Signed-off-by: Neil Armstrong Cc: stable Link: https://lore.kernel.org/r/20210601084830.260196-1-narmstrong@baylibre.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/dwc3-meson-g12a.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/usb/dwc3/dwc3-meson-g12a.c b/drivers/usb/dwc3/dwc3-meson-g12a.c index bdf1f98dfad8c..9c944bc6d5554 100644 --- a/drivers/usb/dwc3/dwc3-meson-g12a.c +++ b/drivers/usb/dwc3/dwc3-meson-g12a.c @@ -651,7 +651,7 @@ static int dwc3_meson_g12a_setup_regmaps(struct dwc3_meson_g12a *priv, return PTR_ERR(priv->usb_glue_regmap); /* Create a regmap for each USB2 PHY control register set */ - for (i = 0; i < priv->usb2_ports; i++) { + for (i = 0; i < priv->drvdata->num_phys; i++) { struct regmap_config u2p_regmap_config = { .reg_bits = 8, .val_bits = 32, @@ -659,6 +659,9 @@ static int dwc3_meson_g12a_setup_regmaps(struct dwc3_meson_g12a *priv, .max_register = U2P_R1, }; + if (!strstr(priv->drvdata->phy_names[i], "usb2")) + continue; + u2p_regmap_config.name = devm_kasprintf(priv->dev, GFP_KERNEL, "u2p-%d", i); if (!u2p_regmap_config.name) From 2af93b437a61e80df55aa2d321e931d124a93ef7 Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Fri, 21 May 2021 18:55:50 +0200 Subject: [PATCH 208/269] usb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe commit 1d0d3d818eafe1963ec1eaf302175cd14938188e upstream. If an error occurs after a successful 'regulator_enable()' call, 'regulator_disable()' must be called. Fix the error handling path of the probe accordingly. The remove function doesn't need to be fixed, because the 'regulator_disable()' call is already hidden in 'dwc3_meson_g12a_suspend()' which is called via 'pm_runtime_set_suspended()' in the remove function. Fixes: c99993376f72 ("usb: dwc3: Add Amlogic G12A DWC3 glue") Reviewed-by: Martin Blumenstingl Acked-by: Neil Armstrong Signed-off-by: Christophe JAILLET Link: https://lore.kernel.org/r/79df054046224bbb0716a8c5c2082650290eec86.1621616013.git.christophe.jaillet@wanadoo.fr Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/dwc3-meson-g12a.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/usb/dwc3/dwc3-meson-g12a.c b/drivers/usb/dwc3/dwc3-meson-g12a.c index 9c944bc6d5554..ffe301d6ea359 100644 --- a/drivers/usb/dwc3/dwc3-meson-g12a.c +++ b/drivers/usb/dwc3/dwc3-meson-g12a.c @@ -775,13 +775,13 @@ static int dwc3_meson_g12a_probe(struct platform_device *pdev) ret = priv->drvdata->usb_init(priv); if (ret) - goto err_disable_clks; + goto err_disable_regulator; /* Init PHYs */ for (i = 0 ; i < PHY_COUNT ; ++i) { ret = phy_init(priv->phys[i]); if (ret) - goto err_disable_clks; + goto err_disable_regulator; } /* Set PHY Power */ @@ -819,6 +819,10 @@ static int dwc3_meson_g12a_probe(struct platform_device *pdev) for (i = 0 ; i < PHY_COUNT ; ++i) phy_exit(priv->phys[i]); +err_disable_regulator: + if (priv->vbus) + regulator_disable(priv->vbus); + err_disable_clks: clk_bulk_disable_unprepare(priv->drvdata->num_clks, priv->drvdata->clks); From 851dee5a5da56564a70290713aee665403bb0b24 Mon Sep 17 00:00:00 2001 From: Jack Pham Date: Fri, 28 May 2021 09:04:05 -0700 Subject: [PATCH 209/269] usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL commit 03715ea2e3dbbc56947137ce3b4ac18a726b2f87 upstream. There exists a possible scenario in which dwc3_gadget_init() can fail: during during host -> peripheral mode switch in dwc3_set_mode(), and a pending gadget driver fails to bind. Then, if the DRD undergoes another mode switch from peripheral->host the resulting dwc3_gadget_exit() will attempt to reference an invalid and dangling dwc->gadget pointer as well as call dma_free_coherent() on unmapped DMA pointers. The exact scenario can be reproduced as follows: - Start DWC3 in peripheral mode - Configure ConfigFS gadget with FunctionFS instance (or use g_ffs) - Run FunctionFS userspace application (open EPs, write descriptors, etc) - Bind gadget driver to DWC3's UDC - Switch DWC3 to host mode => dwc3_gadget_exit() is called. usb_del_gadget() will put the ConfigFS driver instance on the gadget_driver_pending_list - Stop FunctionFS application (closes the ep files) - Switch DWC3 to peripheral mode => dwc3_gadget_init() fails as usb_add_gadget() calls check_pending_gadget_drivers() and attempts to rebind the UDC to the ConfigFS gadget but fails with -19 (-ENODEV) because the FFS instance is not in FFS_ACTIVE state (userspace has not re-opened and written the descriptors yet, i.e. desc_ready!=0). - Switch DWC3 back to host mode => dwc3_gadget_exit() is called again, but this time dwc->gadget is invalid. Although it can be argued that userspace should take responsibility for ensuring that the FunctionFS application be ready prior to allowing the composite driver bind to the UDC, failure to do so should not result in a panic from the kernel driver. Fix this by setting dwc->gadget to NULL in the failure path of dwc3_gadget_init() and add a check to dwc3_gadget_exit() to bail out unless the gadget pointer is valid. Fixes: e81a7018d93a ("usb: dwc3: allocate gadget structure dynamically") Cc: Reviewed-by: Peter Chen Signed-off-by: Jack Pham Link: https://lore.kernel.org/r/20210528160405.17550-1-jackp@codeaurora.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/gadget.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index ead877e7c87f9..10c6d1711ef75 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -3936,6 +3936,7 @@ int dwc3_gadget_init(struct dwc3 *dwc) dwc3_gadget_free_endpoints(dwc); err4: usb_put_gadget(dwc->gadget); + dwc->gadget = NULL; err3: dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce, dwc->bounce_addr); @@ -3955,6 +3956,9 @@ int dwc3_gadget_init(struct dwc3 *dwc) void dwc3_gadget_exit(struct dwc3 *dwc) { + if (!dwc->gadget) + return; + usb_del_gadget(dwc->gadget); dwc3_gadget_free_endpoints(dwc); usb_put_gadget(dwc->gadget); From 470403639114895e2697c766fbe17be8d0e9b67a Mon Sep 17 00:00:00 2001 From: Marian-Cristian Rotariu Date: Tue, 8 Jun 2021 19:26:50 +0300 Subject: [PATCH 210/269] usb: dwc3: ep0: fix NULL pointer exception commit d00889080ab60051627dab1d85831cd9db750e2a upstream. There is no validation of the index from dwc3_wIndex_to_dep() and we might be referring a non-existing ep and trigger a NULL pointer exception. In certain configurations we might use fewer eps and the index might wrongly indicate a larger ep index than existing. By adding this validation from the patch we can actually report a wrong index back to the caller. In our usecase we are using a composite device on an older kernel, but upstream might use this fix also. Unfortunately, I cannot describe the hardware for others to reproduce the issue as it is a proprietary implementation. [ 82.958261] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a4 [ 82.966891] Mem abort info: [ 82.969663] ESR = 0x96000006 [ 82.972703] Exception class = DABT (current EL), IL = 32 bits [ 82.978603] SET = 0, FnV = 0 [ 82.981642] EA = 0, S1PTW = 0 [ 82.984765] Data abort info: [ 82.987631] ISV = 0, ISS = 0x00000006 [ 82.991449] CM = 0, WnR = 0 [ 82.994409] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c6210ccc [ 83.000999] [00000000000000a4] pgd=0000000053aa5003, pud=0000000053aa5003, pmd=0000000000000000 [ 83.009685] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 83.026433] Process irq/62-dwc3 (pid: 303, stack limit = 0x000000003985154c) [ 83.033470] CPU: 0 PID: 303 Comm: irq/62-dwc3 Not tainted 4.19.124 #1 [ 83.044836] pstate: 60000085 (nZCv daIf -PAN -UAO) [ 83.049628] pc : dwc3_ep0_handle_feature+0x414/0x43c [ 83.054558] lr : dwc3_ep0_interrupt+0x3b4/0xc94 ... [ 83.141788] Call trace: [ 83.144227] dwc3_ep0_handle_feature+0x414/0x43c [ 83.148823] dwc3_ep0_interrupt+0x3b4/0xc94 [ 83.181546] ---[ end trace aac6b5267d84c32f ]--- Signed-off-by: Marian-Cristian Rotariu Cc: stable Link: https://lore.kernel.org/r/20210608162650.58426-1-marian.c.rotariu@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/ep0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c index 8b668ef46f7f1..3cd2942643725 100644 --- a/drivers/usb/dwc3/ep0.c +++ b/drivers/usb/dwc3/ep0.c @@ -292,6 +292,9 @@ static struct dwc3_ep *dwc3_wIndex_to_dep(struct dwc3 *dwc, __le16 wIndex_le) epnum |= 1; dep = dwc->eps[epnum]; + if (dep == NULL) + return NULL; + if (dep->flags & DWC3_EP_ENABLED) return dep; From 199af8a06de291d9f345f26a4e5fe2bc98cce262 Mon Sep 17 00:00:00 2001 From: Thomas Petazzoni Date: Fri, 28 May 2021 16:04:46 +0200 Subject: [PATCH 211/269] usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling commit b65ba0c362be665192381cc59e3ac3ef6f0dd1e1 upstream. In commit 92af4fc6ec33 ("usb: musb: Fix suspend with devices connected for a64"), the logic to support the MUSB_QUIRK_B_DISCONNECT_99 quirk was modified to only conditionally schedule the musb->irq_work delayed work. This commit badly breaks ECM Gadget on AM335X. Indeed, with this commit, one can observe massive packet loss: $ ping 192.168.0.100 ... 15 packets transmitted, 3 received, 80% packet loss, time 14316ms Reverting this commit brings back a properly functioning ECM Gadget. An analysis of the commit seems to indicate that a mistake was made: the previous code was not falling through into the MUSB_QUIRK_B_INVALID_VBUS_91, but now it is, unless the condition is taken. Changing the logic to be as it was before the problematic commit *and* only conditionally scheduling musb->irq_work resolves the regression: $ ping 192.168.0.100 ... 64 packets transmitted, 64 received, 0% packet loss, time 64475ms Fixes: 92af4fc6ec33 ("usb: musb: Fix suspend with devices connected for a64") Cc: stable@vger.kernel.org Tested-by: Alexandre Belloni Tested-by: Drew Fustini Acked-by: Tony Lindgren Signed-off-by: Thomas Petazzoni Link: https://lore.kernel.org/r/20210528140446.278076-1-thomas.petazzoni@bootlin.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/musb/musb_core.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c index 8f09a387b7738..4c8f0112481f3 100644 --- a/drivers/usb/musb/musb_core.c +++ b/drivers/usb/musb/musb_core.c @@ -2009,9 +2009,8 @@ static void musb_pm_runtime_check_session(struct musb *musb) schedule_delayed_work(&musb->irq_work, msecs_to_jiffies(1000)); musb->quirk_retries--; - break; } - fallthrough; + break; case MUSB_QUIRK_B_INVALID_VBUS_91: if (musb->quirk_retries && !musb->flush_irq_work) { musb_dbg(musb, From 7cee4344cbb30f07281b920b4b7bd9bee2c21c9d Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Wed, 9 Jun 2021 20:22:02 +0300 Subject: [PATCH 212/269] usb: typec: wcove: Use LE to CPU conversion when accessing msg->header commit d5ab95da2a41567440097c277c5771ad13928dad upstream. As LKP noticed the Sparse is not happy about strict type handling: .../typec/tcpm/wcove.c:380:50: sparse: expected unsigned short [usertype] header .../typec/tcpm/wcove.c:380:50: sparse: got restricted __le16 const [usertype] header Fix this by switching to use pd_header_cnt_le() instead of pd_header_cnt() in the affected code. Fixes: ae8a2ca8a221 ("usb: typec: Group all TCPCI/TCPM code together") Fixes: 3c4fb9f16921 ("usb: typec: wcove: start using tcpm for USB PD support") Reported-by: kernel test robot Reviewed-by: Heikki Krogerus Reviewed-by: Guenter Roeck Signed-off-by: Andy Shevchenko Link: https://lore.kernel.org/r/20210609172202.83377-1-andriy.shevchenko@linux.intel.com Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/tcpm/wcove.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/tcpm/wcove.c b/drivers/usb/typec/tcpm/wcove.c index 9b745f432c910..7e9c279bf49df 100644 --- a/drivers/usb/typec/tcpm/wcove.c +++ b/drivers/usb/typec/tcpm/wcove.c @@ -377,7 +377,7 @@ static int wcove_pd_transmit(struct tcpc_dev *tcpc, const u8 *data = (void *)msg; int i; - for (i = 0; i < pd_header_cnt(msg->header) * 4 + 2; i++) { + for (i = 0; i < pd_header_cnt_le(msg->header) * 4 + 2; i++) { ret = regmap_write(wcove->regmap, USBC_TX_DATA + i, data[i]); if (ret) From 572de10087a9f1a3475bede9ca54326d41c9523c Mon Sep 17 00:00:00 2001 From: Mayank Rana Date: Wed, 9 Jun 2021 00:35:35 -0700 Subject: [PATCH 213/269] usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path commit f247f0a82a4f8c3bfed178d8fd9e069d1424ee4e upstream. If ucsi_init() fails for some reason (e.g. ucsi_register_port() fails or general communication failure to the PPM), particularly at any point after the GET_CAPABILITY command had been issued, this results in unwinding the initialization and returning an error. However the ucsi structure's ucsi_capability member retains its current value, including likely a non-zero num_connectors. And because ucsi_init() itself is done in a workqueue a UCSI interface driver will be unaware that it failed and may think the ucsi_register() call was completely successful. Later, if ucsi_unregister() is called, due to this stale ucsi->cap value it would try to access the items in the ucsi->connector array which might not be in a proper state or not even allocated at all and results in NULL or invalid pointer dereference. Fix this by clearing the ucsi->cap value to 0 during the error path of ucsi_init() in order to prevent a later ucsi_unregister() from entering the connector cleanup loop. Fixes: c1b0bc2dabfa ("usb: typec: Add support for UCSI interface") Cc: stable@vger.kernel.org Acked-by: Heikki Krogerus Signed-off-by: Mayank Rana Signed-off-by: Jack Pham Link: https://lore.kernel.org/r/20210609073535.5094-1-jackp@codeaurora.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/ucsi/ucsi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index b4615bb5daab8..310b5caeb05ae 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1118,6 +1118,7 @@ static int ucsi_init(struct ucsi *ucsi) } err_reset: + memset(&ucsi->cap, 0, sizeof(ucsi->cap)); ucsi_reset_ppm(ucsi); err: return ret; From 6900ef1b1095e2ffa6538895017a5408e4706e34 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Mon, 7 Jun 2021 23:50:05 +0300 Subject: [PATCH 214/269] usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe() commit 1a85b350a7741776a406005b943e3dec02c424ed upstream. device_get_next_child_node() bumps a reference counting of a returned variable. We have to balance it whenever we return to the caller. Fixes: 6701adfa9693 ("usb: typec: driver for Intel PMC mux control") Cc: Heikki Krogerus Reviewed-by: Heikki Krogerus Signed-off-by: Andy Shevchenko Cc: stable Link: https://lore.kernel.org/r/20210607205007.71458-1-andy.shevchenko@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/mux/intel_pmc_mux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/typec/mux/intel_pmc_mux.c b/drivers/usb/typec/mux/intel_pmc_mux.c index 17896bd87fc3f..f6eaa7e30d541 100644 --- a/drivers/usb/typec/mux/intel_pmc_mux.c +++ b/drivers/usb/typec/mux/intel_pmc_mux.c @@ -623,8 +623,10 @@ static int pmc_usb_probe(struct platform_device *pdev) break; ret = pmc_usb_register_port(pmc, i, fwnode); - if (ret) + if (ret) { + fwnode_handle_put(fwnode); goto err_remove_ports; + } } platform_set_drvdata(pdev, pmc); From 0b3bb7950e73e46e7a697f35f893d115c55760ec Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Mon, 7 Jun 2021 23:50:06 +0300 Subject: [PATCH 215/269] usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource() commit 843fabdd7623271330af07f1b7fbd7fabe33c8de upstream. devm_ioremap_resource() can return an error, add missed check for it. Fixes: 43d596e32276 ("usb: typec: intel_pmc_mux: Check the port status before connect") Reviewed-by: Heikki Krogerus Signed-off-by: Andy Shevchenko Cc: stable Link: https://lore.kernel.org/r/20210607205007.71458-2-andy.shevchenko@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/mux/intel_pmc_mux.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/typec/mux/intel_pmc_mux.c b/drivers/usb/typec/mux/intel_pmc_mux.c index f6eaa7e30d541..acdef6fbb85e0 100644 --- a/drivers/usb/typec/mux/intel_pmc_mux.c +++ b/drivers/usb/typec/mux/intel_pmc_mux.c @@ -573,6 +573,11 @@ static int pmc_usb_probe_iom(struct pmc_usb *pmc) return -ENOMEM; } + if (IS_ERR(pmc->iom_base)) { + put_device(&adev->dev); + return PTR_ERR(pmc->iom_base); + } + pmc->iom_adev = adev; return 0; From 5cead896962d8b25dee8a8efc85b076572732b86 Mon Sep 17 00:00:00 2001 From: Wesley Cheng Date: Fri, 21 May 2021 17:44:21 -0700 Subject: [PATCH 216/269] usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind commit 6fc1db5e6211e30fbb1cee8d7925d79d4ed2ae14 upstream. During unbind, ffs_func_eps_disable() will be executed, resulting in completion callbacks for any pending USB requests. When using AIO, irrespective of the completion status, io_data work is queued to io_completion_wq to evaluate and handle the completed requests. Since work runs asynchronously to the unbind() routine, there can be a scenario where the work runs after the USB gadget has been fully removed, resulting in accessing of a resource which has been already freed. (i.e. usb_ep_free_request() accessing the USB ep structure) Explicitly drain the io_completion_wq, instead of relying on the destroy_workqueue() (in ffs_data_put()) to make sure no pending completion work items are running. Signed-off-by: Wesley Cheng Cc: stable Link: https://lore.kernel.org/r/1621644261-1236-1-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_fs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index ffe67d836b0ce..7df180b110afc 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -3566,6 +3566,9 @@ static void ffs_func_unbind(struct usb_configuration *c, ffs->func = NULL; } + /* Drain any pending AIO completions */ + drain_workqueue(ffs->io_completion_wq); + if (!--opts->refcnt) functionfs_unbind(ffs); From 1e2d41c17f437c0814a9624153c94e533055279e Mon Sep 17 00:00:00 2001 From: George McCollister Date: Thu, 3 Jun 2021 19:32:08 -0500 Subject: [PATCH 217/269] USB: serial: ftdi_sio: add NovaTech OrionMX product ID commit bc96c72df33ee81b24d87eab953c73f7bcc04f29 upstream. Add PID for the NovaTech OrionMX so it can be automatically detected. Signed-off-by: George McCollister Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/usb/serial/ftdi_sio.c | 1 + drivers/usb/serial/ftdi_sio_ids.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index 7c64b6ee5c194..1aef9b1e1c4eb 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -611,6 +611,7 @@ static const struct usb_device_id id_table_combined[] = { .driver_info = (kernel_ulong_t)&ftdi_jtag_quirk }, { USB_DEVICE(FTDI_VID, FTDI_NT_ORIONLX_PLUS_PID) }, { USB_DEVICE(FTDI_VID, FTDI_NT_ORION_IO_PID) }, + { USB_DEVICE(FTDI_VID, FTDI_NT_ORIONMX_PID) }, { USB_DEVICE(FTDI_VID, FTDI_SYNAPSE_SS200_PID) }, { USB_DEVICE(FTDI_VID, FTDI_CUSTOMWARE_MINIPLEX_PID) }, { USB_DEVICE(FTDI_VID, FTDI_CUSTOMWARE_MINIPLEX2_PID) }, diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h index d854e04a4286e..add602bebd820 100644 --- a/drivers/usb/serial/ftdi_sio_ids.h +++ b/drivers/usb/serial/ftdi_sio_ids.h @@ -581,6 +581,7 @@ #define FTDI_NT_ORIONLXM_PID 0x7c90 /* OrionLXm Substation Automation Platform */ #define FTDI_NT_ORIONLX_PLUS_PID 0x7c91 /* OrionLX+ Substation Automation Platform */ #define FTDI_NT_ORION_IO_PID 0x7c92 /* Orion I/O */ +#define FTDI_NT_ORIONMX_PID 0x7c93 /* OrionMX */ /* * Synapse Wireless product ids (FTDI_VID) From ef91a6bd946937c4db1e0a62b2c026811e3934a2 Mon Sep 17 00:00:00 2001 From: Alexandre GRIVEAUX Date: Sun, 23 May 2021 18:35:21 +0200 Subject: [PATCH 218/269] USB: serial: omninet: add device id for Zyxel Omni 56K Plus commit fc0b3dc9a11771c3919eaaaf9d649138b095aa0f upstream. Add device id for Zyxel Omni 56K Plus modem, this modem include: USB chip: NetChip NET2888 Main chip: 901041A F721501APGF Another modem using the same chips is the Zyxel Omni 56K DUO/NEO, could be added with the right USB ID. Signed-off-by: Alexandre GRIVEAUX Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/usb/serial/omninet.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/serial/omninet.c b/drivers/usb/serial/omninet.c index 5b6e982a9376b..ff02eff704162 100644 --- a/drivers/usb/serial/omninet.c +++ b/drivers/usb/serial/omninet.c @@ -26,6 +26,7 @@ #define ZYXEL_VENDOR_ID 0x0586 #define ZYXEL_OMNINET_ID 0x1000 +#define ZYXEL_OMNI_56K_PLUS_ID 0x1500 /* This one seems to be a re-branded ZyXEL device */ #define BT_IGNITIONPRO_ID 0x2000 @@ -40,6 +41,7 @@ static int omninet_port_remove(struct usb_serial_port *port); static const struct usb_device_id id_table[] = { { USB_DEVICE(ZYXEL_VENDOR_ID, ZYXEL_OMNINET_ID) }, + { USB_DEVICE(ZYXEL_VENDOR_ID, ZYXEL_OMNI_56K_PLUS_ID) }, { USB_DEVICE(ZYXEL_VENDOR_ID, BT_IGNITIONPRO_ID) }, { } /* Terminating entry */ }; From 4fa815beeaf04f46a92897898e9175d7cc6057bc Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Mon, 24 May 2021 11:17:05 +0200 Subject: [PATCH 219/269] USB: serial: quatech2: fix control-request directions commit eb8dbe80326c3d44c1e38ee4f40e0d8d3e06f2d0 upstream. The direction of the pipe argument must match the request-type direction bit or control requests may fail depending on the host-controller-driver implementation. Fix the three requests which erroneously used usb_rcvctrlpipe(). Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver") Cc: stable@vger.kernel.org # 3.5 Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/usb/serial/quatech2.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/usb/serial/quatech2.c b/drivers/usb/serial/quatech2.c index 872d1bc86ab43..a2c3c0944f996 100644 --- a/drivers/usb/serial/quatech2.c +++ b/drivers/usb/serial/quatech2.c @@ -416,7 +416,7 @@ static void qt2_close(struct usb_serial_port *port) /* flush the port transmit buffer */ i = usb_control_msg(serial->dev, - usb_rcvctrlpipe(serial->dev, 0), + usb_sndctrlpipe(serial->dev, 0), QT2_FLUSH_DEVICE, 0x40, 1, port_priv->device_port, NULL, 0, QT2_USB_TIMEOUT); @@ -426,7 +426,7 @@ static void qt2_close(struct usb_serial_port *port) /* flush the port receive buffer */ i = usb_control_msg(serial->dev, - usb_rcvctrlpipe(serial->dev, 0), + usb_sndctrlpipe(serial->dev, 0), QT2_FLUSH_DEVICE, 0x40, 0, port_priv->device_port, NULL, 0, QT2_USB_TIMEOUT); @@ -654,7 +654,7 @@ static int qt2_attach(struct usb_serial *serial) int status; /* power on unit */ - status = usb_control_msg(serial->dev, usb_rcvctrlpipe(serial->dev, 0), + status = usb_control_msg(serial->dev, usb_sndctrlpipe(serial->dev, 0), 0xc2, 0x40, 0x8000, 0, NULL, 0, QT2_USB_TIMEOUT); if (status < 0) { From 21bee94fb9e4ace33f5fef5645b8a134efa35043 Mon Sep 17 00:00:00 2001 From: Stefan Agner Date: Fri, 28 May 2021 22:39:31 +0200 Subject: [PATCH 220/269] USB: serial: cp210x: fix alternate function for CP2102N QFN20 commit 6f7ec77cc8b64ff5037c1945e4650c65c458037d upstream. The QFN20 part has a different GPIO/port function assignment. The configuration struct bit field ordered as TX/RX/RS485/WAKEUP/CLK which exactly matches GPIO0-3 for QFN24/28. However, QFN20 has a different GPIO to primary function assignment. Special case QFN20 to follow to properly detect which GPIOs are available. Signed-off-by: Stefan Agner Link: https://lore.kernel.org/r/51830b2b24118eb0f77c5c9ac64ffb2f519dbb1d.1622218300.git.stefan@agner.ch Fixes: c8acfe0aadbe ("USB: serial: cp210x: implement GPIO support for CP2102N") Cc: stable@vger.kernel.org # 4.19 Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/usb/serial/cp210x.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index b5f4e584f3c9e..28a728f883bc5 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -533,6 +533,12 @@ struct cp210x_single_port_config { #define CP210X_2NCONFIG_GPIO_RSTLATCH_IDX 587 #define CP210X_2NCONFIG_GPIO_CONTROL_IDX 600 +/* CP2102N QFN20 port configuration values */ +#define CP2102N_QFN20_GPIO2_TXLED_MODE BIT(2) +#define CP2102N_QFN20_GPIO3_RXLED_MODE BIT(3) +#define CP2102N_QFN20_GPIO1_RS485_MODE BIT(4) +#define CP2102N_QFN20_GPIO0_CLK_MODE BIT(6) + /* CP210X_VENDOR_SPECIFIC, CP210X_WRITE_LATCH call writes these 0x2 bytes. */ struct cp210x_gpio_write { u8 mask; @@ -1884,7 +1890,19 @@ static int cp2102n_gpioconf_init(struct usb_serial *serial) priv->gpio_pushpull = (gpio_pushpull >> 3) & 0x0f; /* 0 indicates GPIO mode, 1 is alternate function */ - priv->gpio_altfunc = (gpio_ctrl >> 2) & 0x0f; + if (priv->partnum == CP210X_PARTNUM_CP2102N_QFN20) { + /* QFN20 is special... */ + if (gpio_ctrl & CP2102N_QFN20_GPIO0_CLK_MODE) /* GPIO 0 */ + priv->gpio_altfunc |= BIT(0); + if (gpio_ctrl & CP2102N_QFN20_GPIO1_RS485_MODE) /* GPIO 1 */ + priv->gpio_altfunc |= BIT(1); + if (gpio_ctrl & CP2102N_QFN20_GPIO2_TXLED_MODE) /* GPIO 2 */ + priv->gpio_altfunc |= BIT(2); + if (gpio_ctrl & CP2102N_QFN20_GPIO3_RXLED_MODE) /* GPIO 3 */ + priv->gpio_altfunc |= BIT(3); + } else { + priv->gpio_altfunc = (gpio_ctrl >> 2) & 0x0f; + } if (priv->partnum == CP210X_PARTNUM_CP2102N_QFN28) { /* From 6bf8ff7d05204f0ec1842016280fedc00dd6b4d1 Mon Sep 17 00:00:00 2001 From: Linyu Yuan Date: Wed, 9 Jun 2021 07:35:47 +0800 Subject: [PATCH 221/269] usb: gadget: eem: fix wrong eem header operation commit 305f670846a31a261462577dd0b967c4fa796871 upstream. when skb_clone() or skb_copy_expand() fail, it should pull skb with lengh indicated by header, or not it will read network data and check it as header. Cc: Signed-off-by: Linyu Yuan Link: https://lore.kernel.org/r/20210608233547.3767-1-linyyuan@codeaurora.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_eem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/gadget/function/f_eem.c b/drivers/usb/gadget/function/f_eem.c index cfcc4e81fb776..28dd5f1fd1061 100644 --- a/drivers/usb/gadget/function/f_eem.c +++ b/drivers/usb/gadget/function/f_eem.c @@ -495,7 +495,7 @@ static int eem_unwrap(struct gether *port, skb2 = skb_clone(skb, GFP_ATOMIC); if (unlikely(!skb2)) { DBG(cdev, "unable to unframe EEM packet\n"); - continue; + goto next; } skb_trim(skb2, len - ETH_FCS_LEN); @@ -505,7 +505,7 @@ static int eem_unwrap(struct gether *port, GFP_ATOMIC); if (unlikely(!skb3)) { dev_kfree_skb_any(skb2); - continue; + goto next; } dev_kfree_skb_any(skb2); skb_queue_tail(list, skb3); From 4b289a0f3033f465b4fd51ba995251a7867a2aa2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maciej=20=C5=BBenczykowski?= Date: Mon, 7 Jun 2021 21:41:41 -0700 Subject: [PATCH 222/269] usb: fix various gadgets null ptr deref on 10gbps cabling. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 90c4d05780d47e14a50e11a7f17373104cd47d25 upstream. This avoids a null pointer dereference in f_{ecm,eem,hid,loopback,printer,rndis,serial,sourcesink,subset,tcm} by simply reusing the 5gbps config for 10gbps. Fixes: eaef50c76057 ("usb: gadget: Update usb_assign_descriptors for SuperSpeedPlus") Cc: Christophe JAILLET Cc: Felipe Balbi Cc: Gustavo A. R. Silva Cc: Lorenzo Colitti Cc: Martin K. Petersen Cc: Michael R Sweet Cc: Mike Christie Cc: Pawel Laszczak Cc: Peter Chen Cc: Sudhakar Panneerselvam Cc: Wei Ming Chen Cc: Will McVicker Cc: Zqiang Reviewed-By: Lorenzo Colitti Cc: stable Signed-off-by: Maciej Żenczykowski Link: https://lore.kernel.org/r/20210608044141.3898496-1-zenczykowski@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/function/f_ecm.c | 2 +- drivers/usb/gadget/function/f_eem.c | 2 +- drivers/usb/gadget/function/f_hid.c | 3 ++- drivers/usb/gadget/function/f_loopback.c | 2 +- drivers/usb/gadget/function/f_printer.c | 3 ++- drivers/usb/gadget/function/f_rndis.c | 2 +- drivers/usb/gadget/function/f_serial.c | 2 +- drivers/usb/gadget/function/f_sourcesink.c | 3 ++- drivers/usb/gadget/function/f_subset.c | 2 +- drivers/usb/gadget/function/f_tcm.c | 3 ++- 10 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/usb/gadget/function/f_ecm.c b/drivers/usb/gadget/function/f_ecm.c index 7f5cf488b2b1e..ffe2486fce71c 100644 --- a/drivers/usb/gadget/function/f_ecm.c +++ b/drivers/usb/gadget/function/f_ecm.c @@ -791,7 +791,7 @@ ecm_bind(struct usb_configuration *c, struct usb_function *f) fs_ecm_notify_desc.bEndpointAddress; status = usb_assign_descriptors(f, ecm_fs_function, ecm_hs_function, - ecm_ss_function, NULL); + ecm_ss_function, ecm_ss_function); if (status) goto fail; diff --git a/drivers/usb/gadget/function/f_eem.c b/drivers/usb/gadget/function/f_eem.c index 28dd5f1fd1061..2cd9942707b46 100644 --- a/drivers/usb/gadget/function/f_eem.c +++ b/drivers/usb/gadget/function/f_eem.c @@ -302,7 +302,7 @@ static int eem_bind(struct usb_configuration *c, struct usb_function *f) eem_ss_out_desc.bEndpointAddress = eem_fs_out_desc.bEndpointAddress; status = usb_assign_descriptors(f, eem_fs_function, eem_hs_function, - eem_ss_function, NULL); + eem_ss_function, eem_ss_function); if (status) goto fail; diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c index 1125f4715830d..e556993081170 100644 --- a/drivers/usb/gadget/function/f_hid.c +++ b/drivers/usb/gadget/function/f_hid.c @@ -802,7 +802,8 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f) hidg_fs_out_ep_desc.bEndpointAddress; status = usb_assign_descriptors(f, hidg_fs_descriptors, - hidg_hs_descriptors, hidg_ss_descriptors, NULL); + hidg_hs_descriptors, hidg_ss_descriptors, + hidg_ss_descriptors); if (status) goto fail; diff --git a/drivers/usb/gadget/function/f_loopback.c b/drivers/usb/gadget/function/f_loopback.c index 1803646b36780..90215a81c178b 100644 --- a/drivers/usb/gadget/function/f_loopback.c +++ b/drivers/usb/gadget/function/f_loopback.c @@ -207,7 +207,7 @@ static int loopback_bind(struct usb_configuration *c, struct usb_function *f) ss_loop_sink_desc.bEndpointAddress = fs_loop_sink_desc.bEndpointAddress; ret = usb_assign_descriptors(f, fs_loopback_descs, hs_loopback_descs, - ss_loopback_descs, NULL); + ss_loopback_descs, ss_loopback_descs); if (ret) return ret; diff --git a/drivers/usb/gadget/function/f_printer.c b/drivers/usb/gadget/function/f_printer.c index 2f1eb2e81d306..236ecc9689985 100644 --- a/drivers/usb/gadget/function/f_printer.c +++ b/drivers/usb/gadget/function/f_printer.c @@ -1099,7 +1099,8 @@ static int printer_func_bind(struct usb_configuration *c, ss_ep_out_desc.bEndpointAddress = fs_ep_out_desc.bEndpointAddress; ret = usb_assign_descriptors(f, fs_printer_function, - hs_printer_function, ss_printer_function, NULL); + hs_printer_function, ss_printer_function, + ss_printer_function); if (ret) return ret; diff --git a/drivers/usb/gadget/function/f_rndis.c b/drivers/usb/gadget/function/f_rndis.c index 0739b05a0ef7b..ee95e8f5f9d48 100644 --- a/drivers/usb/gadget/function/f_rndis.c +++ b/drivers/usb/gadget/function/f_rndis.c @@ -789,7 +789,7 @@ rndis_bind(struct usb_configuration *c, struct usb_function *f) ss_notify_desc.bEndpointAddress = fs_notify_desc.bEndpointAddress; status = usb_assign_descriptors(f, eth_fs_function, eth_hs_function, - eth_ss_function, NULL); + eth_ss_function, eth_ss_function); if (status) goto fail; diff --git a/drivers/usb/gadget/function/f_serial.c b/drivers/usb/gadget/function/f_serial.c index e627138463504..1ed8ff0ac2d31 100644 --- a/drivers/usb/gadget/function/f_serial.c +++ b/drivers/usb/gadget/function/f_serial.c @@ -233,7 +233,7 @@ static int gser_bind(struct usb_configuration *c, struct usb_function *f) gser_ss_out_desc.bEndpointAddress = gser_fs_out_desc.bEndpointAddress; status = usb_assign_descriptors(f, gser_fs_function, gser_hs_function, - gser_ss_function, NULL); + gser_ss_function, gser_ss_function); if (status) goto fail; dev_dbg(&cdev->gadget->dev, "generic ttyGS%d: %s speed IN/%s OUT/%s\n", diff --git a/drivers/usb/gadget/function/f_sourcesink.c b/drivers/usb/gadget/function/f_sourcesink.c index ed68a4860b7d8..282737e4609ce 100644 --- a/drivers/usb/gadget/function/f_sourcesink.c +++ b/drivers/usb/gadget/function/f_sourcesink.c @@ -431,7 +431,8 @@ sourcesink_bind(struct usb_configuration *c, struct usb_function *f) ss_iso_sink_desc.bEndpointAddress = fs_iso_sink_desc.bEndpointAddress; ret = usb_assign_descriptors(f, fs_source_sink_descs, - hs_source_sink_descs, ss_source_sink_descs, NULL); + hs_source_sink_descs, ss_source_sink_descs, + ss_source_sink_descs); if (ret) return ret; diff --git a/drivers/usb/gadget/function/f_subset.c b/drivers/usb/gadget/function/f_subset.c index 4d945254905d9..51c1cae162d9b 100644 --- a/drivers/usb/gadget/function/f_subset.c +++ b/drivers/usb/gadget/function/f_subset.c @@ -358,7 +358,7 @@ geth_bind(struct usb_configuration *c, struct usb_function *f) fs_subset_out_desc.bEndpointAddress; status = usb_assign_descriptors(f, fs_eth_function, hs_eth_function, - ss_eth_function, NULL); + ss_eth_function, ss_eth_function); if (status) goto fail; diff --git a/drivers/usb/gadget/function/f_tcm.c b/drivers/usb/gadget/function/f_tcm.c index 410fa89eae8f6..5a2e9ce2bc352 100644 --- a/drivers/usb/gadget/function/f_tcm.c +++ b/drivers/usb/gadget/function/f_tcm.c @@ -2061,7 +2061,8 @@ static int tcm_bind(struct usb_configuration *c, struct usb_function *f) uasp_fs_cmd_desc.bEndpointAddress = uasp_ss_cmd_desc.bEndpointAddress; ret = usb_assign_descriptors(f, uasp_fs_function_desc, - uasp_hs_function_desc, uasp_ss_function_desc, NULL); + uasp_hs_function_desc, uasp_ss_function_desc, + uasp_ss_function_desc); if (ret) goto ep_fail; From b972eff874637402ddc4a7dd11fb22538a0b6d28 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maciej=20=C5=BBenczykowski?= Date: Tue, 8 Jun 2021 19:44:59 -0700 Subject: [PATCH 223/269] usb: fix various gadget panics on 10gbps cabling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 032e288097a553db5653af552dd8035cd2a0ba96 upstream. usb_assign_descriptors() is called with 5 parameters, the last 4 of which are the usb_descriptor_header for: full-speed (USB1.1 - 12Mbps [including USB1.0 low-speed @ 1.5Mbps), high-speed (USB2.0 - 480Mbps), super-speed (USB3.0 - 5Gbps), super-speed-plus (USB3.1 - 10Gbps). The differences between full/high/super-speed descriptors are usually substantial (due to changes in the maximum usb block size from 64 to 512 to 1024 bytes and other differences in the specs), while the difference between 5 and 10Gbps descriptors may be as little as nothing (in many cases the same tuning is simply good enough). However if a gadget driver calls usb_assign_descriptors() with a NULL descriptor for super-speed-plus and is then used on a max 10gbps configuration, the kernel will crash with a null pointer dereference, when a 10gbps capable device port + cable + host port combination shows up. (This wouldn't happen if the gadget max-speed was set to 5gbps, but it of course defaults to the maximum, and there's no real reason to artificially limit it) The fix is to simply use the 5gbps descriptor as the 10gbps descriptor, if a 10gbps descriptor wasn't provided. Obviously this won't fix the problem if the 5gbps descriptor is also NULL, but such cases can't be so trivially solved (and any such gadgets are unlikely to be used with USB3 ports any way). Cc: Felipe Balbi Cc: Greg Kroah-Hartman Signed-off-by: Maciej Żenczykowski Cc: stable Link: https://lore.kernel.org/r/20210609024459.1126080-1-zenczykowski@gmail.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/gadget/config.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/usb/gadget/config.c b/drivers/usb/gadget/config.c index 8bb25773b61e9..05507606b2b42 100644 --- a/drivers/usb/gadget/config.c +++ b/drivers/usb/gadget/config.c @@ -164,6 +164,14 @@ int usb_assign_descriptors(struct usb_function *f, { struct usb_gadget *g = f->config->cdev->gadget; + /* super-speed-plus descriptor falls back to super-speed one, + * if such a descriptor was provided, thus avoiding a NULL + * pointer dereference if a 5gbps capable gadget is used with + * a 10gbps capable config (device port + cable + host port) + */ + if (!ssp) + ssp = ss; + if (fs) { f->fs_descriptors = usb_copy_descriptors(fs); if (!f->fs_descriptors) From 18eaf0de50eadeeb395b83310b259b21ad8ed0a6 Mon Sep 17 00:00:00 2001 From: Li Jun Date: Wed, 2 Jun 2021 17:57:07 +0800 Subject: [PATCH 224/269] usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port commit 3a13ff7ef4349d70d1d18378d661117dd5af8efe upstream. A pending hrtimer may expire after the kthread_worker of tcpm port is destroyed, see below kernel dump when do module unload, fix it by cancel the 2 hrtimers. [ 111.517018] Unable to handle kernel paging request at virtual address ffff8000118cb880 [ 111.518786] blk_update_request: I/O error, dev sda, sector 60061185 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 111.526594] Mem abort info: [ 111.526597] ESR = 0x96000047 [ 111.526600] EC = 0x25: DABT (current EL), IL = 32 bits [ 111.526604] SET = 0, FnV = 0 [ 111.526607] EA = 0, S1PTW = 0 [ 111.526610] Data abort info: [ 111.526612] ISV = 0, ISS = 0x00000047 [ 111.526615] CM = 0, WnR = 1 [ 111.526619] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041d75000 [ 111.526623] [ffff8000118cb880] pgd=10000001bffff003, p4d=10000001bffff003, pud=10000001bfffe003, pmd=10000001bfffa003, pte=0000000000000000 [ 111.526642] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 111.526647] Modules linked in: dwc3_imx8mp dwc3 phy_fsl_imx8mq_usb [last unloaded: tcpci] [ 111.526663] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc4-00927-gebbe9dbd802c-dirty #36 [ 111.526670] Hardware name: NXP i.MX8MPlus EVK board (DT) [ 111.526674] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO BTYPE=--) [ 111.526681] pc : queued_spin_lock_slowpath+0x1a0/0x390 [ 111.526695] lr : _raw_spin_lock_irqsave+0x88/0xb4 [ 111.526703] sp : ffff800010003e20 [ 111.526706] x29: ffff800010003e20 x28: ffff00017f380180 [ 111.537156] buffer_io_error: 6 callbacks suppressed [ 111.537162] Buffer I/O error on dev sda1, logical block 60040704, async page read [ 111.539932] x27: ffff00017f3801c0 [ 111.539938] x26: ffff800010ba2490 x25: 0000000000000000 x24: 0000000000000001 [ 111.543025] blk_update_request: I/O error, dev sda, sector 60061186 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0 [ 111.548304] [ 111.548306] x23: 00000000000000c0 x22: ffff0000c2a9f184 x21: ffff00017f380180 [ 111.551374] Buffer I/O error on dev sda1, logical block 60040705, async page read [ 111.554499] [ 111.554503] x20: ffff0000c5f14210 x19: 00000000000000c0 x18: 0000000000000000 [ 111.557391] Buffer I/O error on dev sda1, logical block 60040706, async page read [ 111.561218] [ 111.561222] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 111.564205] Buffer I/O error on dev sda1, logical block 60040707, async page read [ 111.570887] x14: 00000000000000f5 x13: 0000000000000001 x12: 0000000000000040 [ 111.570902] x11: ffff0000c05ac6d8 [ 111.583420] Buffer I/O error on dev sda1, logical block 60040708, async page read [ 111.588978] x10: 0000000000000000 x9 : 0000000000040000 [ 111.588988] x8 : 0000000000000000 [ 111.597173] Buffer I/O error on dev sda1, logical block 60040709, async page read [ 111.605766] x7 : ffff00017f384880 x6 : ffff8000118cb880 [ 111.605777] x5 : ffff00017f384880 [ 111.611094] Buffer I/O error on dev sda1, logical block 60040710, async page read [ 111.617086] x4 : 0000000000000000 x3 : ffff0000c2a9f184 [ 111.617096] x2 : ffff8000118cb880 [ 111.622242] Buffer I/O error on dev sda1, logical block 60040711, async page read [ 111.626927] x1 : ffff8000118cb880 x0 : ffff00017f384888 [ 111.626938] Call trace: [ 111.626942] queued_spin_lock_slowpath+0x1a0/0x390 [ 111.795809] kthread_queue_work+0x30/0xc0 [ 111.799828] state_machine_timer_handler+0x20/0x30 [ 111.804624] __hrtimer_run_queues+0x140/0x1e0 [ 111.808990] hrtimer_interrupt+0xec/0x2c0 [ 111.813004] arch_timer_handler_phys+0x38/0x50 [ 111.817456] handle_percpu_devid_irq+0x88/0x150 [ 111.821991] __handle_domain_irq+0x80/0xe0 [ 111.826093] gic_handle_irq+0xc0/0x140 [ 111.829848] el1_irq+0xbc/0x154 [ 111.832991] arch_cpu_idle+0x1c/0x2c [ 111.836572] default_idle_call+0x24/0x6c [ 111.840497] do_idle+0x238/0x2ac [ 111.843729] cpu_startup_entry+0x2c/0x70 [ 111.847657] rest_init+0xdc/0xec [ 111.850890] arch_call_rest_init+0x14/0x20 [ 111.854988] start_kernel+0x508/0x540 [ 111.858659] Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827) [ 111.864760] ---[ end trace 308b9a4a3dcb73ac ]--- [ 111.869381] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 111.876258] SMP: stopping secondary CPUs [ 111.880185] Kernel Offset: disabled [ 111.883673] CPU features: 0x00001001,20000846 [ 111.888031] Memory Limit: none [ 111.891090] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- Fixes: 3ed8e1c2ac99 ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events") Cc: stable Reviewed-by: Guenter Roeck Signed-off-by: Li Jun Link: https://lore.kernel.org/r/1622627829-11070-1-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/tcpm/tcpm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index bdbd346dc59ff..4395220a97dc0 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -5187,6 +5187,9 @@ void tcpm_unregister_port(struct tcpm_port *port) { int i; + hrtimer_cancel(&port->vdm_state_machine_timer); + hrtimer_cancel(&port->state_machine_timer); + tcpm_reset_port(port); for (i = 0; i < ARRAY_SIZE(port->port_altmode); i++) typec_unregister_altmode(port->port_altmode[i]); From 7dcdfa28e1fef6641fb3edfb4581580179a9f7ed Mon Sep 17 00:00:00 2001 From: Li Jun Date: Wed, 2 Jun 2021 17:57:08 +0800 Subject: [PATCH 225/269] usb: typec: tcpm: cancel frs hrtimer when unregister tcpm port commit 7ade4805e296c8d1e40c842395bbe478c7210555 upstream. Like the state_machine_timer, we should also cancel possible pending frs hrtimer when unregister tcpm port. Fixes: 8dc4bd073663 ("usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)") Cc: stable Reviewed-by: Guenter Roeck Signed-off-by: Li Jun Link: https://lore.kernel.org/r/1622627829-11070-2-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/tcpm/tcpm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c index 4395220a97dc0..61929d37d7fc4 100644 --- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -5187,6 +5187,7 @@ void tcpm_unregister_port(struct tcpm_port *port) { int i; + hrtimer_cancel(&port->enable_frs_timer); hrtimer_cancel(&port->vdm_state_machine_timer); hrtimer_cancel(&port->state_machine_timer); From 06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66 Mon Sep 17 00:00:00 2001 From: Dmitry Baryshkov Date: Thu, 20 May 2021 01:12:23 +0300 Subject: [PATCH 226/269] regulator: core: resolve supply for boot-on/always-on regulators commit 98e48cd9283dbac0e1445ee780889f10b3d1db6a upstream. For the boot-on/always-on regulators the set_machine_constrainst() is called before resolving rdev->supply. Thus the code would try to enable rdev before enabling supplying regulator. Enforce resolving supply regulator before enabling rdev. Fixes: aea6cb99703e ("regulator: resolve supply after creating regulator") Signed-off-by: Dmitry Baryshkov Link: https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/core.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 7b3de8b0b1caf..043b5f63b94a1 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1422,6 +1422,12 @@ static int set_machine_constraints(struct regulator_dev *rdev) * and we have control then make sure it is enabled. */ if (rdev->constraints->always_on || rdev->constraints->boot_on) { + /* If we want to enable this regulator, make sure that we know + * the supplying regulator. + */ + if (rdev->supply_name && !rdev->supply) + return -EPROBE_DEFER; + if (rdev->supply) { ret = regulator_enable(rdev->supply); if (ret < 0) { From e3a502abf506e5c054b61a7c524caaa48e8e8537 Mon Sep 17 00:00:00 2001 From: Dmitry Osipenko Date: Mon, 24 May 2021 01:42:42 +0300 Subject: [PATCH 227/269] regulator: max77620: Use device_set_of_node_from_dev() commit 6f55c5dd1118b3076d11d9cb17f5c5f4bc3a1162 upstream. The MAX77620 driver fails to re-probe on deferred probe because driver core tries to claim resources that are already claimed by the PINCTRL device. Use device_set_of_node_from_dev() helper which marks OF node as reused, skipping erroneous execution of pinctrl_bind_pins() for the PMIC device on the re-probe. Fixes: aea6cb99703e ("regulator: resolve supply after creating regulator") Signed-off-by: Dmitry Osipenko Link: https://lore.kernel.org/r/20210523224243.13219-2-digetx@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/max77620-regulator.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/regulator/max77620-regulator.c b/drivers/regulator/max77620-regulator.c index 8d9731e4052bf..5c439c850d090 100644 --- a/drivers/regulator/max77620-regulator.c +++ b/drivers/regulator/max77620-regulator.c @@ -814,6 +814,13 @@ static int max77620_regulator_probe(struct platform_device *pdev) config.dev = dev; config.driver_data = pmic; + /* + * Set of_node_reuse flag to prevent driver core from attempting to + * claim any pinmux resources already claimed by the parent device. + * Otherwise PMIC driver will fail to re-probe. + */ + device_set_of_node_from_dev(&pdev->dev, pdev->dev.parent); + for (id = 0; id < MAX77620_NUM_REGS; id++) { struct regulator_dev *rdev; struct regulator_desc *rdesc; From c365ff97617cbea8b31aa340507c7dc4395da89a Mon Sep 17 00:00:00 2001 From: Matti Vaittinen Date: Wed, 2 Jun 2021 08:45:58 +0300 Subject: [PATCH 228/269] regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837 commit bc537e65b09a05923f98a31920d1ab170e648dba upstream. Changing the BD71837 voltages for other regulators except the first 4 BUCKs should be forbidden when the regulator is enabled. There may be out-of-spec voltage spikes if the voltage of these "non DVS" bucks is changed when enabled. This restriction was accidentally removed when the LDO voltage change was allowed for BD71847. (It was not noticed that the BD71837 BUCK7 used same voltage setting function as LDOs). Additionally this bug causes incorrect voltage monitoring register access. The voltage change function accidentally used for bd71837 BUCK7 is intended to only handle LDO voltage changes. A BD71847 LDO specific voltage monitoring disabling code gets executed on BD71837 and register offsets are wrongly calculated as regulator is assumed to be an LDO. Prevent the BD71837 BUCK7 voltage change when BUCK7 is enabled by using the correct voltage setting operation. Fixes: 9bcbabafa19b ("regulator: bd718x7: remove voltage change restriction from BD71847 LDOs") Signed-off-by: Matti Vaittinen Link: https://lore.kernel.org/r/bd8c00931421fafa57e3fdf46557a83075b7cc17.1622610103.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/bd718x7-regulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regulator/bd718x7-regulator.c b/drivers/regulator/bd718x7-regulator.c index 3333b8905f1b7..2c097ee6cb021 100644 --- a/drivers/regulator/bd718x7-regulator.c +++ b/drivers/regulator/bd718x7-regulator.c @@ -364,7 +364,7 @@ BD718XX_OPS(bd71837_buck_regulator_ops, regulator_list_voltage_linear_range, NULL); BD718XX_OPS(bd71837_buck_regulator_nolinear_ops, regulator_list_voltage_table, - regulator_map_voltage_ascend, bd718xx_set_voltage_sel_restricted, + regulator_map_voltage_ascend, bd71837_set_voltage_sel_restricted, regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel, NULL); /* From 5a5f5cfb5f0996d65eae3cc034513d90f4be6783 Mon Sep 17 00:00:00 2001 From: Axel Lin Date: Mon, 17 May 2021 18:53:24 +0800 Subject: [PATCH 229/269] regulator: fan53880: Fix missing n_voltages setting commit 34991ee96fd8477479dd15adadceb6b28b30d9b0 upstream. Fixes: e6dea51e2d41 ("regulator: fan53880: Add initial support") Signed-off-by: Axel Lin Acked-by: Christoph Fritz Link: https://lore.kernel.org/r/20210517105325.1227393-1-axel.lin@ingics.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/fan53880.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/regulator/fan53880.c b/drivers/regulator/fan53880.c index e83eb4fb1876a..1684faf82ed25 100644 --- a/drivers/regulator/fan53880.c +++ b/drivers/regulator/fan53880.c @@ -51,6 +51,7 @@ static const struct regulator_ops fan53880_ops = { REGULATOR_LINEAR_RANGE(800000, 0xf, 0x73, 25000), \ }, \ .n_linear_ranges = 2, \ + .n_voltages = 0x74, \ .vsel_reg = FAN53880_LDO ## _num ## VOUT, \ .vsel_mask = 0x7f, \ .enable_reg = FAN53880_ENABLE, \ @@ -76,6 +77,7 @@ static const struct regulator_desc fan53880_regulators[] = { REGULATOR_LINEAR_RANGE(600000, 0x1f, 0xf7, 12500), }, .n_linear_ranges = 2, + .n_voltages = 0xf8, .vsel_reg = FAN53880_BUCKVOUT, .vsel_mask = 0x7f, .enable_reg = FAN53880_ENABLE, @@ -95,6 +97,7 @@ static const struct regulator_desc fan53880_regulators[] = { REGULATOR_LINEAR_RANGE(3000000, 0x4, 0x70, 25000), }, .n_linear_ranges = 2, + .n_voltages = 0x71, .vsel_reg = FAN53880_BOOSTVOUT, .vsel_mask = 0x7f, .enable_reg = FAN53880_ENABLE_BOOST, From 4579f65176792a54e842d28605330ef0f4916df2 Mon Sep 17 00:00:00 2001 From: Axel Lin Date: Sun, 23 May 2021 15:10:45 +0800 Subject: [PATCH 230/269] regulator: bd71828: Fix .n_voltages settings commit 4c668630bf8ea90a041fc69c9984486e0f56682d upstream. Current .n_voltages settings do not cover the latest 2 valid selectors, so it fails to set voltage for the hightest voltage support. The latest linear range has step_uV = 0, so it does not matter if we count the .n_voltages to maximum selector + 1 or the first selector of latest linear range + 1. To simplify calculating the n_voltages, let's just set the .n_voltages to maximum selector + 1. Fixes: 522498f8cb8c ("regulator: bd71828: Basic support for ROHM bd71828 PMIC regulators") Signed-off-by: Axel Lin Reviewed-by: Matti Vaittinen Link: https://lore.kernel.org/r/20210523071045.2168904-2-axel.lin@ingics.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- include/linux/mfd/rohm-bd71828.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/mfd/rohm-bd71828.h b/include/linux/mfd/rohm-bd71828.h index 017a4c01cb315..61f0974c33d72 100644 --- a/include/linux/mfd/rohm-bd71828.h +++ b/include/linux/mfd/rohm-bd71828.h @@ -26,11 +26,11 @@ enum { BD71828_REGULATOR_AMOUNT, }; -#define BD71828_BUCK1267_VOLTS 0xEF -#define BD71828_BUCK3_VOLTS 0x10 -#define BD71828_BUCK4_VOLTS 0x20 -#define BD71828_BUCK5_VOLTS 0x10 -#define BD71828_LDO_VOLTS 0x32 +#define BD71828_BUCK1267_VOLTS 0x100 +#define BD71828_BUCK3_VOLTS 0x20 +#define BD71828_BUCK4_VOLTS 0x40 +#define BD71828_BUCK5_VOLTS 0x20 +#define BD71828_LDO_VOLTS 0x40 /* LDO6 is fixed 1.8V voltage */ #define BD71828_LDO_6_VOLTAGE 1800000 From aafc51fddfa87178eb599e68526620ac0c02dc5a Mon Sep 17 00:00:00 2001 From: Axel Lin Date: Sun, 30 May 2021 20:41:00 +0800 Subject: [PATCH 231/269] regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks commit 86ab21cc39e6b99b7065ab9008c90bec5dec535a upstream. Current code does not set .curr_table and .n_linear_ranges settings, so it cannot use the regulator_get/set_current_limit_regmap helpers. If we setup the curr_table, it will has 200 entries. Implement customized .set_current_limit/.get_current_limit callbacks instead. Fixes: b8c054a5eaf0 ("regulator: rtmv20: Adds support for Richtek RTMV20 load switch regulator") Signed-off-by: Axel Lin Reviewed-by: ChiYuan Huang Link: https://lore.kernel.org/r/20210530124101.477727-1-axel.lin@ingics.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- drivers/regulator/rtmv20-regulator.c | 42 ++++++++++++++++++++++++++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/drivers/regulator/rtmv20-regulator.c b/drivers/regulator/rtmv20-regulator.c index 852fb2596ffda..5adc552dffd58 100644 --- a/drivers/regulator/rtmv20-regulator.c +++ b/drivers/regulator/rtmv20-regulator.c @@ -103,9 +103,47 @@ static int rtmv20_lsw_disable(struct regulator_dev *rdev) return 0; } +static int rtmv20_lsw_set_current_limit(struct regulator_dev *rdev, int min_uA, + int max_uA) +{ + int sel; + + if (min_uA > RTMV20_LSW_MAXUA || max_uA < RTMV20_LSW_MINUA) + return -EINVAL; + + if (max_uA > RTMV20_LSW_MAXUA) + max_uA = RTMV20_LSW_MAXUA; + + sel = (max_uA - RTMV20_LSW_MINUA) / RTMV20_LSW_STEPUA; + + /* Ensure the selected setting is still in range */ + if ((sel * RTMV20_LSW_STEPUA + RTMV20_LSW_MINUA) < min_uA) + return -EINVAL; + + sel <<= ffs(rdev->desc->csel_mask) - 1; + + return regmap_update_bits(rdev->regmap, rdev->desc->csel_reg, + rdev->desc->csel_mask, sel); +} + +static int rtmv20_lsw_get_current_limit(struct regulator_dev *rdev) +{ + unsigned int val; + int ret; + + ret = regmap_read(rdev->regmap, rdev->desc->csel_reg, &val); + if (ret) + return ret; + + val &= rdev->desc->csel_mask; + val >>= ffs(rdev->desc->csel_mask) - 1; + + return val * RTMV20_LSW_STEPUA + RTMV20_LSW_MINUA; +} + static const struct regulator_ops rtmv20_regulator_ops = { - .set_current_limit = regulator_set_current_limit_regmap, - .get_current_limit = regulator_get_current_limit_regmap, + .set_current_limit = rtmv20_lsw_set_current_limit, + .get_current_limit = rtmv20_lsw_get_current_limit, .enable = rtmv20_lsw_enable, .disable = rtmv20_lsw_disable, .is_enabled = regulator_is_enabled_regmap, From cc440da4aad9dc9f2a9a9ebd9ddf3c23f36ff2fc Mon Sep 17 00:00:00 2001 From: Chen Li Date: Tue, 27 Apr 2021 15:17:45 +0800 Subject: [PATCH 232/269] phy: usb: Fix misuse of IS_ENABLED commit 7c2fc79250cafa1a29befeb60163028ec4720814 upstream. While IS_ENABLED() is perfectly fine for CONFIG_* symbols, it is not for other symbols such as __BIG_ENDIAN that is provided directly by the compiler. Switch to use CONFIG_CPU_BIG_ENDIAN instead of __BIG_ENDIAN. Signed-off-by: Chen Li Reviewed-by: Al Cooper Acked-by: Florian Fainelli Fixes: 94583a41047e ("phy: usb: Restructure in preparation for adding 7216 USB support") Link: https://lore.kernel.org/r/87czuggpra.wl-chenli@uniontech.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman --- drivers/phy/broadcom/phy-brcm-usb-init.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/phy/broadcom/phy-brcm-usb-init.h b/drivers/phy/broadcom/phy-brcm-usb-init.h index 899b9eb43fad6..a39f30fa2e991 100644 --- a/drivers/phy/broadcom/phy-brcm-usb-init.h +++ b/drivers/phy/broadcom/phy-brcm-usb-init.h @@ -78,7 +78,7 @@ static inline u32 brcm_usb_readl(void __iomem *addr) * Other architectures (e.g., ARM) either do not support big endian, or * else leave I/O in little endian mode. */ - if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(__BIG_ENDIAN)) + if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) return __raw_readl(addr); else return readl_relaxed(addr); @@ -87,7 +87,7 @@ static inline u32 brcm_usb_readl(void __iomem *addr) static inline void brcm_usb_writel(u32 val, void __iomem *addr) { /* See brcmnand_readl() comments */ - if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(__BIG_ENDIAN)) + if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) __raw_writel(val, addr); else writel_relaxed(val, addr); From 9e0677c2e39052ac20efae4474bb20614d9a88c9 Mon Sep 17 00:00:00 2001 From: Wesley Cheng Date: Thu, 20 May 2021 21:23:57 -0700 Subject: [PATCH 233/269] usb: dwc3: gadget: Disable gadget IRQ during pullup disable commit 8212937305f84ef73ea81036dafb80c557583d4b upstream. Current sequence utilizes dwc3_gadget_disable_irq() alongside synchronize_irq() to ensure that no further DWC3 events are generated. However, the dwc3_gadget_disable_irq() API only disables device specific events. Endpoint events can still be generated. Briefly disable the interrupt line, so that the cleanup code can run to prevent device and endpoint events. (i.e. __dwc3_gadget_stop() and dwc3_stop_active_transfers() respectively) Without doing so, it can lead to both the interrupt handler and the pullup disable routine both writing to the GEVNTCOUNT register, which will cause an incorrect count being read from future interrupts. Fixes: ae7e86108b12 ("usb: dwc3: Stop active transfers before halting the controller") Signed-off-by: Wesley Cheng Link: https://lore.kernel.org/r/1621571037-1424-1-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/dwc3/gadget.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 10c6d1711ef75..8bccdd7b0ca2e 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2143,13 +2143,10 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on) } /* - * Synchronize any pending event handling before executing the controller - * halt routine. + * Synchronize and disable any further event handling while controller + * is being enabled/disabled. */ - if (!is_on) { - dwc3_gadget_disable_irq(dwc); - synchronize_irq(dwc->irq_gadget); - } + disable_irq(dwc->irq_gadget); spin_lock_irqsave(&dwc->lock, flags); @@ -2187,6 +2184,8 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on) ret = dwc3_gadget_run_stop(dwc, is_on, false); spin_unlock_irqrestore(&dwc->lock, flags); + enable_irq(dwc->irq_gadget); + pm_runtime_put(dwc->dev); return ret; From d6888929802921a3e9b38164b3ee7a464e16f820 Mon Sep 17 00:00:00 2001 From: Bjorn Andersson Date: Wed, 9 Jun 2021 17:21:32 -0700 Subject: [PATCH 234/269] usb: typec: mux: Fix copy-paste mistake in typec_mux_match commit 142d0b24c1b17139f1aaaacae7542a38aa85640f upstream. Fix the copy-paste mistake in the return path of typec_mux_match(), where dev is considered a member of struct typec_switch rather than struct typec_mux. The two structs are identical in regards to having the struct device as the first entry, so this provides no functional change. Fixes: 3370db35193b ("usb: typec: Registering real device entries for the muxes") Reviewed-by: Heikki Krogerus Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20210610002132.3088083-1-bjorn.andersson@linaro.org Signed-off-by: Greg Kroah-Hartman --- drivers/usb/typec/mux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/mux.c b/drivers/usb/typec/mux.c index 42acdc8b684fe..b9035c3407b56 100644 --- a/drivers/usb/typec/mux.c +++ b/drivers/usb/typec/mux.c @@ -239,7 +239,7 @@ static void *typec_mux_match(struct fwnode_handle *fwnode, const char *id, dev = class_find_device(&typec_mux_class, NULL, fwnode, mux_fwnode_match); - return dev ? to_typec_switch(dev) : ERR_PTR(-EPROBE_DEFER); + return dev ? to_typec_mux(dev) : ERR_PTR(-EPROBE_DEFER); } /** From 5a61f69da3b8d735b01dddee72fee4671510d907 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Tue, 8 Jun 2021 23:33:18 +0200 Subject: [PATCH 235/269] drm/mcde: Fix off by 10^3 in calculation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit c8a570443943304cac2e4186dbce6989b6c2b8b5 upstream. The calclulation of how many bytes we stuff into the DSI pipeline for video mode panels is off by three orders of magnitude because we did not account for the fact that the DRM mode clock is in kilohertz rather than hertz. This used to be: drm_mode_vrefresh(mode) * mode->htotal * mode->vtotal which would become for example for s6e63m0: 60 x 514 x 831 = 25628040 Hz, but mode->clock is 25628 as it is in kHz. This affects only the Samsung GT-I8190 "Golden" phone right now since it is the only MCDE device with a video mode display. Curiously some specimen work with this code and wild settings in the EOL and empty packets at the end of the display, but I have noticed an eeire flicker until now. Others were not so lucky and got black screens. Cc: Ville Syrjälä Reported-by: Stephan Gerhold Fixes: 920dd1b1425b ("drm/mcde: Use mode->clock instead of reverse calculating it from the vrefresh") Signed-off-by: Linus Walleij Tested-by: Stephan Gerhold Reviewed-by: Stephan Gerhold Link: https://patchwork.freedesktop.org/patch/msgid/20210608213318.3897858-1-linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/mcde/mcde_dsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/mcde/mcde_dsi.c b/drivers/gpu/drm/mcde/mcde_dsi.c index b3fd3501c4127..5275b2723293b 100644 --- a/drivers/gpu/drm/mcde/mcde_dsi.c +++ b/drivers/gpu/drm/mcde/mcde_dsi.c @@ -577,7 +577,7 @@ static void mcde_dsi_setup_video_mode(struct mcde_dsi *d, * porches and sync. */ /* (ps/s) / (pixels/s) = ps/pixels */ - pclk = DIV_ROUND_UP_ULL(1000000000000, mode->clock); + pclk = DIV_ROUND_UP_ULL(1000000000000, (mode->clock * 1000)); dev_dbg(d->dev, "picoseconds between two pixels: %llu\n", pclk); From 5b7dc8329d7a51bab34452d132230fd76330a22b Mon Sep 17 00:00:00 2001 From: Jonathan Marek Date: Thu, 13 May 2021 13:13:58 -0400 Subject: [PATCH 236/269] drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650 commit b4387eaf3821a4c4241ac3a556e13244eb1fdaa5 upstream. Value was shifted in the wrong direction, resulting in the field always being zero, which is incorrect for A650. Fixes: d0bac4e9cd66 ("drm/msm/a6xx: set ubwc config for A640 and A650") Signed-off-by: Jonathan Marek Reviewed-by: Akhil P Oommen Link: https://lore.kernel.org/r/20210513171431.18632-4-jonathan@marek.ca Signed-off-by: Rob Clark Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 722c2fe3bfd56..7f7c29c67df2c 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -486,7 +486,7 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu) rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1); gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1); gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, - uavflagprd_inv >> 4 | lower_bit << 1); + uavflagprd_inv << 4 | lower_bit << 1); gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21); } From 0bc79f4b7ac60f9fd4dfd1c65ad17e5a2369e2ca Mon Sep 17 00:00:00 2001 From: Jonathan Marek Date: Thu, 13 May 2021 13:13:59 -0400 Subject: [PATCH 237/269] drm/msm/a6xx: update/fix CP_PROTECT initialization commit 408434036958699a7f50ddec984f7ba33e11a8f5 upstream. Update CP_PROTECT register programming based on downstream. A6XX_PROTECT_RW is renamed to A6XX_PROTECT_NORDWR to make things aligned and also be more clear about what it does. Note that this required switching to use the CP_ALWAYS_ON_COUNTER as the GMU counter is not accessible from the cmdstream. Which also means using the CPU counter for the msm_gpu_submit_flush() tracepoint (as catapult depends on being able to compare this to the start/end values captured in cmdstream). This may need to be revisited when IFPC is enabled. Also, compared to downstream, this opens up CP_PERFCTR_CP_SEL as the userspace performance tooling (fdperf and pps-producer) expect to be able to configure the CP counters. Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support") Signed-off-by: Jonathan Marek Reviewed-by: Akhil P Oommen Link: https://lore.kernel.org/r/20210513171431.18632-5-jonathan@marek.ca [switch to CP_ALWAYS_ON_COUNTER, open up CP_PERFCNTR_CP_SEL, and spiff up commit msg] Signed-off-by: Rob Clark Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 151 +++++++++++++++++++------- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- 2 files changed, 113 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 7f7c29c67df2c..935f1a4e86550 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -154,7 +154,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) * GPU registers so we need to add 0x1a800 to the register value on A630 * to get the right value from PM4. */ - get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800, + get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO, rbmemptr_stats(ring, index, alwayson_start)); /* Invalidate CCU depth and color */ @@ -184,7 +184,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO, rbmemptr_stats(ring, index, cpcycles_end)); - get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800, + get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO, rbmemptr_stats(ring, index, alwayson_end)); /* Write the fence to the scratch register */ @@ -203,8 +203,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) OUT_RING(ring, submit->seqno); trace_msm_gpu_submit_flush(submit, - gmu_read64(&a6xx_gpu->gmu, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L, - REG_A6XX_GMU_ALWAYS_ON_COUNTER_H)); + gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO, + REG_A6XX_CP_ALWAYS_ON_COUNTER_HI)); a6xx_flush(gpu, ring); } @@ -459,6 +459,113 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state) gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0); } +/* For a615, a616, a618, A619, a630, a640 and a680 */ +static const u32 a6xx_protect[] = { + A6XX_PROTECT_RDONLY(0x00000, 0x04ff), + A6XX_PROTECT_RDONLY(0x00501, 0x0005), + A6XX_PROTECT_RDONLY(0x0050b, 0x02f4), + A6XX_PROTECT_NORDWR(0x0050e, 0x0000), + A6XX_PROTECT_NORDWR(0x00510, 0x0000), + A6XX_PROTECT_NORDWR(0x00534, 0x0000), + A6XX_PROTECT_NORDWR(0x00800, 0x0082), + A6XX_PROTECT_NORDWR(0x008a0, 0x0008), + A6XX_PROTECT_NORDWR(0x008ab, 0x0024), + A6XX_PROTECT_RDONLY(0x008de, 0x00ae), + A6XX_PROTECT_NORDWR(0x00900, 0x004d), + A6XX_PROTECT_NORDWR(0x0098d, 0x0272), + A6XX_PROTECT_NORDWR(0x00e00, 0x0001), + A6XX_PROTECT_NORDWR(0x00e03, 0x000c), + A6XX_PROTECT_NORDWR(0x03c00, 0x00c3), + A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff), + A6XX_PROTECT_NORDWR(0x08630, 0x01cf), + A6XX_PROTECT_NORDWR(0x08e00, 0x0000), + A6XX_PROTECT_NORDWR(0x08e08, 0x0000), + A6XX_PROTECT_NORDWR(0x08e50, 0x001f), + A6XX_PROTECT_NORDWR(0x09624, 0x01db), + A6XX_PROTECT_NORDWR(0x09e70, 0x0001), + A6XX_PROTECT_NORDWR(0x09e78, 0x0187), + A6XX_PROTECT_NORDWR(0x0a630, 0x01cf), + A6XX_PROTECT_NORDWR(0x0ae02, 0x0000), + A6XX_PROTECT_NORDWR(0x0ae50, 0x032f), + A6XX_PROTECT_NORDWR(0x0b604, 0x0000), + A6XX_PROTECT_NORDWR(0x0be02, 0x0001), + A6XX_PROTECT_NORDWR(0x0be20, 0x17df), + A6XX_PROTECT_NORDWR(0x0f000, 0x0bff), + A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff), + A6XX_PROTECT_NORDWR(0x11c00, 0x0000), /* note: infinite range */ +}; + +/* These are for a620 and a650 */ +static const u32 a650_protect[] = { + A6XX_PROTECT_RDONLY(0x00000, 0x04ff), + A6XX_PROTECT_RDONLY(0x00501, 0x0005), + A6XX_PROTECT_RDONLY(0x0050b, 0x02f4), + A6XX_PROTECT_NORDWR(0x0050e, 0x0000), + A6XX_PROTECT_NORDWR(0x00510, 0x0000), + A6XX_PROTECT_NORDWR(0x00534, 0x0000), + A6XX_PROTECT_NORDWR(0x00800, 0x0082), + A6XX_PROTECT_NORDWR(0x008a0, 0x0008), + A6XX_PROTECT_NORDWR(0x008ab, 0x0024), + A6XX_PROTECT_RDONLY(0x008de, 0x00ae), + A6XX_PROTECT_NORDWR(0x00900, 0x004d), + A6XX_PROTECT_NORDWR(0x0098d, 0x0272), + A6XX_PROTECT_NORDWR(0x00e00, 0x0001), + A6XX_PROTECT_NORDWR(0x00e03, 0x000c), + A6XX_PROTECT_NORDWR(0x03c00, 0x00c3), + A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff), + A6XX_PROTECT_NORDWR(0x08630, 0x01cf), + A6XX_PROTECT_NORDWR(0x08e00, 0x0000), + A6XX_PROTECT_NORDWR(0x08e08, 0x0000), + A6XX_PROTECT_NORDWR(0x08e50, 0x001f), + A6XX_PROTECT_NORDWR(0x08e80, 0x027f), + A6XX_PROTECT_NORDWR(0x09624, 0x01db), + A6XX_PROTECT_NORDWR(0x09e60, 0x0011), + A6XX_PROTECT_NORDWR(0x09e78, 0x0187), + A6XX_PROTECT_NORDWR(0x0a630, 0x01cf), + A6XX_PROTECT_NORDWR(0x0ae02, 0x0000), + A6XX_PROTECT_NORDWR(0x0ae50, 0x032f), + A6XX_PROTECT_NORDWR(0x0b604, 0x0000), + A6XX_PROTECT_NORDWR(0x0b608, 0x0007), + A6XX_PROTECT_NORDWR(0x0be02, 0x0001), + A6XX_PROTECT_NORDWR(0x0be20, 0x17df), + A6XX_PROTECT_NORDWR(0x0f000, 0x0bff), + A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff), + A6XX_PROTECT_NORDWR(0x18400, 0x1fff), + A6XX_PROTECT_NORDWR(0x1a800, 0x1fff), + A6XX_PROTECT_NORDWR(0x1f400, 0x0443), + A6XX_PROTECT_RDONLY(0x1f844, 0x007b), + A6XX_PROTECT_NORDWR(0x1f887, 0x001b), + A6XX_PROTECT_NORDWR(0x1f8c0, 0x0000), /* note: infinite range */ +}; + +static void a6xx_set_cp_protect(struct msm_gpu *gpu) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + const u32 *regs = a6xx_protect; + unsigned i, count = ARRAY_SIZE(a6xx_protect), count_max = 32; + + BUILD_BUG_ON(ARRAY_SIZE(a6xx_protect) > 32); + BUILD_BUG_ON(ARRAY_SIZE(a650_protect) > 48); + + if (adreno_is_a650(adreno_gpu)) { + regs = a650_protect; + count = ARRAY_SIZE(a650_protect); + count_max = 48; + } + + /* + * Enable access protection to privileged registers, fault on an access + * protect violation and select the last span to protect from the start + * address all the way to the end of the register address space + */ + gpu_write(gpu, REG_A6XX_CP_PROTECT_CNTL, BIT(0) | BIT(1) | BIT(3)); + + for (i = 0; i < count - 1; i++) + gpu_write(gpu, REG_A6XX_CP_PROTECT(i), regs[i]); + /* last CP_PROTECT to have "infinite" length on the last entry */ + gpu_write(gpu, REG_A6XX_CP_PROTECT(count_max - 1), regs[i]); +} + static void a6xx_set_ubwc_config(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -722,41 +829,7 @@ static int a6xx_hw_init(struct msm_gpu *gpu) } /* Protect registers from the CP */ - gpu_write(gpu, REG_A6XX_CP_PROTECT_CNTL, 0x00000003); - - gpu_write(gpu, REG_A6XX_CP_PROTECT(0), - A6XX_PROTECT_RDONLY(0x600, 0x51)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(1), A6XX_PROTECT_RW(0xae50, 0x2)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(2), A6XX_PROTECT_RW(0x9624, 0x13)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(3), A6XX_PROTECT_RW(0x8630, 0x8)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(4), A6XX_PROTECT_RW(0x9e70, 0x1)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(5), A6XX_PROTECT_RW(0x9e78, 0x187)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(6), A6XX_PROTECT_RW(0xf000, 0x810)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(7), - A6XX_PROTECT_RDONLY(0xfc00, 0x3)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(8), A6XX_PROTECT_RW(0x50e, 0x0)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(9), A6XX_PROTECT_RDONLY(0x50f, 0x0)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(10), A6XX_PROTECT_RW(0x510, 0x0)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(11), - A6XX_PROTECT_RDONLY(0x0, 0x4f9)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(12), - A6XX_PROTECT_RDONLY(0x501, 0xa)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(13), - A6XX_PROTECT_RDONLY(0x511, 0x44)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(14), A6XX_PROTECT_RW(0xe00, 0xe)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(15), A6XX_PROTECT_RW(0x8e00, 0x0)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(16), A6XX_PROTECT_RW(0x8e50, 0xf)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(17), A6XX_PROTECT_RW(0xbe02, 0x0)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(18), - A6XX_PROTECT_RW(0xbe20, 0x11f3)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(19), A6XX_PROTECT_RW(0x800, 0x82)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(20), A6XX_PROTECT_RW(0x8a0, 0x8)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(21), A6XX_PROTECT_RW(0x8ab, 0x19)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(22), A6XX_PROTECT_RW(0x900, 0x4d)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(23), A6XX_PROTECT_RW(0x98d, 0x76)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(24), - A6XX_PROTECT_RDONLY(0x980, 0x4)); - gpu_write(gpu, REG_A6XX_CP_PROTECT(25), A6XX_PROTECT_RW(0xa630, 0x0)); + a6xx_set_cp_protect(gpu); /* Enable expanded apriv for targets that support it */ if (gpu->hw_apriv) { diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 3eeebf6a754b9..69765a722cae6 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -37,7 +37,7 @@ struct a6xx_gpu { * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len * registers starting at _reg. */ -#define A6XX_PROTECT_RW(_reg, _len) \ +#define A6XX_PROTECT_NORDWR(_reg, _len) \ ((1 << 31) | \ (((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF)) From fd681a8c7ac8f649a0718f6cbf2fe75d0587c9a2 Mon Sep 17 00:00:00 2001 From: Jonathan Marek Date: Thu, 13 May 2021 13:14:00 -0400 Subject: [PATCH 238/269] drm/msm/a6xx: avoid shadow NULL reference in failure path commit ce86c239e4d218ae6040bec18e6d19a58edb8b7c upstream. If a6xx_hw_init() fails before creating the shadow_bo, the a6xx_pm_suspend code referencing it will crash. Change the condition to one that avoids this problem (note: creation of shadow_bo is behind this same condition) Fixes: e8b0b994c3a5 ("drm/msm/a6xx: Clear shadow on suspend") Signed-off-by: Jonathan Marek Reviewed-by: Akhil P Oommen Link: https://lore.kernel.org/r/20210513171431.18632-6-jonathan@marek.ca Signed-off-by: Rob Clark Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 935f1a4e86550..2dcbe02846cd9 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1128,7 +1128,7 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu) if (ret) return ret; - if (adreno_gpu->base.hw_apriv || a6xx_gpu->has_whereami) + if (a6xx_gpu->shadow_bo) for (i = 0; i < gpu->nr_rings; i++) a6xx_gpu->shadow[i] = 0; From 67cf4e447b5e5e9e94996cb6812ae2828e0e0e27 Mon Sep 17 00:00:00 2001 From: Kamal Heib Date: Tue, 25 May 2021 18:01:34 +0300 Subject: [PATCH 239/269] RDMA/ipoib: Fix warning caused by destroying non-initial netns commit a3e74fb9247cd530dca246699d5eb5a691884d32 upstream. After the commit 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces"), if the IPoIB device is moved to non-initial netns, destroying that netns lets the device vanish instead of moving it back to the initial netns, This is happening because default_device_exit() skips the interfaces due to having rtnl_link_ops set. Steps to reporoduce: ip netns add foo ip link set mlx5_ib0 netns foo ip netns delete foo WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50 Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d fuse CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S W 5.13.0-rc1+ #1 Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016 Workqueue: netns cleanup_net RIP: 0010:netdev_exit+0x3f/0x50 Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48 8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206 RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00 RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00 R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620 R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20 FS: 0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ops_exit_list.isra.9+0x36/0x70 cleanup_net+0x234/0x390 process_one_work+0x1cb/0x360 ? process_one_work+0x360/0x360 worker_thread+0x30/0x370 ? process_one_work+0x360/0x360 kthread+0x116/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 To avoid the above warning and later on the kernel panic that could happen on shutdown due to a NULL pointer dereference, make sure to set the netns_refund flag that was introduced by commit 3a5ca857079e ("can: dev: Move device back to init netns on owning netns delete") to properly restore the IPoIB interfaces to the initial netns. Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces") Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib Reviewed-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c index d5a90a66b45cf..5b05cf3837da1 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c @@ -163,6 +163,7 @@ static size_t ipoib_get_size(const struct net_device *dev) static struct rtnl_link_ops ipoib_link_ops __read_mostly = { .kind = "ipoib", + .netns_refund = true, .maxtype = IFLA_IPOIB_MAX, .policy = ipoib_policy, .priv_size = sizeof(struct ipoib_dev_priv), From cb1aa1da04882d1860f733e24aeebdbbc85724d7 Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Thu, 3 Jun 2021 16:19:39 +0300 Subject: [PATCH 240/269] RDMA/mlx4: Do not map the core_clock page to user space unless enabled commit 404e5a12691fe797486475fe28cc0b80cb8bef2c upstream. Currently when mlx4 maps the hca_core_clock page to the user space there are read-modifiable registers, one of which is semaphore, on this page as well as the clock counter. If user reads the wrong offset, it can modify the semaphore and hang the device. Do not map the hca_core_clock page to the user space unless the device has been put in a backwards compatibility mode to support this feature. After this patch, mlx4 core_clock won't be mapped to user space on the majority of existing devices and the uverbs device time feature in ibv_query_rt_values_ex() will be disabled. Fixes: 52033cfb5aab ("IB/mlx4: Add mmap call to map the hardware clock") Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com Signed-off-by: Shay Drory Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/hw/mlx4/main.c | 5 +---- drivers/net/ethernet/mellanox/mlx4/fw.c | 3 +++ drivers/net/ethernet/mellanox/mlx4/fw.h | 1 + drivers/net/ethernet/mellanox/mlx4/main.c | 6 ++++++ include/linux/mlx4/device.h | 1 + 5 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index cd0fba6b09642..7b11aff8a5ea7 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -580,12 +580,9 @@ static int mlx4_ib_query_device(struct ib_device *ibdev, props->cq_caps.max_cq_moderation_count = MLX4_MAX_CQ_COUNT; props->cq_caps.max_cq_moderation_period = MLX4_MAX_CQ_PERIOD; - if (!mlx4_is_slave(dev->dev)) - err = mlx4_get_internal_clock_params(dev->dev, &clock_params); - if (uhw->outlen >= resp.response_length + sizeof(resp.hca_core_clock_offset)) { resp.response_length += sizeof(resp.hca_core_clock_offset); - if (!err && !mlx4_is_slave(dev->dev)) { + if (!mlx4_get_internal_clock_params(dev->dev, &clock_params)) { resp.comp_mask |= MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET; resp.hca_core_clock_offset = clock_params.offset % PAGE_SIZE; } diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c index f6cfec81ccc3b..dc4ac1a2b6b67 100644 --- a/drivers/net/ethernet/mellanox/mlx4/fw.c +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c @@ -823,6 +823,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) #define QUERY_DEV_CAP_MAD_DEMUX_OFFSET 0xb0 #define QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_BASE_OFFSET 0xa8 #define QUERY_DEV_CAP_DMFS_HIGH_RATE_QPN_RANGE_OFFSET 0xac +#define QUERY_DEV_CAP_MAP_CLOCK_TO_USER 0xc1 #define QUERY_DEV_CAP_QP_RATE_LIMIT_NUM_OFFSET 0xcc #define QUERY_DEV_CAP_QP_RATE_LIMIT_MAX_OFFSET 0xd0 #define QUERY_DEV_CAP_QP_RATE_LIMIT_MIN_OFFSET 0xd2 @@ -841,6 +842,8 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) if (mlx4_is_mfunc(dev)) disable_unsupported_roce_caps(outbox); + MLX4_GET(field, outbox, QUERY_DEV_CAP_MAP_CLOCK_TO_USER); + dev_cap->map_clock_to_user = field & 0x80; MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_QP_OFFSET); dev_cap->reserved_qps = 1 << (field & 0xf); MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_QP_OFFSET); diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.h b/drivers/net/ethernet/mellanox/mlx4/fw.h index 8f020f26ebf5f..cf64e54eecb05 100644 --- a/drivers/net/ethernet/mellanox/mlx4/fw.h +++ b/drivers/net/ethernet/mellanox/mlx4/fw.h @@ -131,6 +131,7 @@ struct mlx4_dev_cap { u32 health_buffer_addrs; struct mlx4_port_cap port_cap[MLX4_MAX_PORTS + 1]; bool wol_port[MLX4_MAX_PORTS + 1]; + bool map_clock_to_user; }; struct mlx4_func_cap { diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c index c326b434734e1..00c84656b2e7e 100644 --- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -498,6 +498,7 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) } } + dev->caps.map_clock_to_user = dev_cap->map_clock_to_user; dev->caps.uar_page_size = PAGE_SIZE; dev->caps.num_uars = dev_cap->uar_size / PAGE_SIZE; dev->caps.local_ca_ack_delay = dev_cap->local_ca_ack_delay; @@ -1948,6 +1949,11 @@ int mlx4_get_internal_clock_params(struct mlx4_dev *dev, if (mlx4_is_slave(dev)) return -EOPNOTSUPP; + if (!dev->caps.map_clock_to_user) { + mlx4_dbg(dev, "Map clock to user is not supported.\n"); + return -EOPNOTSUPP; + } + if (!params) return -EINVAL; diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 06e066e04a4bb..eb8169c03d899 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -631,6 +631,7 @@ struct mlx4_caps { bool wol_port[MLX4_MAX_PORTS + 1]; struct mlx4_rate_limit_caps rl_caps; u32 health_buffer_addrs; + bool map_clock_to_user; }; struct mlx4_buf_list { From c25ec6386a81d6d6bebd1dae320f6f7bee0167e6 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Wed, 30 Dec 2020 16:54:56 +0100 Subject: [PATCH 241/269] ARM: cpuidle: Avoid orphan section warning commit d94b93a9101573eb75b819dee94b1417acff631b upstream. Since commit 83109d5d5fba ("x86/build: Warn on orphan section placement"), we get a warning for objects in orphan sections. The cpuidle implementation for OMAP causes this when CONFIG_CPU_IDLE is disabled: arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table' arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table' arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table' Change the definition of CPUIDLE_METHOD_OF_DECLARE() to silently drop the table and all code referenced from it when CONFIG_CPU_IDLE is disabled. Fixes: 06ee7a950b6a ("ARM: OMAP2+: pm33xx-core: Add cpuidle_ops for am335x/am437x") Signed-off-by: Arnd Bergmann Reviewed-by: Miguel Ojeda Reviewed-by: Nick Desaulniers Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20201230155506.1085689-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman --- arch/arm/include/asm/cpuidle.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/cpuidle.h b/arch/arm/include/asm/cpuidle.h index 0d67ed682e077..bc4ffa7ca04c7 100644 --- a/arch/arm/include/asm/cpuidle.h +++ b/arch/arm/include/asm/cpuidle.h @@ -7,9 +7,11 @@ #ifdef CONFIG_CPU_IDLE extern int arm_cpuidle_simple_enter(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index); +#define __cpuidle_method_section __used __section("__cpuidle_method_of_table") #else static inline int arm_cpuidle_simple_enter(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { return -ENODEV; } +#define __cpuidle_method_section __maybe_unused /* drop silently */ #endif /* Common ARM WFI state */ @@ -42,8 +44,7 @@ struct of_cpuidle_method { #define CPUIDLE_METHOD_OF_DECLARE(name, _method, _ops) \ static const struct of_cpuidle_method __cpuidle_method_of_table_##name \ - __used __section("__cpuidle_method_of_table") \ - = { .method = _method, .ops = _ops } + __cpuidle_method_section = { .method = _method, .ops = _ops } extern int arm_cpuidle_suspend(int index); From 16ccdcdfe668896587b3d4cb2fd6dd512b308dea Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Wed, 5 May 2021 17:14:11 -0700 Subject: [PATCH 242/269] vmlinux.lds.h: Avoid orphan section with !SMP commit d4c6399900364facd84c9e35ce1540b6046c345f upstream. With x86_64_defconfig and the following configs, there is an orphan section warning: CONFIG_SMP=n CONFIG_AMD_MEM_ENCRYPT=y CONFIG_HYPERVISOR_GUEST=y CONFIG_KVM=y CONFIG_PARAVIRT=y ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/cpu/vmware.o' being placed in section `.data..decrypted' ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/kvm.o' being placed in section `.data..decrypted' These sections are created with DEFINE_PER_CPU_DECRYPTED, which ultimately turns into __PCPU_ATTRS, which in turn has a section attribute with a value of PER_CPU_BASE_SECTION + the section name. When CONFIG_SMP is not set, the base section is .data and that is not currently handled in any linker script. Add .data..decrypted to PERCPU_DECRYPTED_SECTION, which is included in PERCPU_INPUT -> PERCPU_SECTION, which is include in the x86 linker script when either CONFIG_X86_64 or CONFIG_SMP is unset, taking care of the warning. Fixes: ac26963a1175 ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED") Link: https://github.com/ClangBuiltLinux/linux/issues/1360 Reported-by: kernel test robot Signed-off-by: Nathan Chancellor Tested-by: Nick Desaulniers # build Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20210506001410.1026691-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman --- include/asm-generic/vmlinux.lds.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index d7efbc5490e8c..18468b46c4506 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -971,6 +971,7 @@ #ifdef CONFIG_AMD_MEM_ENCRYPT #define PERCPU_DECRYPTED_SECTION \ . = ALIGN(PAGE_SIZE); \ + *(.data..decrypted) \ *(.data..percpu..decrypted) \ . = ALIGN(PAGE_SIZE); #else From c9cb5837e92ee3052e0e46e3cd1eb1f7a903411d Mon Sep 17 00:00:00 2001 From: Zhen Lei Date: Sat, 8 May 2021 11:42:16 +0800 Subject: [PATCH 243/269] tools/bootconfig: Fix error return code in apply_xbc() commit e8ba0b2b64126381643bb50df3556b139a60545a upstream. Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Link: https://lkml.kernel.org/r/20210508034216.2277-1-thunder.leizhen@huawei.com Fixes: a995e6bc0524 ("tools/bootconfig: Fix to check the write failure correctly") Reported-by: Hulk Robot Acked-by: Masami Hiramatsu Signed-off-by: Zhen Lei Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- tools/bootconfig/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/bootconfig/main.c b/tools/bootconfig/main.c index 7362bef1a3683..6cd6080cac04c 100644 --- a/tools/bootconfig/main.c +++ b/tools/bootconfig/main.c @@ -399,6 +399,7 @@ static int apply_xbc(const char *path, const char *xbc_path) } /* TODO: Ensure the @path is initramfs/initrd image */ if (fstat(fd, &stat) < 0) { + ret = -errno; pr_err("Failed to get the size of %s\n", path); goto out; } From d83075c25a28e6e8f02e531c8dd4b4e0b58c3ef5 Mon Sep 17 00:00:00 2001 From: Wang Wensheng Date: Mon, 17 May 2021 01:57:49 +0000 Subject: [PATCH 244/269] phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe() commit 6411e386db0a477217607015e7d2910d02f75426 upstream. Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: a43f72ae136a ("phy: cadence: Sierra: Change MAX_LANES of Sierra to 16") Reported-by: Hulk Robot Signed-off-by: Wang Wensheng Link: https://lore.kernel.org/r/20210517015749.127799-1-wangwensheng4@huawei.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman --- drivers/phy/cadence/phy-cadence-sierra.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/phy/cadence/phy-cadence-sierra.c b/drivers/phy/cadence/phy-cadence-sierra.c index aaa0bbe473f76..7d990613ce837 100644 --- a/drivers/phy/cadence/phy-cadence-sierra.c +++ b/drivers/phy/cadence/phy-cadence-sierra.c @@ -614,6 +614,7 @@ static int cdns_sierra_phy_probe(struct platform_device *pdev) sp->nsubnodes = node; if (sp->num_lanes > SIERRA_MAX_LANES) { + ret = -EINVAL; dev_err(dev, "Invalid lane configuration\n"); goto put_child2; } From 0e2c9aeb00289f279b8181fbd4c20765127d8943 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Mon, 24 May 2021 10:49:41 +0800 Subject: [PATCH 245/269] ASoC: core: Fix Null-point-dereference in fmt_single_name() commit 41daf6ba594d55f201c50280ebcd430590441da1 upstream. Check the return value of devm_kstrdup() in case of Null-point-dereference. Fixes: 45dd9943fce0 ("ASoC: core: remove artificial component and DAI name constraint") Cc: Dmitry Baryshkov Reported-by: Hulk Robot Signed-off-by: Kefeng Wang Link: https://lore.kernel.org/r/20210524024941.159952-1-wangkefeng.wang@huawei.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- sound/soc/soc-core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index bf65cba232e67..b22674e3a89c9 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2231,6 +2231,8 @@ static char *fmt_single_name(struct device *dev, int *id) return NULL; name = devm_kstrdup(dev, devname, GFP_KERNEL); + if (!name) + return NULL; /* are we a "%s.%d" name (platform and SPI components) */ found = strstr(name, dev->driver->name); From 62d891861f83ac12e1b00b304211faf3d1e24857 Mon Sep 17 00:00:00 2001 From: Jerome Brunet Date: Mon, 24 May 2021 11:34:48 +0200 Subject: [PATCH 246/269] ASoC: meson: gx-card: fix sound-dai dt schema commit d031d99b02eaf7363c33f5b27b38086cc8104082 upstream. There is a fair amount of warnings when running 'make dtbs_check' with amlogic,gx-sound-card.yaml. Ex: arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:1: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:2: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0: [66, 0, 0] is too long The reason is that the sound-dai phandle provided has cells, and in such case the schema should use 'phandle-array' instead of 'phandle'. Fixes: fd00366b8e41 ("ASoC: meson: gx: add sound card dt-binding documentation") Signed-off-by: Jerome Brunet Link: https://lore.kernel.org/r/20210524093448.357140-1-jbrunet@baylibre.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman --- .../devicetree/bindings/sound/amlogic,gx-sound-card.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/sound/amlogic,gx-sound-card.yaml b/Documentation/devicetree/bindings/sound/amlogic,gx-sound-card.yaml index db61f0731a203..2e35aeaa8781d 100644 --- a/Documentation/devicetree/bindings/sound/amlogic,gx-sound-card.yaml +++ b/Documentation/devicetree/bindings/sound/amlogic,gx-sound-card.yaml @@ -57,7 +57,7 @@ patternProperties: rate sound-dai: - $ref: /schemas/types.yaml#/definitions/phandle + $ref: /schemas/types.yaml#/definitions/phandle-array description: phandle of the CPU DAI patternProperties: @@ -71,7 +71,7 @@ patternProperties: properties: sound-dai: - $ref: /schemas/types.yaml#/definitions/phandle + $ref: /schemas/types.yaml#/definitions/phandle-array description: phandle of the codec DAI required: From 56a388a9cc1e246a4447a575a8ba29984758bb50 Mon Sep 17 00:00:00 2001 From: Yang Li Date: Tue, 25 May 2021 18:50:32 +0800 Subject: [PATCH 247/269] phy: ti: Fix an error code in wiz_probe() commit b8203ec7f58ae925e10fadd3d136073ae7503a6e upstream. When the code execute this if statement, the value of ret is 0. However, we can see from the dev_err() log that the value of ret should be -EINVAL. Clean up smatch warning: drivers/phy/ti/phy-j721e-wiz.c:1216 wiz_probe() warn: missing error code 'ret' Reported-by: Abaci Robot Fixes: c9f9eba06629 ("phy: ti: j721e-wiz: Manage typec-gpio-dir") Signed-off-by: Yang Li Link: https://lore.kernel.org/r/1621939832-65535-1-git-send-email-yang.lee@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman --- drivers/phy/ti/phy-j721e-wiz.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/phy/ti/phy-j721e-wiz.c b/drivers/phy/ti/phy-j721e-wiz.c index e28e25f98708c..dceac77148721 100644 --- a/drivers/phy/ti/phy-j721e-wiz.c +++ b/drivers/phy/ti/phy-j721e-wiz.c @@ -894,6 +894,7 @@ static int wiz_probe(struct platform_device *pdev) if (wiz->typec_dir_delay < WIZ_TYPEC_DIR_DEBOUNCE_MIN || wiz->typec_dir_delay > WIZ_TYPEC_DIR_DEBOUNCE_MAX) { + ret = -EINVAL; dev_err(dev, "Invalid typec-dir-debounce property\n"); goto err_addr_to_resource; } From e0b518a2eb44d8a74c19e50f79a8ed393e96d634 Mon Sep 17 00:00:00 2001 From: Srinivas Kandagatla Date: Tue, 25 May 2021 17:55:39 +0100 Subject: [PATCH 248/269] gpio: wcd934x: Fix shift-out-of-bounds error commit dbec64b11c65d74f31427e2b9d5746fbf17bf840 upstream. bit-mask for pins 0 to 4 is BIT(0) to BIT(4) however we ended up with BIT(n - 1) which is not right, and this was caught by below usban check UBSAN: shift-out-of-bounds in drivers/gpio/gpio-wcd934x.c:34:14 Fixes: 59c324683400 ("gpio: wcd934x: Add support to wcd934x gpio controller") Signed-off-by: Srinivas Kandagatla Reviewed-by: Andy Shevchenko Reviewed-by: Bjorn Andersson Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman --- drivers/gpio/gpio-wcd934x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-wcd934x.c b/drivers/gpio/gpio-wcd934x.c index 1cbce59908558..97e6caedf1f33 100644 --- a/drivers/gpio/gpio-wcd934x.c +++ b/drivers/gpio/gpio-wcd934x.c @@ -7,7 +7,7 @@ #include #include -#define WCD_PIN_MASK(p) BIT(p - 1) +#define WCD_PIN_MASK(p) BIT(p) #define WCD_REG_DIR_CTL_OFFSET 0x42 #define WCD_REG_VAL_CTL_OFFSET 0x43 #define WCD934X_NPINS 5 From c64a3be39fa9fe473dd85cf1e6a4a7b27c0d935f Mon Sep 17 00:00:00 2001 From: Marco Elver Date: Thu, 27 May 2021 12:47:11 +0200 Subject: [PATCH 249/269] perf: Fix data race between pin_count increment/decrement commit 6c605f8371159432ec61cbb1488dcf7ad24ad19a upstream. KCSAN reports a data race between increment and decrement of pin_count: write to 0xffff888237c2d4e0 of 4 bytes by task 15740 on cpu 1: find_get_context kernel/events/core.c:4617 __do_sys_perf_event_open kernel/events/core.c:12097 [inline] __se_sys_perf_event_open kernel/events/core.c:11933 ... read to 0xffff888237c2d4e0 of 4 bytes by task 15743 on cpu 0: perf_unpin_context kernel/events/core.c:1525 [inline] __do_sys_perf_event_open kernel/events/core.c:12328 [inline] __se_sys_perf_event_open kernel/events/core.c:11933 ... Because neither read-modify-write here is atomic, this can lead to one of the operations being lost, resulting in an inconsistent pin_count. Fix it by adding the missing locking in the CPU-event case. Fixes: fe4b04fa31a6 ("perf: Cure task_oncpu_function_call() races") Reported-by: syzbot+142c9018f5962db69c7e@syzkaller.appspotmail.com Signed-off-by: Marco Elver Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20210527104711.2671610-1-elver@google.com Signed-off-by: Greg Kroah-Hartman --- kernel/events/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index 45fa7167cee2d..7e0fdc19043e4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4547,7 +4547,9 @@ find_get_context(struct pmu *pmu, struct task_struct *task, cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu); ctx = &cpuctx->ctx; get_ctx(ctx); + raw_spin_lock_irqsave(&ctx->lock, flags); ++ctx->pin_count; + raw_spin_unlock_irqrestore(&ctx->lock, flags); return ctx; } From 4c37b062edae8ad3e1f279ecc084f254bc8161ae Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Thu, 27 May 2021 14:29:15 +0200 Subject: [PATCH 250/269] sched/fair: Keep load_avg and load_sum synced commit 7c7ad626d9a0ff0a36c1e2a3cfbbc6a13828d5eb upstream. when removing a cfs_rq from the list we only check _sum value so we must ensure that _avg and _sum stay synced so load_sum can't be null whereas load_avg is not after propagating load in the cgroup hierarchy. Use load_avg to compute load_sum similarly to what is done for util_sum and runnable_sum. Fixes: 0e2d2aaaae52 ("sched/fair: Rewrite PELT migration propagation") Reported-by: Odin Ugedal Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Odin Ugedal Link: https://lkml.kernel.org/r/20210527122916.27683-2-vincent.guittot@linaro.org Signed-off-by: Greg Kroah-Hartman --- kernel/sched/fair.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1ad0e52487f6b..f07a581ded953 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3501,10 +3501,9 @@ update_tg_cfs_runnable(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cf static inline void update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq *gcfs_rq) { - long delta_avg, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum; + long delta, running_sum, runnable_sum = gcfs_rq->prop_runnable_sum; unsigned long load_avg; u64 load_sum = 0; - s64 delta_sum; u32 divider; if (!runnable_sum) @@ -3551,13 +3550,13 @@ update_tg_cfs_load(struct cfs_rq *cfs_rq, struct sched_entity *se, struct cfs_rq load_sum = (s64)se_weight(se) * runnable_sum; load_avg = div_s64(load_sum, divider); - delta_sum = load_sum - (s64)se_weight(se) * se->avg.load_sum; - delta_avg = load_avg - se->avg.load_avg; + delta = load_avg - se->avg.load_avg; se->avg.load_sum = runnable_sum; se->avg.load_avg = load_avg; - add_positive(&cfs_rq->avg.load_avg, delta_avg); - add_positive(&cfs_rq->avg.load_sum, delta_sum); + + add_positive(&cfs_rq->avg.load_avg, delta); + cfs_rq->avg.load_sum = cfs_rq->avg.load_avg * divider; } static inline void add_tg_cfs_propagate(struct cfs_rq *cfs_rq, long runnable_sum) From 32e22db8b25ea165bd9e446c7f92b089c8568eaf Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Thu, 27 May 2021 14:29:16 +0200 Subject: [PATCH 251/269] sched/fair: Make sure to update tg contrib for blocked load commit 02da26ad5ed6ea8680e5d01f20661439611ed776 upstream. During the update of fair blocked load (__update_blocked_fair()), we update the contribution of the cfs in tg->load_avg if cfs_rq's pelt has decayed. Nevertheless, the pelt values of a cfs_rq could have been recently updated while propagating the change of a child. In this case, cfs_rq's pelt will not decayed because it has already been updated and we don't update tg->load_avg. __update_blocked_fair ... for_each_leaf_cfs_rq_safe: child cfs_rq update cfs_rq_load_avg() for child cfs_rq ... update_load_avg(cfs_rq_of(se), se, 0) ... update cfs_rq_load_avg() for parent cfs_rq -propagation of child's load makes parent cfs_rq->load_sum becoming null -UPDATE_TG is not set so it doesn't update parent cfs_rq->tg_load_avg_contrib .. for_each_leaf_cfs_rq_safe: parent cfs_rq update cfs_rq_load_avg() for parent cfs_rq - nothing to do because parent cfs_rq has already been updated recently so cfs_rq->tg_load_avg_contrib is not updated ... parent cfs_rq is decayed list_del_leaf_cfs_rq parent cfs_rq - but it still contibutes to tg->load_avg we must set UPDATE_TG flags when propagting pending load to the parent Fixes: 039ae8bcf7a5 ("sched/fair: Fix O(nr_cgroups) in the load balancing path") Reported-by: Odin Ugedal Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Odin Ugedal Link: https://lkml.kernel.org/r/20210527122916.27683-3-vincent.guittot@linaro.org Signed-off-by: Greg Kroah-Hartman --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f07a581ded953..838bd8b306e6a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7960,7 +7960,7 @@ static bool __update_blocked_fair(struct rq *rq, bool *done) /* Propagate pending load changes to the parent, if any: */ se = cfs_rq->tg->se[cpu]; if (se && !skip_blocked_update(se)) - update_load_avg(cfs_rq_of(se), se, 0); + update_load_avg(cfs_rq_of(se), se, UPDATE_TG); /* * There can be a lot of idle CPU cgroups. Don't let fully From 190a7f908993cc7f5e8bbe67aa88eeb0bdbbbaa5 Mon Sep 17 00:00:00 2001 From: Dietmar Eggemann Date: Wed, 2 Jun 2021 16:58:08 +0200 Subject: [PATCH 252/269] sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling commit 68d7a190682aa4eb02db477328088ebad15acc83 upstream. The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent unnecessary util_est updates uses the LSB of util_est.enqueued. It is exposed via _task_util_est() (and task_util_est()). Commit 92a801e5d5b7 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages") mentions that the LSB is lost for util_est resolution but find_energy_efficient_cpu() checks if task_util_est() returns 0 to return prev_cpu early. _task_util_est() returns the max value of util_est.ewma and util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED. So task_util_est() returning the max of task_util() and _task_util_est() will never return 0 under the default SCHED_FEAT(UTIL_EST, true). To fix this use the MSB of util_est.enqueued instead and keep the flag util_est internal, i.e. don't export it via _task_util_est(). The maximal possible util_avg value for a task is 1024 so the MSB of 'unsigned int util_est.enqueued' isn't used to store a util value. As a caveat the code behind the util_est_se trace point has to filter UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should be easy to do. This also fixes an issue report by Xuewen Yan that util_est_update() only used UTIL_AVG_UNCHANGED for the subtrahend of the equation: last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED) Fixes: b89997aa88f0b sched/pelt: Fix task util_est update filtering Signed-off-by: Dietmar Eggemann Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Xuewen Yan Reviewed-by: Vincent Donnefort Reviewed-by: Vincent Guittot Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.com Signed-off-by: Greg Kroah-Hartman --- include/linux/sched.h | 8 ++++++++ kernel/sched/debug.c | 3 ++- kernel/sched/fair.c | 5 +++-- kernel/sched/pelt.h | 11 +---------- 4 files changed, 14 insertions(+), 13 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 76cd21fa55016..2660ee4b08adf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -348,11 +348,19 @@ struct load_weight { * Only for tasks we track a moving average of the past instantaneous * estimated utilization. This allows to absorb sporadic drops in utilization * of an otherwise almost periodic task. + * + * The UTIL_AVG_UNCHANGED flag is used to synchronize util_est with util_avg + * updates. When a task is dequeued, its util_est should not be updated if its + * util_avg has not been updated in the meantime. + * This information is mapped into the MSB bit of util_est.enqueued at dequeue + * time. Since max value of util_est.enqueued for a task is 1024 (PELT util_avg + * for a task) it is safe to use MSB. */ struct util_est { unsigned int enqueued; unsigned int ewma; #define UTIL_EST_WEIGHT_SHIFT 2 +#define UTIL_AVG_UNCHANGED 0x80000000 } __attribute__((__aligned__(sizeof(u64)))); /* diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 6264584b51c25..70a5782724363 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -888,6 +888,7 @@ __initcall(init_sched_debug_procfs); #define __PS(S, F) SEQ_printf(m, "%-45s:%21Ld\n", S, (long long)(F)) #define __P(F) __PS(#F, F) #define P(F) __PS(#F, p->F) +#define PM(F, M) __PS(#F, p->F & (M)) #define __PSN(S, F) SEQ_printf(m, "%-45s:%14Ld.%06ld\n", S, SPLIT_NS((long long)(F))) #define __PN(F) __PSN(#F, F) #define PN(F) __PSN(#F, p->F) @@ -1014,7 +1015,7 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns, P(se.avg.util_avg); P(se.avg.last_update_time); P(se.avg.util_est.ewma); - P(se.avg.util_est.enqueued); + PM(se.avg.util_est.enqueued, ~UTIL_AVG_UNCHANGED); #endif #ifdef CONFIG_UCLAMP_TASK __PS("uclamp.min", p->uclamp_req[UCLAMP_MIN].value); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 838bd8b306e6a..ff8a172a69ca9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3903,7 +3903,7 @@ static inline unsigned long _task_util_est(struct task_struct *p) { struct util_est ue = READ_ONCE(p->se.avg.util_est); - return (max(ue.ewma, ue.enqueued) | UTIL_AVG_UNCHANGED); + return max(ue.ewma, (ue.enqueued & ~UTIL_AVG_UNCHANGED)); } static inline unsigned long task_util_est(struct task_struct *p) @@ -4003,7 +4003,7 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, * Reset EWMA on utilization increases, the moving average is used only * to smooth utilization decreases. */ - ue.enqueued = (task_util(p) | UTIL_AVG_UNCHANGED); + ue.enqueued = task_util(p); if (sched_feat(UTIL_EST_FASTUP)) { if (ue.ewma < ue.enqueued) { ue.ewma = ue.enqueued; @@ -4052,6 +4052,7 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, ue.ewma += last_ewma_diff; ue.ewma >>= UTIL_EST_WEIGHT_SHIFT; done: + ue.enqueued |= UTIL_AVG_UNCHANGED; WRITE_ONCE(p->se.avg.util_est, ue); trace_sched_util_est_se_tp(&p->se); diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h index 795e43e02afc6..0b9aeebb9c325 100644 --- a/kernel/sched/pelt.h +++ b/kernel/sched/pelt.h @@ -42,15 +42,6 @@ static inline u32 get_pelt_divider(struct sched_avg *avg) return LOAD_AVG_MAX - 1024 + avg->period_contrib; } -/* - * When a task is dequeued, its estimated utilization should not be update if - * its util_avg has not been updated at least once. - * This flag is used to synchronize util_avg updates with util_est updates. - * We map this information into the LSB bit of the utilization saved at - * dequeue time (i.e. util_est.dequeued). - */ -#define UTIL_AVG_UNCHANGED 0x1 - static inline void cfs_se_util_change(struct sched_avg *avg) { unsigned int enqueued; @@ -58,7 +49,7 @@ static inline void cfs_se_util_change(struct sched_avg *avg) if (!sched_feat(UTIL_EST)) return; - /* Avoid store if the flag has been already set */ + /* Avoid store if the flag has been already reset */ enqueued = avg->util_est.enqueued; if (!(enqueued & UTIL_AVG_UNCHANGED)) return; From 4921feb0e5be99873e1b4246672563aacd34e8cf Mon Sep 17 00:00:00 2001 From: CodyYao-oc Date: Mon, 7 Jun 2021 10:53:35 +0800 Subject: [PATCH 253/269] x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs commit a8383dfb2138742a1bb77b481ada047aededa2ba upstream. The following commit: 3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.") Got the old-style NMI watchdog logic wrong and broke it for basically every Intel CPU where it was active. Which is only truly old CPUs, so few people noticed. On CPUs with perf events support we turn off the old-style NMI watchdog, so it was pretty pointless to add the logic for X86_VENDOR_ZHAOXIN to begin with ... :-/ Anyway, the fix is to restore the old logic and add a 'break'. [ mingo: Wrote a new changelog. ] Fixes: 3a4ac121c2ca ("x86/perf: Add hardware performance events support for Zhaoxin CPU.") Signed-off-by: CodyYao-oc Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20210607025335.9643-1-CodyYao-oc@zhaoxin.com Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/cpu/perfctr-watchdog.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/perfctr-watchdog.c b/arch/x86/kernel/cpu/perfctr-watchdog.c index a5ee607a3b893..a548d9104604f 100644 --- a/arch/x86/kernel/cpu/perfctr-watchdog.c +++ b/arch/x86/kernel/cpu/perfctr-watchdog.c @@ -63,7 +63,7 @@ static inline unsigned int nmi_perfctr_msr_to_bit(unsigned int msr) case 15: return msr - MSR_P4_BPU_PERFCTR0; } - fallthrough; + break; case X86_VENDOR_ZHAOXIN: case X86_VENDOR_CENTAUR: return msr - MSR_ARCH_PERFMON_PERFCTR0; @@ -96,7 +96,7 @@ static inline unsigned int nmi_evntsel_msr_to_bit(unsigned int msr) case 15: return msr - MSR_P4_BSU_ESCR0; } - fallthrough; + break; case X86_VENDOR_ZHAOXIN: case X86_VENDOR_CENTAUR: return msr - MSR_ARCH_PERFMON_EVENTSEL0; From d046f724bbd725a24007b7e52b2d675249870888 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Mon, 7 Jun 2021 10:57:48 -0700 Subject: [PATCH 254/269] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message commit f31500b0d437a2464ca5972d8f5439e156b74960 upstream. Use the __string() machinery provided by the tracing subystem to make a copy of the string literals consumed by the "nested VM-Enter failed" tracepoint. A complete copy is necessary to ensure that the tracepoint can't outlive the data/memory it consumes and deference stale memory. Because the tracepoint itself is defined by kvm, if kvm-intel and/or kvm-amd are built as modules, the memory holding the string literals defined by the vendor modules will be freed when the module is unloaded, whereas the tracepoint and its data in the ring buffer will live until kvm is unloaded (or "indefinitely" if kvm is built-in). This bug has existed since the tracepoint was added, but was recently exposed by a new check in tracing to detect exactly this type of bug. fmt: '%s%s ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: ' WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0 CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:trace_check_vprintf+0x3be/0x3e0 Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20 RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8 RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8 R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4 R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000 FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0 Call Trace: trace_event_printf+0x5e/0x80 trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm] print_trace_line+0x1dd/0x4e0 s_show+0x45/0x150 seq_read_iter+0x2d5/0x4c0 seq_read+0x106/0x150 vfs_read+0x98/0x180 ksys_read+0x5f/0xe0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Steven Rostedt Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W") Signed-off-by: Sean Christopherson Reviewed-by: Steven Rostedt (VMware) Message-Id: <20210607175748.674002-1-seanjc@google.com> Signed-off-by: Greg Kroah-Hartman --- arch/x86/kvm/trace.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index aef960f90f26e..a2835d784f4be 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -1514,16 +1514,16 @@ TRACE_EVENT(kvm_nested_vmenter_failed, TP_ARGS(msg, err), TP_STRUCT__entry( - __field(const char *, msg) + __string(msg, msg) __field(u32, err) ), TP_fast_assign( - __entry->msg = msg; + __assign_str(msg, msg); __entry->err = err; ), - TP_printk("%s%s", __entry->msg, !__entry->err ? "" : + TP_printk("%s%s", __get_str(msg), !__entry->err ? "" : __print_symbolic(__entry->err, VMX_VMENTER_INSTRUCTION_ERRORS)) ); From 91f7fdc4cc10542ca1045c06aad23365f0d067e0 Mon Sep 17 00:00:00 2001 From: Alaa Hleihel Date: Thu, 10 Jun 2021 10:34:27 +0300 Subject: [PATCH 255/269] IB/mlx5: Fix initializing CQ fragments buffer commit 2ba0aa2feebda680ecfc3c552e867cf4d1b05a3a upstream. The function init_cq_frag_buf() can be called to initialize the current CQ fragments buffer cq->buf, or the temporary cq->resize_buf that is filled during CQ resize operation. However, the offending commit started to use function get_cqe() for getting the CQEs, the issue with this change is that get_cqe() always returns CQEs from cq->buf, which leads us to initialize the wrong buffer, and in case of enlarging the CQ we try to access elements beyond the size of the current cq->buf and eventually hit a kernel panic. [exception RIP: init_cq_frag_buf+103] [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib] [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core] [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt] [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt] [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt] [ffff9f799ddcbec8] kthread at ffffffffa66c5da1 [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that takes the correct source buffer as a parameter. Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)") Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com Signed-off-by: Alaa Hleihel Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/hw/mlx5/cq.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index fb62f1d04afa7..372adb7ceb74e 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -838,15 +838,14 @@ static void destroy_cq_user(struct mlx5_ib_cq *cq, struct ib_udata *udata) ib_umem_release(cq->buf.umem); } -static void init_cq_frag_buf(struct mlx5_ib_cq *cq, - struct mlx5_ib_cq_buf *buf) +static void init_cq_frag_buf(struct mlx5_ib_cq_buf *buf) { int i; void *cqe; struct mlx5_cqe64 *cqe64; for (i = 0; i < buf->nent; i++) { - cqe = get_cqe(cq, i); + cqe = mlx5_frag_buf_get_wqe(&buf->fbc, i); cqe64 = buf->cqe_size == 64 ? cqe : cqe + 64; cqe64->op_own = MLX5_CQE_INVALID << 4; } @@ -872,7 +871,7 @@ static int create_cq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq, if (err) goto err_db; - init_cq_frag_buf(cq, &cq->buf); + init_cq_frag_buf(&cq->buf); *inlen = MLX5_ST_SZ_BYTES(create_cq_in) + MLX5_FLD_SZ_BYTES(create_cq_in, pas[0]) * @@ -1177,7 +1176,7 @@ static int resize_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq, if (err) goto ex; - init_cq_frag_buf(cq, cq->resize_buf); + init_cq_frag_buf(cq->resize_buf); return 0; From 279ad78a00f8b9c5ff24171a59297187a3bd44b7 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Thu, 3 Jun 2021 15:37:53 +0300 Subject: [PATCH 256/269] NFS: Fix a potential NULL dereference in nfs_get_client() [ Upstream commit 09226e8303beeec10f2ff844d2e46d1371dc58e0 ] None of the callers are expecting NULL returns from nfs_get_client() so this code will lead to an Oops. It's better to return an error pointer. I expect that this is dead code so hopefully no one is affected. Fixes: 31434f496abb ("nfs: check hostname in nfs_get_client") Signed-off-by: Dan Carpenter Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 4b8cc93913f74..723d425796cca 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -406,7 +406,7 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) if (cl_init->hostname == NULL) { WARN_ON(1); - return NULL; + return ERR_PTR(-EINVAL); } /* see if the client already exists */ From b4651cea43afd4fa3f0091bfed5ab4b5030d0425 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 1 Jun 2021 11:10:05 -0400 Subject: [PATCH 257/269] NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode() [ Upstream commit dfe1fe75e00e4c724ede7b9e593f6f680e446c5f ] If the inode is being evicted, but has to return a delegation first, then it can cause a deadlock in the corner case where the server reboots before the delegreturn completes, but while the call to iget5_locked() in nfs4_opendata_get_inode() is waiting for the inode free to complete. Since the open call still holds a session slot, the reboot recovery cannot proceed. In order to break the logjam, we can turn the delegation return into a privileged operation for the case where we're evicting the inode. We know that in that case, there can be no other state recovery operation that conflicts. Reported-by: zhangxiaoxu (A) Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/nfs4_fs.h | 1 + fs/nfs/nfs4proc.c | 12 +++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 065cb04222a1b..543d916f79abb 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -205,6 +205,7 @@ struct nfs4_exception { struct inode *inode; nfs4_stateid *stateid; long timeout; + unsigned char task_is_privileged : 1; unsigned char delay : 1, recovering : 1, retry : 1; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c92d6ff0fceab..959c2aa2180de 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -592,6 +592,8 @@ int nfs4_handle_exception(struct nfs_server *server, int errorcode, struct nfs4_ goto out_retry; } if (exception->recovering) { + if (exception->task_is_privileged) + return -EDEADLOCK; ret = nfs4_wait_clnt_recover(clp); if (test_bit(NFS_MIG_FAILED, &server->mig_status)) return -EIO; @@ -617,6 +619,8 @@ nfs4_async_handle_exception(struct rpc_task *task, struct nfs_server *server, goto out_retry; } if (exception->recovering) { + if (exception->task_is_privileged) + return -EDEADLOCK; rpc_sleep_on(&clp->cl_rpcwaitq, task, NULL); if (test_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) == 0) rpc_wake_up_queued_task(&clp->cl_rpcwaitq, task); @@ -6383,6 +6387,7 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata) struct nfs4_exception exception = { .inode = data->inode, .stateid = &data->stateid, + .task_is_privileged = data->args.seq_args.sa_privileged, }; if (!nfs4_sequence_done(task, &data->res.seq_res)) @@ -6506,7 +6511,6 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred, data = kzalloc(sizeof(*data), GFP_NOFS); if (data == NULL) return -ENOMEM; - nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1, 0); nfs4_state_protect(server->nfs_client, NFS_SP4_MACH_CRED_CLEANUP, @@ -6537,6 +6541,12 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred, } } + if (!data->inode) + nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1, + 1); + else + nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1, + 0); task_setup_data.callback_data = data; msg.rpc_argp = &data->args; msg.rpc_resp = &data->res; From 351075bcfea996ce15b546c2cb80986736b917f4 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Sat, 5 Jun 2021 13:29:57 +0800 Subject: [PATCH 258/269] perf session: Correct buffer copying when peeking events [ Upstream commit 197eecb6ecae0b04bd694432f640ff75597fed9c ] When peeking an event, it has a short path and a long path. The short path uses the session pointer "one_mmap_addr" to directly fetch the event; and the long path needs to read out the event header and the following event data from file and fill into the buffer pointer passed through the argument "buf". The issue is in the long path that it copies the event header and event data into the same destination address which pointer "buf", this means the event header is overwritten. We are just lucky to run into the short path in most cases, so we don't hit the issue in the long path. This patch adds the offset "hdr_sz" to the pointer "buf" when copying the event data, so that it can reserve the event header which can be used properly by its caller. Fixes: 5a52f33adf02 ("perf session: Add perf_session__peek_event()") Signed-off-by: Leo Yan Acked-by: Adrian Hunter Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin --- tools/perf/util/session.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 63b619084b34a..9dddec19a494e 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1699,6 +1699,7 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset, if (event->header.size < hdr_sz || event->header.size > buf_sz) return -1; + buf += hdr_sz; rest = event->header.size - hdr_sz; if (readn(fd, buf, rest) != (ssize_t)rest) From 9064c9d544b906a63e359db5f908594bc07580e6 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 9 Jun 2021 01:49:13 -0400 Subject: [PATCH 259/269] kvm: fix previous commit for 32-bit builds commit 4422829e8053068e0225e4d0ef42dc41ea7c9ef5 upstream. array_index_nospec does not work for uint64_t on 32-bit builds. However, the size of a memory slot must be less than 20 bits wide on those system, since the memory slot must fit in the user address space. So just store it in an unsigned long. Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- include/linux/kvm_host.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ecab72456c100..c66c702a4f079 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1110,8 +1110,8 @@ __gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn) * table walks, do not let the processor speculate loads outside * the guest's registered memslots. */ - unsigned long offset = array_index_nospec(gfn - slot->base_gfn, - slot->npages); + unsigned long offset = gfn - slot->base_gfn; + offset = array_index_nospec(offset, slot->npages); return slot->userspace_addr + offset * PAGE_SIZE; } From c3b6cf64dfe4ef96e7341508d50d6998da7062c7 Mon Sep 17 00:00:00 2001 From: Anna Schumaker Date: Wed, 2 Jun 2021 14:31:20 -0400 Subject: [PATCH 260/269] NFS: Fix use-after-free in nfs4_init_client() commit 476bdb04c501fc64bf3b8464ffddefc8dbe01577 upstream. KASAN reports a use-after-free when attempting to mount two different exports through two different NICs that belong to the same server. Olga was able to hit this with kernels starting somewhere between 5.7 and 5.10, but I traced the patch that introduced the clear_bit() call to 4.13. So something must have changed in the refcounting of the clp pointer to make this call to nfs_put_client() the very last one. Fixes: 8dcbec6d20 ("NFSv41: Handle EXCHID4_FLAG_CONFIRMED_R during NFSv4.1 migration") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Anna Schumaker Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman --- fs/nfs/nfs4client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index be7915c861cef..7491323a58207 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -435,8 +435,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, */ nfs_mark_client_ready(clp, -EPERM); } - nfs_put_client(clp); clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); + nfs_put_client(clp); return old; error: From d973bd0d6e7f9b4ea976cc619e8d6e0d235b9056 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 1 Jun 2021 11:35:56 -0400 Subject: [PATCH 261/269] NFSv4: Fix second deadlock in nfs4_evict_inode() commit c3aba897c6e67fa464ec02b1f17911577d619713 upstream. If the inode is being evicted but has to return a layout first, then that too can cause a deadlock in the corner case where the server reboots. Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman --- fs/nfs/nfs4proc.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 959c2aa2180de..ca8bfb8bd52be 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -9629,15 +9629,20 @@ int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool sync) &task_setup_data.rpc_client, &msg); dprintk("--> %s\n", __func__); + lrp->inode = nfs_igrab_and_active(lrp->args.inode); if (!sync) { - lrp->inode = nfs_igrab_and_active(lrp->args.inode); if (!lrp->inode) { nfs4_layoutreturn_release(lrp); return -EAGAIN; } task_setup_data.flags |= RPC_TASK_ASYNC; } - nfs4_init_sequence(&lrp->args.seq_args, &lrp->res.seq_res, 1, 0); + if (!lrp->inode) + nfs4_init_sequence(&lrp->args.seq_args, &lrp->res.seq_res, 1, + 1); + else + nfs4_init_sequence(&lrp->args.seq_args, &lrp->res.seq_res, 1, + 0); task = rpc_run_task(&task_setup_data); if (IS_ERR(task)) return PTR_ERR(task); From 6e13b9bc66f0e34238aa7b9486a0575177fb7955 Mon Sep 17 00:00:00 2001 From: Dai Ngo Date: Wed, 19 May 2021 17:15:10 -0400 Subject: [PATCH 262/269] NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit f8849e206ef52b584cd9227255f4724f0cc900bb upstream. Currently if __nfs4_proc_set_acl fails with NFS4ERR_BADOWNER it re-enables the idmapper by clearing NFS_CAP_UIDGID_NOMAP before retrying again. The NFS_CAP_UIDGID_NOMAP remains cleared even if the retry fails. This causes problem for subsequent setattr requests for v4 server that does not have idmapping configured. This patch modifies nfs4_proc_set_acl to detect NFS4ERR_BADOWNER and NFS4ERR_BADNAME and skips the retry, since the kernel isn't involved in encoding the ACEs, and return -EINVAL. Steps to reproduce the problem: # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt # touch /tmp/mnt/file1 # chown 99 /tmp/mnt/file1 # nfs4_setfacl -a A::unknown.user@xyz.com:wrtncy /tmp/mnt/file1 Failed setxattr operation: Invalid argument # chown 99 /tmp/mnt/file1 chown: changing ownership of ‘/tmp/mnt/file1’: Invalid argument # umount /tmp/mnt # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt # chown 99 /tmp/mnt/file1 # v2: detect NFS4ERR_BADOWNER and NFS4ERR_BADNAME and skip retry in nfs4_proc_set_acl. Signed-off-by: Dai Ngo Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman --- fs/nfs/nfs4proc.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index ca8bfb8bd52be..5365000e83bd6 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5946,6 +5946,14 @@ static int nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t buflen do { err = __nfs4_proc_set_acl(inode, buf, buflen); trace_nfs4_set_acl(inode, err); + if (err == -NFS4ERR_BADOWNER || err == -NFS4ERR_BADNAME) { + /* + * no need to retry since the kernel + * isn't involved in encoding the ACEs. + */ + err = -EINVAL; + break; + } err = nfs4_handle_exception(NFS_SERVER(inode), err, &exception); } while (exception.retry); From 7a696ce1d5d16a33a6cd6400bbcc0339b2460e11 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 2 Jun 2021 21:30:26 +0800 Subject: [PATCH 263/269] scsi: core: Fix error handling of scsi_host_alloc() commit 66a834d092930cf41d809c0e989b13cd6f9ca006 upstream. After device is initialized via device_initialize(), or its name is set via dev_set_name(), the device has to be freed via put_device(). Otherwise device name will be leaked because it is allocated dynamically in dev_set_name(). Fix the leak by replacing kfree() with put_device(). Since scsi_host_dev_release() properly handles IDA and kthread removal, remove special-casing these from the error handling as well. Link: https://lore.kernel.org/r/20210602133029.2864069-2-ming.lei@redhat.com Cc: Bart Van Assche Cc: John Garry Cc: Hannes Reinecke Tested-by: John Garry Reviewed-by: Bart Van Assche Reviewed-by: John Garry Reviewed-by: Hannes Reinecke Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/hosts.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 2f162603876f9..d835a7b236142 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -392,8 +392,10 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) mutex_init(&shost->scan_mutex); index = ida_simple_get(&host_index_ida, 0, 0, GFP_KERNEL); - if (index < 0) - goto fail_kfree; + if (index < 0) { + kfree(shost); + return NULL; + } shost->host_no = index; shost->dma_channel = 0xff; @@ -486,7 +488,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) shost_printk(KERN_WARNING, shost, "error handler thread failed to spawn, error = %ld\n", PTR_ERR(shost->ehandler)); - goto fail_index_remove; + goto fail; } shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d", @@ -495,17 +497,18 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize) if (!shost->tmf_work_q) { shost_printk(KERN_WARNING, shost, "failed to create tmf workq\n"); - goto fail_kthread; + goto fail; } scsi_proc_hostdir_add(shost->hostt); return shost; + fail: + /* + * Host state is still SHOST_CREATED and that is enough to release + * ->shost_gendev. scsi_host_dev_release() will free + * dev_name(&shost->shost_dev). + */ + put_device(&shost->shost_gendev); - fail_kthread: - kthread_stop(shost->ehandler); - fail_index_remove: - ida_simple_remove(&host_index_ida, shost->host_no); - fail_kfree: - kfree(shost); return NULL; } EXPORT_SYMBOL(scsi_host_alloc); From 146446a43b3dbaa3a58364ef99fd606b3f324832 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 2 Jun 2021 21:30:27 +0800 Subject: [PATCH 264/269] scsi: core: Fix failure handling of scsi_add_host_with_dma() commit 3719f4ff047e20062b8314c23ec3cab84d74c908 upstream. When scsi_add_host_with_dma() returns failure, the caller will call scsi_host_put(shost) to release everything allocated for this host instance. Consequently we can't also free allocated stuff in scsi_add_host_with_dma(), otherwise we will end up with a double free. Strictly speaking, host resource allocations should have been done in scsi_host_alloc(). However, the allocations may need information which is not yet provided by the driver when that function is called. So leave the allocations where they are but rely on host device's release handler to free resources. Link: https://lore.kernel.org/r/20210602133029.2864069-3-ming.lei@redhat.com Cc: Bart Van Assche Cc: John Garry Cc: Hannes Reinecke Tested-by: John Garry Reviewed-by: Bart Van Assche Reviewed-by: John Garry Reviewed-by: Hannes Reinecke Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/hosts.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index d835a7b236142..48ec9c35daa4a 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -278,23 +278,22 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, if (!shost->work_q) { error = -EINVAL; - goto out_free_shost_data; + goto out_del_dev; } } error = scsi_sysfs_add_host(shost); if (error) - goto out_destroy_host; + goto out_del_dev; scsi_proc_host_add(shost); scsi_autopm_put_host(shost); return error; - out_destroy_host: - if (shost->work_q) - destroy_workqueue(shost->work_q); - out_free_shost_data: - kfree(shost->shost_data); + /* + * Any host allocation in this function will be freed in + * scsi_host_dev_release(). + */ out_del_dev: device_del(&shost->shost_dev); out_del_gendev: @@ -304,7 +303,6 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, pm_runtime_disable(&shost->shost_gendev); pm_runtime_set_suspended(&shost->shost_gendev); pm_runtime_put_noidle(&shost->shost_gendev); - scsi_mq_destroy_tags(shost); fail: return error; } From 0a31d1237aafc305ade0f886977450f6021b2d0c Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 2 Jun 2021 21:30:28 +0800 Subject: [PATCH 265/269] scsi: core: Put .shost_dev in failure path if host state changes to RUNNING commit 11714026c02d613c30a149c3f4c4a15047744529 upstream. scsi_host_dev_release() only frees dev_name when host state is SHOST_CREATED. After host state has changed to SHOST_RUNNING, scsi_host_dev_release() no longer cleans up. Fix this by doing a put_device(&shost->shost_dev) in the failure path when host state is SHOST_RUNNING. Move get_device(&shost->shost_gendev) before device_add(&shost->shost_dev) so that scsi_host_cls_release() can do a put on this reference. Link: https://lore.kernel.org/r/20210602133029.2864069-4-ming.lei@redhat.com Cc: Bart Van Assche Cc: Hannes Reinecke Reported-by: John Garry Tested-by: John Garry Reviewed-by: John Garry Reviewed-by: Hannes Reinecke Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/hosts.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 48ec9c35daa4a..a64d0c6f1c4a4 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -254,12 +254,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, device_enable_async_suspend(&shost->shost_dev); + get_device(&shost->shost_gendev); error = device_add(&shost->shost_dev); if (error) goto out_del_gendev; - get_device(&shost->shost_gendev); - if (shost->transportt->host_size) { shost->shost_data = kzalloc(shost->transportt->host_size, GFP_KERNEL); @@ -297,6 +296,11 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev, out_del_dev: device_del(&shost->shost_dev); out_del_gendev: + /* + * Host state is SHOST_RUNNING so we have to explicitly release + * ->shost_dev. + */ + put_device(&shost->shost_dev); device_del(&shost->shost_gendev); out_disable_runtime_pm: device_disable_async_suspend(&shost->shost_gendev); From 5b537408f2733d510060e72596befa44c3435cb6 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 2 Jun 2021 21:30:29 +0800 Subject: [PATCH 266/269] scsi: core: Only put parent device if host state differs from SHOST_CREATED commit 1e0d4e6225996f05271de1ebcb1a7c9381af0b27 upstream. get_device(shost->shost_gendev.parent) is called after host state has switched to SHOST_RUNNING. scsi_host_dev_release() shouldn't release the parent device if host state is still SHOST_CREATED. Link: https://lore.kernel.org/r/20210602133029.2864069-5-ming.lei@redhat.com Cc: Bart Van Assche Cc: John Garry Cc: Hannes Reinecke Tested-by: John Garry Reviewed-by: John Garry Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman --- drivers/scsi/hosts.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index a64d0c6f1c4a4..b93dd8ef4ac82 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -347,7 +347,7 @@ static void scsi_host_dev_release(struct device *dev) ida_simple_remove(&host_index_ida, shost->host_no); - if (parent) + if (shost->shost_state != SHOST_CREATED) put_device(parent); kfree(shost); } From 43c32c22254b9328d7abb1c2b0f689dc67838e60 Mon Sep 17 00:00:00 2001 From: Liangyan Date: Mon, 7 Jun 2021 20:57:34 +0800 Subject: [PATCH 267/269] tracing: Correct the length check which causes memory corruption commit 3e08a9f9760f4a70d633c328a76408e62d6f80a3 upstream. We've suffered from severe kernel crashes due to memory corruption on our production environment, like, Call Trace: [1640542.554277] general protection fault: 0000 [#1] SMP PTI [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190 [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286 [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX: 0000000006e931bf [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI: ffff9a45ff004300 [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09: 0000000000000000 [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9a20608d [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15: 696c662f65636976 [1640542.563128] FS: 00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000) knlGS:0000000000000000 [1640542.563937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4: 00000000003606e0 [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1640542.566742] Call Trace: [1640542.567009] anon_vma_clone+0x5d/0x170 [1640542.567417] __split_vma+0x91/0x1a0 [1640542.567777] do_munmap+0x2c6/0x320 [1640542.568128] vm_munmap+0x54/0x70 [1640542.569990] __x64_sys_munmap+0x22/0x30 [1640542.572005] do_syscall_64+0x5b/0x1b0 [1640542.573724] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1640542.575642] RIP: 0033:0x7f45d6e61e27 James Wang has reproduced it stably on the latest 4.19 LTS. After some debugging, we finally proved that it's due to ftrace buffer out-of-bound access using a debug tool as follows: [ 86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000 [ 86.780806] no_context+0xdf/0x3c0 [ 86.784327] __do_page_fault+0x252/0x470 [ 86.788367] do_page_fault+0x32/0x140 [ 86.792145] page_fault+0x1e/0x30 [ 86.795576] strncpy_from_unsafe+0x66/0xb0 [ 86.799789] fetch_memory_string+0x25/0x40 [ 86.804002] fetch_deref_string+0x51/0x60 [ 86.808134] kprobe_trace_func+0x32d/0x3a0 [ 86.812347] kprobe_dispatcher+0x45/0x50 [ 86.816385] kprobe_ftrace_handler+0x90/0xf0 [ 86.820779] ftrace_ops_assist_func+0xa1/0x140 [ 86.825340] 0xffffffffc00750bf [ 86.828603] do_sys_open+0x5/0x1f0 [ 86.832124] do_syscall_64+0x5b/0x1b0 [ 86.835900] entry_SYSCALL_64_after_hwframe+0x44/0xa9 commit b220c049d519 ("tracing: Check length before giving out the filter buffer") adds length check to protect trace data overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent overflow entirely, the length check should also take the sizeof entry->array[0] into account, since this array[0] is filled the length of trace data and occupy addtional space and risk overflow. Link: https://lkml.kernel.org/r/20210607125734.1770447-1-liangyan.peng@linux.alibaba.com Cc: stable@vger.kernel.org Cc: Ingo Molnar Cc: Xunlei Pang Cc: Greg Kroah-Hartman Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer") Reviewed-by: Xunlei Pang Reviewed-by: yinbinbin Reviewed-by: Wetp Zhang Tested-by: James Wang Signed-off-by: Liangyan Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 321f7f7a29b4b..b2c141eaca020 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2734,7 +2734,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb, (entry = this_cpu_read(trace_buffered_event))) { /* Try to use the per cpu buffer first */ val = this_cpu_inc_return(trace_buffered_event_cnt); - if ((len < (PAGE_SIZE - sizeof(*entry))) && val == 1) { + if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) { trace_event_setup(entry, type, flags, pc); entry->array[0] = len; return entry; From ef9a0d224bafc0f4f8f85d0eb69fc59a6fbd1318 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Tue, 15 Jun 2021 09:26:19 -0700 Subject: [PATCH 268/269] proc: only require mm_struct for writing commit 94f0b2d4a1d0c52035aef425da5e022bd2cb1c71 upstream. Commit 591a22c14d3f ("proc: Track /proc/$pid/attr/ opener mm_struct") we started using __mem_open() to track the mm_struct at open-time, so that we could then check it for writes. But that also ended up making the permission checks at open time much stricter - and not just for writes, but for reads too. And that in turn caused a regression for at least Fedora 29, where NIC interfaces fail to start when using NetworkManager. Since only the write side wanted the mm_struct test, ignore any failures by __mem_open() at open time, leaving reads unaffected. The write() time verification of the mm_struct pointer will then catch the failure case because a NULL pointer will not match a valid 'current->mm'. Link: https://lore.kernel.org/netdev/YMjTlp2FSJYvoyFa@unreal/ Fixes: 591a22c14d3f ("proc: Track /proc/$pid/attr/ opener mm_struct") Reported-and-tested-by: Leon Romanovsky Cc: Kees Cook Cc: Christian Brauner Cc: Andrea Righi Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- fs/proc/base.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 511d98d09754a..df9b17dd92cb3 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2677,7 +2677,9 @@ static int proc_pident_readdir(struct file *file, struct dir_context *ctx, #ifdef CONFIG_SECURITY static int proc_pid_attr_open(struct inode *inode, struct file *file) { - return __mem_open(inode, file, PTRACE_MODE_READ_FSCREDS); + file->private_data = NULL; + __mem_open(inode, file, PTRACE_MODE_READ_FSCREDS); + return 0; } static ssize_t proc_pid_attr_read(struct file * file, char __user * buf, From f2b1fc360fa1283b61359f232c6965f4170260f0 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Wed, 16 Jun 2021 12:01:47 +0200 Subject: [PATCH 269/269] Linux 5.10.44 Link: https://lore.kernel.org/r/20210614102652.964395392@linuxfoundation.org Tested-by: Fox Chen Tested-by: Jon Hunter Tested-by: Florian Fainelli Tested-by: Pavel Machek (CIP) Tested-by: Hulk Robot Link: https://lore.kernel.org/r/20210614161424.091266895@linuxfoundation.org Tested-by: Jon Hunter Tested-by: Fox Chen Tested-by: Pavel Machek (CIP) Tested-by: Florian Fainelli Tested-by: Salvatore Bonaccorso Tested-by: Shuah Khan Tested-by: Jason Self Tested-by: Hulk Robot Tested-by: Sudip Mukherjee Tested-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index ec9ee8032a985..ae33e048eb8db 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 5 PATCHLEVEL = 10 -SUBLEVEL = 43 +SUBLEVEL = 44 EXTRAVERSION = NAME = Dare mighty things