iproute2: ADD_ADDR: ports support #166

matttbe · 2021-02-18T14:46:20Z

The kernel support announcing ADD_ADDR on a different port and even receiving them (+ receiving MP_JOIN on a different port) thanks to the work done on #54.

But from the userspace, it is not easy to do that without a support in iproute2.

The idea would be to do something like:

ip mptcp endpoint add 1.2.3.4 port 1234

The text was updated successfully, but these errors were encountered:

matttbe · 2021-02-20T11:01:04Z

Patch has been already sent to netdev: https://patchwork.kernel.org/project/netdevbpf/patch/868cfad6ab2fbd7f4b2ac6522b3ec62fce858fed.1613767061.git.pabeni@redhat.com/

matttbe · 2021-03-04T17:19:46Z

Patch has been already sent to netdev: https://patchwork.kernel.org/project/netdevbpf/patch/868cfad6ab2fbd7f4b2ac6522b3ec62fce858fed.1613767061.git.pabeni@redhat.com/

Patch has been merged! We can close this ticket.

The IPA BCM resource ("IP0") on sc7180 was moved to the clk-rpmh driver in commit bcd63d2 ("clk: qcom: rpmh: Add IPA clock for SC7180") and modeled as a clk, but this interconnect driver still had it modeled as an interconnect. This was mostly OK because nobody used the interconnect definition, until the interconnect framework started dropping bandwidth requests on interconnects that aren't used via the sync_state callback in commit 7d3b0b0 ("interconnect: qcom: Use icc_sync_state"). Once that patch was applied the IP0 resource was going to be controlled from two places, the clk framework and the interconnect framework. Even then, things were probably going to be OK, because commit b95b668 ("interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate") was needed to actually drop bandwidth requests on unused interconnects, of which the IPA was one of the interconnect that wasn't getting dropped to zero. Combining the three commits together leads to bad behavior where the interconnect framework is disabling the IP0 resource because it has no users while the clk framework thinks the IP0 resource is on because the only user, the IPA driver, has turned it on via clk_prepare_enable(). Depending on when sync_state is called, we can get into a situation like below: IPA driver probes IPA driver gets notified modem started runtime PM get() IPA clk enabled -> IP0 resource is ON sync_state runs interconnect zeroes out the IP0 resource -> IP0 resource is off IPA driver tries to access a register and blows up The crash is an unclocked access that manifest as an SError. SError Interrupt on CPU0, code 0xbe000011 -- SError CPU: 0 PID: 3595 Comm: mmdata_mgr Not tainted 5.17.1+ #166 Hardware name: Google Lazor (rev1 - 2) with LTE (DT) pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mutex_lock+0x4c/0x80 lr : mutex_lock+0x30/0x80 sp : ffffffc00da9b9c0 x29: ffffffc00da9b9c0 x28: 0000000000000000 x27: 0000000000000000 x26: ffffffc00da9bc90 x25: ffffff80c2024010 x24: ffffff80c2024000 x23: ffffff8083100000 x22: ffffff80831000d0 x21: ffffff80831000a8 x20: ffffff80831000a8 x19: ffffff8083100070 x18: 00000000ffff0a00 x17: 000000002f7254f1 x16: 0000000000000100 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 000000000001f0b8 x10: ffffffc00931f0b8 x9 : 0000000000000000 x8 : 0000000000000000 x7 : fefefefefeff2f60 x6 : 0000808080808080 x5 : 0000000000000000 x4 : 8080808080800000 x3 : ffffff80d2d4ee28 x2 : ffffff808c1d6e40 x1 : 0000000000000000 x0 : ffffff8083100070 Kernel panic - not syncing: Asynchronous SError Interrupt CPU: 0 PID: 3595 Comm: mmdata_mgr Not tainted 5.17.1+ #166 Hardware name: Google Lazor (rev1 - 2) with LTE (DT) Call trace: dump_backtrace+0xf4/0x114 show_stack+0x24/0x30 dump_stack_lvl+0x64/0x7c dump_stack+0x18/0x38 panic+0x150/0x38c nmi_panic+0x88/0xa0 arm64_serror_panic+0x74/0x80 do_serror+0x0/0x80 do_serror+0x58/0x80 el1h_64_error_handler+0x34/0x4c el1h_64_error+0x78/0x7c mutex_lock+0x4c/0x80 __gsi_channel_start+0x50/0x17c gsi_channel_start+0x54/0x90 ipa_endpoint_enable_one+0x34/0xc0 ipa_open+0x4c/0x120 Remove all IP0 resource management from the interconnect driver so that clk-rpmh is the sole owner. This fixes the issue by preventing the interconnect driver from overwriting the IP0 resource data that the clk-rpmh driver wrote. Cc: Alex Elder <elder@linaro.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Taniya Das <quic_tdas@quicinc.com> Cc: Mike Tipton <quic_mdtipton@quicinc.com> Fixes: b95b668 ("interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate") Fixes: bcd63d2 ("clk: qcom: rpmh: Add IPA clock for SC7180") Fixes: 7d3b0b0 ("interconnect: qcom: Use icc_sync_state") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Tested-by: Alex Elder <elder@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220412220033.1273607-2-swboyd@chromium.org Signed-off-by: Georgi Djakov <djakov@kernel.org>

gpio_keys module can either accept gpios or interrupts. The module initializes delayed work in case of gpios only and is only used if debounce timer is not used, so make sure cancel_delayed_work_sync() is called only when its gpio-backed and debounce_use_hrtimer is false. This fixes the issue seen below when the gpio_keys module is unloaded and an interrupt pin is used instead of GPIO: [ 360.297569] ------------[ cut here ]------------ [ 360.302303] WARNING: CPU: 0 PID: 237 at kernel/workqueue.c:3066 __flush_work+0x414/0x470 [ 360.310531] Modules linked in: gpio_keys(-) [ 360.314797] CPU: 0 PID: 237 Comm: rmmod Not tainted 5.18.0-rc5-arm64-renesas-00116-g73636105874d-dirty #166 [ 360.324662] Hardware name: Renesas SMARC EVK based on r9a07g054l2 (DT) [ 360.331270] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 360.338318] pc : __flush_work+0x414/0x470 [ 360.342385] lr : __cancel_work_timer+0x140/0x1b0 [ 360.347065] sp : ffff80000a7fba00 [ 360.350423] x29: ffff80000a7fba00 x28: ffff000012b9c5c0 x27: 0000000000000000 [ 360.357664] x26: ffff80000a7fbb80 x25: ffff80000954d0a8 x24: 0000000000000001 [ 360.364904] x23: ffff800009757000 x22: 0000000000000000 x21: ffff80000919b000 [ 360.372143] x20: ffff00000f5974e0 x19: ffff00000f5974e0 x18: ffff8000097fcf48 [ 360.379382] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000053f40 [ 360.386622] x14: ffff800009850e88 x13: 0000000000000002 x12: 000000000000a60c [ 360.393861] x11: 000000000000a610 x10: 0000000000000000 x9 : 0000000000000008 [ 360.401100] x8 : 0101010101010101 x7 : 00000000a473c394 x6 : 0080808080808080 [ 360.408339] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff80000919b458 [ 360.415578] x2 : ffff8000097577f0 x1 : 0000000000000001 x0 : 0000000000000000 [ 360.422818] Call trace: [ 360.425299] __flush_work+0x414/0x470 [ 360.429012] __cancel_work_timer+0x140/0x1b0 [ 360.433340] cancel_delayed_work_sync+0x10/0x18 [ 360.437931] gpio_keys_quiesce_key+0x28/0x58 [gpio_keys] [ 360.443327] devm_action_release+0x10/0x18 [ 360.447481] release_nodes+0x8c/0x1a0 [ 360.451194] devres_release_all+0x90/0x100 [ 360.455346] device_unbind_cleanup+0x14/0x60 [ 360.459677] device_release_driver_internal+0xe8/0x168 [ 360.464883] driver_detach+0x4c/0x90 [ 360.468509] bus_remove_driver+0x54/0xb0 [ 360.472485] driver_unregister+0x2c/0x58 [ 360.476462] platform_driver_unregister+0x10/0x18 [ 360.481230] gpio_keys_exit+0x14/0x828 [gpio_keys] [ 360.486088] __arm64_sys_delete_module+0x1e0/0x270 [ 360.490945] invoke_syscall+0x40/0xf8 [ 360.494661] el0_svc_common.constprop.3+0xf0/0x110 [ 360.499515] do_el0_svc+0x20/0x78 [ 360.502877] el0_svc+0x48/0xf8 [ 360.505977] el0t_64_sync_handler+0x88/0xb0 [ 360.510216] el0t_64_sync+0x148/0x14c [ 360.513930] irq event stamp: 4306 [ 360.517288] hardirqs last enabled at (4305): [<ffff8000080b0300>] __cancel_work_timer+0x130/0x1b0 [ 360.526359] hardirqs last disabled at (4306): [<ffff800008d194fc>] el1_dbg+0x24/0x88 [ 360.534204] softirqs last enabled at (4278): [<ffff8000080104a0>] _stext+0x4a0/0x5e0 [ 360.542133] softirqs last disabled at (4267): [<ffff8000080932ac>] irq_exit_rcu+0x18c/0x1b0 [ 360.550591] ---[ end trace 0000000000000000 ]--- Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20220524135822.14764-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

When running with return thunks enabled under 32-bit EFI, the system crashes with: kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle page fault for address: 000000005bc02900 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0011) - permissions violation PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063 Oops: 0011 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:0x5bc02900 Code: Unable to access opcode bytes at RIP 0x5bc028d6. RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048 RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28 R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710 R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033 CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0 Call Trace: ? efi_set_virtual_address_map+0x9c/0x175 efi_enter_virtual_mode+0x4a6/0x53e start_kernel+0x67c/0x71e x86_64_start_reservations+0x24/0x2a x86_64_start_kernel+0xe9/0xf4 secondary_startup_64_no_verify+0xe5/0xeb That's because it cannot jump to the return thunk from the 32-bit code. Using a naked RET and marking it as safe allows the system to proceed booting. Fixes: aa3d480 ("x86: Use return-thunk in asm code") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: <stable@vger.kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

If a relocatable kernel is loaded at an address that is not 2MB aligned and told not to relocate to zero, the kernel can crash due to mark_rodata_ro() incorrectly changing some read-write data to read-only. Scenarios where the misalignment can occur are when the kernel is loaded by kdump or using the RELOCATABLE_TEST config option. Example crash with the kernel loaded at 5MB: Run /sbin/init as init process BUG: Unable to handle kernel data access on write at 0xc000000000452000 Faulting instruction address: 0xc0000000005b6730 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries NIP: c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380 REGS: c000000004503250 TRAP: 0300 Not tainted (6.2.0-rc1-00011-g349188be4841) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 44288480 XER: 00000000 CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0 ... NIP memset+0x68/0x104 LR zero_user_segments.constprop.0+0xa8/0xf0 Call Trace: ext4_mpage_readpages+0x7f8/0x830 ext4_readahead+0x48/0x60 read_pages+0xb8/0x380 page_cache_ra_unbounded+0x19c/0x250 filemap_fault+0x58c/0xae0 __do_fault+0x60/0x100 __handle_mm_fault+0x1230/0x1a40 handle_mm_fault+0x120/0x300 ___do_page_fault+0x20c/0xa80 do_page_fault+0x30/0xc0 data_access_common_virt+0x210/0x220 This happens because mark_rodata_ro() tries to change permissions on the range _stext..__end_rodata, but _stext sits in the middle of the 2MB page from 4MB to 6MB: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec) The logic that changes the permissions assumes the linear mapping was split correctly at boot, so it marks the entire 2MB page read-only. That leads to the write fault above. To fix it, the boot time mapping logic needs to consider that if the kernel is running at a non-zero address then _stext is a boundary where it must split the mapping. That leads to the mapping being split correctly, allowing the rodata permission change to take happen correctly, with no spillover: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec) radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec) If the kernel is loaded at a 2MB aligned address, the mapping continues to use 2MB pages as before: radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec) radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages Fixes: c55d7b5 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au

@2

Recent additions in BPF like cpu v4 instructions, test_bpf module exhibits the following failures: test_bpf: #82 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #83 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #84 ALU64_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #85 ALU64_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #86 ALU64_MOVSX | BPF_W jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times) test_bpf: #165 ALU_SDIV_X: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #166 ALU_SDIV_K: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times) test_bpf: #169 ALU_SMOD_X: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #170 ALU_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #172 ALU64_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times) test_bpf: #313 BSWAP 16: 0x0123456789abcdef -> 0xefcd eBPF filter opcode 00d7 (@2) unsupported jited:0 301 PASS test_bpf: #314 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 555 PASS test_bpf: #315 BSWAP 64: 0x0123456789abcdef -> 0x67452301 eBPF filter opcode 00d7 (@2) unsupported jited:0 268 PASS test_bpf: #316 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 eBPF filter opcode 00d7 (@2) unsupported jited:0 269 PASS test_bpf: #317 BSWAP 16: 0xfedcba9876543210 -> 0x1032 eBPF filter opcode 00d7 (@2) unsupported jited:0 460 PASS test_bpf: #318 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 320 PASS test_bpf: #319 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe eBPF filter opcode 00d7 (@2) unsupported jited:0 222 PASS test_bpf: #320 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 eBPF filter opcode 00d7 (@2) unsupported jited:0 273 PASS test_bpf: #344 BPF_LDX_MEMSX | BPF_B eBPF filter opcode 0091 (@5) unsupported jited:0 432 PASS test_bpf: #345 BPF_LDX_MEMSX | BPF_H eBPF filter opcode 0089 (@5) unsupported jited:0 381 PASS test_bpf: #346 BPF_LDX_MEMSX | BPF_W eBPF filter opcode 0081 (@5) unsupported jited:0 505 PASS test_bpf: #490 JMP32_JA: Unconditional jump: if (true) return 1 eBPF filter opcode 0006 (@1) unsupported jited:0 261 PASS test_bpf: Summary: 1040 PASSED, 10 FAILED, [924/1038 JIT'ed] Fix them by adding missing processing. Fixes: daabb2b ("bpf/tests: add tests for cpuv4 instructions") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/91de862dda99d170697eb79ffb478678af7e0b27.1709652689.git.christophe.leroy@csgroup.eu

matttbe added enhancement iproute2 labels Feb 18, 2021

matttbe added this to Needs triage in MPTCP Future via automation Feb 18, 2021

matttbe mentioned this issue Feb 18, 2021

ADD_ADDR: ports support #54

Closed

5 tasks

matttbe assigned pabeni Feb 20, 2021

matttbe removed this from Needs triage in MPTCP Future Feb 20, 2021

matttbe added this to To do in MPTCP Next (5.12) via automation Feb 20, 2021

matttbe moved this from To do to Done in MPTCP Next (5.12) Feb 25, 2021

matttbe closed this as completed Mar 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iproute2: ADD_ADDR: ports support #166

iproute2: ADD_ADDR: ports support #166

matttbe commented Feb 18, 2021

matttbe commented Feb 20, 2021

matttbe commented Mar 4, 2021

iproute2: ADD_ADDR: ports support #166

iproute2: ADD_ADDR: ports support #166

Comments

matttbe commented Feb 18, 2021

matttbe commented Feb 20, 2021

matttbe commented Mar 4, 2021