Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iproute2: ADD_ADDR: ports support #166

Closed
matttbe opened this issue Feb 18, 2021 · 2 comments
Closed

iproute2: ADD_ADDR: ports support #166

matttbe opened this issue Feb 18, 2021 · 2 comments

Comments

@matttbe
Copy link
Member

matttbe commented Feb 18, 2021

The kernel support announcing ADD_ADDR on a different port and even receiving them (+ receiving MP_JOIN on a different port) thanks to the work done on #54.

But from the userspace, it is not easy to do that without a support in iproute2.

The idea would be to do something like:

ip mptcp endpoint add 1.2.3.4 port 1234
@matttbe matttbe added this to Needs triage in MPTCP Future via automation Feb 18, 2021
@matttbe matttbe mentioned this issue Feb 18, 2021
5 tasks
@matttbe matttbe removed this from Needs triage in MPTCP Future Feb 20, 2021
@matttbe matttbe added this to To do in MPTCP Next (5.12) via automation Feb 20, 2021
@matttbe
Copy link
Member Author

matttbe commented Feb 20, 2021

@matttbe matttbe moved this from To do to Done in MPTCP Next (5.12) Feb 25, 2021
@matttbe
Copy link
Member Author

matttbe commented Mar 4, 2021

Patch has been already sent to netdev: https://patchwork.kernel.org/project/netdevbpf/patch/868cfad6ab2fbd7f4b2ac6522b3ec62fce858fed.1613767061.git.pabeni@redhat.com/

Patch has been merged! We can close this ticket.

@matttbe matttbe closed this as completed Mar 4, 2021
jenkins-tessares pushed a commit that referenced this issue May 6, 2022
The IPA BCM resource ("IP0") on sc7180 was moved to the clk-rpmh driver
in commit bcd63d2 ("clk: qcom: rpmh: Add IPA clock for SC7180") and
modeled as a clk, but this interconnect driver still had it modeled as
an interconnect. This was mostly OK because nobody used the interconnect
definition, until the interconnect framework started dropping bandwidth
requests on interconnects that aren't used via the sync_state callback
in commit 7d3b0b0 ("interconnect: qcom: Use icc_sync_state"). Once
that patch was applied the IP0 resource was going to be controlled from
two places, the clk framework and the interconnect framework.

Even then, things were probably going to be OK, because commit
b95b668 ("interconnect: qcom: icc-rpmh: Add BCMs to commit list in
pre_aggregate") was needed to actually drop bandwidth requests on unused
interconnects, of which the IPA was one of the interconnect that wasn't
getting dropped to zero. Combining the three commits together leads to
bad behavior where the interconnect framework is disabling the IP0
resource because it has no users while the clk framework thinks the IP0
resource is on because the only user, the IPA driver, has turned it on
via clk_prepare_enable(). Depending on when sync_state is called, we can
get into a situation like below:

  IPA driver probes
  IPA driver gets notified modem started
   runtime PM get()
    IPA clk enabled -> IP0 resource is ON
  sync_state runs
   interconnect zeroes out the IP0 resource -> IP0 resource is off
  IPA driver tries to access a register and blows up

The crash is an unclocked access that manifest as an SError.

 SError Interrupt on CPU0, code 0xbe000011 -- SError
 CPU: 0 PID: 3595 Comm: mmdata_mgr Not tainted 5.17.1+ #166
 Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
 pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : mutex_lock+0x4c/0x80
 lr : mutex_lock+0x30/0x80
 sp : ffffffc00da9b9c0
 x29: ffffffc00da9b9c0 x28: 0000000000000000 x27: 0000000000000000
 x26: ffffffc00da9bc90 x25: ffffff80c2024010 x24: ffffff80c2024000
 x23: ffffff8083100000 x22: ffffff80831000d0 x21: ffffff80831000a8
 x20: ffffff80831000a8 x19: ffffff8083100070 x18: 00000000ffff0a00
 x17: 000000002f7254f1 x16: 0000000000000100 x15: 0000000000000000
 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
 x11: 000000000001f0b8 x10: ffffffc00931f0b8 x9 : 0000000000000000
 x8 : 0000000000000000 x7 : fefefefefeff2f60 x6 : 0000808080808080
 x5 : 0000000000000000 x4 : 8080808080800000 x3 : ffffff80d2d4ee28
 x2 : ffffff808c1d6e40 x1 : 0000000000000000 x0 : ffffff8083100070
 Kernel panic - not syncing: Asynchronous SError Interrupt
 CPU: 0 PID: 3595 Comm: mmdata_mgr Not tainted 5.17.1+ #166
 Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
 Call trace:
  dump_backtrace+0xf4/0x114
  show_stack+0x24/0x30
  dump_stack_lvl+0x64/0x7c
  dump_stack+0x18/0x38
  panic+0x150/0x38c
  nmi_panic+0x88/0xa0
  arm64_serror_panic+0x74/0x80
  do_serror+0x0/0x80
  do_serror+0x58/0x80
  el1h_64_error_handler+0x34/0x4c
  el1h_64_error+0x78/0x7c
  mutex_lock+0x4c/0x80
  __gsi_channel_start+0x50/0x17c
  gsi_channel_start+0x54/0x90
  ipa_endpoint_enable_one+0x34/0xc0
  ipa_open+0x4c/0x120

Remove all IP0 resource management from the interconnect driver so that
clk-rpmh is the sole owner. This fixes the issue by preventing the
interconnect driver from overwriting the IP0 resource data that the
clk-rpmh driver wrote.

Cc: Alex Elder <elder@linaro.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Taniya Das <quic_tdas@quicinc.com>
Cc: Mike Tipton <quic_mdtipton@quicinc.com>
Fixes: b95b668 ("interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate")
Fixes: bcd63d2 ("clk: qcom: rpmh: Add IPA clock for SC7180")
Fixes: 7d3b0b0 ("interconnect: qcom: Use icc_sync_state")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Alex Elder <elder@linaro.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220412220033.1273607-2-swboyd@chromium.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
jenkins-tessares pushed a commit that referenced this issue Jun 3, 2022
gpio_keys module can either accept gpios or interrupts. The module
initializes delayed work in case of gpios only and is only used if
debounce timer is not used, so make sure cancel_delayed_work_sync()
is called only when its gpio-backed and debounce_use_hrtimer is false.

This fixes the issue seen below when the gpio_keys module is unloaded and
an interrupt pin is used instead of GPIO:

[  360.297569] ------------[ cut here ]------------
[  360.302303] WARNING: CPU: 0 PID: 237 at kernel/workqueue.c:3066 __flush_work+0x414/0x470
[  360.310531] Modules linked in: gpio_keys(-)
[  360.314797] CPU: 0 PID: 237 Comm: rmmod Not tainted 5.18.0-rc5-arm64-renesas-00116-g73636105874d-dirty #166
[  360.324662] Hardware name: Renesas SMARC EVK based on r9a07g054l2 (DT)
[  360.331270] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  360.338318] pc : __flush_work+0x414/0x470
[  360.342385] lr : __cancel_work_timer+0x140/0x1b0
[  360.347065] sp : ffff80000a7fba00
[  360.350423] x29: ffff80000a7fba00 x28: ffff000012b9c5c0 x27: 0000000000000000
[  360.357664] x26: ffff80000a7fbb80 x25: ffff80000954d0a8 x24: 0000000000000001
[  360.364904] x23: ffff800009757000 x22: 0000000000000000 x21: ffff80000919b000
[  360.372143] x20: ffff00000f5974e0 x19: ffff00000f5974e0 x18: ffff8000097fcf48
[  360.379382] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000053f40
[  360.386622] x14: ffff800009850e88 x13: 0000000000000002 x12: 000000000000a60c
[  360.393861] x11: 000000000000a610 x10: 0000000000000000 x9 : 0000000000000008
[  360.401100] x8 : 0101010101010101 x7 : 00000000a473c394 x6 : 0080808080808080
[  360.408339] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff80000919b458
[  360.415578] x2 : ffff8000097577f0 x1 : 0000000000000001 x0 : 0000000000000000
[  360.422818] Call trace:
[  360.425299]  __flush_work+0x414/0x470
[  360.429012]  __cancel_work_timer+0x140/0x1b0
[  360.433340]  cancel_delayed_work_sync+0x10/0x18
[  360.437931]  gpio_keys_quiesce_key+0x28/0x58 [gpio_keys]
[  360.443327]  devm_action_release+0x10/0x18
[  360.447481]  release_nodes+0x8c/0x1a0
[  360.451194]  devres_release_all+0x90/0x100
[  360.455346]  device_unbind_cleanup+0x14/0x60
[  360.459677]  device_release_driver_internal+0xe8/0x168
[  360.464883]  driver_detach+0x4c/0x90
[  360.468509]  bus_remove_driver+0x54/0xb0
[  360.472485]  driver_unregister+0x2c/0x58
[  360.476462]  platform_driver_unregister+0x10/0x18
[  360.481230]  gpio_keys_exit+0x14/0x828 [gpio_keys]
[  360.486088]  __arm64_sys_delete_module+0x1e0/0x270
[  360.490945]  invoke_syscall+0x40/0xf8
[  360.494661]  el0_svc_common.constprop.3+0xf0/0x110
[  360.499515]  do_el0_svc+0x20/0x78
[  360.502877]  el0_svc+0x48/0xf8
[  360.505977]  el0t_64_sync_handler+0x88/0xb0
[  360.510216]  el0t_64_sync+0x148/0x14c
[  360.513930] irq event stamp: 4306
[  360.517288] hardirqs last  enabled at (4305): [<ffff8000080b0300>] __cancel_work_timer+0x130/0x1b0
[  360.526359] hardirqs last disabled at (4306): [<ffff800008d194fc>] el1_dbg+0x24/0x88
[  360.534204] softirqs last  enabled at (4278): [<ffff8000080104a0>] _stext+0x4a0/0x5e0
[  360.542133] softirqs last disabled at (4267): [<ffff8000080932ac>] irq_exit_rcu+0x18c/0x1b0
[  360.550591] ---[ end trace 0000000000000000 ]---

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20220524135822.14764-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
jenkins-tessares pushed a commit that referenced this issue Jul 22, 2022
When running with return thunks enabled under 32-bit EFI, the system
crashes with:

  kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
  BUG: unable to handle page fault for address: 000000005bc02900
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0011) - permissions violation
  PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063
  Oops: 0011 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:0x5bc02900
  Code: Unable to access opcode bytes at RIP 0x5bc028d6.
  RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048
  RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b
  RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28
  R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710
  R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001
  FS:  0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050033
  CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0
  Call Trace:
   ? efi_set_virtual_address_map+0x9c/0x175
   efi_enter_virtual_mode+0x4a6/0x53e
   start_kernel+0x67c/0x71e
   x86_64_start_reservations+0x24/0x2a
   x86_64_start_kernel+0xe9/0xf4
   secondary_startup_64_no_verify+0xe5/0xeb

That's because it cannot jump to the return thunk from the 32-bit code.

Using a naked RET and marking it as safe allows the system to proceed
booting.

Fixes: aa3d480 ("x86: Use return-thunk in asm code")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: <stable@vger.kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jenkins-tessares pushed a commit that referenced this issue Feb 10, 2023
If a relocatable kernel is loaded at an address that is not 2MB aligned
and told not to relocate to zero, the kernel can crash due to
mark_rodata_ro() incorrectly changing some read-write data to read-only.

Scenarios where the misalignment can occur are when the kernel is
loaded by kdump or using the RELOCATABLE_TEST config option.

Example crash with the kernel loaded at 5MB:

  Run /sbin/init as init process
  BUG: Unable to handle kernel data access on write at 0xc000000000452000
  Faulting instruction address: 0xc0000000005b6730
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
  ...
  NIP memset+0x68/0x104
  LR  zero_user_segments.constprop.0+0xa8/0xf0
  Call Trace:
    ext4_mpage_readpages+0x7f8/0x830
    ext4_readahead+0x48/0x60
    read_pages+0xb8/0x380
    page_cache_ra_unbounded+0x19c/0x250
    filemap_fault+0x58c/0xae0
    __do_fault+0x60/0x100
    __handle_mm_fault+0x1230/0x1a40
    handle_mm_fault+0x120/0x300
    ___do_page_fault+0x20c/0xa80
    do_page_fault+0x30/0xc0
    data_access_common_virt+0x210/0x220

This happens because mark_rodata_ro() tries to change permissions on the
range _stext..__end_rodata, but _stext sits in the middle of the 2MB
page from 4MB to 6MB:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)

The logic that changes the permissions assumes the linear mapping was
split correctly at boot, so it marks the entire 2MB page read-only. That
leads to the write fault above.

To fix it, the boot time mapping logic needs to consider that if the
kernel is running at a non-zero address then _stext is a boundary where
it must split the mapping.

That leads to the mapping being split correctly, allowing the rodata
permission change to take happen correctly, with no spillover:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)

If the kernel is loaded at a 2MB aligned address, the mapping continues
to use 2MB pages as before:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages

Fixes: c55d7b5 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au
matttbe pushed a commit that referenced this issue May 20, 2024
Recent additions in BPF like cpu v4 instructions, test_bpf module
exhibits the following failures:

  test_bpf: #82 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #83 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #84 ALU64_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #85 ALU64_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #86 ALU64_MOVSX | BPF_W jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)

  test_bpf: #165 ALU_SDIV_X: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times)
  test_bpf: #166 ALU_SDIV_K: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times)

  test_bpf: #169 ALU_SMOD_X: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)
  test_bpf: #170 ALU_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)

  test_bpf: #172 ALU64_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)

  test_bpf: #313 BSWAP 16: 0x0123456789abcdef -> 0xefcd
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 301 PASS
  test_bpf: #314 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 555 PASS
  test_bpf: #315 BSWAP 64: 0x0123456789abcdef -> 0x67452301
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 268 PASS
  test_bpf: #316 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 269 PASS
  test_bpf: #317 BSWAP 16: 0xfedcba9876543210 -> 0x1032
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 460 PASS
  test_bpf: #318 BSWAP 32: 0xfedcba9876543210 -> 0x10325476
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 320 PASS
  test_bpf: #319 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 222 PASS
  test_bpf: #320 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 273 PASS

  test_bpf: #344 BPF_LDX_MEMSX | BPF_B
  eBPF filter opcode 0091 (@5) unsupported
  jited:0 432 PASS
  test_bpf: #345 BPF_LDX_MEMSX | BPF_H
  eBPF filter opcode 0089 (@5) unsupported
  jited:0 381 PASS
  test_bpf: #346 BPF_LDX_MEMSX | BPF_W
  eBPF filter opcode 0081 (@5) unsupported
  jited:0 505 PASS

  test_bpf: #490 JMP32_JA: Unconditional jump: if (true) return 1
  eBPF filter opcode 0006 (@1) unsupported
  jited:0 261 PASS

  test_bpf: Summary: 1040 PASSED, 10 FAILED, [924/1038 JIT'ed]

Fix them by adding missing processing.

Fixes: daabb2b ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/91de862dda99d170697eb79ffb478678af7e0b27.1709652689.git.christophe.leroy@csgroup.eu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

2 participants