Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syzkaller: memory leak in subflow_create_ctx #356

Closed
cpaasch opened this issue Feb 14, 2023 · 2 comments
Closed

syzkaller: memory leak in subflow_create_ctx #356

cpaasch opened this issue Feb 14, 2023 · 2 comments

Comments

@cpaasch
Copy link
Member

cpaasch commented Feb 14, 2023

Head: 89ce2ca (02/28)
3038497 (02/27)
0150d5b

Trace:

BUG: memory leak
unreferenced object 0xffff88812df5a700 (size 256):
  comm "syz-executor.3", pid 2701, jiffies 4297168679 (age 63.966s)
  hex dump (first 32 bytes):
    00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de  ........".......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000109c3810>] kmalloc_trace+0x21/0x90 mm/slab_common.c:1062
    [<00000000a01124a9>] kmalloc include/linux/slab.h:580 [inline]
    [<00000000a01124a9>] kzalloc include/linux/slab.h:720 [inline]
    [<00000000a01124a9>] subflow_create_ctx+0x5b/0xf0 net/mptcp/subflow.c:1752
    [<00000000a180cdc9>] subflow_ulp_init+0x3a/0x140 net/mptcp/subflow.c:1832
    [<00000000cf1a8bf5>] __tcp_set_ulp net/ipv4/tcp_ulp.c:146 [inline]
    [<00000000cf1a8bf5>] tcp_set_ulp+0x142/0x230 net/ipv4/tcp_ulp.c:167
    [<00000000382f383a>] mptcp_subflow_create_socket+0x171/0x300 net/mptcp/subflow.c:1715
    [<000000002fe382d9>] __mptcp_socket_create net/mptcp/protocol.c:89 [inline]
    [<000000002fe382d9>] __mptcp_nmpc_socket.part.0+0x83/0x1e0 net/mptcp/protocol.c:122
    [<00000000fa3a0e61>] __mptcp_nmpc_socket net/mptcp/protocol.c:115 [inline]
    [<00000000fa3a0e61>] mptcp_bind+0x55/0xe0 net/mptcp/protocol.c:3714
    [<00000000e3b192be>] __sys_bind+0x89/0x100 net/socket.c:1801
    [<00000000f1ffd90b>] __do_sys_bind net/socket.c:1812 [inline]
    [<00000000f1ffd90b>] __se_sys_bind net/socket.c:1810 [inline]
    [<00000000f1ffd90b>] __x64_sys_bind+0x18/0x20 net/socket.c:1810
    [<000000000703658a>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<000000000703658a>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:80
    [<00000000d12f04a9>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

No reproducers yet.

Kconfig: Kconfig_k8_kmemleak.txt

@matttbe matttbe added this to Needs triage in MPTCP Bugs via automation Feb 15, 2023
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Mar 1, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: multipath-tcp/mptcp_net-next#356
Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
matttbe pushed a commit that referenced this issue Mar 6, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
@matttbe matttbe closed this as completed in 7161321 Mar 6, 2023
matttbe pushed a commit that referenced this issue Mar 6, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
matttbe pushed a commit that referenced this issue Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
matttbe pushed a commit that referenced this issue Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
@matttbe
Copy link
Member

matttbe commented Mar 7, 2023

@cpaasch this issue has been fixed we think but hard to confirm without reproducer.

matttbe pushed a commit that referenced this issue Mar 7, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
jenkins-tessares pushed a commit that referenced this issue Mar 8, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
jenkins-tessares pushed a commit that referenced this issue Mar 8, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
jenkins-tessares pushed a commit that referenced this issue Mar 9, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
jenkins-tessares pushed a commit that referenced this issue Mar 9, 2023
When subflow_syn_recv_sock() loses the inet hash insert race, the
newly created children will be released by inet_csk_complete_hashdance().

In that scenario, without any further hint, the ulp release callback
will keep the ulp context alive, expecting that the msk socket will
later free it.

Anyway the dying child is not linked to any msk socket, and the context
will be leaked, as reported by Christoph.

Address the issue explicitly releasing the context in the critical
scenario.

Fixes: cec37a6 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: #356
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
matttbe added a commit that referenced this issue Mar 9, 2023
This reverts commit ccd4c33.

According to Paolo, this one is causing issue #356 and indirectly #370.

Closes: #356
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
@matttbe
Copy link
Member

matttbe commented Mar 9, 2023

FYI, from IRC, Paolo pointed out that this issue was caused by "mptcp: drop legacy code", requiring "mptcp: fix possible memory leak on syn_recv race" to fix this issue but then causing #370.

It has now been reverted:

  • 4ef602b: Revert "mptcp: drop legacy code"

matttbe pushed a commit that referenced this issue Dec 1, 2023
With latest upstream llvm18, the following test cases failed:

  $ ./test_progs -j
  #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
  #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
  #13      bpf_cookie:FAIL
  #77      fentry_fexit:FAIL
  #78/1    fentry_test/fentry:FAIL
  #78      fentry_test:FAIL
  #82/1    fexit_test/fexit:FAIL
  #82      fexit_test:FAIL
  #112/1   kprobe_multi_test/skel_api:FAIL
  #112/2   kprobe_multi_test/link_api_addrs:FAIL
  [...]
  #112     kprobe_multi_test:FAIL
  #356/17  test_global_funcs/global_func17:FAIL
  #356     test_global_funcs:FAIL

Further analysis shows llvm upstream patch [1] is responsible for the above
failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c,
without [1], the asm code is:

  0000000000000400 <bpf_fentry_test7>:
     400: f3 0f 1e fa                   endbr64
     404: e8 00 00 00 00                callq   0x409 <bpf_fentry_test7+0x9>
     409: 48 89 f8                      movq    %rdi, %rax
     40c: c3                            retq
     40d: 0f 1f 00                      nopl    (%rax)

... and with [1], the asm code is:

  0000000000005d20 <bpf_fentry_test7.specialized.1>:
    5d20: e8 00 00 00 00                callq   0x5d25 <bpf_fentry_test7.specialized.1+0x5>
    5d25: c3                            retq

... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7>
and this caused test failures for #13/#77 etc. except #356.

For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog
looks like:

  0000000000000000 <global_func17>:
       0:       b4 00 00 00 2a 00 00 00 w0 = 0x2a
       1:       95 00 00 00 00 00 00 00 exit

... which passed verification while the test itself expects a verification
failure.

Let us add 'barrier_var' style asm code in both places to prevent function
specialization which caused selftests failure.

  [1] llvm/llvm-project#72903

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231127050342.1945270-1-yonghong.song@linux.dev
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
MPTCP Bugs
  
Needs triage
Development

No branches or pull requests

2 participants