Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UML #21

Closed
wants to merge 1 commit into from
Closed

UML #21

wants to merge 1 commit into from

Conversation

asamy
Copy link
Contributor

@asamy asamy commented Aug 24, 2012

No description provided.

mturquette pushed a commit to mturquette/linux that referenced this pull request Aug 25, 2012
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
bootc pushed a commit to bootc/linux that referenced this pull request Aug 25, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
@asamy asamy closed this Aug 25, 2012
liubogithub pushed a commit to liubogithub/btrfs-work that referenced this pull request Aug 29, 2012
When hot-adding a CPU, the system outputs following messages
since node_to_cpumask_map[2] was not allocated memory.

Booting Node 2 Processor 32 APIC 0xc0
node_to_cpumask_map[2] NULL
Pid: 0, comm: swapper/32 Tainted: G       A     3.3.5-acd torvalds#21
Call Trace:
 [<ffffffff81048845>] debug_cpumask_set_cpu+0x155/0x160
 [<ffffffff8105e28a>] ? add_timer_on+0xaa/0x120
 [<ffffffff8150665f>] numa_add_cpu+0x1e/0x22
 [<ffffffff815020bb>] identify_cpu+0x1df/0x1e4
 [<ffffffff815020d6>] identify_econdary_cpu+0x16/0x1d
 [<ffffffff81504614>] smp_store_cpu_info+0x3c/0x3e
 [<ffffffff81505263>] smp_callin+0x139/0x1be
 [<ffffffff815052fb>] start_secondary+0x13/0xeb

The reason is that the bit of node 2 was not set at
numa_nodes_parsed. numa_nodes_parsed is set by only
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init. Thus even if hot-added memory
which is same PXM as hot-added CPU is written in ACPI SRAT
Table, if the hot-added CPU is not written in ACPI SRAT table,
numa_nodes_parsed is not set.

But according to ACPI Spec Rev 5.0, it says about ACPI SRAT
table as follows: This optional table provides information that
allows OSPM to associate processors and memory ranges, including
ranges of memory provided by hot-added memory devices, with
system localities / proximity domains and clock domains.

It means that ACPI SRAT table only provides information for CPUs
present at boot time and for memory including hot-added memory.
So hot-added memory is written in ACPI SRAT table, but hot-added
CPU is not written in it. Thus numa_nodes_parsed should be set
by not only acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init but also
acpi_numa_memory_affinity_init for the case.

Additionally, if system has cpuless memory node,
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init cannot set numa_nodes_parseds
since these functions cannot find cpu description for the node.
In this case, numa_nodes_parsed needs to be set by
acpi_numa_memory_affinity_init.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: liuj97@gmail.com
Cc: kosaki.motohiro@gmail.com
Link: http://lkml.kernel.org/r/4FCC2098.4030007@jp.fujitsu.com
[ merged it ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
liubogithub pushed a commit to liubogithub/btrfs-work that referenced this pull request Aug 29, 2012
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU torvalds#6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  torvalds#6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  torvalds#7 torvalds#8 torvalds#9 torvalds#10 torvalds#11 Ok.
[    0.807108] Booting Node   3, Processors  torvalds#12 torvalds#13 torvalds#14 torvalds#15 torvalds#16 torvalds#17 Ok.
[    0.897587] Booting Node   2, Processors  torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
jbrandeb pushed a commit to jbrandeb/linux that referenced this pull request Aug 29, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shr-project pushed a commit to shr-distribution/linux that referenced this pull request Aug 30, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
RobertCNelson pushed a commit to RobertCNelson/linux that referenced this pull request Aug 30, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton at redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust at netapp.com>
Signed-off-by: Ben Hutchings <ben at decadent.org.uk>
Quarx2k pushed a commit to Quarx2k/linux-allwinner that referenced this pull request Sep 9, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Sep 11, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
baerwolf pushed a commit to baerwolf/linux-stephan that referenced this pull request Sep 12, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
RobertCNelson pushed a commit to RobertCNelson/linux that referenced this pull request Sep 12, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
hno pushed a commit to hno/linux that referenced this pull request Sep 12, 2012
Fixes issue torvalds#21 on amery/linux-allwinner
hno pushed a commit to hno/linux that referenced this pull request Sep 12, 2012
Fix build using O= (issue torvalds#21) and inline build on CM9
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 2, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 2, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 4, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 4, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
noamc referenced this pull request in Mellanox/linux Oct 16, 2012
…rq() workaround for clockevent Timer

request_irq() for TIMER0 failing on CPU1

Ideally we want to use the request_percpu_irq( ) / enable_percpu_irq()
calls from GENERIC_IRQ framework, however that seems to be faltering
even on the boot cpu at the time of first interrupt.

Until that is resolved (with Thomas G), we need to pretend that
TIMER0 is IRQF_SHARED. This also requires yet another hack of explicitly
unmasking the IRQ on that CPU.

Query sent to Thomas Gleixner

======================>8====================================
In a SMP setup, each ARC700 CPU has a in-core TIMER, hooked up to
private IRQ 3 of respective CPU and would serve as the local
clock_event_device.

request_irq( ) for my first CPU which succeeds, looks roughly as
follows:

	void __cpuinit arc_clockevent_init(void)
	{
	    int rc;
	    unsigned int cpu = smp_processor_id();
	    struct clock_event_device *evt = &per_cpu(arc_clockevent_device,
							cpu);
	....
	    rc = request_irq(TIMER0_INT, timer_irq_handler,
        	    IRQF_TIMER | IRQF_DISABLED | IRQF_PERCPU,
	            "Timer0 (clock-evt-dev)", evt);
	....

The exact same call, when done from 2nd CPU fails, as it wants to see
IRQF_SHARED which is semantically not correct, since IRQ is not really
shared, it is a private instance (albeit same value), per cpu.

I figured that the right APIs for our case is the pair:
(request|enable)_percpu_irq to be called for both CPUs, with a prior one
time call to irq_set_percpu_devid().  Is that correct?

Assuming it is, the trouble now is that, even on the first CPU,
handle_level_irq( ) is bailing out w/o calling handle_irq_event()
because irqd_irq_disabled( ) is true. This in turn happens because,
irq_set_percpu_devid(), our much needed init routine, sets IRQ_NOAUTOEN
causing __setup_irq( ) to skip calling irq_startup() => irq_enable()
which would have cleared IRQD_IRQ_DISABLED.

While enable_percpu_irq( ), could have fixed this, it only seems to be
unmasking IRQ at device level, it is not clearing the above flag.

I tried calling enable_irq( ) right after, but that doesn't seem to help
either.
What API am I missing here, to enable the irqd machinery, or am I seeing
a bug where enable_percpu_irq( ) call-chain should somehow be doing it.

======================>8====================================

This needs to be reverted and replaced with right calls once ThomasG
responds to my query.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
noamc referenced this pull request in Mellanox/linux Oct 16, 2012
…equest_irq() workaround for clockevent Timer"

This reverts commit 2985184.

Next commit uses the correct APIs, so we no longer need this hack
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 17, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 17, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
hknkkn pushed a commit to hknkkn/linux-dynticks that referenced this pull request Oct 29, 2012
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 torvalds#31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 torvalds#6 torvalds#7 Ok.
Booting Node   1, Processors  torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 Ok.
Booting Node   0, Processors  torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok.
Booting Node   1, Processors  torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 31, 2012
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
koenkooi pushed a commit to koenkooi/linux that referenced this pull request Oct 31, 2012
commit a3f83ab upstream.

At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
vineetgarc referenced this pull request in foss-for-synopsys-dwc-arc-processors/linux Oct 31, 2012
request_irq() for TIMER0 failing on CPU1

Ideally we want to use the request_percpu_irq( ) / enable_percpu_irq()
calls from GENERIC_IRQ framework, however that seems to be faltering
even on the boot cpu at the time of first interrupt.

Until that is resolved (with Thomas G), we need to pretend that
TIMER0 is IRQF_SHARED. This also requires yet another hack of explicitly
unmasking the IRQ on that CPU.

Query sent to Thomas Gleixner

======================>8====================================
In a SMP setup, each ARC700 CPU has a in-core TIMER, hooked up to
private IRQ 3 of respective CPU and would serve as the local
clock_event_device.

request_irq( ) for my first CPU which succeeds, looks roughly as
follows:

	void __cpuinit arc_clockevent_init(void)
	{
	    int rc;
	    unsigned int cpu = smp_processor_id();
	    struct clock_event_device *evt = &per_cpu(arc_clockevent_device,
							cpu);
	....
	    rc = request_irq(TIMER0_INT, timer_irq_handler,
        	    IRQF_TIMER | IRQF_DISABLED | IRQF_PERCPU,
	            "Timer0 (clock-evt-dev)", evt);
	....

The exact same call, when done from 2nd CPU fails, as it wants to see
IRQF_SHARED which is semantically not correct, since IRQ is not really
shared, it is a private instance (albeit same value), per cpu.

I figured that the right APIs for our case is the pair:
(request|enable)_percpu_irq to be called for both CPUs, with a prior one
time call to irq_set_percpu_devid().  Is that correct?

Assuming it is, the trouble now is that, even on the first CPU,
handle_level_irq( ) is bailing out w/o calling handle_irq_event()
because irqd_irq_disabled( ) is true. This in turn happens because,
irq_set_percpu_devid(), our much needed init routine, sets IRQ_NOAUTOEN
causing __setup_irq( ) to skip calling irq_startup() => irq_enable()
which would have cleared IRQD_IRQ_DISABLED.

While enable_percpu_irq( ), could have fixed this, it only seems to be
unmasking IRQ at device level, it is not clearing the above flag.

I tried calling enable_irq( ) right after, but that doesn't seem to help
either.
What API am I missing here, to enable the irqd machinery, or am I seeing
a bug where enable_percpu_irq( ) call-chain should somehow be doing it.

======================>8====================================

This needs to be reverted and replaced with right calls once ThomasG
responds to my query.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
vineetgarc referenced this pull request in foss-for-synopsys-dwc-arc-processors/linux Oct 31, 2012
…nt Timer"

This reverts commit 2985184.

Next commit uses the correct APIs, so we no longer need this hack

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
jadonk pushed a commit to jadonk/linux that referenced this pull request Nov 13, 2012
At a boot time I observed following bug:

 BUG: unable to handle kernel paging request at ffff8800a4244000
 IP: [<ffffffff81275b5b>] memcpy+0xb/0x120
 PGD 1816063 PUD 1fe7d067 PMD 1ff9f067 PTE 80000000a4244160
 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
 CPU 0
 Modules linked in: btusb bluetooth brcmsmac brcmutil crc8 cordic b43 radeon(+)
  mac80211 cfg80211 ttm ohci_hcd drm_kms_helper rfkill drm ssb agpgart mmc_core
  sp5100_tco video battery ac thermal processor rtc_cmos thermal_sys snd_hda_codec_hdmi
  joydev snd_hda_codec_conexant button bcma pcmcia snd_hda_intel snd_hda_codec
  snd_hwdep snd_pcm shpchp pcmcia_core k8temp snd_timer atl1c snd psmouse hwmon
  i2c_piix4 i2c_algo_bit soundcore evdev i2c_core ehci_hcd sg serio_raw snd_page_alloc
  loop btrfs

 Pid: 1008, comm: modprobe Not tainted 3.3.0-rc1 torvalds#21 LENOVO 20046                           /AMD CRB
 RIP: 0010:[<ffffffff81275b5b>]  [<ffffffff81275b5b>] memcpy+0xb/0x120
 RSP: 0018:ffff8800aa72db00  EFLAGS: 00010246
 RAX: ffff8800a4150000 RBX: 0000000000001000 RCX: 0000000000000087
 RDX: 0000000000000000 RSI: ffff8800a4244000 RDI: ffff8800a4150bc8
 RBP: ffff8800aa72db78 R08: 0000000000000010 R09: ffffffff8174bbec
 R10: ffffffff812ee010 R11: 0000000000000001 R12: 0000000000001000
 R13: 0000000000010000 R14: ffff8800a4140000 R15: ffff8800aaba1800
 FS:  00007ff9a3bd4720(0000) GS:ffff8800afa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffff8800a4244000 CR3: 00000000a9c18000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 1008, threadinfo ffff8800aa72c000, task ffff8800aa0e4000)
 Stack:
  ffffffffa04e7c7b 0000000000000001 0000000000010000 ffff8800aa72db28
  ffffffff00000001 0000000000001000 ffffffff8113cbef 0000000000000020
  ffff8800a4243420 ffff880000000002 ffff8800aa72db08 ffff8800a9d42000
 Call Trace:
  [<ffffffffa04e7c7b>] ? radeon_atrm_get_bios_chunk+0x8b/0xd0 [radeon]
  [<ffffffff8113cbef>] ? kmalloc_order_trace+0x3f/0xb0
  [<ffffffffa04a9298>] radeon_get_bios+0x68/0x2f0 [radeon]
  [<ffffffffa04c7a30>] rv770_init+0x40/0x280 [radeon]
  [<ffffffffa047d740>] radeon_device_init+0x560/0x600 [radeon]
  [<ffffffffa047ef4f>] radeon_driver_load_kms+0xaf/0x170 [radeon]
  [<ffffffffa043cdde>] drm_get_pci_dev+0x18e/0x2c0 [drm]
  [<ffffffffa04e7e95>] radeon_pci_probe+0xad/0xb5 [radeon]
  [<ffffffff81296c5f>] local_pci_probe+0x5f/0xd0
  [<ffffffff81297418>] pci_device_probe+0x88/0xb0
  [<ffffffff813417aa>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff813418d8>] really_probe+0x68/0x180
  [<ffffffff81341be5>] driver_probe_device+0x45/0x70
  [<ffffffff81341cb3>] __driver_attach+0xa3/0xb0
  [<ffffffff81341c10>] ? driver_probe_device+0x70/0x70
  [<ffffffff813400ce>] bus_for_each_dev+0x5e/0x90
  [<ffffffff8134172e>] driver_attach+0x1e/0x20
  [<ffffffff81341298>] bus_add_driver+0xc8/0x280
  [<ffffffff813422c6>] driver_register+0x76/0x140
  [<ffffffff812976d6>] __pci_register_driver+0x66/0xe0
  [<ffffffffa043d021>] drm_pci_init+0x111/0x120 [drm]
  [<ffffffff8133c67a>] ? vga_switcheroo_register_handler+0x3a/0x60
  [<ffffffffa0229000>] ? 0xffffffffa0228fff
  [<ffffffffa02290ec>] radeon_init+0xec/0xee [radeon]
  [<ffffffff810002f2>] do_one_initcall+0x42/0x180
  [<ffffffff8109d8d2>] sys_init_module+0x92/0x1e0
  [<ffffffff815407a9>] system_call_fastpath+0x16/0x1b
 Code: 58 2a 43 50 88 43 4e 48 83 c4 08 5b c9 c3 66 90 e8 cb fd ff ff eb
  e6 90 90 90 90 90 90 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48
  a5 89 d1 f3 a4 c3 20 48 83 ea 20 4c 8b 06 4c 8b 4e 08 4c
 RIP  [<ffffffff81275b5b>] memcpy+0xb/0x120
  RSP <ffff8800aa72db00>
 CR2: ffff8800a4244000
 ---[ end trace fcffa1599cf56382 ]---

Call to acpi_evaluate_object() not always returns 4096 bytes chunks,
on my system it can return 2048 bytes chunk, so pass the length of
retrieved chunk to memcpy(), not the length of the recieving buffer.

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
jadonk pushed a commit to jadonk/linux that referenced this pull request Nov 13, 2012
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513
torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6
torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 12, 2024
If function-like macros do not utilize a parameter, it might result in a
build warning.  In our coding style guidelines, we advocate for utilizing
static inline functions to replace such macros.  This patch verifies
compliance with the new rule.

For a macro such as the one below,

 #define test(a) do { } while (0)

The test result is as follows.

 WARNING: Argument 'a' is not used in function-like macro
 torvalds#21: FILE: mm/init-mm.c:20:
 +#define test(a) do { } while (0)

 total: 0 errors, 1 warnings, 8 lines checked

Link: https://lkml.kernel.org/r/20240507032757.146386-3-21cnbao@gmail.com
Signed-off-by: Xining Xu <mac.xxn@outlook.com>
Tested-by: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Joe Perches <joe@perches.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Huacai Chen <chenhuacai@loongson.cn>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mark Brown <broonie@kernel.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Charlemagne Lasse <charlemagnelasse@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request May 15, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 15, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 16, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing that referenced this pull request May 17, 2024
… non head_frag

The crashed kernel version is 5.16.20, and I have not test this patch
because I dont find a way to reproduce it, and the mailine may be
has the same problem.

When using bpf based NAT, hits a kernel BUG_ON at function skb_segment(),
BUG_ON(skb_headlen(list_skb) > len). The bpf calls the bpf_skb_adjust_room
to decrease the gso_size, and then call bpf_redirect send packet out.

call stack:
...
   [exception RIP: skb_segment+3016]
    RIP: ffffffffb97df2a8  RSP: ffffa3f2cce08728  RFLAGS: 00010293
    RAX: 000000000000007d  RBX: 00000000fffff7b3  RCX: 0000000000000011
    RDX: 0000000000000000  RSI: ffff895ea32c76c0  RDI: 00000000000008c1
    RBP: ffffa3f2cce087f8   R8: 000000000000088f   R9: 0000000000000011
    R10: 000000000000090c  R11: ffff895e47e68000  R12: ffff895eb2022f00
    R13: 000000000000004b  R14: ffff895ecdaf2000  R15: ffff895eb2023f00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 torvalds#9 [ffffa3f2cce08720] skb_segment at ffffffffb97ded63
torvalds#10 [ffffa3f2cce08800] tcp_gso_segment at ffffffffb98d0320
torvalds#11 [ffffa3f2cce08860] tcp4_gso_segment at ffffffffb98d07a3
torvalds#12 [ffffa3f2cce08880] inet_gso_segment at ffffffffb98e6de0
torvalds#13 [ffffa3f2cce088e0] skb_mac_gso_segment at ffffffffb97f3741
torvalds#14 [ffffa3f2cce08918] skb_udp_tunnel_segment at ffffffffb98daa59
torvalds#15 [ffffa3f2cce08980] udp4_ufo_fragment at ffffffffb98db471
torvalds#16 [ffffa3f2cce089b0] inet_gso_segment at ffffffffb98e6de0
torvalds#17 [ffffa3f2cce08a10] skb_mac_gso_segment at ffffffffb97f3741
torvalds#18 [ffffa3f2cce08a48] __skb_gso_segment at ffffffffb97f388e
torvalds#19 [ffffa3f2cce08a78] validate_xmit_skb at ffffffffb97f3d6e
torvalds#20 [ffffa3f2cce08ab8] __dev_queue_xmit at ffffffffb97f4614
torvalds#21 [ffffa3f2cce08b50] dev_queue_xmit at ffffffffb97f5030
torvalds#22 [ffffa3f2cce08b60] __bpf_redirect at ffffffffb98199a8
torvalds#23 [ffffa3f2cce08b88] skb_do_redirect at ffffffffb98205cd
...

The skb has the following properties:
    doffset = 66
    list_skb = skb_shinfo(skb)->frag_list
    list_skb->head_frag = true
    skb->len = 2441 && skb->data_len = 2250
    skb_shinfo(skb)->nr_frags = 17
    skb_shinfo(skb)->gso_size = 75
    skb_shinfo(skb)->frags[0...16].bv_len = 125
    list_skb->len = 125
    list_skb->data_len = 0

3962 struct sk_buff *skb_segment(struct sk_buff *head_skb,
3963                             netdev_features_t features)
3964 {
3965         struct sk_buff *segs = NULL;
3966         struct sk_buff *tail = NULL;
...
4181                 while (pos < offset + len) {
4182                         if (i >= nfrags) {
4183                                 i = 0;
4184                                 nfrags = skb_shinfo(list_skb)->nr_frags;
4185                                 frag = skb_shinfo(list_skb)->frags;
4186                                 frag_skb = list_skb;

After segment the head_skb's last frag, the (pos == offset+len), so break the
while at line 4181, run into this BUG_ON(), not segment the head_frag frag_list
skb.

Since commit 13acc94(net: permit skb_segment on head_frag frag_list skb),
it is allowed to segment the head_frag frag_list skb.

In commit 3dcbdb1 (net: gso: Fix skb_segment splat when splitting gso_size
mangled skb having linear-headed frag_list), it is cleared the NETIF_F_SG if it
has non head_frag skb. It is not cleared the NETIF_F_SG only with one head_frag
frag_list skb.

Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
Kaz205 pushed a commit to Kaz205/linux that referenced this pull request Jun 3, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 3, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 3, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kaz205 pushed a commit to Kaz205/linux that referenced this pull request Jun 5, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 6, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 6, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 9, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Jun 9, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
hdeller pushed a commit to hdeller/linux that referenced this pull request Jun 12, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Jun 12, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
staging-kernelci-org pushed a commit to kernelci/linux that referenced this pull request Jun 12, 2024
[ Upstream commit 769e6a1 ]

ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().

Avoid a use after return by strdup-ing the string.

Committer notes:

Further explanation from Ian Rogers:

My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
    #0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
    #1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
    #2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
    #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
    #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
    #5 0x55c94045c776 in ui_browser__show ui/browser.c:288
    torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
    torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
    torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
    torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
    torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
    torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
    torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
    torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
    torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
    torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
    torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
    torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
    torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
    torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
    torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561
    torvalds#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
    torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)

Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
    #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746

  This frame has 1 object(s):
    [32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.

Fixes: 05e8b08 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 17, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 18, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 20, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 20, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 22, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 25, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ioworker0 pushed a commit to ioworker0/linux that referenced this pull request Jun 25, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
torvalds#11 dio_complete at ffffffff8c2b9fa7
torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f
torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
torvalds#14 generic_file_direct_write at ffffffff8c1dcf14
torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b
torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
torvalds#17 aio_write at ffffffff8c2cc72e
torvalds#18 kmem_cache_alloc at ffffffff8c248dde
torvalds#19 do_io_submit at ffffffff8c2ccada
torvalds#20 do_syscall_64 at ffffffff8c004984
torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant