Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear license on USB code #33

Closed
hno opened this issue Jun 4, 2012 · 1 comment
Closed

Unclear license on USB code #33

hno opened this issue Jun 4, 2012 · 1 comment
Assignees

Comments

@hno
Copy link
Member

hno commented Jun 4, 2012

drivers/usb/sun[345]i_usb/* is missing GPL license headers.

@ghost ghost assigned hipboi Jun 4, 2012
@hipboi
Copy link

hipboi commented Jun 5, 2012

fixed

@hipboi hipboi closed this as completed Jun 5, 2012
amery pushed a commit that referenced this issue Oct 13, 2012
…ck ordering

So after kicking around commit 547ac4c around a bit more, a tcm_qla2xxx LUN
unlink OP has generated the following warning:

[   50.386625] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000.
[   70.572988] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged...
[  126.527531] ------------[ cut here ]------------
[  126.532677] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x41/0x8c()
[  126.540433] Hardware name: S5520HC
[  126.544248] Modules linked in: tcm_vhost ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx tcm_loop tcm_fc libfc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh loop i2c_i801 kvm_intel kvm crc32c_intel i2c_core microcode joydev button iomemory_vsl(O) pcspkr ext3 jbd uhci_hcd lpfc ata_piix libata ehci_hcd qla2xxx mlx4_core scsi_transport_fc scsi_tgt igb [last unloaded: scsi_wait_scan]
[  126.595567] Pid: 3283, comm: unlink Tainted: G           O 3.5.0-rc2+ #33
[  126.603128] Call Trace:
[  126.605853]  [<ffffffff81026b91>] ? warn_slowpath_common+0x78/0x8c
[  126.612737]  [<ffffffff8102c342>] ? local_bh_enable_ip+0x41/0x8c
[  126.619433]  [<ffffffffa03582a2>] ? core_disable_device_list_for_node+0x70/0xe3 [target_core_mod]
[  126.629323]  [<ffffffffa035849f>] ? core_clear_lun_from_tpg+0x88/0xeb [target_core_mod]
[  126.638244]  [<ffffffffa0362ec1>] ? core_tpg_post_dellun+0x17/0x48 [target_core_mod]
[  126.646873]  [<ffffffffa03575ee>] ? core_dev_del_lun+0x26/0x8c [target_core_mod]
[  126.655114]  [<ffffffff810bcbd1>] ? dput+0x27/0x154
[  126.660549]  [<ffffffffa0359aa0>] ? target_fabric_port_unlink+0x3b/0x41 [target_core_mod]
[  126.669661]  [<ffffffffa034a698>] ? configfs_unlink+0xfc/0x14a [configfs]
[  126.677224]  [<ffffffff810b5979>] ? vfs_unlink+0x58/0xb7
[  126.683141]  [<ffffffff810b6ef3>] ? do_unlinkat+0xbb/0x142
[  126.689253]  [<ffffffff81330c75>] ? page_fault+0x25/0x30
[  126.695170]  [<ffffffff81335df9>] ? system_call_fastpath+0x16/0x1b
[  126.702053] ---[ end trace 2f8e5b0a9ec797ef ]---
[  126.756336] qla2xxx [0000:07:00.0]-00af:0: Performing ISP error recovery - ha=ffff880263774000.
[  146.942414] qla2xxx [0000:07:00.0]-8038:0: Cable is unplugged...

So this warning triggered because device_list disable logic is now
holding nacl->device_list_lock w/ spin_lock_irqsave before obtaining
port->sep_alua_lock with only spin_lock_bh..

The original disable logic obtains *deve ahead of dropping the entry
from deve->alua_port_list and then obtains ->device_list_lock to do the
remaining work.  Also, I'm pretty sure this particular warning is being
generated by a demo-mode session in tcm_qla2xxx, and not by explicit
NodeACL MappedLUNs.  The Initiator MappedLUNs are already protected by a
seperate configfs symlink reference back se_lun->lun_group, and the
demo-mode se_node_acl (and associated ->device_list[]) is released
during se_portal_group->tpg_group shutdown.

The following patch drops the extra functional change to disable logic
in commit 547ac4c

Cc: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
amery pushed a commit that referenced this issue Nov 12, 2013
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Nov 12, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Mar 9, 2014
Currently, at module removal, one gets the following warnings:
------------[ cut here ]------------
WARNING: at drivers/clk/clk.c:780 clk_disable+0x18/0x24()
Modules linked in: spi_imx(-) [last unloaded: ev76c560]
CPU: 1 PID: 16337 Comm: rmmod Tainted: G        W    3.10.17-80548-g90191eb-dirty #33
[<80013b4c>] (unwind_backtrace+0x0/0xf8) from [<800115dc>] (show_stack+0x10/0x14)
[<800115dc>] (show_stack+0x10/0x14) from [<800257b8>] (warn_slowpath_common+0x4c/0x68)
[<800257b8>] (warn_slowpath_common+0x4c/0x68) from [<800257f0>] (warn_slowpath_null+0x1c/0x24)
[<800257f0>] (warn_slowpath_null+0x1c/0x24) from [<803f60ec>] (clk_disable+0x18/0x24)
[<803f60ec>] (clk_disable+0x18/0x24) from [<7f02c9cc>] (spi_imx_remove+0x54/0x9c [spi_imx])
[<7f02c9cc>] (spi_imx_remove+0x54/0x9c [spi_imx]) from [<8025868c>] (platform_drv_remove+0x18/0x1c)
[<8025868c>] (platform_drv_remove+0x18/0x1c) from [<80256f60>] (__device_release_driver+0x70/0xcc)
[<80256f60>] (__device_release_driver+0x70/0xcc) from [<80257770>] (driver_detach+0xcc/0xd0)
[<80257770>] (driver_detach+0xcc/0xd0) from [<80256d90>] (bus_remove_driver+0x7c/0xc0)
[<80256d90>] (bus_remove_driver+0x7c/0xc0) from [<80068668>] (SyS_delete_module+0x144/0x1f8)
[<80068668>] (SyS_delete_module+0x144/0x1f8) from [<8000e080>] (ret_fast_syscall+0x0/0x30)
---[ end trace 1f5df9ad54996300 ]---
------------[ cut here ]------------
WARNING: at drivers/clk/clk.c:780 clk_disable+0x18/0x24()
Modules linked in: spi_imx(-) [last unloaded: ev76c560]
CPU: 1 PID: 16337 Comm: rmmod Tainted: G        W    3.10.17-80548-g90191eb-dirty #33
[<80013b4c>] (unwind_backtrace+0x0/0xf8) from [<800115dc>] (show_stack+0x10/0x14)
[<800115dc>] (show_stack+0x10/0x14) from [<800257b8>] (warn_slowpath_common+0x4c/0x68)
[<800257b8>] (warn_slowpath_common+0x4c/0x68) from [<800257f0>] (warn_slowpath_null+0x1c/0x24)
[<800257f0>] (warn_slowpath_null+0x1c/0x24) from [<803f60ec>] (clk_disable+0x18/0x24)
[<803f60ec>] (clk_disable+0x18/0x24) from [<7f02c9e8>] (spi_imx_remove+0x70/0x9c [spi_imx])
[<7f02c9e8>] (spi_imx_remove+0x70/0x9c [spi_imx]) from [<8025868c>] (platform_drv_remove+0x18/0x1c)
[<8025868c>] (platform_drv_remove+0x18/0x1c) from [<80256f60>] (__device_release_driver+0x70/0xcc)
[<80256f60>] (__device_release_driver+0x70/0xcc) from [<80257770>] (driver_detach+0xcc/0xd0)
[<80257770>] (driver_detach+0xcc/0xd0) from [<80256d90>] (bus_remove_driver+0x7c/0xc0)
[<80256d90>] (bus_remove_driver+0x7c/0xc0) from [<80068668>] (SyS_delete_module+0x144/0x1f8)
[<80068668>] (SyS_delete_module+0x144/0x1f8) from [<8000e080>] (ret_fast_syscall+0x0/0x30)
---[ end trace 1f5df9ad54996301 ]---

Since commit 9e556dc, "spi: spi-imx: only
enable the clocks when we start to transfer a message", clocks are always
disabled except when transmitting messages.  There is thus no need to
disable them at module removal.

Fixes: 9e556dc (spi: spi-imx: only enable the clocks when we start to transfer a message)
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Acked-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
amery pushed a commit that referenced this issue Apr 3, 2014
In function talitos_probe(), it will jump to err_out when getting an
error in talitos_probe_irq(). Then the uninitialized list head
priv->alg_list will be used in function talitos_remove(). In this case
we would get a call trace like the following. So move up the
initialization of priv->alg_list.

  Unable to handle kernel paging request for data at address 0x00000000
  Faulting instruction address: 0xc0459ff4
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=8 P1020 RDB
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W    3.13.0-08789-g54c0a4b46150 #33
  task: cf050000 ti: cf04c000 task.ti: cf04c000
  NIP: c0459ff4 LR: c0459fd4 CTR: c02f2438
  REGS: cf04dcb0 TRAP: 0300   Tainted: G        W     (3.13.0-08789-g54c0a4b46150)
  MSR: 00029000 <CE,EE,ME>  CR: 82000028  XER: 20000000
  DEAR: 00000000 ESR: 00000000
  GPR00: c045ac28 cf04dd60 cf050000 cf2579c0 00021000 00000000 c02f35b0 0000014e
  GPR08: c07e702c cf104300 c07e702c 0000014e 22000024 00000000 c0002a3c 00000000
  GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c082e4e0 000000df
  GPR24: 00000000 00100100 00200200 cf257a2c cf0efe10 cf2579c0 cf0efe10 00000000
  NIP [c0459ff4] talitos_remove+0x3c/0x1c8
  LR [c0459fd4] talitos_remove+0x1c/0x1c8
  Call Trace:
  [cf04dd60] [c07485d8] __func__.13331+0x1241c8/0x1391c0 (unreliable)
  [cf04dd90] [c045ac28] talitos_probe+0x244/0x998
  [cf04dde0] [c0306a74] platform_drv_probe+0x28/0x68
  [cf04ddf0] [c0304d38] really_probe+0x78/0x250
  [cf04de10] [c030505c] __driver_attach+0xc8/0xcc
  [cf04de30] [c0302e98] bus_for_each_dev+0x6c/0xb8
  [cf04de60] [c03043cc] bus_add_driver+0x168/0x220
  [cf04de80] [c0305798] driver_register+0x88/0x130
  [cf04de90] [c0002458] do_one_initcall+0x14c/0x198
  [cf04df00] [c079f904] kernel_init_freeable+0x138/0x1d4
  [cf04df30] [c0002a50] kernel_init+0x14/0x124
  [cf04df40] [c000ec40] ret_from_kernel_thread+0x5c/0x64

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
amery pushed a commit that referenced this issue Aug 6, 2014
Dave reported following splat, caused by improper use of
IP_INC_STATS_BH() in process context.

BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
 ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
 0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
 ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
Call Trace:
 [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
 [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
 [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
 [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
 [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
 [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
 [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
 [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
 [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
 [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
 [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
 [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
 [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
 [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
 [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
 [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
 [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
 [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
 [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
 [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
 [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
 [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
 [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
 [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2

This is a followup of commits f1d8cba ("inet: fix possible
seqlock deadlocks") and 7f88c6b ("ipv6: fix possible seqlock
deadlock in ip6_finish_output2")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
amery pushed a commit that referenced this issue Aug 24, 2014
[ Upstream commit 757efd3 ]

Dave reported following splat, caused by improper use of
IP_INC_STATS_BH() in process context.

BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
 ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
 0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
 ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
Call Trace:
 [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
 [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
 [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
 [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
 [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
 [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
 [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
 [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
 [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
 [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
 [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
 [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
 [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
 [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
 [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
 [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
 [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
 [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
 [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
 [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
 [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
 [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
 [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
 [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2

This is a followup of commits f1d8cba ("inet: fix possible
seqlock deadlocks") and 7f88c6b ("ipv6: fix possible seqlock
deadlock in ip6_finish_output2")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amery pushed a commit that referenced this issue Oct 13, 2014
I'm getting the spew below when booting with Haswell (Xeon
E5-2699 v3) CPUs and the "Cluster-on-Die" (CoD) feature enabled
in the BIOS.  It seems similar to the issue that some folks from
AMD ran in to on their systems and addressed in this commit:

  161270f ("x86/smp: Fix topology checks on AMD MCM CPUs")

Both these Intel and AMD systems break an assumption which is
being enforced by topology_sane(): a socket may not contain more
than one NUMA node.

AMD special-cased their system by looking for a cpuid flag.  The
Intel mode is dependent on BIOS options and I do not know of a
way which it is enumerated other than the tables being parsed
during the CPU bringup process.  In other words, we have to trust
the ACPI tables <shudder>.

This detects the situation where a NUMA node occurs at a place in
the middle of the "CPU" sched domains.  It replaces the default
topology with one that relies on the NUMA information from the
firmware (SRAT table) for all levels of sched domains above the
hyperthreads.

This also fixes a sysfs bug.  We used to freak out when we saw
the "mc" group cross a node boundary, so we stopped building the
MC group.  MC gets exported as the 'core_siblings_list' in
/sys/devices/system/cpu/cpu*/topology/ and this caused CPUs with
the same 'physical_package_id' to not be listed together in
'core_siblings_list'.  This violates a statement from
Documentation/ABI/testing/sysfs-devices-system-cpu:

	core_siblings: internal kernel map of cpu#'s hardware threads
	within the same physical_package_id.

	core_siblings_list: human-readable list of the logical CPU
	numbers within the same physical_package_id as cpu#.

The sysfs effects here cause an issue with the hwloc tool where
it gets confused and thinks there are more sockets than are
physically present.

Before this patch, there are two packages:

# cd /sys/devices/system/cpu/
# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1

But 4 _sets_ of core siblings:

# cat cpu*/topology/core_siblings_list | sort | uniq -c
      9 0-8
      9 18-26
      9 27-35
      9 9-17

After this set, there are only 2 sets of core siblings, which
is what we expect for a 2-socket system.

# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1
# cat cpu*/topology/core_siblings_list | sort | uniq -c
     18 0-17
     18 18-35

Example spew:
...
	NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
	 #2  #3  #4  #5  #6  #7  #8
	.... node  #1, CPUs:    #9
	------------[ cut here ]------------
	WARNING: CPU: 9 PID: 0 at /home/ak/hle/linux-hle-2.6/arch/x86/kernel/smpboot.c:306 topology_sane.isra.2+0x74/0x90()
	sched: CPU #9's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
	Modules linked in:
	CPU: 9 PID: 0 Comm: swapper/9 Not tainted 3.17.0-rc1-00293-g8e01c4d-dirty torvalds#631
	Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
	0000000000000009 ffff88046ddabe00 ffffffff8172e485 ffff88046ddabe48
	ffff88046ddabe38 ffffffff8109691d 000000000000b001 0000000000000009
	ffff88086fc12580 000000000000b020 0000000000000009 ffff88046ddabe98
	Call Trace:
	[<ffffffff8172e485>] dump_stack+0x45/0x56
	[<ffffffff8109691d>] warn_slowpath_common+0x7d/0xa0
	[<ffffffff8109698c>] warn_slowpath_fmt+0x4c/0x50
	[<ffffffff81074f94>] topology_sane.isra.2+0x74/0x90
	[<ffffffff8107530e>] set_cpu_sibling_map+0x31e/0x4f0
	[<ffffffff8107568d>] start_secondary+0x1ad/0x240
	---[ end trace 3fe5f587a9fcde61 ]---
	#10 #11 #12 #13 #14 #15 #16 #17
	.... node  #2, CPUs:   #18 #19 #20 #21 #22 #23 #24 #25 #26
	.... node  #3, CPUs:   #27 #28 #29 #30 #31 #32 #33 #34 #35

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
[ Added LLC domain and s/match_mc/match_die/ ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: brice.goglin@gmail.com
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20140918193334.C065EBCE@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
vkomenda pushed a commit to vkomenda/linux-sunxi that referenced this issue Oct 15, 2014
amery pushed a commit that referenced this issue Oct 29, 2014
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
vkomenda pushed a commit to vkomenda/linux-sunxi that referenced this issue Jan 23, 2015
vkomenda pushed a commit to vkomenda/linux-sunxi that referenced this issue Jan 26, 2015
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Jul 28, 2015
While proving lock, the following warning happens
and it is fixed after initializing lock in the setup
function

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.27-02861-g39df285-dirty linux-sunxi#33
[<c00154ac>] (unwind_backtrace+0x0/0xf4) from [<c0011b50>] (show_stack+0x10/0x14)
[<c0011b50>] (show_stack+0x10/0x14) from [<c00689ac>] (__lock_acquire+0x900/0xb14)
[<c00689ac>] (__lock_acquire+0x900/0xb14) from [<c0069394>] (lock_acquire+0x68/0x7c)
[<c0069394>] (lock_acquire+0x68/0x7c) from [<c04958f8>] (_raw_spin_lock_irqsave+0x48/0x5c)
[<c04958f8>] (_raw_spin_lock_irqsave+0x48/0x5c) from [<c0381e6c>] (clk_gate_endisable+0x28/0x88)
[<c0381e6c>] (clk_gate_endisable+0x28/0x88) from [<c0381ee0>] (clk_gate_enable+0xc/0x14)
[<c0381ee0>] (clk_gate_enable+0xc/0x14) from [<c0386c68>] (flexgen_enable+0x28/0x40)
[<c0386c68>] (flexgen_enable+0x28/0x40) from [<c037f260>] (__clk_enable+0x5c/0x9c)
[<c037f260>] (__clk_enable+0x5c/0x9c) from [<c037f558>] (clk_enable+0x18/0x2c)
[<c037f558>] (clk_enable+0x18/0x2c) from [<c064a1dc>] (st_lpc_of_register+0xc0/0x248)
[<c064a1dc>] (st_lpc_of_register+0xc0/0x248) from [<c0649e44>] (clocksource_of_init+0x34/0x58)
[<c0649e44>] (clocksource_of_init+0x34/0x58) from [<c0637ddc>] (sti_timer_init+0x10/0x18)
[<c0637ddc>] (sti_timer_init+0x10/0x18) from [<c06343f8>] (time_init+0x20/0x30)
[<c06343f8>] (time_init+0x20/0x30) from [<c0632984>] (start_kernel+0x20c/0x2e8)
[<c0632984>] (start_kernel+0x20c/0x2e8) from [<40008074>] (0x40008074)

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Fixes: b116517 ("clk: st: STiH407: Support for Flexgen Clocks")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
hramrach pushed a commit to hramrach/linux-sunxi that referenced this issue Aug 11, 2015
Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  linux-sunxi#1 schedule at ffffffff815ab76e
  linux-sunxi#2 schedule_timeout at ffffffff815ae5e5
  linux-sunxi#3 io_schedule_timeout at ffffffff815aad6a
  linux-sunxi#4 bit_wait_io at ffffffff815abfc6
  linux-sunxi#5 __wait_on_bit at ffffffff815abda5
  linux-sunxi#6 wait_on_page_bit at ffffffff8111fd4f
  linux-sunxi#7 shrink_page_list at ffffffff81135445
  linux-sunxi#8 shrink_inactive_list at ffffffff81135845
  linux-sunxi#9 shrink_lruvec at ffffffff81135ead
 linux-sunxi#10 shrink_zone at ffffffff811360c3
 linux-sunxi#11 shrink_zones at ffffffff81136eff
 linux-sunxi#12 do_try_to_free_pages at ffffffff8113712f
 linux-sunxi#13 try_to_free_mem_cgroup_pages at ffffffff811372be
 linux-sunxi#14 try_charge at ffffffff81189423
 linux-sunxi#15 mem_cgroup_try_charge at ffffffff8118c6f5
 linux-sunxi#16 __add_to_page_cache_locked at ffffffff8112137d
 linux-sunxi#17 add_to_page_cache_lru at ffffffff81121618
 linux-sunxi#18 pagecache_get_page at ffffffff8112170b
 linux-sunxi#19 grow_dev_page at ffffffff811c8297
 linux-sunxi#20 __getblk_slow at ffffffff811c91d6
 linux-sunxi#21 __getblk_gfp at ffffffff811c92c1
 linux-sunxi#22 ext4_ext_grow_indepth at ffffffff8124565c
 linux-sunxi#23 ext4_ext_create_new_leaf at ffffffff81246ca8
 linux-sunxi#24 ext4_ext_insert_extent at ffffffff81246f09
 linux-sunxi#25 ext4_ext_map_blocks at ffffffff8124a848
 linux-sunxi#26 ext4_map_blocks at ffffffff8121a5b7
 linux-sunxi#27 mpage_map_one_extent at ffffffff8121b1fa
 linux-sunxi#28 mpage_map_and_submit_extent at ffffffff8121f07b
 linux-sunxi#29 ext4_writepages at ffffffff8121f6d5
 linux-sunxi#30 do_writepages at ffffffff8112c490
 linux-sunxi#31 __filemap_fdatawrite_range at ffffffff81120199
 linux-sunxi#32 filemap_flush at ffffffff8112041c
 linux-sunxi#33 ext4_alloc_da_blocks at ffffffff81219da1
 linux-sunxi#34 ext4_rename at ffffffff81229b91
 linux-sunxi#35 ext4_rename2 at ffffffff81229e32
 linux-sunxi#36 vfs_rename at ffffffff811a08a5
 linux-sunxi#37 SYSC_renameat2 at ffffffff811a3ffc
 linux-sunxi#38 sys_renameat2 at ffffffff811a408e
 linux-sunxi#39 sys_rename at ffffffff8119e51e
 linux-sunxi#40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: stable@vger.kernel.org # 3.9+
[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wens pushed a commit that referenced this issue Oct 12, 2015
When freqdomain_cpus attribute is read from an offlined cpu, it will
cause crash. This change prevents calling cpufreq_show_cpus when
policy driver_data is NULL.

Crash info:

[  170.814949] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[  170.814990] IP: [<ffffffff813b2490>] _find_next_bit.part.0+0x10/0x70
[  170.815021] PGD 227d30067 PUD 229e56067 PMD 0
[  170.815043] Oops: 0000 [#2] SMP
[  170.816022] CPU: 3 PID: 3121 Comm: cat Tainted: G      D    OE   4.3.0-rc3+ #33
...
...
[  170.816657] Call Trace:
[  170.816672]  [<ffffffff813b2505>] ? find_next_bit+0x15/0x20
[  170.816696]  [<ffffffff8160e47c>] cpufreq_show_cpus+0x5c/0xd0
[  170.816722]  [<ffffffffa031a409>] show_freqdomain_cpus+0x19/0x20 [acpi_cpufreq]
[  170.816749]  [<ffffffff8160e65b>] show+0x3b/0x60
[  170.816769]  [<ffffffff8129b31c>] sysfs_kf_seq_show+0xbc/0x130
[  170.816793]  [<ffffffff81299be3>] kernfs_seq_show+0x23/0x30
[  170.816816]  [<ffffffff81240f2c>] seq_read+0xec/0x390
[  170.816837]  [<ffffffff8129a64a>] kernfs_fop_read+0x10a/0x160
[  170.816861]  [<ffffffff8121d9b7>] __vfs_read+0x37/0x100
[  170.816883]  [<ffffffff813217c0>] ? security_file_permission+0xa0/0xc0
[  170.816909]  [<ffffffff8121e2e3>] vfs_read+0x83/0x130
[  170.816930]  [<ffffffff8121f035>] SyS_read+0x55/0xc0
...
...
[  170.817185] ---[ end trace bc6eadf82b2b965a ]---

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
amery pushed a commit that referenced this issue Feb 25, 2016
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put.
2. "dm: remove request-based logic from make_request_fn wrapper" necessitates
    calling dm_setup_md_queue or else the request_queue's make_request_fn
    pointer ends being unset.

[    7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[    7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.15-02273-gb057d16-dirty #33
[    7.726559] Hardware name: HiKey Development Board (DT)
[    7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000
[    7.739257] PC is at 0x0
[    7.741787] LR is at generic_make_request+0x8c/0x108
....
[    9.082931] Call trace:
[    9.085372] [<          (null)>]           (null)
[    9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0
[    9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0
[    9.096165] cfg80211: Calling CRDA to update world regulatory domain
[    9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114
[    9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64
[    9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0
[    9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c
[    9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194
[    9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134
[    9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc
[    9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8
[    9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c
[    9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c
[    9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8
[    9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260
[    9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118
[    9.176589] Code: bad PC value
[    9.179807] ---[ end trace 75e1bc52ba364d13 ]---

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0
amery pushed a commit that referenced this issue Feb 26, 2016
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put.
2. "dm: remove request-based logic from make_request_fn wrapper" necessitates
    calling dm_setup_md_queue or else the request_queue's make_request_fn
    pointer ends being unset.

[    7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[    7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.15-02273-gb057d16-dirty #33
[    7.726559] Hardware name: HiKey Development Board (DT)
[    7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000
[    7.739257] PC is at 0x0
[    7.741787] LR is at generic_make_request+0x8c/0x108
....
[    9.082931] Call trace:
[    9.085372] [<          (null)>]           (null)
[    9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0
[    9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0
[    9.096165] cfg80211: Calling CRDA to update world regulatory domain
[    9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114
[    9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64
[    9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0
[    9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c
[    9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194
[    9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134
[    9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc
[    9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8
[    9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c
[    9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c
[    9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8
[    9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260
[    9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118
[    9.176589] Code: bad PC value
[    9.179807] ---[ end trace 75e1bc52ba364d13 ]---

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0
amery pushed a commit that referenced this issue Apr 1, 2017
[ Upstream commit ecf5fc6 ]

Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  #1 schedule at ffffffff815ab76e
  #2 schedule_timeout at ffffffff815ae5e5
  #3 io_schedule_timeout at ffffffff815aad6a
  #4 bit_wait_io at ffffffff815abfc6
  #5 __wait_on_bit at ffffffff815abda5
  #6 wait_on_page_bit at ffffffff8111fd4f
  #7 shrink_page_list at ffffffff81135445
  #8 shrink_inactive_list at ffffffff81135845
  #9 shrink_lruvec at ffffffff81135ead
 #10 shrink_zone at ffffffff811360c3
 #11 shrink_zones at ffffffff81136eff
 #12 do_try_to_free_pages at ffffffff8113712f
 #13 try_to_free_mem_cgroup_pages at ffffffff811372be
 #14 try_charge at ffffffff81189423
 #15 mem_cgroup_try_charge at ffffffff8118c6f5
 #16 __add_to_page_cache_locked at ffffffff8112137d
 #17 add_to_page_cache_lru at ffffffff81121618
 #18 pagecache_get_page at ffffffff8112170b
 #19 grow_dev_page at ffffffff811c8297
 #20 __getblk_slow at ffffffff811c91d6
 #21 __getblk_gfp at ffffffff811c92c1
 #22 ext4_ext_grow_indepth at ffffffff8124565c
 #23 ext4_ext_create_new_leaf at ffffffff81246ca8
 #24 ext4_ext_insert_extent at ffffffff81246f09
 #25 ext4_ext_map_blocks at ffffffff8124a848
 #26 ext4_map_blocks at ffffffff8121a5b7
 #27 mpage_map_one_extent at ffffffff8121b1fa
 #28 mpage_map_and_submit_extent at ffffffff8121f07b
 #29 ext4_writepages at ffffffff8121f6d5
 #30 do_writepages at ffffffff8112c490
 #31 __filemap_fdatawrite_range at ffffffff81120199
 #32 filemap_flush at ffffffff8112041c
 #33 ext4_alloc_da_blocks at ffffffff81219da1
 #34 ext4_rename at ffffffff81229b91
 #35 ext4_rename2 at ffffffff81229e32
 #36 vfs_rename at ffffffff811a08a5
 #37 SYSC_renameat2 at ffffffff811a3ffc
 #38 sys_renameat2 at ffffffff811a408e
 #39 sys_rename at ffffffff8119e51e
 #40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: stable@vger.kernel.org # 3.9+
[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
wens pushed a commit that referenced this issue Aug 4, 2017
The commit 304419d ("mmc: core: Allocate per-request data using the
block layer core") refactored mechanism of queue handling caused
mmc_init_request() can be called just after mmc_cleanup_queue() caused null
pointer dereference.

Another commit bbdc74d ("mmc: block: Prevent new req entering queue
after its cleanup") tried to fix the problem. However it actually miss one
corner case.

We could still reproduce the issue mentioned with these steps:
(1) insert a SD card and mount it
(2) hotplug it, so it will leave md->usage still be counted
(3) reboot the system which will sync data and umount the card

[Unable to handle kernel NULL pointer dereference at virtual address
00000000
[user pgtable: 4k pages, 48-bit VAs, pgd = ffff80007bab3000
[[0000000000000000] *pgd=000000007a828003, *pud=0000000078dce003,
*pmd=000000007aab6003, *pte=0000000000000000
[Internal error: Oops: 96000007 [#1] PREEMPT SMP
[Modules linked in:
[CPU: 3 PID: 3507 Comm: umount Tainted: G        W
4.13.0-rc1-next-20170720-00012-g9d9bf45 #33
[Hardware name: Firefly-RK3399 Board (DT)
[task: ffff80007a1de200 task.stack: ffff80007a01c000
[PC is at mmc_init_request+0x14/0xc4
[LR is at alloc_request_size+0x4c/0x74
[pc : [<ffff0000087d7150>] lr : [<ffff000008378fe0>] pstate: 600001c5
[sp : ffff80007a01f8f0

....

[[<ffff0000087d7150>] mmc_init_request+0x14/0xc4
[[<ffff000008378fe0>] alloc_request_size+0x4c/0x74
[[<ffff00000817ac28>] mempool_create_node+0xb8/0x17c
[[<ffff00000837aadc>] blk_init_rl+0x9c/0x120
[[<ffff000008396580>] blkg_alloc+0x110/0x234
[[<ffff000008396ac8>] blkg_create+0x424/0x468
[[<ffff00000839877c>] blkg_lookup_create+0xd8/0x14c
[[<ffff0000083796bc>] generic_make_request_checks+0x368/0x3b0
[[<ffff00000837b050>] generic_make_request+0x1c/0x240

So mmc_blk_put wouldn't calling blk_cleanup_queue which actually the
QUEUE_FLAG_DYING and QUEUE_FLAG_BYPASS should stay. Block core expect
blk_queue_bypass_{start, end} internally to bypass/drain the queue before
actually dying the queue, so it didn't expose API to set the queue bypass.
I think we should set QUEUE_FLAG_BYPASS whenever queue is removed, although
the md->usage is still counted, as no dispatch queue could be found then.

Fixes: 304419d ("mmc: core: Allocate per-request data using the block layer core")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
amery pushed a commit that referenced this issue Nov 1, 2017
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :

WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x41c kernel/panic.c:181
 __warn+0x1c4/0x1e0 kernel/panic.c:544
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
 do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286
RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000
RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4
RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000
R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9
R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc
 __in6_ifa_put include/net/addrconf.h:369 [inline]
 ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
 addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
 addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
 call_netdevice_notifiers net/core/dev.c:1715 [inline]
 __dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
 dev_change_flags+0xf5/0x140 net/core/dev.c:6879
 do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
 rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
 rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
 netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
 netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
 netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
 __sys_sendmsg+0xe5/0x210 net/socket.c:2083
 SYSC_sendmsg net/socket.c:2094 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2090
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7fa9174d3320
RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320
RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016
RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0
R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8

Fixes: f1705ec ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amery pushed a commit that referenced this issue Feb 24, 2018
do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
amery pushed a commit that referenced this issue May 11, 2018
syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 #1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 #32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 #47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 #47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Benc <jbenc@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 21, 2018
[ Upstream commit af50e4b ]

syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 jwrdegoede#1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 linux-sunxi#32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ linux-sunxi#26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Benc <jbenc@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 22, 2018
[ Upstream commit af50e4b ]

syzbot caught an infinite recursion in nsh_gso_segment().

Problem here is that we need to make sure the NSH header is of
reasonable length.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by syz-executor0/10189:
 #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
 jwrdegoede#1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 jwrdegoede#6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 jwrdegoede#6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
 linux-sunxi#32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
 linux-sunxi#47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
 linux-sunxi#47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ linux-sunxi#26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
 rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
 rcu_read_lock include/linux/rcupdate.h:632 [inline]
 skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
 skb_gso_segment include/linux/netdevice.h:4025 [inline]
 validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
 qdisc_restart net/sched/sch_generic.c:399 [inline]
 __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
 __dev_xmit_skb net/core/dev.c:3243 [inline]
 __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
 packet_snd net/packet/af_packet.c:2951 [inline]
 packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:639
 __sys_sendto+0x3d7/0x670 net/socket.c:1789
 __do_sys_sendto net/socket.c:1801 [inline]
 __se_sys_sendto net/socket.c:1797 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: c411ed8 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Benc <jbenc@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 30, 2018
[ Upstream commit 9f41631 ]

do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [jwrdegoede#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ linux-sunxi#33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amery pushed a commit that referenced this issue Jul 19, 2018
…ol-handlers'

Vladimir Zapolskiy says:

====================
ravb/sh_eth: fix sleep in atomic by reusing shared ethtool handlers

For ages trivial changes to RAVB and SuperH ethernet links by means of
standard 'ethtool' trigger a 'sleeping function called from invalid
context' bug, to visualize it on r8a7795 ULCB:

  % ethtool -r eth0
  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
  in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool
  INFO: lockdep is turned off.
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>]           (null)
  hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
  softirqs last  enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
  softirqs last disabled at (0): [<0000000000000000>]           (null)
  CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33
  Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT)
  Call trace:
   dump_backtrace+0x0/0x198
   show_stack+0x24/0x30
   dump_stack+0xb8/0xf4
   ___might_sleep+0x1c8/0x1f8
   __might_sleep+0x58/0x90
   __mutex_lock+0x50/0x890
   mutex_lock_nested+0x3c/0x50
   phy_start_aneg_priv+0x38/0x180
   phy_start_aneg+0x24/0x30
   ravb_nway_reset+0x3c/0x68
   dev_ethtool+0x3dc/0x2338
   dev_ioctl+0x19c/0x490
   sock_do_ioctl+0xe0/0x238
   sock_ioctl+0x254/0x460
   do_vfs_ioctl+0xb0/0x918
   ksys_ioctl+0x50/0x80
   sys_ioctl+0x34/0x48
   __sys_trace_return+0x0/0x4

The root cause is that an attempt to modify ECMR and GECMR registers
only when RX/TX function is disabled was too overcomplicated in its
original implementation, also processing of an optional Link Change
interrupt added even more complexity, as a result the implementation
was error prone.

The new locking scheme is confirmed to be correct by dumping driver
specific and generic PHY framework function calls with aid of ftrace
while running more or less advanced tests.

Please note that sh_eth patches from the series were built-tested only.

On purpose I do not add Fixes tags, the reused PHY handlers were added
way later than the fixed problems were firstly found in the drivers.

Changes from v1 to v2:
* the original patches are split to bugfixes and enhancements only,
  both v1 and v2 series are absolutely equal in total, thus I omit
  description of changes in individual patches,
* the latter implies that there should be no strict need for retesting,
  but because formally two series are different, I have to drop the tags
  given by Geert and Andrew, please send your tags again.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
amery pushed a commit that referenced this issue Oct 26, 2018
Fix possible recursive lock warning. Its a false warning as the locks are
part of two differnt HW Queue data structure - cmdq and creq. Debug kernel
is throwing the following warning and stack trace.

[  783.914967] ============================================
[  783.914970] WARNING: possible recursive locking detected
[  783.914973] 4.19.0-rc2+ #33 Not tainted
[  783.914976] --------------------------------------------
[  783.914979] swapper/2/0 is trying to acquire lock:
[  783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.914999]
but task is already holding lock:
[  783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915013]
other info that might help us debug this:
[  783.915016]  Possible unsafe locking scenario:

[  783.915019]        CPU0
[  783.915021]        ----
[  783.915034]   lock(&(&hwq->lock)->rlock);
[  783.915035]   lock(&(&hwq->lock)->rlock);
[  783.915037]
 *** DEADLOCK ***

[  783.915038]  May be due to missing lock nesting notation

[  783.915039] 1 lock held by swapper/2/0:
[  783.915040]  #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915044]
stack backtrace:
[  783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ #33
[  783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[  783.915048] Call Trace:
[  783.915049]  <IRQ>
[  783.915054]  dump_stack+0x90/0xe3
[  783.915058]  __lock_acquire+0x106c/0x1080
[  783.915061]  ? sched_clock+0x5/0x10
[  783.915063]  lock_acquire+0xbd/0x1a0
[  783.915065]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915069]  _raw_spin_lock_irqsave+0x4a/0x90
[  783.915071]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915073]  bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915078]  tasklet_action_common.isra.17+0x197/0x1b0
[  783.915081]  __do_softirq+0xcb/0x3a6
[  783.915084]  irq_exit+0xe9/0x100
[  783.915085]  do_IRQ+0x6a/0x120
[  783.915087]  common_interrupt+0xf/0xf
[  783.915088]  </IRQ>

Use nested notation for the spin_lock to avoid this warning.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
amery pushed a commit that referenced this issue Nov 9, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Nov 13, 2018
[ Upstream commit d455f29 ]

Fix possible recursive lock warning. Its a false warning as the locks are
part of two differnt HW Queue data structure - cmdq and creq. Debug kernel
is throwing the following warning and stack trace.

[  783.914967] ============================================
[  783.914970] WARNING: possible recursive locking detected
[  783.914973] 4.19.0-rc2+ linux-sunxi#33 Not tainted
[  783.914976] --------------------------------------------
[  783.914979] swapper/2/0 is trying to acquire lock:
[  783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.914999]
but task is already holding lock:
[  783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915013]
other info that might help us debug this:
[  783.915016]  Possible unsafe locking scenario:

[  783.915019]        CPU0
[  783.915021]        ----
[  783.915034]   lock(&(&hwq->lock)->rlock);
[  783.915035]   lock(&(&hwq->lock)->rlock);
[  783.915037]
 *** DEADLOCK ***

[  783.915038]  May be due to missing lock nesting notation

[  783.915039] 1 lock held by swapper/2/0:
[  783.915040]  #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915044]
stack backtrace:
[  783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ linux-sunxi#33
[  783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[  783.915048] Call Trace:
[  783.915049]  <IRQ>
[  783.915054]  dump_stack+0x90/0xe3
[  783.915058]  __lock_acquire+0x106c/0x1080
[  783.915061]  ? sched_clock+0x5/0x10
[  783.915063]  lock_acquire+0xbd/0x1a0
[  783.915065]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915069]  _raw_spin_lock_irqsave+0x4a/0x90
[  783.915071]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915073]  bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915078]  tasklet_action_common.isra.17+0x197/0x1b0
[  783.915081]  __do_softirq+0xcb/0x3a6
[  783.915084]  irq_exit+0xe9/0x100
[  783.915085]  do_IRQ+0x6a/0x120
[  783.915087]  common_interrupt+0xf/0xf
[  783.915088]  </IRQ>

Use nested notation for the spin_lock to avoid this warning.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wens pushed a commit that referenced this issue Feb 3, 2019
This patch prevents division by zero htotal.

In a follow-up mail Tina writes:

> > How did you manage to get here with htotal == 0? This needs backtraces (or if
> > this is just about static checkers, a mention of that).
> > -Daniel
>
> In GVT-g, we are trying to enable a virtual display w/o setting timings for a pipe
> (a.k.a htotal=0), then we met the following kernel panic:
>
> [   32.832048] divide error: 0000 [#1] SMP PTI
> [   32.833614] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc4-sriov+ #33
> [   32.834438] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.1-0-g8891697-dirty-20180511_165818-tinazhang-linux-1 04/01/2014
> [   32.835901] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.836004] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.836004] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.836004] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.836004] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.836004] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.836004] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.836004] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.836004] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.836004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.836004] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.836004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.836004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.836004] Call Trace:
> [   32.836004]  intel_mode_from_pipe_config+0x72/0x90
> [   32.836004]  intel_modeset_setup_hw_state+0x569/0xf90
> [   32.836004]  intel_modeset_init+0x905/0x1db0
> [   32.836004]  i915_driver_load+0xb8c/0x1120
> [   32.836004]  i915_pci_probe+0x4d/0xb0
> [   32.836004]  local_pci_probe+0x44/0xa0
> [   32.836004]  ? pci_assign_irq+0x27/0x130
> [   32.836004]  pci_device_probe+0x102/0x1c0
> [   32.836004]  driver_probe_device+0x2b8/0x480
> [   32.836004]  __driver_attach+0x109/0x110
> [   32.836004]  ? driver_probe_device+0x480/0x480
> [   32.836004]  bus_for_each_dev+0x67/0xc0
> [   32.836004]  ? klist_add_tail+0x3b/0x70
> [   32.836004]  bus_add_driver+0x1e8/0x260
> [   32.836004]  driver_register+0x5b/0xe0
> [   32.836004]  ? mipi_dsi_bus_init+0x11/0x11
> [   32.836004]  do_one_initcall+0x4d/0x1eb
> [   32.836004]  kernel_init_freeable+0x197/0x237
> [   32.836004]  ? rest_init+0xd0/0xd0
> [   32.836004]  kernel_init+0xa/0x110
> [   32.836004]  ret_from_fork+0x35/0x40
> [   32.836004] Modules linked in:
> [   32.859183] ---[ end trace 525608b0ed0e8665 ]---
> [   32.859722] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.860287] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.862680] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.863309] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.864182] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.865206] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.866359] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.867213] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.868075] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.868983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.869659] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.870599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.871598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.872549] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Since drm_mode_hsync() has the logic to check mode->htotal, I just extend it to cover the case htotal==0.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
[danvet: Add additional explanations + cc: stable.]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1548228539-3061-1-git-send-email-tina.zhang@intel.com
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Feb 15, 2019
commit a2fcd5c upstream.

This patch prevents division by zero htotal.

In a follow-up mail Tina writes:

> > How did you manage to get here with htotal == 0? This needs backtraces (or if
> > this is just about static checkers, a mention of that).
> > -Daniel
>
> In GVT-g, we are trying to enable a virtual display w/o setting timings for a pipe
> (a.k.a htotal=0), then we met the following kernel panic:
>
> [   32.832048] divide error: 0000 [jwrdegoede#1] SMP PTI
> [   32.833614] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc4-sriov+ linux-sunxi#33
> [   32.834438] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.1-0-g8891697-dirty-20180511_165818-tinazhang-linux-1 04/01/2014
> [   32.835901] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.836004] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.836004] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.836004] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.836004] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.836004] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.836004] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.836004] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.836004] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.836004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.836004] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.836004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.836004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.836004] Call Trace:
> [   32.836004]  intel_mode_from_pipe_config+0x72/0x90
> [   32.836004]  intel_modeset_setup_hw_state+0x569/0xf90
> [   32.836004]  intel_modeset_init+0x905/0x1db0
> [   32.836004]  i915_driver_load+0xb8c/0x1120
> [   32.836004]  i915_pci_probe+0x4d/0xb0
> [   32.836004]  local_pci_probe+0x44/0xa0
> [   32.836004]  ? pci_assign_irq+0x27/0x130
> [   32.836004]  pci_device_probe+0x102/0x1c0
> [   32.836004]  driver_probe_device+0x2b8/0x480
> [   32.836004]  __driver_attach+0x109/0x110
> [   32.836004]  ? driver_probe_device+0x480/0x480
> [   32.836004]  bus_for_each_dev+0x67/0xc0
> [   32.836004]  ? klist_add_tail+0x3b/0x70
> [   32.836004]  bus_add_driver+0x1e8/0x260
> [   32.836004]  driver_register+0x5b/0xe0
> [   32.836004]  ? mipi_dsi_bus_init+0x11/0x11
> [   32.836004]  do_one_initcall+0x4d/0x1eb
> [   32.836004]  kernel_init_freeable+0x197/0x237
> [   32.836004]  ? rest_init+0xd0/0xd0
> [   32.836004]  kernel_init+0xa/0x110
> [   32.836004]  ret_from_fork+0x35/0x40
> [   32.836004] Modules linked in:
> [   32.859183] ---[ end trace 525608b0ed0e8665 ]---
> [   32.859722] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.860287] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.862680] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.863309] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.864182] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.865206] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.866359] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.867213] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.868075] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.868983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.869659] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.870599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.871598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.872549] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Since drm_mode_hsync() has the logic to check mode->htotal, I just extend it to cover the case htotal==0.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
[danvet: Add additional explanations + cc: stable.]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1548228539-3061-1-git-send-email-tina.zhang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Feb 23, 2019
commit a2fcd5c upstream.

This patch prevents division by zero htotal.

In a follow-up mail Tina writes:

> > How did you manage to get here with htotal == 0? This needs backtraces (or if
> > this is just about static checkers, a mention of that).
> > -Daniel
>
> In GVT-g, we are trying to enable a virtual display w/o setting timings for a pipe
> (a.k.a htotal=0), then we met the following kernel panic:
>
> [   32.832048] divide error: 0000 [jwrdegoede#1] SMP PTI
> [   32.833614] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc4-sriov+ linux-sunxi#33
> [   32.834438] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.1-0-g8891697-dirty-20180511_165818-tinazhang-linux-1 04/01/2014
> [   32.835901] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.836004] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.836004] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.836004] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.836004] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.836004] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.836004] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.836004] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.836004] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.836004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.836004] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.836004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.836004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.836004] Call Trace:
> [   32.836004]  intel_mode_from_pipe_config+0x72/0x90
> [   32.836004]  intel_modeset_setup_hw_state+0x569/0xf90
> [   32.836004]  intel_modeset_init+0x905/0x1db0
> [   32.836004]  i915_driver_load+0xb8c/0x1120
> [   32.836004]  i915_pci_probe+0x4d/0xb0
> [   32.836004]  local_pci_probe+0x44/0xa0
> [   32.836004]  ? pci_assign_irq+0x27/0x130
> [   32.836004]  pci_device_probe+0x102/0x1c0
> [   32.836004]  driver_probe_device+0x2b8/0x480
> [   32.836004]  __driver_attach+0x109/0x110
> [   32.836004]  ? driver_probe_device+0x480/0x480
> [   32.836004]  bus_for_each_dev+0x67/0xc0
> [   32.836004]  ? klist_add_tail+0x3b/0x70
> [   32.836004]  bus_add_driver+0x1e8/0x260
> [   32.836004]  driver_register+0x5b/0xe0
> [   32.836004]  ? mipi_dsi_bus_init+0x11/0x11
> [   32.836004]  do_one_initcall+0x4d/0x1eb
> [   32.836004]  kernel_init_freeable+0x197/0x237
> [   32.836004]  ? rest_init+0xd0/0xd0
> [   32.836004]  kernel_init+0xa/0x110
> [   32.836004]  ret_from_fork+0x35/0x40
> [   32.836004] Modules linked in:
> [   32.859183] ---[ end trace 525608b0ed0e8665 ]---
> [   32.859722] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.860287] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.862680] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.863309] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.864182] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.865206] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.866359] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.867213] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.868075] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.868983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.869659] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.870599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.871598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.872549] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Since drm_mode_hsync() has the logic to check mode->htotal, I just extend it to cover the case htotal==0.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
[danvet: Add additional explanations + cc: stable.]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1548228539-3061-1-git-send-email-tina.zhang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Feb 23, 2019
commit a2fcd5c upstream.

This patch prevents division by zero htotal.

In a follow-up mail Tina writes:

> > How did you manage to get here with htotal == 0? This needs backtraces (or if
> > this is just about static checkers, a mention of that).
> > -Daniel
>
> In GVT-g, we are trying to enable a virtual display w/o setting timings for a pipe
> (a.k.a htotal=0), then we met the following kernel panic:
>
> [   32.832048] divide error: 0000 [jwrdegoede#1] SMP PTI
> [   32.833614] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc4-sriov+ linux-sunxi#33
> [   32.834438] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.10.1-0-g8891697-dirty-20180511_165818-tinazhang-linux-1 04/01/2014
> [   32.835901] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.836004] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.836004] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.836004] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.836004] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.836004] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.836004] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.836004] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.836004] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.836004] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.836004] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.836004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.836004] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.836004] Call Trace:
> [   32.836004]  intel_mode_from_pipe_config+0x72/0x90
> [   32.836004]  intel_modeset_setup_hw_state+0x569/0xf90
> [   32.836004]  intel_modeset_init+0x905/0x1db0
> [   32.836004]  i915_driver_load+0xb8c/0x1120
> [   32.836004]  i915_pci_probe+0x4d/0xb0
> [   32.836004]  local_pci_probe+0x44/0xa0
> [   32.836004]  ? pci_assign_irq+0x27/0x130
> [   32.836004]  pci_device_probe+0x102/0x1c0
> [   32.836004]  driver_probe_device+0x2b8/0x480
> [   32.836004]  __driver_attach+0x109/0x110
> [   32.836004]  ? driver_probe_device+0x480/0x480
> [   32.836004]  bus_for_each_dev+0x67/0xc0
> [   32.836004]  ? klist_add_tail+0x3b/0x70
> [   32.836004]  bus_add_driver+0x1e8/0x260
> [   32.836004]  driver_register+0x5b/0xe0
> [   32.836004]  ? mipi_dsi_bus_init+0x11/0x11
> [   32.836004]  do_one_initcall+0x4d/0x1eb
> [   32.836004]  kernel_init_freeable+0x197/0x237
> [   32.836004]  ? rest_init+0xd0/0xd0
> [   32.836004]  kernel_init+0xa/0x110
> [   32.836004]  ret_from_fork+0x35/0x40
> [   32.836004] Modules linked in:
> [   32.859183] ---[ end trace 525608b0ed0e8665 ]---
> [   32.859722] RIP: 0010:drm_mode_hsync+0x1e/0x40
> [   32.860287] Code: 31 c0 c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 8b 87 d8 00 00 00 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9 b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 f3 c3 66
> [   32.862680] RSP: 0000:ffffc900000ebb90 EFLAGS: 00010206
> [   32.863309] RAX: 0000000000000000 RBX: ffff88001c67c8a0 RCX: 0000000000000000
> [   32.864182] RDX: 0000000000000000 RSI: ffff88001c67c000 RDI: ffff88001c67c8a0
> [   32.865206] RBP: ffff88001c7d03a0 R08: ffff88001c67c8a0 R09: ffff88001c7d0330
> [   32.866359] R10: ffffffff822c3a98 R11: 0000000000000001 R12: ffff88001c67c000
> [   32.867213] R13: ffff88001c7d0370 R14: ffffffff8207eb78 R15: ffff88001c67c800
> [   32.868075] FS:  0000000000000000(0000) GS:ffff88001da00000(0000) knlGS:0000000000000000
> [   32.868983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   32.869659] CR2: 0000000000000000 CR3: 000000000220a000 CR4: 00000000000006f0
> [   32.870599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   32.871598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   32.872549] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> Since drm_mode_hsync() has the logic to check mode->htotal, I just extend it to cover the case htotal==0.

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
[danvet: Add additional explanations + cc: stable.]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1548228539-3061-1-git-send-email-tina.zhang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 16, 2019
[ Upstream commit 68be930 ]

BUG: unable to handle kernel paging request at ffffffffa01c5430
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
Oops: 0000 [jwrdegoede#1
CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ linux-sunxi#33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:raw_notifier_chain_register+0x16/0x40
Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
Call Trace:
 register_netdevice_notifier+0x43/0x250
 ? 0xffffffffa01e0000
 dsa_slave_register_notifier+0x13/0x70 [dsa_core
 ? 0xffffffffa01e0000
 dsa_init_module+0x2e/0x1000 [dsa_core
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cleanup allocated resourses if there are errors,
otherwise it will trgger memleak.

Fixes: c9eb3e0 ("net: dsa: Add support for learning FDB through notification")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 22, 2019
[ Upstream commit 68be930 ]

BUG: unable to handle kernel paging request at ffffffffa01c5430
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
Oops: 0000 [jwrdegoede#1
CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ linux-sunxi#33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:raw_notifier_chain_register+0x16/0x40
Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
Call Trace:
 register_netdevice_notifier+0x43/0x250
 ? 0xffffffffa01e0000
 dsa_slave_register_notifier+0x13/0x70 [dsa_core
 ? 0xffffffffa01e0000
 dsa_init_module+0x2e/0x1000 [dsa_core
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cleanup allocated resourses if there are errors,
otherwise it will trgger memleak.

Fixes: c9eb3e0 ("net: dsa: Add support for learning FDB through notification")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 31, 2019
[ Upstream commit a314777 ]

BUG: unable to handle kernel paging request at ffffffffa016a270
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0
Oops: 0000 [jwrdegoede#1
CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ linux-sunxi#33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:atomic_notifier_chain_register+0x24/0x60
Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086
RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278
RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240
RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260
R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78
FS:  00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0
Call Trace:
 register_inet6addr_notifier+0x13/0x20
 cxgb4_init_module+0x6c/0x1000 [cxgb4
 ? 0xffffffffa01d7000
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If pci_register_driver fails, register inet6addr_notifier is
pointless. This patch fix the error path in cxgb4_init_module.

Fixes: b5a02f5 ("cxgb4 : Update ipv6 address handling api")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue May 31, 2019
[ Upstream commit a314777 ]

BUG: unable to handle kernel paging request at ffffffffa016a270
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0
Oops: 0000 [jwrdegoede#1
CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ linux-sunxi#33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:atomic_notifier_chain_register+0x24/0x60
Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086
RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278
RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240
RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260
R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78
FS:  00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0
Call Trace:
 register_inet6addr_notifier+0x13/0x20
 cxgb4_init_module+0x6c/0x1000 [cxgb4
 ? 0xffffffffa01d7000
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If pci_register_driver fails, register inet6addr_notifier is
pointless. This patch fix the error path in cxgb4_init_module.

Fixes: b5a02f5 ("cxgb4 : Update ipv6 address handling api")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jun 1, 2019
[ Upstream commit a314777 ]

BUG: unable to handle kernel paging request at ffffffffa016a270
PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0
Oops: 0000 [jwrdegoede#1
CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ linux-sunxi#33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:atomic_notifier_chain_register+0x24/0x60
Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086
RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278
RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240
RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260
R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78
FS:  00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0
Call Trace:
 register_inet6addr_notifier+0x13/0x20
 cxgb4_init_module+0x6c/0x1000 [cxgb4
 ? 0xffffffffa01d7000
 do_one_initcall+0x6c/0x3cc
 ? do_init_module+0x22/0x1f1
 ? rcu_read_lock_sched_held+0x97/0xb0
 ? kmem_cache_alloc_trace+0x325/0x3b0
 do_init_module+0x5b/0x1f1
 load_module+0x1db1/0x2690
 ? m_show+0x1d0/0x1d0
 __do_sys_finit_module+0xc5/0xd0
 __x64_sys_finit_module+0x15/0x20
 do_syscall_64+0x6b/0x1d0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If pci_register_driver fails, register inet6addr_notifier is
pointless. This patch fix the error path in cxgb4_init_module.

Fixes: b5a02f5 ("cxgb4 : Update ipv6 address handling api")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Jul 13, 2020
The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart. This can result in OOM
killer activity even with free swap when the memory is fragmented
enough:

kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0
kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic linux-sunxi#33-Ubuntu
kernel: Hardware name: IBM 8562 T02 Z06 (LPAR)
kernel: Call Trace:
kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0)
kernel:  [<00000001f8d3437a>] dump_stack+0x8a/0xc0
kernel:  [<00000001f8687032>] dump_header+0x62/0x258
kernel:  [<00000001f8686122>] oom_kill_process+0x172/0x180
kernel:  [<00000001f8686abe>] out_of_memory+0xee/0x580
kernel:  [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90
kernel:  [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320
kernel:  [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0
kernel:  [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0
kernel:  [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0
kernel:  [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0
kernel:  [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760
kernel:  [<00000001f875df66>] do_vfs_ioctl+0x376/0x690
kernel:  [<00000001f875e304>] ksys_ioctl+0x84/0xb0
kernel:  [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40
kernel:  [<00000001f8d55424>] system_call+0xd8/0x2c8

As far as I can tell s390x does not use the iopins as we bail our for
anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is
only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to
reduce the memory footprint.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jul 16, 2020
[ Upstream commit 7749112 ]

The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart. This can result in OOM
killer activity even with free swap when the memory is fragmented
enough:

kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0
kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic linux-sunxi#33-Ubuntu
kernel: Hardware name: IBM 8562 T02 Z06 (LPAR)
kernel: Call Trace:
kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0)
kernel:  [<00000001f8d3437a>] dump_stack+0x8a/0xc0
kernel:  [<00000001f8687032>] dump_header+0x62/0x258
kernel:  [<00000001f8686122>] oom_kill_process+0x172/0x180
kernel:  [<00000001f8686abe>] out_of_memory+0xee/0x580
kernel:  [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90
kernel:  [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320
kernel:  [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0
kernel:  [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0
kernel:  [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0
kernel:  [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0
kernel:  [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760
kernel:  [<00000001f875df66>] do_vfs_ioctl+0x376/0x690
kernel:  [<00000001f875e304>] ksys_ioctl+0x84/0xb0
kernel:  [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40
kernel:  [<00000001f8d55424>] system_call+0xd8/0x2c8

As far as I can tell s390x does not use the iopins as we bail our for
anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is
only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to
reduce the memory footprint.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Sep 8, 2020
When multiple adapters are present in the system, pci hot-removing second
adapter leads to the following warning as both the adapters registered
thermal zone device with same thermal zone name/type.
Therefore, use unique thermal zone name during thermal zone device
initialization. Also mark thermal zone dev NULL once unregistered.

[  414.370143] ------------[ cut here ]------------
[  414.370944] sysfs group 'power' not found for kobject 'hwmon0'
[  414.371747] WARNING: CPU: 9 PID: 2661 at fs/sysfs/group.c:281
 sysfs_remove_group+0x76/0x80
[  414.382550] CPU: 9 PID: 2661 Comm: bash Not tainted 5.8.0-rc6+ linux-sunxi#33
[  414.383593] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0a 06/23/2016
[  414.384669] RIP: 0010:sysfs_remove_group+0x76/0x80
[  414.385738] Code: 48 89 df 5b 5d 41 5c e9 d8 b5 ff ff 48 89 df e8 60 b0 ff ff
 eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 90 ae 13 bb e8 6a 27 d0 ff <0f> 0b 5b 5d
 41 5c c3 0f 1f 00 0f 1f 44 00 00 48 85 f6 74 31 41 54
[  414.388404] RSP: 0018:ffffa22bc080fcb0 EFLAGS: 00010286
[  414.389638] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  414.390829] RDX: 0000000000000001 RSI: ffff8ee2de3e9510 RDI: ffff8ee2de3e9510
[  414.392064] RBP: ffffffffbaef2ee0 R08: 0000000000000000 R09: 0000000000000000
[  414.393224] R10: 0000000000000000 R11: 000000002b30006c R12: ffff8ee260720008
[  414.394388] R13: ffff8ee25e0a40e8 R14: ffffa22bc080ff08 R15: ffff8ee2c3be5020
[  414.395661] FS:  00007fd2a7171740(0000) GS:ffff8ee2de200000(0000)
 knlGS:0000000000000000
[  414.396825] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  414.398011] CR2: 00007f178ffe5020 CR3: 000000084c5cc003 CR4: 00000000003606e0
[  414.399172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  414.400352] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  414.401473] Call Trace:
[  414.402685]  device_del+0x89/0x400
[  414.403819]  device_unregister+0x16/0x60
[  414.405024]  hwmon_device_unregister+0x44/0xa0
[  414.406112]  thermal_remove_hwmon_sysfs+0x196/0x200
[  414.407256]  thermal_zone_device_unregister+0x1b5/0x1f0
[  414.408415]  cxgb4_thermal_remove+0x3c/0x4f [cxgb4]
[  414.409668]  remove_one+0x212/0x290 [cxgb4]
[  414.410875]  pci_device_remove+0x36/0xb0
[  414.412004]  device_release_driver_internal+0xe2/0x1c0
[  414.413276]  pci_stop_bus_device+0x64/0x90
[  414.414433]  pci_stop_and_remove_bus_device_locked+0x16/0x30
[  414.415609]  remove_store+0x75/0x90
[  414.416790]  kernfs_fop_write+0x114/0x1b0
[  414.417930]  vfs_write+0xcf/0x210
[  414.419059]  ksys_write+0xa7/0xe0
[  414.420120]  do_syscall_64+0x4c/0xa0
[  414.421278]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  414.422335] RIP: 0033:0x7fd2a686afd0
[  414.423396] Code: Bad RIP value.
[  414.424549] RSP: 002b:00007fffc1446148 EFLAGS: 00000246 ORIG_RAX:
 0000000000000001
[  414.425638] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd2a686afd0
[  414.426830] RDX: 0000000000000002 RSI: 00007fd2a7196000 RDI: 0000000000000001
[  414.427927] RBP: 00007fd2a7196000 R08: 000000000000000a R09: 00007fd2a7171740
[  414.428923] R10: 00007fd2a7171740 R11: 0000000000000246 R12: 00007fd2a6b43400
[  414.430082] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
[  414.431027] irq event stamp: 76300
[  414.435678] ---[ end trace 13865acb4d5ab00f ]---

Fixes: b187191 ("cxgb4: Add thermal zone support")
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Apr 18, 2022
The kmemleak_*_phys() apis do not check the address for lowmem's min
boundary, while the caller may pass an address below lowmem, which will
trigger an oops:

  # echo scan > /sys/kernel/debug/kmemleak
  Unable to handle kernel paging request at virtual address ff5fffffffe00000
  Oops [#1]
  Modules linked in:
  CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 linux-sunxi#33
  Hardware name: riscv-virtio,qemu (DT)
  epc : scan_block+0x74/0x15c
   ra : scan_block+0x72/0x15c
  epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
   gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
   t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
   s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
   a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
   a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
   s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
   s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
   s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
   s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
   t5 : 0000000000000001 t6 : 0000000000000000
  status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
    scan_gray_list+0x12e/0x1a6
    kmemleak_scan+0x2aa/0x57e
    kmemleak_write+0x32a/0x40c
    full_proxy_write+0x56/0x82
    vfs_write+0xa6/0x2a6
    ksys_write+0x6c/0xe2
    sys_write+0x22/0x2a
    ret_from_syscall+0x0/0x2

The callers may not quite know the actual address they pass(e.g. from
devicetree).  So the kmemleak_*_phys() apis should guarantee the address
they finally use is in lowmem range, so check the address for lowmem's
min boundary.

Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Apr 20, 2022
commit 23c2d49 upstream.

The kmemleak_*_phys() apis do not check the address for lowmem's min
boundary, while the caller may pass an address below lowmem, which will
trigger an oops:

  # echo scan > /sys/kernel/debug/kmemleak
  Unable to handle kernel paging request at virtual address ff5fffffffe00000
  Oops [jwrdegoede#1]
  Modules linked in:
  CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 linux-sunxi#33
  Hardware name: riscv-virtio,qemu (DT)
  epc : scan_block+0x74/0x15c
   ra : scan_block+0x72/0x15c
  epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
   gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
   t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
   s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
   a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
   a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
   s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
   s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
   s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
   s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
   t5 : 0000000000000001 t6 : 0000000000000000
  status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
    scan_gray_list+0x12e/0x1a6
    kmemleak_scan+0x2aa/0x57e
    kmemleak_write+0x32a/0x40c
    full_proxy_write+0x56/0x82
    vfs_write+0xa6/0x2a6
    ksys_write+0x6c/0xe2
    sys_write+0x22/0x2a
    ret_from_syscall+0x0/0x2

The callers may not quite know the actual address they pass(e.g. from
devicetree).  So the kmemleak_*_phys() apis should guarantee the address
they finally use is in lowmem range, so check the address for lowmem's
min boundary.

Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Apr 25, 2022
commit 23c2d49 upstream.

The kmemleak_*_phys() apis do not check the address for lowmem's min
boundary, while the caller may pass an address below lowmem, which will
trigger an oops:

  # echo scan > /sys/kernel/debug/kmemleak
  Unable to handle kernel paging request at virtual address ff5fffffffe00000
  Oops [jwrdegoede#1]
  Modules linked in:
  CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 linux-sunxi#33
  Hardware name: riscv-virtio,qemu (DT)
  epc : scan_block+0x74/0x15c
   ra : scan_block+0x72/0x15c
  epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
   gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
   t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
   s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
   a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
   a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
   s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
   s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
   s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
   s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
   t5 : 0000000000000001 t6 : 0000000000000000
  status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
    scan_gray_list+0x12e/0x1a6
    kmemleak_scan+0x2aa/0x57e
    kmemleak_write+0x32a/0x40c
    full_proxy_write+0x56/0x82
    vfs_write+0xa6/0x2a6
    ksys_write+0x6c/0xe2
    sys_write+0x22/0x2a
    ret_from_syscall+0x0/0x2

The callers may not quite know the actual address they pass(e.g. from
devicetree).  So the kmemleak_*_phys() apis should guarantee the address
they finally use is in lowmem range, so check the address for lowmem's
min boundary.

Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jwrdegoede added a commit to jwrdegoede/linux-sunxi that referenced this issue May 12, 2022
Lockdep complains about the smu->message_lock mutex being used before
it is initialized through the following call path:

amdgpu_device_init()
 amdgpu_dpm_mode2_reset()
  smu_mode2_reset()
   smu_v12_0_mode2_reset()
    smu_cmn_send_smc_msg_with_param()

Move the mutex_init() call to smu_early_init() to fix the mutex being
used before it is initialized.

This fixes the following lockdep splat:

[    3.867331] ------------[ cut here ]------------
[    3.867335] fbcon: Taking over console
[    3.867338] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[    3.867340] WARNING: CPU: 14 PID: 491 at kernel/locking/mutex.c:579 __mutex_lock+0x44c/0x830
[    3.867349] Modules linked in: amdgpu(+) crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel hid_lg_g15 iommu_v2 sp5100_tco nvme gpu_sched drm_dp_helper nvme_core ccp wmi video hid_logitech_dj ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse i2c_dev
[    3.867363] CPU: 14 PID: 491 Comm: systemd-udevd Tainted: G          I       5.18.0-rc5+ linux-sunxi#33
[    3.867366] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
[    3.867369] RIP: 0010:__mutex_lock+0x44c/0x830
[    3.867372] Code: ff 85 c0 0f 84 33 fc ff ff 8b 0d b7 50 25 01 85 c9 0f 85 25 fc ff ff 48 c7 c6 fb 41 82 99 48 c7 c7 6b 63 80 99 e8 88 2a f8 ff <0f> 0b e9 0b fc ff ff f6 83 b9 0c 00 00 01 0f 85 64 ff ff ff 4c 89
[    3.867377] RSP: 0018:ffffaef8c0fc79f0 EFLAGS: 00010286
[    3.867380] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000027
[    3.867382] RDX: ffff9ccc0dda0928 RSI: 0000000000000001 RDI: ffff9ccc0dda0920
[    3.867384] RBP: ffffaef8c0fc7a80 R08: 0000000000000000 R09: ffffaef8c0fc7820
[    3.867386] R10: 0000000000000003 R11: ffff9ccc2a2fffe8 R12: 0000000000000002
[    3.867388] R13: ffff9cc990808058 R14: 0000000000000000 R15: ffff9cc98bfc0000
[    3.867390] FS:  00007fc4d830f580(0000) GS:ffff9ccc0dd80000(0000) knlGS:0000000000000000
[    3.867394] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.867396] CR2: 0000560a77031410 CR3: 000000010f522000 CR4: 0000000000750ee0
[    3.867398] PKRU: 55555554
[    3.867399] Call Trace:
[    3.867401]  <TASK>
[    3.867403]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867533]  ? __mutex_lock+0x90/0x830
[    3.867535]  ? amdgpu_dpm_mode2_reset+0x37/0x60 [amdgpu]
[    3.867653]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867758]  smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867857]  smu_mode2_reset+0x2b/0x50 [amdgpu]
[    3.867953]  amdgpu_dpm_mode2_reset+0x46/0x60 [amdgpu]
[    3.868096]  amdgpu_device_init.cold+0x1069/0x1e78 [amdgpu]
[    3.868219]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    3.868222]  ? pci_conf1_read+0x9b/0xf0
[    3.868226]  amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[    3.868314]  amdgpu_pci_probe+0x1a9/0x3c0 [amdgpu]
[    3.868398]  local_pci_probe+0x41/0x80
[    3.868401]  pci_device_probe+0xab/0x200
[    3.868404]  really_probe+0x1a1/0x370
[    3.868407]  __driver_probe_device+0xfc/0x170
[    3.868410]  driver_probe_device+0x1f/0x90
[    3.868412]  __driver_attach+0xbf/0x1a0
[    3.868414]  ? __device_attach_driver+0xe0/0xe0
[    3.868416]  bus_for_each_dev+0x65/0x90
[    3.868419]  bus_add_driver+0x151/0x1f0
[    3.868421]  driver_register+0x89/0xd0
[    3.868423]  ? 0xffffffffc0bd4000
[    3.868425]  do_one_initcall+0x5d/0x300
[    3.868428]  ? do_init_module+0x22/0x240
[    3.868431]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868434]  ? trace_kmalloc+0x30/0xe0
[    3.868437]  ? kmem_cache_alloc_trace+0x1e6/0x3a0
[    3.868440]  do_init_module+0x4a/0x240
[    3.868442]  __do_sys_finit_module+0x93/0xf0
[    3.868446]  do_syscall_64+0x5b/0x80
[    3.868449]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868451]  ? lockdep_hardirqs_on_prepare+0xd9/0x180
[    3.868454]  ? do_syscall_64+0x67/0x80
[    3.868456]  ? do_syscall_64+0x67/0x80
[    3.868458]  ? do_syscall_64+0x67/0x80
[    3.868460]  ? do_syscall_64+0x67/0x80
[    3.868462]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    3.868465] RIP: 0033:0x7fc4d8ec1ced
[    3.868467] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fb 70 0e 00 f7 d8 64 89 01 48
[    3.868472] RSP: 002b:00007fff687ae6b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    3.868475] RAX: ffffffffffffffda RBX: 0000560a76fbca60 RCX: 00007fc4d8ec1ced
[    3.868477] RDX: 0000000000000000 RSI: 00007fc4d902343c RDI: 0000000000000011
[    3.868479] RBP: 00007fc4d902343c R08: 0000000000000000 R09: 0000560a76fb59c0
[    3.868481] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000020000
[    3.868484] R13: 0000560a76f8bfd0 R14: 0000000000000000 R15: 0000560a76fc2d10
[    3.868487]  </TASK>
[    3.868489] irq event stamp: 120617
[    3.868490] hardirqs last  enabled at (120617): [<ffffffff9817169e>] __up_console_sem+0x5e/0x70
[    3.868494] hardirqs last disabled at (120616): [<ffffffff98171683>] __up_console_sem+0x43/0x70
[    3.868497] softirqs last  enabled at (119684): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868501] softirqs last disabled at (119679): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868504] ---[ end trace 0000000000000000 ]---

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
jwrdegoede added a commit to jwrdegoede/linux-sunxi that referenced this issue May 13, 2022
Lockdep complains about the smu->message_lock mutex being used before
it is initialized through the following call path:

amdgpu_device_init()
 amdgpu_dpm_mode2_reset()
  smu_mode2_reset()
   smu_v12_0_mode2_reset()
    smu_cmn_send_smc_msg_with_param()

Move the mutex_init() call to smu_early_init() to fix the mutex being
used before it is initialized.

This fixes the following lockdep splat:

[    3.867331] ------------[ cut here ]------------
[    3.867335] fbcon: Taking over console
[    3.867338] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[    3.867340] WARNING: CPU: 14 PID: 491 at kernel/locking/mutex.c:579 __mutex_lock+0x44c/0x830
[    3.867349] Modules linked in: amdgpu(+) crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel hid_lg_g15 iommu_v2 sp5100_tco nvme gpu_sched drm_dp_helper nvme_core ccp wmi video hid_logitech_dj ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse i2c_dev
[    3.867363] CPU: 14 PID: 491 Comm: systemd-udevd Tainted: G          I       5.18.0-rc5+ linux-sunxi#33
[    3.867366] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
[    3.867369] RIP: 0010:__mutex_lock+0x44c/0x830
[    3.867372] Code: ff 85 c0 0f 84 33 fc ff ff 8b 0d b7 50 25 01 85 c9 0f 85 25 fc ff ff 48 c7 c6 fb 41 82 99 48 c7 c7 6b 63 80 99 e8 88 2a f8 ff <0f> 0b e9 0b fc ff ff f6 83 b9 0c 00 00 01 0f 85 64 ff ff ff 4c 89
[    3.867377] RSP: 0018:ffffaef8c0fc79f0 EFLAGS: 00010286
[    3.867380] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000027
[    3.867382] RDX: ffff9ccc0dda0928 RSI: 0000000000000001 RDI: ffff9ccc0dda0920
[    3.867384] RBP: ffffaef8c0fc7a80 R08: 0000000000000000 R09: ffffaef8c0fc7820
[    3.867386] R10: 0000000000000003 R11: ffff9ccc2a2fffe8 R12: 0000000000000002
[    3.867388] R13: ffff9cc990808058 R14: 0000000000000000 R15: ffff9cc98bfc0000
[    3.867390] FS:  00007fc4d830f580(0000) GS:ffff9ccc0dd80000(0000) knlGS:0000000000000000
[    3.867394] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.867396] CR2: 0000560a77031410 CR3: 000000010f522000 CR4: 0000000000750ee0
[    3.867398] PKRU: 55555554
[    3.867399] Call Trace:
[    3.867401]  <TASK>
[    3.867403]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867533]  ? __mutex_lock+0x90/0x830
[    3.867535]  ? amdgpu_dpm_mode2_reset+0x37/0x60 [amdgpu]
[    3.867653]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867758]  smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867857]  smu_mode2_reset+0x2b/0x50 [amdgpu]
[    3.867953]  amdgpu_dpm_mode2_reset+0x46/0x60 [amdgpu]
[    3.868096]  amdgpu_device_init.cold+0x1069/0x1e78 [amdgpu]
[    3.868219]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    3.868222]  ? pci_conf1_read+0x9b/0xf0
[    3.868226]  amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[    3.868314]  amdgpu_pci_probe+0x1a9/0x3c0 [amdgpu]
[    3.868398]  local_pci_probe+0x41/0x80
[    3.868401]  pci_device_probe+0xab/0x200
[    3.868404]  really_probe+0x1a1/0x370
[    3.868407]  __driver_probe_device+0xfc/0x170
[    3.868410]  driver_probe_device+0x1f/0x90
[    3.868412]  __driver_attach+0xbf/0x1a0
[    3.868414]  ? __device_attach_driver+0xe0/0xe0
[    3.868416]  bus_for_each_dev+0x65/0x90
[    3.868419]  bus_add_driver+0x151/0x1f0
[    3.868421]  driver_register+0x89/0xd0
[    3.868423]  ? 0xffffffffc0bd4000
[    3.868425]  do_one_initcall+0x5d/0x300
[    3.868428]  ? do_init_module+0x22/0x240
[    3.868431]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868434]  ? trace_kmalloc+0x30/0xe0
[    3.868437]  ? kmem_cache_alloc_trace+0x1e6/0x3a0
[    3.868440]  do_init_module+0x4a/0x240
[    3.868442]  __do_sys_finit_module+0x93/0xf0
[    3.868446]  do_syscall_64+0x5b/0x80
[    3.868449]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868451]  ? lockdep_hardirqs_on_prepare+0xd9/0x180
[    3.868454]  ? do_syscall_64+0x67/0x80
[    3.868456]  ? do_syscall_64+0x67/0x80
[    3.868458]  ? do_syscall_64+0x67/0x80
[    3.868460]  ? do_syscall_64+0x67/0x80
[    3.868462]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    3.868465] RIP: 0033:0x7fc4d8ec1ced
[    3.868467] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fb 70 0e 00 f7 d8 64 89 01 48
[    3.868472] RSP: 002b:00007fff687ae6b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    3.868475] RAX: ffffffffffffffda RBX: 0000560a76fbca60 RCX: 00007fc4d8ec1ced
[    3.868477] RDX: 0000000000000000 RSI: 00007fc4d902343c RDI: 0000000000000011
[    3.868479] RBP: 00007fc4d902343c R08: 0000000000000000 R09: 0000560a76fb59c0
[    3.868481] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000020000
[    3.868484] R13: 0000560a76f8bfd0 R14: 0000000000000000 R15: 0000560a76fc2d10
[    3.868487]  </TASK>
[    3.868489] irq event stamp: 120617
[    3.868490] hardirqs last  enabled at (120617): [<ffffffff9817169e>] __up_console_sem+0x5e/0x70
[    3.868494] hardirqs last disabled at (120616): [<ffffffff98171683>] __up_console_sem+0x43/0x70
[    3.868497] softirqs last  enabled at (119684): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868501] softirqs last disabled at (119679): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868504] ---[ end trace 0000000000000000 ]---

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
jwrdegoede added a commit to jwrdegoede/linux-sunxi that referenced this issue Jun 5, 2022
Lockdep complains about the smu->message_lock mutex being used before
it is initialized through the following call path:

amdgpu_device_init()
 amdgpu_dpm_mode2_reset()
  smu_mode2_reset()
   smu_v12_0_mode2_reset()
    smu_cmn_send_smc_msg_with_param()

Move the mutex_init() call to smu_early_init() to fix the mutex being
used before it is initialized.

This fixes the following lockdep splat:

[    3.867331] ------------[ cut here ]------------
[    3.867335] fbcon: Taking over console
[    3.867338] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[    3.867340] WARNING: CPU: 14 PID: 491 at kernel/locking/mutex.c:579 __mutex_lock+0x44c/0x830
[    3.867349] Modules linked in: amdgpu(+) crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel hid_lg_g15 iommu_v2 sp5100_tco nvme gpu_sched drm_dp_helper nvme_core ccp wmi video hid_logitech_dj ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse i2c_dev
[    3.867363] CPU: 14 PID: 491 Comm: systemd-udevd Tainted: G          I       5.18.0-rc5+ linux-sunxi#33
[    3.867366] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
[    3.867369] RIP: 0010:__mutex_lock+0x44c/0x830
[    3.867372] Code: ff 85 c0 0f 84 33 fc ff ff 8b 0d b7 50 25 01 85 c9 0f 85 25 fc ff ff 48 c7 c6 fb 41 82 99 48 c7 c7 6b 63 80 99 e8 88 2a f8 ff <0f> 0b e9 0b fc ff ff f6 83 b9 0c 00 00 01 0f 85 64 ff ff ff 4c 89
[    3.867377] RSP: 0018:ffffaef8c0fc79f0 EFLAGS: 00010286
[    3.867380] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000027
[    3.867382] RDX: ffff9ccc0dda0928 RSI: 0000000000000001 RDI: ffff9ccc0dda0920
[    3.867384] RBP: ffffaef8c0fc7a80 R08: 0000000000000000 R09: ffffaef8c0fc7820
[    3.867386] R10: 0000000000000003 R11: ffff9ccc2a2fffe8 R12: 0000000000000002
[    3.867388] R13: ffff9cc990808058 R14: 0000000000000000 R15: ffff9cc98bfc0000
[    3.867390] FS:  00007fc4d830f580(0000) GS:ffff9ccc0dd80000(0000) knlGS:0000000000000000
[    3.867394] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.867396] CR2: 0000560a77031410 CR3: 000000010f522000 CR4: 0000000000750ee0
[    3.867398] PKRU: 55555554
[    3.867399] Call Trace:
[    3.867401]  <TASK>
[    3.867403]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867533]  ? __mutex_lock+0x90/0x830
[    3.867535]  ? amdgpu_dpm_mode2_reset+0x37/0x60 [amdgpu]
[    3.867653]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867758]  smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867857]  smu_mode2_reset+0x2b/0x50 [amdgpu]
[    3.867953]  amdgpu_dpm_mode2_reset+0x46/0x60 [amdgpu]
[    3.868096]  amdgpu_device_init.cold+0x1069/0x1e78 [amdgpu]
[    3.868219]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    3.868222]  ? pci_conf1_read+0x9b/0xf0
[    3.868226]  amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[    3.868314]  amdgpu_pci_probe+0x1a9/0x3c0 [amdgpu]
[    3.868398]  local_pci_probe+0x41/0x80
[    3.868401]  pci_device_probe+0xab/0x200
[    3.868404]  really_probe+0x1a1/0x370
[    3.868407]  __driver_probe_device+0xfc/0x170
[    3.868410]  driver_probe_device+0x1f/0x90
[    3.868412]  __driver_attach+0xbf/0x1a0
[    3.868414]  ? __device_attach_driver+0xe0/0xe0
[    3.868416]  bus_for_each_dev+0x65/0x90
[    3.868419]  bus_add_driver+0x151/0x1f0
[    3.868421]  driver_register+0x89/0xd0
[    3.868423]  ? 0xffffffffc0bd4000
[    3.868425]  do_one_initcall+0x5d/0x300
[    3.868428]  ? do_init_module+0x22/0x240
[    3.868431]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868434]  ? trace_kmalloc+0x30/0xe0
[    3.868437]  ? kmem_cache_alloc_trace+0x1e6/0x3a0
[    3.868440]  do_init_module+0x4a/0x240
[    3.868442]  __do_sys_finit_module+0x93/0xf0
[    3.868446]  do_syscall_64+0x5b/0x80
[    3.868449]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868451]  ? lockdep_hardirqs_on_prepare+0xd9/0x180
[    3.868454]  ? do_syscall_64+0x67/0x80
[    3.868456]  ? do_syscall_64+0x67/0x80
[    3.868458]  ? do_syscall_64+0x67/0x80
[    3.868460]  ? do_syscall_64+0x67/0x80
[    3.868462]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    3.868465] RIP: 0033:0x7fc4d8ec1ced
[    3.868467] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fb 70 0e 00 f7 d8 64 89 01 48
[    3.868472] RSP: 002b:00007fff687ae6b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    3.868475] RAX: ffffffffffffffda RBX: 0000560a76fbca60 RCX: 00007fc4d8ec1ced
[    3.868477] RDX: 0000000000000000 RSI: 00007fc4d902343c RDI: 0000000000000011
[    3.868479] RBP: 00007fc4d902343c R08: 0000000000000000 R09: 0000560a76fb59c0
[    3.868481] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000020000
[    3.868484] R13: 0000560a76f8bfd0 R14: 0000000000000000 R15: 0000560a76fc2d10
[    3.868487]  </TASK>
[    3.868489] irq event stamp: 120617
[    3.868490] hardirqs last  enabled at (120617): [<ffffffff9817169e>] __up_console_sem+0x5e/0x70
[    3.868494] hardirqs last disabled at (120616): [<ffffffff98171683>] __up_console_sem+0x43/0x70
[    3.868497] softirqs last  enabled at (119684): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868501] softirqs last disabled at (119679): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868504] ---[ end trace 0000000000000000 ]---

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jun 10, 2022
[ Upstream commit 4b9caaa ]

Lockdep complains about the smu->message_lock mutex being used before
it is initialized through the following call path:

amdgpu_device_init()
 amdgpu_dpm_mode2_reset()
  smu_mode2_reset()
   smu_v12_0_mode2_reset()
    smu_cmn_send_smc_msg_with_param()

Move the mutex_init() call to smu_early_init() to fix the mutex being
used before it is initialized.

This fixes the following lockdep splat:

[    3.867331] ------------[ cut here ]------------
[    3.867335] fbcon: Taking over console
[    3.867338] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[    3.867340] WARNING: CPU: 14 PID: 491 at kernel/locking/mutex.c:579 __mutex_lock+0x44c/0x830
[    3.867349] Modules linked in: amdgpu(+) crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel hid_lg_g15 iommu_v2 sp5100_tco nvme gpu_sched drm_dp_helper nvme_core ccp wmi video hid_logitech_dj ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse i2c_dev
[    3.867363] CPU: 14 PID: 491 Comm: systemd-udevd Tainted: G          I       5.18.0-rc5+ linux-sunxi#33
[    3.867366] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
[    3.867369] RIP: 0010:__mutex_lock+0x44c/0x830
[    3.867372] Code: ff 85 c0 0f 84 33 fc ff ff 8b 0d b7 50 25 01 85 c9 0f 85 25 fc ff ff 48 c7 c6 fb 41 82 99 48 c7 c7 6b 63 80 99 e8 88 2a f8 ff <0f> 0b e9 0b fc ff ff f6 83 b9 0c 00 00 01 0f 85 64 ff ff ff 4c 89
[    3.867377] RSP: 0018:ffffaef8c0fc79f0 EFLAGS: 00010286
[    3.867380] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000027
[    3.867382] RDX: ffff9ccc0dda0928 RSI: 0000000000000001 RDI: ffff9ccc0dda0920
[    3.867384] RBP: ffffaef8c0fc7a80 R08: 0000000000000000 R09: ffffaef8c0fc7820
[    3.867386] R10: 0000000000000003 R11: ffff9ccc2a2fffe8 R12: 0000000000000002
[    3.867388] R13: ffff9cc990808058 R14: 0000000000000000 R15: ffff9cc98bfc0000
[    3.867390] FS:  00007fc4d830f580(0000) GS:ffff9ccc0dd80000(0000) knlGS:0000000000000000
[    3.867394] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.867396] CR2: 0000560a77031410 CR3: 000000010f522000 CR4: 0000000000750ee0
[    3.867398] PKRU: 55555554
[    3.867399] Call Trace:
[    3.867401]  <TASK>
[    3.867403]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867533]  ? __mutex_lock+0x90/0x830
[    3.867535]  ? amdgpu_dpm_mode2_reset+0x37/0x60 [amdgpu]
[    3.867653]  ? smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867758]  smu_cmn_send_smc_msg_with_param+0x98/0x240 [amdgpu]
[    3.867857]  smu_mode2_reset+0x2b/0x50 [amdgpu]
[    3.867953]  amdgpu_dpm_mode2_reset+0x46/0x60 [amdgpu]
[    3.868096]  amdgpu_device_init.cold+0x1069/0x1e78 [amdgpu]
[    3.868219]  ? _raw_spin_unlock_irqrestore+0x30/0x50
[    3.868222]  ? pci_conf1_read+0x9b/0xf0
[    3.868226]  amdgpu_driver_load_kms+0x15/0x110 [amdgpu]
[    3.868314]  amdgpu_pci_probe+0x1a9/0x3c0 [amdgpu]
[    3.868398]  local_pci_probe+0x41/0x80
[    3.868401]  pci_device_probe+0xab/0x200
[    3.868404]  really_probe+0x1a1/0x370
[    3.868407]  __driver_probe_device+0xfc/0x170
[    3.868410]  driver_probe_device+0x1f/0x90
[    3.868412]  __driver_attach+0xbf/0x1a0
[    3.868414]  ? __device_attach_driver+0xe0/0xe0
[    3.868416]  bus_for_each_dev+0x65/0x90
[    3.868419]  bus_add_driver+0x151/0x1f0
[    3.868421]  driver_register+0x89/0xd0
[    3.868423]  ? 0xffffffffc0bd4000
[    3.868425]  do_one_initcall+0x5d/0x300
[    3.868428]  ? do_init_module+0x22/0x240
[    3.868431]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868434]  ? trace_kmalloc+0x30/0xe0
[    3.868437]  ? kmem_cache_alloc_trace+0x1e6/0x3a0
[    3.868440]  do_init_module+0x4a/0x240
[    3.868442]  __do_sys_finit_module+0x93/0xf0
[    3.868446]  do_syscall_64+0x5b/0x80
[    3.868449]  ? rcu_read_lock_sched_held+0x3c/0x70
[    3.868451]  ? lockdep_hardirqs_on_prepare+0xd9/0x180
[    3.868454]  ? do_syscall_64+0x67/0x80
[    3.868456]  ? do_syscall_64+0x67/0x80
[    3.868458]  ? do_syscall_64+0x67/0x80
[    3.868460]  ? do_syscall_64+0x67/0x80
[    3.868462]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    3.868465] RIP: 0033:0x7fc4d8ec1ced
[    3.868467] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fb 70 0e 00 f7 d8 64 89 01 48
[    3.868472] RSP: 002b:00007fff687ae6b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    3.868475] RAX: ffffffffffffffda RBX: 0000560a76fbca60 RCX: 00007fc4d8ec1ced
[    3.868477] RDX: 0000000000000000 RSI: 00007fc4d902343c RDI: 0000000000000011
[    3.868479] RBP: 00007fc4d902343c R08: 0000000000000000 R09: 0000560a76fb59c0
[    3.868481] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000020000
[    3.868484] R13: 0000560a76f8bfd0 R14: 0000000000000000 R15: 0000560a76fc2d10
[    3.868487]  </TASK>
[    3.868489] irq event stamp: 120617
[    3.868490] hardirqs last  enabled at (120617): [<ffffffff9817169e>] __up_console_sem+0x5e/0x70
[    3.868494] hardirqs last disabled at (120616): [<ffffffff98171683>] __up_console_sem+0x43/0x70
[    3.868497] softirqs last  enabled at (119684): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868501] softirqs last disabled at (119679): [<ffffffff980ee83a>] __irq_exit_rcu+0xca/0x100
[    3.868504] ---[ end trace 0000000000000000 ]---

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Oct 26, 2022
…ller

[ Upstream commit 9d2ce78 ]

Naveen reported recursive locking of direct_mutex with sample
ftrace-direct-modify.ko:

[   74.762406] WARNING: possible recursive locking detected
[   74.762887] 6.0.0-rc6+ linux-sunxi#33 Not tainted
[   74.763216] --------------------------------------------
[   74.763672] event-sample-fn/1084 is trying to acquire lock:
[   74.764152] ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
    register_ftrace_function+0x1f/0x180
[   74.764922]
[   74.764922] but task is already holding lock:
[   74.765421] ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
    modify_ftrace_direct+0x34/0x1f0
[   74.766142]
[   74.766142] other info that might help us debug this:
[   74.766701]  Possible unsafe locking scenario:
[   74.766701]
[   74.767216]        CPU0
[   74.767437]        ----
[   74.767656]   lock(direct_mutex);
[   74.767952]   lock(direct_mutex);
[   74.768245]
[   74.768245]  *** DEADLOCK ***
[   74.768245]
[   74.768750]  May be due to missing lock nesting notation
[   74.768750]
[   74.769332] 1 lock held by event-sample-fn/1084:
[   74.769731]  #0: ffffffff86c9d6b0 (direct_mutex){+.+.}-{3:3}, at: \
    modify_ftrace_direct+0x34/0x1f0
[   74.770496]
[   74.770496] stack backtrace:
[   74.770884] CPU: 4 PID: 1084 Comm: event-sample-fn Not tainted ...
[   74.771498] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
[   74.772474] Call Trace:
[   74.772696]  <TASK>
[   74.772896]  dump_stack_lvl+0x44/0x5b
[   74.773223]  __lock_acquire.cold.74+0xac/0x2b7
[   74.773616]  lock_acquire+0xd2/0x310
[   74.773936]  ? register_ftrace_function+0x1f/0x180
[   74.774357]  ? lock_is_held_type+0xd8/0x130
[   74.774744]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
[   74.775213]  __mutex_lock+0x99/0x1010
[   74.775536]  ? register_ftrace_function+0x1f/0x180
[   74.775954]  ? slab_free_freelist_hook.isra.43+0x115/0x160
[   74.776424]  ? ftrace_set_hash+0x195/0x220
[   74.776779]  ? register_ftrace_function+0x1f/0x180
[   74.777194]  ? kfree+0x3e1/0x440
[   74.777482]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
[   74.777941]  ? __schedule+0xb40/0xb40
[   74.778258]  ? register_ftrace_function+0x1f/0x180
[   74.778672]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
[   74.779128]  register_ftrace_function+0x1f/0x180
[   74.779527]  ? ftrace_set_filter_ip+0x33/0x70
[   74.779910]  ? __schedule+0xb40/0xb40
[   74.780231]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
[   74.780678]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
[   74.781147]  ftrace_modify_direct_caller+0x5b/0x90
[   74.781563]  ? 0xffffffffa0201000
[   74.781859]  ? my_tramp1+0xf/0xf [ftrace_direct_modify]
[   74.782309]  modify_ftrace_direct+0x1b2/0x1f0
[   74.782690]  ? __schedule+0xb40/0xb40
[   74.783014]  ? simple_thread+0x2a/0xb0 [ftrace_direct_modify]
[   74.783508]  ? __schedule+0xb40/0xb40
[   74.783832]  ? my_tramp2+0x11/0x11 [ftrace_direct_modify]
[   74.784294]  simple_thread+0x76/0xb0 [ftrace_direct_modify]
[   74.784766]  kthread+0xf5/0x120
[   74.785052]  ? kthread_complete_and_exit+0x20/0x20
[   74.785464]  ret_from_fork+0x22/0x30
[   74.785781]  </TASK>

Fix this by using register_ftrace_function_nolock in
ftrace_modify_direct_caller.

Link: https://lkml.kernel.org/r/20220927004146.1215303-1-song@kernel.org

Fixes: 53cd885 ("ftrace: Allow IPMODIFY and DIRECT ops on the same function")
Reported-and-tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Dec 17, 2022
By keep sending L2CAP_CONF_REQ packets, chan->num_conf_rsp increases
multiple times and eventually it will wrap around the maximum number
(i.e., 255).
This patch prevents this by adding a boundary check with
L2CAP_MAX_CONF_RSP

Btmon log:
Bluetooth monitor ver 5.64
= Note: Linux version 6.1.0-rc2 (x86_64)                               0.264594
= Note: Bluetooth subsystem version 2.22                               0.264636
@ MGMT Open: btmon (privileged) version 1.22                  {0x0001} 0.272191
= New Index: 00:00:00:00:00:00 (Primary,Virtual,hci0)          [hci0] 13.877604
@ RAW Open: 9496 (privileged) version 2.22                   {0x0002} 13.890741
= Open Index: 00:00:00:00:00:00                                [hci0] 13.900426
(...)
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             linux-sunxi#32 [hci0] 14.273106
        invalid packet size (12 != 1033)
        08 00 01 00 02 01 04 00 01 10 ff ff              ............
> ACL Data RX: Handle 200 flags 0x00 dlen 1547             linux-sunxi#33 [hci0] 14.273561
        invalid packet size (14 != 1547)
        0a 00 01 00 04 01 06 00 40 00 00 00 00 00        ........@.....
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             linux-sunxi#34 [hci0] 14.274390
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 00 00 00 04  ........@.......
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             linux-sunxi#35 [hci0] 14.274932
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 07 00 03 00  ........@.......
= bluetoothd: Bluetooth daemon 5.43                                   14.401828
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             linux-sunxi#36 [hci0] 14.275753
        invalid packet size (12 != 1033)
        08 00 01 00 04 01 04 00 40 00 00 00              ........@...

Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Dec 31, 2022
[ Upstream commit bcd7026 ]

By keep sending L2CAP_CONF_REQ packets, chan->num_conf_rsp increases
multiple times and eventually it will wrap around the maximum number
(i.e., 255).
This patch prevents this by adding a boundary check with
L2CAP_MAX_CONF_RSP

Btmon log:
Bluetooth monitor ver 5.64
= Note: Linux version 6.1.0-rc2 (x86_64)                               0.264594
= Note: Bluetooth subsystem version 2.22                               0.264636
@ MGMT Open: btmon (privileged) version 1.22                  {0x0001} 0.272191
= New Index: 00:00:00:00:00:00 (Primary,Virtual,hci0)          [hci0] 13.877604
@ RAW Open: 9496 (privileged) version 2.22                   {0x0002} 13.890741
= Open Index: 00:00:00:00:00:00                                [hci0] 13.900426
(...)
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             linux-sunxi#32 [hci0] 14.273106
        invalid packet size (12 != 1033)
        08 00 01 00 02 01 04 00 01 10 ff ff              ............
> ACL Data RX: Handle 200 flags 0x00 dlen 1547             linux-sunxi#33 [hci0] 14.273561
        invalid packet size (14 != 1547)
        0a 00 01 00 04 01 06 00 40 00 00 00 00 00        ........@.....
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             linux-sunxi#34 [hci0] 14.274390
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 00 00 00 04  ........@.......
> ACL Data RX: Handle 200 flags 0x00 dlen 2061             linux-sunxi#35 [hci0] 14.274932
        invalid packet size (16 != 2061)
        0c 00 01 00 04 01 08 00 40 00 00 00 07 00 03 00  ........@.......
= bluetoothd: Bluetooth daemon 5.43                                   14.401828
> ACL Data RX: Handle 200 flags 0x00 dlen 1033             linux-sunxi#36 [hci0] 14.275753
        invalid packet size (12 != 1033)
        08 00 01 00 04 01 04 00 40 00 00 00              ........@...

Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants