Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem after commit: sunxi-disp: Add warn prints, fix cache flush... 0165c7d303d8 #72

Closed
Quarx2k opened this issue Sep 7, 2012 · 2 comments

Comments

@Quarx2k
Copy link

Quarx2k commented Sep 7, 2012

amery@0165c7d
This commit made slow updating screen on my ainol aurora.. (like a slow-mode in pc games)
Display has 60Hz refresh rate.

@techn
Copy link

techn commented Sep 7, 2012

techn@a418ab0

@amery amery closed this as completed in 2db5a48 Sep 7, 2012
@amery
Copy link
Member

amery commented Sep 7, 2012

I really wonder what were they trying to fix with that change....

amery added a commit that referenced this issue Oct 3, 2012
introduced by 0165c7d from a bulk update of
the disp driver coming from behind the wall makes the rendering unsuably slow.

closes #72
amery pushed a commit that referenced this issue Nov 12, 2013
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
amery pushed a commit that referenced this issue Aug 7, 2014
pci_get_slot() is called with hold of PCI bus semaphore and it's not
safe to be called in interrupt context. However, we possibly checks
EEH error and calls the function in interrupt context. To avoid using
pci_get_slot(), we turn into device tree for fetching location code.
Otherwise, we might run into WARN_ON() as following messages indicate:

 WARNING: at drivers/pci/search.c:223
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc3+ #72
 task: c000000001367af0 ti: c000000001444000 task.ti: c000000001444000
 NIP: c000000000497b70 LR: c000000000037530 CTR: 000000003003d114
 REGS: c000000001446fa0 TRAP: 0700   Not tainted  (3.16.0-rc3+)
 MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 48002422  XER: 20000000
 CFAR: c00000000003752c SOFTE: 0
   :
 NIP [c000000000497b70] .pci_get_slot+0x40/0x110
 LR [c000000000037530] .eeh_pe_loc_get+0x150/0x190
 Call Trace:
   .of_get_property+0x30/0x60 (unreliable)
   .eeh_pe_loc_get+0x150/0x190
   .eeh_dev_check_failure+0x1b4/0x550
   .eeh_check_failure+0x90/0xf0
   .lpfc_sli_check_eratt+0x504/0x7c0 [lpfc]
   .lpfc_poll_eratt+0x64/0x100 [lpfc]
   .call_timer_fn+0x64/0x190
   .run_timer_softirq+0x2cc/0x3e0

Cc: stable@vger.kernel.org
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
amery pushed a commit that referenced this issue Oct 9, 2014
Suspicious RCU usage in qdisc_watchdog call needs to be done inside
rcu_read_lock/rcu_read_unlock. And then Qdisc destroy operations
need to ensure timer is cancelled before removing qdisc structure.

[ 3992.191339] ===============================
[ 3992.191340] [ INFO: suspicious RCU usage. ]
[ 3992.191343] 3.17.0-rc6net-next+ #72 Not tainted
[ 3992.191345] -------------------------------
[ 3992.191347] include/net/sch_generic.h:272 suspicious rcu_dereference_check() usage!
[ 3992.191348]
[ 3992.191348] other info that might help us debug this:
[ 3992.191348]
[ 3992.191351]
[ 3992.191351] rcu_scheduler_active = 1, debug_locks = 1
[ 3992.191353] no locks held by swapper/1/0.
[ 3992.191355]
[ 3992.191355] stack backtrace:
[ 3992.191358] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.17.0-rc6net-next+ #72
[ 3992.191360] Hardware name:                  /DZ77RE-75K, BIOS GAZ7711H.86A.0060.2012.1115.1750 11/15/2012
[ 3992.191362]  0000000000000001 ffff880235803e48 ffffffff8178f92c 0000000000000000
[ 3992.191366]  ffff8802322224a0 ffff880235803e78 ffffffff810c9966 ffff8800a5fe3000
[ 3992.191370]  ffff880235803f30 ffff8802359cd768 ffff8802359cd6e0 ffff880235803e98
[ 3992.191374] Call Trace:
[ 3992.191376]  <IRQ>  [<ffffffff8178f92c>] dump_stack+0x4e/0x68
[ 3992.191387]  [<ffffffff810c9966>] lockdep_rcu_suspicious+0xe6/0x130
[ 3992.191392]  [<ffffffff8167213a>] qdisc_watchdog+0x8a/0xb0
[ 3992.191396]  [<ffffffff810f93f2>] __run_hrtimer+0x72/0x420
[ 3992.191399]  [<ffffffff810f9bcd>] ? hrtimer_interrupt+0x7d/0x240
[ 3992.191403]  [<ffffffff816720b0>] ? tc_classify+0xc0/0xc0
[ 3992.191406]  [<ffffffff810f9c4f>] hrtimer_interrupt+0xff/0x240
[ 3992.191410]  [<ffffffff8109e4a5>] ? __atomic_notifier_call_chain+0x5/0x140
[ 3992.191415]  [<ffffffff8103577b>] local_apic_timer_interrupt+0x3b/0x60
[ 3992.191419]  [<ffffffff8179c2b5>] smp_apic_timer_interrupt+0x45/0x60
[ 3992.191422]  [<ffffffff8179a6bf>] apic_timer_interrupt+0x6f/0x80
[ 3992.191424]  <EOI>  [<ffffffff815ed233>] ? cpuidle_enter_state+0x73/0x2e0
[ 3992.191432]  [<ffffffff815ed22e>] ? cpuidle_enter_state+0x6e/0x2e0
[ 3992.191437]  [<ffffffff815ed567>] cpuidle_enter+0x17/0x20
[ 3992.191441]  [<ffffffff810c0741>] cpu_startup_entry+0x3d1/0x4a0
[ 3992.191445]  [<ffffffff81106fc6>] ? clockevents_config_and_register+0x26/0x30
[ 3992.191448]  [<ffffffff81033c16>] start_secondary+0x1b6/0x260

Fixes: b26b0d1 ("net: qdisc: use rcu prefix and silence sparse warnings")
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amery pushed a commit that referenced this issue Aug 6, 2016
objdump's raw insn output can vary across architectures on the number of
bytes per chunk (bpc) displayed and their endianness.

The code-reading test relied on reading objdump output as 1 bpc. Kaixu
Xia reported test failure on ARM64, where objdump displays 4 bpc:

  70c48:        f90027bf         str        xzr, [x29,#72]
  70c4c:        91224000         add        x0, x0, #0x890
  70c50:        f90023a0         str        x0, [x29,#64]

This patch adds support to read raw insn output for any bpc length.
In case of 2+ bpc it also guesses objdump's display endian.

Reported-and-Tested-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/07f0f7bcbda78deb423298708ef9b6a54d6b92bd.1452592712.git.jstancek@redhat.com
[ Fix up pr_fmt() call to use %zd for size_t variables, fixing the build on Ubuntu cross-compiling to armhf and ppc64 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
amery pushed a commit that referenced this issue Sep 17, 2016
On lubbock board, the probe of the driver crashes by dereferencing very
early a platform_data structure which is not set, in
pxa2xx_configure_sockets().

The stack fixed is :
[    0.244353] SA1111 Microprocessor Companion Chip: silicon revision 1, metal revision 1
[    0.256321] sa1111 sa1111: Providing IRQ336-390
[    0.340899] clocksource: Switched to clocksource oscr0
[    0.472263] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[    0.480469] pgd = c0004000
[    0.483432] [00000004] *pgd=00000000
[    0.487105] Internal error: Oops: f5 [#1] ARM
[    0.491497] Modules linked in:
[    0.494650] CPU: 0 PID: 1 Comm: swapper Not tainted 4.8.0-rc3-00080-g1aaa68426f0c-dirty #2068
[    0.503229] Hardware name: Intel DBPXA250 Development Platform (aka Lubbock)
[    0.510344] task: c3e42000 task.stack: c3e44000
[    0.514984] PC is at pxa2xx_configure_sockets+0x4/0x24 (drivers/pcmcia/pxa2xx_base.c:227)
[    0.520193] LR is at pcmcia_lubbock_init+0x1c/0x38
[    0.525079] pc : [<c0247c30>]    lr : [<c02479b0>]    psr: a0000053
[    0.525079] sp : c3e45e70  ip : 100019ff  fp : 00000000
[    0.536651] r10: c0828900  r9 : c0434838  r8 : 00000000
[    0.541953] r7 : c0820700  r6 : c0857b30  r5 : c3ec1400  r4 : c0820758
[    0.548549] r3 : 00000000  r2 : 0000000c  r1 : c3c09c40  r0 : c3ec1400
[    0.555154] Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
[    0.562450] Control: 0000397f  Table: a0004000  DAC: 00000053
[    0.568257] Process swapper (pid: 1, stack limit = 0xc3e44190)
[    0.574154] Stack: (0xc3e45e70 to 0xc3e46000)
[    0.578610] 5e60:                                     c4849800 00000000 c3ec1400 c024769c
[    0.586928] 5e80: 00000000 c3ec140c c3c0ee0c c3ec1400 c3ec1434 c020c410 c3ec1400 c3ec1434
[    0.595244] 5ea0: c0820700 c080b408 c0828900 c020c5f8 00000000 c0820700 c020c578 c020ac5c
[    0.603560] 5ec0: c3e687cc c3e71e10 c0820700 00000000 c3c02de0 c020bae4 c03c62f7 c03c62f7
[    0.611872] 5ee0: c3e68780 c0820700 c042e034 00000000 c043c440 c020cdec c080b408 00000005
[    0.620188] 5f00: c042e034 c00096c0 c0034440 c01c730c 20000053 ffffffff 00000000 00000000
[    0.628502] 5f20: 00000000 c3ffcb87 c3ffcb90 c00346ac c3e66ba0 c03f7914 00000092 00000005
[    0.636811] 5f40: 00000005 c03f847c 00000091 c03f847c 00000000 00000005 c0434828 00000005
[    0.645125] 5f60: c043482c 00000092 c043c440 c0828900 c0434838 c0418d2c 00000005 00000005
[    0.653430] 5f80: 00000000 c041858c 00000000 c032e9f0 00000000 00000000 00000000 00000000
[    0.661729] 5fa0: 00000000 c032e9f8 00000000 c000f0f0 00000000 00000000 00000000 00000000
[    0.670020] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.678311] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    0.686673] (pxa2xx_configure_sockets) from pcmcia_lubbock_init (/drivers/pcmcia/sa1111_lubbock.c:161)
[    0.696026] (pcmcia_lubbock_init) from pcmcia_probe (/drivers/pcmcia/sa1111_generic.c:213)
[    0.704358] (pcmcia_probe) from driver_probe_device (/drivers/base/dd.c:378 /drivers/base/dd.c:499)
[    0.712848] (driver_probe_device) from __driver_attach (/./include/linux/device.h:983 /drivers/base/dd.c:733)
[    0.721414] (__driver_attach) from bus_for_each_dev (/drivers/base/bus.c:313)
[    0.729723] (bus_for_each_dev) from bus_add_driver (/drivers/base/bus.c:708)
[    0.738036] (bus_add_driver) from driver_register (/drivers/base/driver.c:169)
[    0.746185] (driver_register) from do_one_initcall (/init/main.c:778)
[    0.754561] (do_one_initcall) from kernel_init_freeable (/init/main.c:843 /init/main.c:851 /init/main.c:869 /init/main.c:1016)
[    0.763409] (kernel_init_freeable) from kernel_init (/init/main.c:944)
[    0.771660] (kernel_init) from ret_from_fork (/arch/arm/kernel/entry-common.S:119)
[ 0.779347] Code: c03c6305 c03c631e c03c632e e5903048 (e993000c)
All code
========
   0:	c03c6305 	eorsgt	r6, ip, r5, lsl #6
   4:	c03c631e 	eorsgt	r6, ip, lr, lsl r3
   8:	c03c632e 	eorsgt	r6, ip, lr, lsr #6
   c:	e5903048 	ldr	r3, [r0, #72]	; 0x48
  10:*	e993000c 	ldmib	r3, {r2, r3}		<-- trapping instruction

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
amery pushed a commit that referenced this issue May 10, 2017
When raid56 dev-replace is cancelled by running scrub, we will free
target device without waiting for in-flight bios, causing the following
NULL pointer deference or general protection failure.

 BUG: unable to handle kernel NULL pointer dereference at 00000000000005e0
 IP: generic_make_request_checks+0x4d/0x610
 CPU: 1 PID: 11676 Comm: kworker/u4:14 Tainted: G  O    4.11.0-rc2 #72
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
 Workqueue: btrfs-endio-raid56 btrfs_endio_raid56_helper [btrfs]
 task: ffff88002875b4c0 task.stack: ffffc90001334000
 RIP: 0010:generic_make_request_checks+0x4d/0x610
 Call Trace:
  ? generic_make_request+0xc7/0x360
  generic_make_request+0x24/0x360
  ? generic_make_request+0xc7/0x360
  submit_bio+0x64/0x120
  ? page_in_rbio+0x4d/0x80 [btrfs]
  ? rbio_orig_end_io+0x80/0x80 [btrfs]
  finish_rmw+0x3f4/0x540 [btrfs]
  validate_rbio_for_rmw+0x36/0x40 [btrfs]
  raid_rmw_end_io+0x7a/0x90 [btrfs]
  bio_endio+0x56/0x60
  end_workqueue_fn+0x3c/0x40 [btrfs]
  btrfs_scrubparity_helper+0xef/0x620 [btrfs]
  btrfs_endio_raid56_helper+0xe/0x10 [btrfs]
  process_one_work+0x2af/0x720
  ? process_one_work+0x22b/0x720
  worker_thread+0x4b/0x4f0
  kthread+0x10f/0x150
  ? process_one_work+0x720/0x720
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x2e/0x40
 RIP: generic_make_request_checks+0x4d/0x610 RSP: ffffc90001337bb8

In btrfs_dev_replace_finishing(), we will call
btrfs_rm_dev_replace_blocked() to wait bios before destroying the target
device when scrub is finished normally.

However when dev-replace is aborted, either due to error or cancelled by
scrub, we didn't wait for bios, this can lead to use-after-free if there
are bios holding the target device.

Furthermore, for raid56 scrub, at least 2 places are calling
btrfs_map_sblock() without protection of bio_counter, leading to the
problem.

This patch fixes the problem:
1) Wait for bio_counter before freeing target device when canceling
   replace
2) When calling btrfs_map_sblock() for raid56, use bio_counter to
   protect the call.

Cc: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
amery pushed a commit that referenced this issue Oct 24, 2017
kvmppc_gpa_to_ua() accesses KVM memory slot array via
srcu_dereference_check() and this produces warnings from RCU like below.

This extends the existing srcu_read_lock/unlock to cover that
kvmppc_gpa_to_ua() as well.

We did not hit this before as this lock is not needed for the realmode
handlers and hash guests would use the realmode path all the time;
however the radix guests are always redirected to the virtual mode
handlers and hence the warning.

[   68.253798] ./include/linux/kvm_host.h:575 suspicious rcu_dereference_check() usage!
[   68.253799]
               other info that might help us debug this:

[   68.253802]
               rcu_scheduler_active = 2, debug_locks = 1
[   68.253804] 1 lock held by qemu-system-ppc/6413:
[   68.253806]  #0:  (&vcpu->mutex){+.+.}, at: [<c00800000e3c22f4>] vcpu_load+0x3c/0xc0 [kvm]
[   68.253826]
               stack backtrace:
[   68.253830] CPU: 92 PID: 6413 Comm: qemu-system-ppc Tainted: G        W       4.14.0-rc3-00553-g432dcba58e9c-dirty #72
[   68.253833] Call Trace:
[   68.253839] [c000000fd3d9f790] [c000000000b7fcc8] dump_stack+0xe8/0x160 (unreliable)
[   68.253845] [c000000fd3d9f7d0] [c0000000001924c0] lockdep_rcu_suspicious+0x110/0x180
[   68.253851] [c000000fd3d9f850] [c0000000000e825c] kvmppc_gpa_to_ua+0x26c/0x2b0
[   68.253858] [c000000fd3d9f8b0] [c00800000e3e1984] kvmppc_h_put_tce+0x12c/0x2a0 [kvm]

Fixes: 121f80b ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
amery pushed a commit that referenced this issue Jun 7, 2018
Trivial fix to remove the following sparse warnings:

  arch/powerpc/kernel/module_32.c:112:74: warning: Using plain integer as NULL pointer
  arch/powerpc/kernel/module_32.c:117:74: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:1155:28: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:1230:20: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:1385:36: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:1752:23: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:2084:19: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:2110:32: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:2167:19: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:2183:19: warning: Using plain integer as NULL pointer
  drivers/macintosh/via-pmu.c:277:20: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:155:67: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:247:27: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:249:27: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/setup.c:252:37: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:127:21: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:148:21: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:44:21: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:57:21: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/tlb_hash32.c:87:21: warning: Using plain integer as NULL pointer
  arch/powerpc/kernel/btext.c:160:31: warning: Using plain integer as NULL pointer
  arch/powerpc/kernel/btext.c:167:22: warning: Using plain integer as NULL pointer
  arch/powerpc/kernel/btext.c:274:21: warning: Using plain integer as NULL pointer
  arch/powerpc/kernel/btext.c:285:31: warning: Using plain integer as NULL pointer
  arch/powerpc/include/asm/hugetlb.h:204:16: warning: Using plain integer as NULL pointer
  arch/powerpc/mm/ppc_mmu_32.c:170:21: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/pci.c:1227:23: warning: Using plain integer as NULL pointer
  arch/powerpc/platforms/powermac/pci.c:65:24: warning: Using plain integer as NULL pointer

Also use `--fix` command line option from `script/checkpatch --strict` to
remove the following:

  CHECK: Comparison to NULL could be written "!dispDeviceBase"
  #72: FILE: arch/powerpc/kernel/btext.c:160:
  +	if (dispDeviceBase == NULL)

  CHECK: Comparison to NULL could be written "!vbase"
  #80: FILE: arch/powerpc/kernel/btext.c:167:
  +	if (vbase == NULL)

  CHECK: Comparison to NULL could be written "!base"
  #89: FILE: arch/powerpc/kernel/btext.c:274:
  +	if (base == NULL)

  CHECK: Comparison to NULL could be written "!dispDeviceBase"
  #98: FILE: arch/powerpc/kernel/btext.c:285:
  +	if (dispDeviceBase == NULL)

  CHECK: Comparison to NULL could be written "strstr"
  #117: FILE: arch/powerpc/kernel/module_32.c:117:
  +		if (strstr(secstrings + sechdrs[i].sh_name, ".debug") != NULL)

  CHECK: Comparison to NULL could be written "!Hash"
  #130: FILE: arch/powerpc/mm/ppc_mmu_32.c:170:
  +	if (Hash == NULL)

  CHECK: Comparison to NULL could be written "Hash"
  #143: FILE: arch/powerpc/mm/tlb_hash32.c:44:
  +	if (Hash != NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #152: FILE: arch/powerpc/mm/tlb_hash32.c:57:
  +	if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #161: FILE: arch/powerpc/mm/tlb_hash32.c:87:
  +	if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #170: FILE: arch/powerpc/mm/tlb_hash32.c:127:
  +	if (Hash == NULL) {

  CHECK: Comparison to NULL could be written "!Hash"
  #179: FILE: arch/powerpc/mm/tlb_hash32.c:148:
  +	if (Hash == NULL) {

  ERROR: space required after that ';' (ctx:VxV)
  #192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
  +	for (; node != NULL;node = node->sibling) {

  CHECK: Comparison to NULL could be written "node"
  #192: FILE: arch/powerpc/platforms/powermac/pci.c:65:
  +	for (; node != NULL;node = node->sibling) {

  CHECK: Comparison to NULL could be written "!region"
  #201: FILE: arch/powerpc/platforms/powermac/pci.c:1227:
  +	if (region == NULL)

  CHECK: Comparison to NULL could be written "of_get_property"
  #214: FILE: arch/powerpc/platforms/powermac/setup.c:155:
  +		if (of_get_property(np, "cache-unified", NULL) != NULL && dc) {

  CHECK: Comparison to NULL could be written "!np"
  #223: FILE: arch/powerpc/platforms/powermac/setup.c:247:
  +		if (np == NULL)

  CHECK: Comparison to NULL could be written "np"
  #226: FILE: arch/powerpc/platforms/powermac/setup.c:249:
  +		if (np != NULL) {

  CHECK: Comparison to NULL could be written "l2cr"
  #230: FILE: arch/powerpc/platforms/powermac/setup.c:252:
  +			if (l2cr != NULL) {

  CHECK: Comparison to NULL could be written "via"
  #243: FILE: drivers/macintosh/via-pmu.c:277:
  +	if (via != NULL)

  CHECK: Comparison to NULL could be written "current_req"
  #252: FILE: drivers/macintosh/via-pmu.c:1155:
  +	if (current_req != NULL) {

  CHECK: Comparison to NULL could be written "!req"
  #261: FILE: drivers/macintosh/via-pmu.c:1230:
  +	if (req == NULL || pmu_state != idle

  CHECK: Comparison to NULL could be written "!req"
  #270: FILE: drivers/macintosh/via-pmu.c:1385:
  +			if (req == NULL) {

  CHECK: Comparison to NULL could be written "!pp"
  #288: FILE: drivers/macintosh/via-pmu.c:2084:
  +	if (pp == NULL)

  CHECK: Comparison to NULL could be written "!pp"
  #297: FILE: drivers/macintosh/via-pmu.c:2110:
  +	if (count < 1 || pp == NULL)

  CHECK: Comparison to NULL could be written "!pp"
  #306: FILE: drivers/macintosh/via-pmu.c:2167:
  +	if (pp == NULL)

  CHECK: Comparison to NULL could be written "pp"
  torvalds#315: FILE: drivers/macintosh/via-pmu.c:2183:
  +	if (pp != NULL) {

Link: https://github.com/linuxppc/linux/issues/37
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
amery pushed a commit that referenced this issue Aug 14, 2018
Dave Hansen reported, that it's outright dangerous to keep SMT siblings
disabled completely so they are stuck in the BIOS and wait for SIPI.

The reason is that Machine Check Exceptions are broadcasted to siblings and
the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a
logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or
reboots the machine. The MCE chapter in the SDM contains the following
blurb:

    Because the logical processors within a physical package are tightly
    coupled with respect to shared hardware resources, both logical
    processors are notified of machine check errors that occur within a
    given physical processor. If machine-check exceptions are enabled when
    a fatal error is reported, all the logical processors within a physical
    package are dispatched to the machine-check exception handler. If
    machine-check exceptions are disabled, the logical processors enter the
    shutdown state and assert the IERR# signal. When enabling machine-check
    exceptions, the MCE flag in control register CR4 should be set for each
    logical processor.

Reverting the commit which ignores siblings at enumeration time solves only
half of the problem. The core cpuhotplug logic needs to be adjusted as
well.

This thoughtful engineered mechanism also turns the boot process on all
Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU
before the secondary CPUs are brought up. Depending on the number of
physical cores the window in which this situation can happen is smaller or
larger. On a HSW-EX it's about 750ms:

MCE is enabled on the boot CPU:

[    0.244017] mce: CPU supports 22 MCE banks

The corresponding sibling #72 boots:

[    1.008005] .... node  #0, CPUs:    #72

That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
between these two points the machine is going to shutdown. At least it's a
known safe state.

It's obvious that the early boot can be hit by an MCE as well and then runs
into the same situation because MCEs are not yet enabled on the boot CPU.
But after enabling them on the boot CPU, it does not make any sense to
prevent the kernel from recovering.

Adjust the nosmt kernel parameter documentation as well.

Reverts: 2207def ("x86/apic: Ignore secondary threads if nosmt=force")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tony Luck <tony.luck@intel.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Aug 15, 2018
commit 506a66f upstream

Dave Hansen reported, that it's outright dangerous to keep SMT siblings
disabled completely so they are stuck in the BIOS and wait for SIPI.

The reason is that Machine Check Exceptions are broadcasted to siblings and
the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a
logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or
reboots the machine. The MCE chapter in the SDM contains the following
blurb:

    Because the logical processors within a physical package are tightly
    coupled with respect to shared hardware resources, both logical
    processors are notified of machine check errors that occur within a
    given physical processor. If machine-check exceptions are enabled when
    a fatal error is reported, all the logical processors within a physical
    package are dispatched to the machine-check exception handler. If
    machine-check exceptions are disabled, the logical processors enter the
    shutdown state and assert the IERR# signal. When enabling machine-check
    exceptions, the MCE flag in control register CR4 should be set for each
    logical processor.

Reverting the commit which ignores siblings at enumeration time solves only
half of the problem. The core cpuhotplug logic needs to be adjusted as
well.

This thoughtful engineered mechanism also turns the boot process on all
Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU
before the secondary CPUs are brought up. Depending on the number of
physical cores the window in which this situation can happen is smaller or
larger. On a HSW-EX it's about 750ms:

MCE is enabled on the boot CPU:

[    0.244017] mce: CPU supports 22 MCE banks

The corresponding sibling linux-sunxi#72 boots:

[    1.008005] .... node  #0, CPUs:    linux-sunxi#72

That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
between these two points the machine is going to shutdown. At least it's a
known safe state.

It's obvious that the early boot can be hit by an MCE as well and then runs
into the same situation because MCEs are not yet enabled on the boot CPU.
But after enabling them on the boot CPU, it does not make any sense to
prevent the kernel from recovering.

Adjust the nosmt kernel parameter documentation as well.

Reverts: 2207def ("x86/apic: Ignore secondary threads if nosmt=force")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Dec 7, 2018
commit 506a66f upstream

Dave Hansen reported, that it's outright dangerous to keep SMT siblings
disabled completely so they are stuck in the BIOS and wait for SIPI.

The reason is that Machine Check Exceptions are broadcasted to siblings and
the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a
logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or
reboots the machine. The MCE chapter in the SDM contains the following
blurb:

    Because the logical processors within a physical package are tightly
    coupled with respect to shared hardware resources, both logical
    processors are notified of machine check errors that occur within a
    given physical processor. If machine-check exceptions are enabled when
    a fatal error is reported, all the logical processors within a physical
    package are dispatched to the machine-check exception handler. If
    machine-check exceptions are disabled, the logical processors enter the
    shutdown state and assert the IERR# signal. When enabling machine-check
    exceptions, the MCE flag in control register CR4 should be set for each
    logical processor.

Reverting the commit which ignores siblings at enumeration time solves only
half of the problem. The core cpuhotplug logic needs to be adjusted as
well.

This thoughtful engineered mechanism also turns the boot process on all
Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU
before the secondary CPUs are brought up. Depending on the number of
physical cores the window in which this situation can happen is smaller or
larger. On a HSW-EX it's about 750ms:

MCE is enabled on the boot CPU:

[    0.244017] mce: CPU supports 22 MCE banks

The corresponding sibling linux-sunxi#72 boots:

[    1.008005] .... node  #0, CPUs:    linux-sunxi#72

That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
between these two points the machine is going to shutdown. At least it's a
known safe state.

It's obvious that the early boot can be hit by an MCE as well and then runs
into the same situation because MCEs are not yet enabled on the boot CPU.
But after enabling them on the boot CPU, it does not make any sense to
prevent the kernel from recovering.

Adjust the nosmt kernel parameter documentation as well.

Reverts: 2207def ("x86/apic: Ignore secondary threads if nosmt=force")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amery pushed a commit that referenced this issue Jan 18, 2019
Hammering the "bank enable" (PBKEN) bit on and off between
every command crashes the Nomadik NHK15 with this message:

Scanning device for bad blocks
Unhandled fault: external abort on non-linefetch (0x008) at 0xcc95e000
pgd = (ptrval)
[cc95e000] *pgd=0b808811, *pte=40000653, *ppte=40000552
Internal error: : 8 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.20.0-rc2+ #72
Hardware name: Nomadik STn8815
PC is at fsmc_exec_op+0x194/0x204
(...)

After a discussion we (me and Boris Brezillon) start to suspect
that this bit does not immediately control the chip select line
at all, it rather enables access to the bank and the hardware
will drive the CS autonomously. If there is a NAND chip connected,
we should keep this enabled.

As fsmc_nand_setup() sets this bit, we can simply remove the
offending code.

Fixes: 550b9fc ("mtd: rawnand: fsmc: Stop implementing ->select_chip()")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 24, 2019
commit baef1c9 upstream.

Using the batch API from the interconnect driver sometimes leads to a
KASAN error due to an access to freed memory. This is easier to trigger
with threadirqs on the kernel commandline.

 BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c
 Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57

 CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G        W         4.19.10 linux-sunxi#72
 Call trace:
  dump_backtrace+0x0/0x2f8
  show_stack+0x20/0x2c
  __dump_stack+0x20/0x28
  dump_stack+0xcc/0x10c
  print_address_description+0x74/0x240
  kasan_report+0x250/0x26c
  __asan_report_load1_noabort+0x20/0x2c
  rpmh_tx_done+0x114/0x12c
  tcs_tx_done+0x450/0x768
  irq_forced_thread_fn+0x58/0x9c
  irq_thread+0x120/0x1dc
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 Allocated by task 385:
  kasan_kmalloc+0xac/0x148
  __kmalloc+0x170/0x1e4
  rpmh_write_batch+0x174/0x540
  qcom_icc_set+0x8dc/0x9ac
  icc_set+0x288/0x2e8
  a6xx_gmu_stop+0x320/0x3c0
  a6xx_pm_suspend+0x108/0x124
  adreno_suspend+0x50/0x60
  pm_generic_runtime_suspend+0x60/0x78
  __rpm_callback+0x214/0x32c
  rpm_callback+0x54/0x184
  rpm_suspend+0x3f8/0xa90
  pm_runtime_work+0xb4/0x178
  process_one_work+0x544/0xbc0
  worker_thread+0x514/0x7d0
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 Freed by task 385:
  __kasan_slab_free+0x12c/0x1e0
  kasan_slab_free+0x10/0x1c
  kfree+0x134/0x588
  rpmh_write_batch+0x49c/0x540
  qcom_icc_set+0x8dc/0x9ac
  icc_set+0x288/0x2e8
  a6xx_gmu_stop+0x320/0x3c0
  a6xx_pm_suspend+0x108/0x124
  adreno_suspend+0x50/0x60
 cr50_spi spi5.0: SPI transfer timed out
  pm_generic_runtime_suspend+0x60/0x78
  __rpm_callback+0x214/0x32c
  rpm_callback+0x54/0x184
  rpm_suspend+0x3f8/0xa90
  pm_runtime_work+0xb4/0x178
  process_one_work+0x544/0xbc0
  worker_thread+0x514/0x7d0
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 The buggy address belongs to the object at fffffff51414ac80
  which belongs to the cache kmalloc-512 of size 512
 The buggy address is located 260 bytes inside of
  512-byte region [fffffff51414ac80, fffffff51414ae80)
 The buggy address belongs to the page:
 page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0
 flags: 0x4000000000008100(slab|head)
 raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680
 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                    ^
  fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

The batch API sets the same completion for each rpmh message that's sent
and then loops through all the messages and waits for that single
completion declared on the stack to be completed before returning from
the function and freeing the message structures. Unfortunately, some
messages may still be in process and 'stuck' in the TCS. At some later
point, the tcs_tx_done() interrupt will run and try to process messages
that have already been freed at the end of rpmh_write_batch(). This will
in turn access the 'needs_free' member of the rpmh_request structure and
cause KASAN to complain. Furthermore, if there's a message that's
completed in rpmh_tx_done() and freed immediately after the complete()
call is made we'll be racing with potentially freed memory when
accessing the 'needs_free' member:

	CPU0                         CPU1
	----                         ----
	rpmh_tx_done()
	 complete(&compl)
	                             wait_for_completion(&compl)
	                             kfree(rpm_msg)
	 if (rpm_msg->needs_free)
	 <KASAN warning splat>

Let's fix this by allocating a chunk of completions for each message and
waiting for all of them to be completed before returning from the batch
API. Alternatively, we could wait for the last message in the batch, but
that may be a more complicated change because it looks like
tcs_tx_done() just iterates through the indices of the queue and
completes each message instead of tracking the last inserted message and
completing that first.

Fixes: c8790cb ("drivers: qcom: rpmh: add support for batch RPMH request")
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Evan Green <evgreen@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Lina Iyer <ilina@codeaurora.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 27, 2019
commit baef1c9 upstream.

Using the batch API from the interconnect driver sometimes leads to a
KASAN error due to an access to freed memory. This is easier to trigger
with threadirqs on the kernel commandline.

 BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c
 Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57

 CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G        W         4.19.10 linux-sunxi#72
 Call trace:
  dump_backtrace+0x0/0x2f8
  show_stack+0x20/0x2c
  __dump_stack+0x20/0x28
  dump_stack+0xcc/0x10c
  print_address_description+0x74/0x240
  kasan_report+0x250/0x26c
  __asan_report_load1_noabort+0x20/0x2c
  rpmh_tx_done+0x114/0x12c
  tcs_tx_done+0x450/0x768
  irq_forced_thread_fn+0x58/0x9c
  irq_thread+0x120/0x1dc
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 Allocated by task 385:
  kasan_kmalloc+0xac/0x148
  __kmalloc+0x170/0x1e4
  rpmh_write_batch+0x174/0x540
  qcom_icc_set+0x8dc/0x9ac
  icc_set+0x288/0x2e8
  a6xx_gmu_stop+0x320/0x3c0
  a6xx_pm_suspend+0x108/0x124
  adreno_suspend+0x50/0x60
  pm_generic_runtime_suspend+0x60/0x78
  __rpm_callback+0x214/0x32c
  rpm_callback+0x54/0x184
  rpm_suspend+0x3f8/0xa90
  pm_runtime_work+0xb4/0x178
  process_one_work+0x544/0xbc0
  worker_thread+0x514/0x7d0
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 Freed by task 385:
  __kasan_slab_free+0x12c/0x1e0
  kasan_slab_free+0x10/0x1c
  kfree+0x134/0x588
  rpmh_write_batch+0x49c/0x540
  qcom_icc_set+0x8dc/0x9ac
  icc_set+0x288/0x2e8
  a6xx_gmu_stop+0x320/0x3c0
  a6xx_pm_suspend+0x108/0x124
  adreno_suspend+0x50/0x60
 cr50_spi spi5.0: SPI transfer timed out
  pm_generic_runtime_suspend+0x60/0x78
  __rpm_callback+0x214/0x32c
  rpm_callback+0x54/0x184
  rpm_suspend+0x3f8/0xa90
  pm_runtime_work+0xb4/0x178
  process_one_work+0x544/0xbc0
  worker_thread+0x514/0x7d0
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18

 The buggy address belongs to the object at fffffff51414ac80
  which belongs to the cache kmalloc-512 of size 512
 The buggy address is located 260 bytes inside of
  512-byte region [fffffff51414ac80, fffffff51414ae80)
 The buggy address belongs to the page:
 page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0
 flags: 0x4000000000008100(slab|head)
 raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680
 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 >fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                    ^
  fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

The batch API sets the same completion for each rpmh message that's sent
and then loops through all the messages and waits for that single
completion declared on the stack to be completed before returning from
the function and freeing the message structures. Unfortunately, some
messages may still be in process and 'stuck' in the TCS. At some later
point, the tcs_tx_done() interrupt will run and try to process messages
that have already been freed at the end of rpmh_write_batch(). This will
in turn access the 'needs_free' member of the rpmh_request structure and
cause KASAN to complain. Furthermore, if there's a message that's
completed in rpmh_tx_done() and freed immediately after the complete()
call is made we'll be racing with potentially freed memory when
accessing the 'needs_free' member:

	CPU0                         CPU1
	----                         ----
	rpmh_tx_done()
	 complete(&compl)
	                             wait_for_completion(&compl)
	                             kfree(rpm_msg)
	 if (rpm_msg->needs_free)
	 <KASAN warning splat>

Let's fix this by allocating a chunk of completions for each message and
waiting for all of them to be completed before returning from the batch
API. Alternatively, we could wait for the last message in the batch, but
that may be a more complicated change because it looks like
tcs_tx_done() just iterates through the indices of the queue and
completes each message instead of tracking the last inserted message and
completing that first.

Fixes: c8790cb ("drivers: qcom: rpmh: add support for batch RPMH request")
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Evan Green <evgreen@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Lina Iyer <ilina@codeaurora.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Jul 26, 2019
[ Upstream commit 23377c2 ]

When the device is disconnected while passing traffic it is possible
to receive out of order urbs causing a memory leak since the skb linked
to the current tx urb is not removed. Fix the issue deallocating the skb
cleaning up the tx ring. Moreover this patch fixes the following kernel
warning

[   57.480771] usb 1-1: USB disconnect, device number 2
[   57.483451] ------------[ cut here ]------------
[   57.483462] TX urb mismatch
[   57.483481] WARNING: CPU: 1 PID: 32 at drivers/net/wireless/mediatek/mt7601u/dma.c:245 mt7601u_complete_tx+0x165/00
[   57.483483] Modules linked in:
[   57.483496] CPU: 1 PID: 32 Comm: kworker/1:1 Not tainted 5.2.0-rc1+ linux-sunxi#72
[   57.483498] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014
[   57.483502] Workqueue: usb_hub_wq hub_event
[   57.483507] RIP: 0010:mt7601u_complete_tx+0x165/0x1e0
[   57.483510] Code: 8b b5 10 04 00 00 8b 8d 14 04 00 00 eb 8b 80 3d b1 cb e1 00 00 75 9e 48 c7 c7 a4 ea 05 82 c6 05 f
[   57.483513] RSP: 0000:ffffc900000a0d28 EFLAGS: 00010092
[   57.483516] RAX: 000000000000000f RBX: ffff88802c0a62c0 RCX: ffffc900000a0c2c
[   57.483518] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff810a8371
[   57.483520] RBP: ffff88803ced6858 R08: 0000000000000000 R09: 0000000000000001
[   57.483540] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000046
[   57.483542] R13: ffff88802c0a6c88 R14: ffff88803baab540 R15: ffff88803a0cc078
[   57.483548] FS:  0000000000000000(0000) GS:ffff88803eb00000(0000) knlGS:0000000000000000
[   57.483550] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.483552] CR2: 000055e7f6780100 CR3: 0000000028c86000 CR4: 00000000000006a0
[   57.483554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   57.483556] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   57.483559] Call Trace:
[   57.483561]  <IRQ>
[   57.483565]  __usb_hcd_giveback_urb+0x77/0xe0
[   57.483570]  xhci_giveback_urb_in_irq.isra.0+0x8b/0x140
[   57.483574]  handle_cmd_completion+0xf5b/0x12c0
[   57.483577]  xhci_irq+0x1f6/0x1810
[   57.483581]  ? lockdep_hardirqs_on+0x9e/0x180
[   57.483584]  ? _raw_spin_unlock_irq+0x24/0x30
[   57.483588]  __handle_irq_event_percpu+0x3a/0x260
[   57.483592]  handle_irq_event_percpu+0x1c/0x60
[   57.483595]  handle_irq_event+0x2f/0x4c
[   57.483599]  handle_edge_irq+0x7e/0x1a0
[   57.483603]  handle_irq+0x17/0x20
[   57.483607]  do_IRQ+0x54/0x110
[   57.483610]  common_interrupt+0xf/0xf
[   57.483612]  </IRQ>

Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
jwrdegoede pushed a commit to jwrdegoede/linux-sunxi that referenced this issue Nov 6, 2021
The test case btrfs/238 reports the warning below:

 WARNING: CPU: 3 PID: 481 at fs/btrfs/super.c:2509 btrfs_show_devname+0x104/0x1e8 [btrfs]
 CPU: 2 PID: 1 Comm: systemd Tainted: G        W  O 5.14.0-rc1-custom linux-sunxi#72
 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
 Call trace:
   btrfs_show_devname+0x108/0x1b4 [btrfs]
   show_mountinfo+0x234/0x2c4
   m_show+0x28/0x34
   seq_read_iter+0x12c/0x3c4
   vfs_read+0x29c/0x2c8
   ksys_read+0x80/0xec
   __arm64_sys_read+0x28/0x34
   invoke_syscall+0x50/0xf8
   do_el0_svc+0x88/0x138
   el0_svc+0x2c/0x8c
   el0t_64_sync_handler+0x84/0xe4
   el0t_64_sync+0x198/0x19c

Reason:
While btrfs_prepare_sprout() moves the fs_devices::devices into
fs_devices::seed_list, the btrfs_show_devname() searches for the devices
and found none, leading to the warning as in above.

Fix:
latest_dev is updated according to the changes to the device list.
That means we could use the latest_dev->name to show the device name in
/proc/self/mounts, the pointer will be always valid as it's assigned
before the device is deleted from the list in remove or replace.
The RCU protection is sufficient as the device structure is freed after
synchronization.

Reported-by: Su Yue <l@damenly.su>
Tested-by: Su Yue <l@damenly.su>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this issue Mar 30, 2023
[ Upstream commit 3004081 ]

In case runtime PM is enabled, do runtime PM clean up to remove
cpu latency qos request, otherwise driver removal may have below
kernel dump:

[   19.463299] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000048
[   19.472161] Mem abort info:
[   19.474985]   ESR = 0x0000000096000004
[   19.478754]   EC = 0x25: DABT (current EL), IL = 32 bits
[   19.484081]   SET = 0, FnV = 0
[   19.487149]   EA = 0, S1PTW = 0
[   19.490361]   FSC = 0x04: level 0 translation fault
[   19.495256] Data abort info:
[   19.498149]   ISV = 0, ISS = 0x00000004
[   19.501997]   CM = 0, WnR = 0
[   19.504977] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000049f81000
[   19.511432] [0000000000000048] pgd=0000000000000000,
p4d=0000000000000000
[   19.518245] Internal error: Oops: 0000000096000004 [jwrdegoede#1] PREEMPT SMP
[   19.524520] Modules linked in: gpio_ir_recv(+) rc_core [last
unloaded: rc_core]
[   19.531845] CPU: 0 PID: 445 Comm: insmod Not tainted
6.2.0-rc1-00028-g2c397a46d47c linux-sunxi#72
[   19.531854] Hardware name: FSL i.MX8MM EVK board (DT)
[   19.531859] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[   19.551777] pc : cpu_latency_qos_remove_request+0x20/0x110
[   19.557277] lr : gpio_ir_recv_runtime_suspend+0x18/0x30
[gpio_ir_recv]
[   19.557294] sp : ffff800008ce3740
[   19.557297] x29: ffff800008ce3740 x28: 0000000000000000 x27:
ffff800008ce3d50
[   19.574270] x26: ffffc7e3e9cea100 x25: 00000000000f4240 x24:
ffffc7e3f9ef0e30
[   19.574284] x23: 0000000000000000 x22: ffff0061803820f4 x21:
0000000000000008
[   19.574296] x20: ffffc7e3fa75df30 x19: 0000000000000020 x18:
ffffffffffffffff
[   19.588570] x17: 0000000000000000 x16: ffffc7e3f9efab70 x15:
ffffffffffffffff
[   19.595712] x14: ffff800008ce37b8 x13: ffff800008ce37aa x12:
0000000000000001
[   19.602853] x11: 0000000000000001 x10: ffffcbe3ec0dff87 x9 :
0000000000000008
[   19.609991] x8 : 0101010101010101 x7 : 0000000000000000 x6 :
000000000f0bfe9f
[   19.624261] x5 : 00ffffffffffffff x4 : 0025ab8e00000000 x3 :
ffff006180382010
[   19.631405] x2 : ffffc7e3e9ce8030 x1 : ffffc7e3fc3eb810 x0 :
0000000000000020
[   19.638548] Call trace:
[   19.640995]  cpu_latency_qos_remove_request+0x20/0x110
[   19.646142]  gpio_ir_recv_runtime_suspend+0x18/0x30 [gpio_ir_recv]
[   19.652339]  pm_generic_runtime_suspend+0x2c/0x44
[   19.657055]  __rpm_callback+0x48/0x1dc
[   19.660807]  rpm_callback+0x6c/0x80
[   19.664301]  rpm_suspend+0x10c/0x640
[   19.667880]  rpm_idle+0x250/0x2d0
[   19.671198]  update_autosuspend+0x38/0xe0
[   19.675213]  pm_runtime_set_autosuspend_delay+0x40/0x60
[   19.680442]  gpio_ir_recv_probe+0x1b4/0x21c [gpio_ir_recv]
[   19.685941]  platform_probe+0x68/0xc0
[   19.689610]  really_probe+0xc0/0x3dc
[   19.693189]  __driver_probe_device+0x7c/0x190
[   19.697550]  driver_probe_device+0x3c/0x110
[   19.701739]  __driver_attach+0xf4/0x200
[   19.705578]  bus_for_each_dev+0x70/0xd0
[   19.709417]  driver_attach+0x24/0x30
[   19.712998]  bus_add_driver+0x17c/0x240
[   19.716834]  driver_register+0x78/0x130
[   19.720676]  __platform_driver_register+0x28/0x34
[   19.725386]  gpio_ir_recv_driver_init+0x20/0x1000 [gpio_ir_recv]
[   19.731404]  do_one_initcall+0x44/0x2ac
[   19.735243]  do_init_module+0x48/0x1d0
[   19.739003]  load_module+0x19fc/0x2034
[   19.742759]  __do_sys_finit_module+0xac/0x12c
[   19.747124]  __arm64_sys_finit_module+0x20/0x30
[   19.751664]  invoke_syscall+0x48/0x114
[   19.755420]  el0_svc_common.constprop.0+0xcc/0xec
[   19.760132]  do_el0_svc+0x38/0xb0
[   19.763456]  el0_svc+0x2c/0x84
[   19.766516]  el0t_64_sync_handler+0xf4/0x120
[   19.770789]  el0t_64_sync+0x190/0x194
[   19.774460] Code: 910003fd a90153f3 aa0003f3 91204021 (f9401400)
[   19.780556] ---[ end trace 0000000000000000 ]---

Signed-off-by: Li Jun <jun.li@nxp.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants