Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contact #1

Closed
grimaz opened this issue May 30, 2016 · 0 comments
Closed

Contact #1

grimaz opened this issue May 30, 2016 · 0 comments

Comments

@grimaz
Copy link

grimaz commented May 30, 2016

Hello, I did not find another way to contact you regarding building kernel for S905 box. So, let me try here. I am working with 2016-05-04 amlogic release to build ubuntu-kodi pack. But have some problems in result:

  • network and wifi modules do not working yet
  • kodi 17.0a crashes on start with SIGSEGV (not sure why) with log record "ERROR: InitWindowSystem: Could not initialize"

Если у вас есть полчаса времени - напишите письмо, пожалуйта, на grimaz---yandex.ua
Расскажу подробней со всякими логами.

isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 85eb9af ]

in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command

 # tc action add action pedit ex munge ip ttl set 64

dereference a NULL pointer:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
 PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
 Oops: 0002 [150balbes#1] SMP PTI
 CPU: 0 PID: 3336 Comm: tc Tainted: G            E     4.18.0.pedit+ #425
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
 Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
 RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
 RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
 RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
 R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
 R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
 FS:  00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
 Call Trace:
  ? __nla_reserve+0x38/0x50
  tcf_action_dump_1+0xd2/0x130
  tcf_action_dump+0x6a/0xf0
  tca_get_fill.constprop.31+0xa3/0x120
  tcf_action_add+0xd1/0x170
  tc_ctl_action+0x137/0x150
  rtnetlink_rcv_msg+0x263/0x2d0
  ? _cond_resched+0x15/0x40
  ? rtnl_calcit.isra.30+0x110/0x110
  netlink_rcv_skb+0x4d/0x130
  netlink_unicast+0x1a3/0x250
  netlink_sendmsg+0x2ae/0x3a0
  sock_sendmsg+0x36/0x40
  ___sys_sendmsg+0x26f/0x2d0
  ? do_wp_page+0x8e/0x5f0
  ? handle_pte_fault+0x6c3/0xf50
  ? __handle_mm_fault+0x38e/0x520
  ? __sys_sendmsg+0x5e/0xa0
  __sys_sendmsg+0x5e/0xa0
  do_syscall_64+0x5b/0x180
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f75a4583ba0
 Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
 RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
 RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
 RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
 R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
 Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
 CR2: 0000000000000000

Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.

Fixes: 71d0ed7 ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit e04e7a7 ]

This patch fixes the race between netvsc_probe() and
rndis_set_subchannel(), which can cause a deadlock.

These are the related 3 paths which show the deadlock:

path 150balbes#1:
    Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
    Call Trace:
     schedule
     schedule_preempt_disabled
     __mutex_lock
     __device_attach
     bus_probe_device
     device_add
     vmbus_device_register
     vmbus_onoffer
     vmbus_onmessage_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path 150balbes#2:
    schedule
     schedule_preempt_disabled
     __mutex_lock
     netvsc_probe
     vmbus_probe
     really_probe
     __driver_attach
     bus_for_each_dev
     driver_attach_async
     async_run_entry_fn
     process_one_work
     worker_thread
     kthread
     ret_from_fork

path 150balbes#3:
    Workqueue: events netvsc_subchan_work [hv_netvsc]
    Call Trace:
     schedule
     rndis_set_subchannel
     netvsc_subchan_work
     process_one_work
     worker_thread
     kthread
     ret_from_fork

Before path 150balbes#1 finishes, path 150balbes#2 can start to run, because just before
the "bus_probe_device(dev);" in device_add() in path 150balbes#1, there is a line
"object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can
immediately try to load hv_netvsc and hence path 150balbes#2 can start to run.

Next, path 150balbes#2 offloads the subchannal's initialization to a workqueue,
i.e. path 150balbes#3, so we can end up in a deadlock situation like this:

Path 150balbes#2 gets the device lock, and is trying to get the rtnl lock;
Path 150balbes#3 gets the rtnl lock and is waiting for all the subchannel messages
to be processed;
Path 150balbes#1 is trying to get the device lock, but since 150balbes#2 is not releasing
the device lock, path 150balbes#1 has to sleep; since the VMBus messages are
processed one by one, this means the sub-channel messages can't be
procedded, so 150balbes#3 has to sleep with the rtnl lock held, and finally 150balbes#2
has to sleep... Now all the 3 paths are sleeping and we hit the deadlock.

With the patch, we can make sure 150balbes#2 gets both the device lock and the
rtnl lock together, gets its job done, and releases the locks, so 150balbes#1
and 150balbes#3 will not be blocked for ever.

Fixes: 8195b13 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 9c86336 ]

If load ip6_vti module and create a network namespace when set
fb_tunnels_only_for_init_net to 1, then exit the namespace will
cause following crash:

[ 6601.677036] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 6601.679057] PGD 8000000425eca067 P4D 8000000425eca067 PUD 424292067 PMD 0
[ 6601.680483] Oops: 0000 [150balbes#1] SMP PTI
[ 6601.681223] CPU: 7 PID: 93 Comm: kworker/u16:1 Kdump: loaded Tainted: G            E     4.18.0+ 150balbes#3
[ 6601.683153] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014
[ 6601.684919] Workqueue: netns cleanup_net
[ 6601.685742] RIP: 0010:vti6_exit_batch_net+0x87/0xd0 [ip6_vti]
[ 6601.686932] Code: 7b 08 48 89 e6 e8 b9 ea d3 dd 48 8b 1b 48 85 db 75 ec 48 83 c5 08 48 81 fd 00 01 00 00 75 d5 49 8b 84 24 08 01 00 00 48 89 e6 <48> 8b 78 08 e8 90 ea d3 dd 49 8b 45 28 49 39 c6 4c 8d 68 d8 75 a1
[ 6601.690735] RSP: 0018:ffffa897c2737de0 EFLAGS: 00010246
[ 6601.691846] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000200
[ 6601.693324] RDX: 0000000000000015 RSI: ffffa897c2737de0 RDI: ffffffff9f2ea9e0
[ 6601.694824] RBP: 0000000000000100 R08: 0000000000000000 R09: 0000000000000000
[ 6601.696314] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8dc323c07e00
[ 6601.697812] R13: ffff8dc324a63100 R14: ffffa897c2737e30 R15: ffffa897c2737e30
[ 6601.699345] FS:  0000000000000000(0000) GS:ffff8dc33fdc0000(0000) knlGS:0000000000000000
[ 6601.701068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6601.702282] CR2: 0000000000000008 CR3: 0000000424966002 CR4: 00000000001606e0
[ 6601.703791] Call Trace:
[ 6601.704329]  cleanup_net+0x1b4/0x2c0
[ 6601.705268]  process_one_work+0x16c/0x370
[ 6601.706145]  worker_thread+0x49/0x3e0
[ 6601.706942]  kthread+0xf8/0x130
[ 6601.707626]  ? rescuer_thread+0x340/0x340
[ 6601.708476]  ? kthread_bind+0x10/0x10
[ 6601.709266]  ret_from_fork+0x35/0x40

Reproduce:
modprobe ip6_vti
echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net
unshare -n
exit

This because ip6n->tnls_wc[0] point to fallback device in default, but
in non-default namespace, ip6n->tnls_wc[0] will be NULL, so add the NULL
check comparatively.

Fixes: e2948e5 ("ip6_vti: fix creating fallback tunnel device for vti6")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit c513de4 ]

If the system BIOS does not supply NUMA node information to the
PCI devices, the NUMA node is selected by choosing the current
node.

This can lead to the following crash:

divide error: 0000 SMP
CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: G          IOE
------------   3.10.0-693.21.1.el7.x86_64 150balbes#1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS
SE5C610.86B.01.01.0005.101720141054 10/17/2014
Workqueue: events work_for_cpu_fn
task: ffff880174480fd0 ti: ffff880174488000 task.ti: ffff880174488000
RIP: 0010: [<ffffffffc020ac69>] hfi1_dev_affinity_init+0x129/0x6a0 [hfi1]
RSP: 0018:ffff88017448bbf8  EFLAGS: 00010246
RAX: 0000000000000011 RBX: ffff88107ffba6c0 RCX: ffff88085c22e130
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880824ad0000
RBP: ffff88017448bc48 R08: 0000000000000011 R09: 0000000000000002
R10: ffff8808582b6ca0 R11: 0000000000003151 R12: ffff8808582b6ca0
R13: ffff8808582b6518 R14: ffff8808582b6010 R15: 0000000000000012
FS:  0000000000000000(0000) GS:ffff88085ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efc707404f0 CR3: 0000000001a02000 CR4: 00000000001607f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 hfi1_init_dd+0x14b3/0x27a0 [hfi1]
 ? pcie_capability_write_word+0x46/0x70
 ? hfi1_pcie_init+0xc0/0x200 [hfi1]
 do_init_one+0x153/0x4c0 [hfi1]
 ? sched_clock_cpu+0x85/0xc0
 init_one+0x1b5/0x260 [hfi1]
 local_pci_probe+0x4a/0xb0
 work_for_cpu_fn+0x1a/0x30
 process_one_work+0x17f/0x440
 worker_thread+0x278/0x3c0
 ? manage_workers.isra.24+0x2a0/0x2a0
 kthread+0xd1/0xe0
 ? insert_kthread_work+0x40/0x40
 ret_from_fork+0x77/0xb0
 ? insert_kthread_work+0x40/0x40

If the BIOS is not supplying NUMA information:
  - set the default table count to 1 for all possible nodes
  - select node 0 (instead of current NUMA) node to get consistent
    performance
  - generate an error indicating that the BIOS should be upgraded

Reviewed-by: Gary Leshner <gary.s.leshner@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 21b172e ]

Fix the warning below by calling rhashtable_lookup_fast.
Also, make some code movements for better quality and human
readability.

[  342.450870] WARNING: suspicious RCU usage
[  342.455856] 4.18.0-rc2+ 150balbes#17 Tainted: G           O
[  342.462210] -----------------------------
[  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
[  342.476568]
[  342.476568] other info that might help us debug this:
[  342.476568]
[  342.486978]
[  342.486978] rcu_scheduler_active = 2, debug_locks = 1
[  342.495211] 4 locks held by modprobe/3934:
[  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
mlx5_unregister_interface+0x18/0x90 [mlx5_core]
[  342.511953]  150balbes#1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
[  342.521109]  150balbes#2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
[mlx5_core]
[  342.531642]  150balbes#3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
[  342.541206]
[  342.541206] stack backtrace:
[  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ 150balbes#17
[  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
[  342.565606] Call Trace:
[  342.568861]  dump_stack+0x78/0xb3
[  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
[  342.578285]  ? __call_rcu+0x220/0x300
[  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
[  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
[  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
[  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
[  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
[  342.613005]  __dev_close_many+0xb1/0x120
[  342.617911]  dev_close_many+0xa2/0x170
[  342.622622]  rollback_registered_many+0x148/0x460
[  342.628401]  ? __lock_acquire+0x48d/0x11b0
[  342.633498]  ? unregister_netdev+0xe/0x20
[  342.638495]  rollback_registered+0x56/0x90
[  342.643588]  unregister_netdevice_queue+0x7e/0x100
[  342.649461]  unregister_netdev+0x18/0x20
[  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
[  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
[  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
[  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
[  342.678094]  __x64_sys_delete_module+0x16b/0x240
[  342.683725]  ? do_syscall_64+0x1c/0x210
[  342.688476]  do_syscall_64+0x5a/0x210
[  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 8d5d885 ("xdp: rhashtable with allocator ID to pointer mapping")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 2f606da ]

Instantiating the sm501 OHCI subdevice results in a kernel warning.

sm501-usb sm501-usb: SM501 OHCI
sm501-usb sm501-usb: new USB bus registered, assigned bus number 1
WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516
ohci_init+0x194/0x2d8
Modules linked in:

CPU: 0 PID: 1 Comm: swapper Tainted: G        W
4.18.0-rc7-00178-g0b5b1f9a78b5 150balbes#1
PC is at ohci_init+0x194/0x2d8
PR is at ohci_init+0x168/0x2d8
PC  : 8c27844c SP  : 8f81dd94 SR  : 40008001
TEA : 29613060
R0  : 00000000 R1  : 00000000 R2  : 00000000 R3  : 00000202
R4  : 8fa98b88 R5  : 8c277e68 R6  : 00000000 R7  : 00000000
R8  : 8f965814 R9  : 8c388100 R10 : 8fa98800 R11 : 8fa98928
R12 : 8c48302c R13 : 8fa98920 R14 : 8c48302c
MACH: 00000096 MACL: 0000017c GBR : 00000000 PR  : 8c278420

Call trace:
 [<(ptrval)>] usb_add_hcd+0x1e8/0x6ec
 [<(ptrval)>] _dev_info+0x0/0x54
 [<(ptrval)>] arch_local_save_flags+0x0/0x8
 [<(ptrval)>] arch_local_irq_restore+0x0/0x24
 [<(ptrval)>] ohci_hcd_sm501_drv_probe+0x114/0x2d8
...

Initialize coherent_dma_mask when creating SM501 subdevices to fix
the problem.

Fixes: b6d6454 ("mfd: SM501 core driver")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 78ee994 ]

Because rfi_flush_fallback runs immediately before the return to
userspace it currently runs with the user r1 (stack pointer). This
means if we oops in there we will report a bad kernel stack pointer in
the exception entry path, eg:

  Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
  Oops: Bad kernel stack pointer, sig: 6 [150balbes#1]
  LE SMP NR_CPUS=32 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 150balbes#7
  NIP:  c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
  REGS: c0000000fffe7d40 TRAP: 4100   Not tainted  (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
  MSR:  9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE>  CR: 44000442  XER: 20000000
  CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
  GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
  ...
  NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
  LR [0000000010053e00] 0x10053e00

Although the NIP tells us where we were, and the TRAP number tells us
what happened, it would still be nicer if we could report the actual
exception rather than barfing about the stack pointer.

We an do that fairly simply by loading the kernel stack pointer on
entry and restoring the user value before returning. That way we see a
regular oops such as:

  Unrecoverable exception 4100 at c00000000000239c
  Oops: Unrecoverable exception, sig: 6 [150balbes#1]
  LE SMP NR_CPUS=32 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
  NIP:  c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
  REGS: c0000000f1e17bb0 TRAP: 4100   Not tainted  (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
  MSR:  9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE>  CR: 44000442  XER: 20000000
  CFAR: c00000000000bac8 IRQMASK: 0
  ...
  NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
  LR [0000000010053e00] 0x10053e00
  Call Trace:
  [c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)

Note this shouldn't make the kernel stack pointer vulnerable to a
meltdown attack, because it should be flushed from the cache before we
return to userspace. The user r1 value will be in the cache, because
we load it in the return path, but that is harmless.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 4379444 ]

[BUG]
Under certain KVM load and LTP tests, it is possible to hit the
following calltrace if quota is enabled:

BTRFS critical (device vda2): unable to find logical 8820195328 length 4096
BTRFS critical (device vda2): unable to find logical 8820195328 length 4096

WARNING: CPU: 0 PID: 49 at ../block/blk-core.c:172 blk_status_to_errno+0x1a/0x30
CPU: 0 PID: 49 Comm: kworker/u2:1 Not tainted 4.12.14-15-default 150balbes#1 SLE15 (unreleased)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
task: ffff9f827b340bc0 task.stack: ffffb4f8c0304000
RIP: 0010:blk_status_to_errno+0x1a/0x30
Call Trace:
 submit_extent_page+0x191/0x270 [btrfs]
 ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
 __do_readpage+0x2d2/0x810 [btrfs]
 ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 __extent_read_full_page+0xe7/0x100 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 read_extent_buffer_pages+0x1ab/0x2d0 [btrfs]
 ? run_one_async_done+0xc0/0xc0 [btrfs]
 btree_read_extent_buffer_pages+0x94/0xf0 [btrfs]
 read_tree_block+0x31/0x60 [btrfs]
 read_block_for_search.isra.35+0xf0/0x2e0 [btrfs]
 btrfs_search_slot+0x46b/0xa00 [btrfs]
 ? kmem_cache_alloc+0x1a8/0x510
 ? btrfs_get_token_32+0x5b/0x120 [btrfs]
 find_parent_nodes+0x11d/0xeb0 [btrfs]
 ? leaf_space_used+0xb8/0xd0 [btrfs]
 ? btrfs_leaf_free_space+0x49/0x90 [btrfs]
 ? btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
 btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
 btrfs_find_all_roots+0x45/0x60 [btrfs]
 btrfs_qgroup_trace_extent_post+0x20/0x40 [btrfs]
 btrfs_add_delayed_data_ref+0x1a3/0x1d0 [btrfs]
 btrfs_alloc_reserved_file_extent+0x38/0x40 [btrfs]
 insert_reserved_file_extent.constprop.71+0x289/0x2e0 [btrfs]
 btrfs_finish_ordered_io+0x2f4/0x7f0 [btrfs]
 ? pick_next_task_fair+0x2cd/0x530
 ? __switch_to+0x92/0x4b0
 btrfs_worker_helper+0x81/0x300 [btrfs]
 process_one_work+0x1da/0x3f0
 worker_thread+0x2b/0x3f0
 ? process_one_work+0x3f0/0x3f0
 kthread+0x11a/0x130
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x35/0x40

BTRFS critical (device vda2): unable to find logical 8820195328 length 16384
BTRFS: error (device vda2) in btrfs_finish_ordered_io:3023: errno=-5 IO failure
BTRFS info (device vda2): forced readonly
BTRFS error (device vda2): pending csums is 2887680

[CAUSE]
It's caused by race with block group auto removal:

- There is a meta block group X, which has only one tree block
  The tree block belongs to fs tree 257.
- In current transaction, some operation modified fs tree 257
  The tree block gets COWed, so the block group X is empty, and marked
  as unused, queued to be deleted.
- Some workload (like fsync) wakes up cleaner_kthread()
  Which will call btrfs_delete_unused_bgs() to remove unused block
  groups.
  So block group X along its chunk map get removed.
- Some delalloc work finished for fs tree 257
  Quota needs to get the original reference of the extent, which will
  read tree blocks of commit root of 257.
  Then since the chunk map gets removed, the above warning gets
  triggered.

[FIX]
Just let btrfs_delete_unused_bgs() skip block group which still has
pinned bytes.

However there is a minor side effect: currently we only queue empty
blocks at update_block_group(), and such empty block group with pinned
bytes won't go through update_block_group() again, such block group
won't be removed, until it gets new extent allocated and removed.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit 2093275 upstream.

I changed the way mac80211 updates the PM state of the peer.
I forgot that we could also have multicast frames from the
peer and that those frame should of course not change the
PM state of the peer: A peer goes to power save when it
needs to scan, but it won't send the broadcast Probe Request
with the PM bit set.

This made us mark the peer as awake when it wasn't and then
Intel's firmware would fail to transmit because the peer is
asleep according to its database. The driver warned about
this and it looked like this:

 WARNING: CPU: 0 PID: 184 at /usr/src/linux-4.16.14/drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1369 iwl_mvm_rx_tx_cmd+0x53b/0x860
 CPU: 0 PID: 184 Comm: irq/124-iwlwifi Not tainted 4.16.14 150balbes#1
 RIP: 0010:iwl_mvm_rx_tx_cmd+0x53b/0x860
 Call Trace:
  iwl_pcie_rx_handle+0x220/0x880
  iwl_pcie_irq_handler+0x6c9/0xa20
  ? irq_forced_thread_fn+0x60/0x60
  ? irq_thread_dtor+0x90/0x90

The relevant code that spits the WARNING is:

        case TX_STATUS_FAIL_DEST_PS:
                /* the FW should have stopped the queue and not
                 * return this status
                 */
                WARN_ON(1);
                info->flags |= IEEE80211_TX_STAT_TX_FILTERED;

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199967.

Fixes: 9fef654 ("mac80211: always update the PM state of a peer on MGMT / DATA frames")
Cc: <stable@vger.kernel.org>   150balbes#4.16+
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit da1b956 upstream.

There is RaceFuzzer report like below because we have no lock to close
below the race between binder_mmap and binder_alloc_new_buf_locked.
To close the race, let's use memory barrier so that if someone see
alloc->vma is not NULL, alloc->vma_vm_mm should be never NULL.

(I didn't add stable mark intentionallybecause standard android
userspace libraries that interact with binder (libbinder & libhwbinder)
prevent the mmap/ioctl race. - from Todd)

"
Thread interleaving:
CPU0 (binder_alloc_mmap_handler)              CPU1 (binder_alloc_new_buf_locked)
=====                                         =====
// drivers/android/binder_alloc.c
// #L718 (v4.18-rc3)
alloc->vma = vma;
                                              // drivers/android/binder_alloc.c
                                              // #L346 (v4.18-rc3)
                                              if (alloc->vma == NULL) {
                                                  ...
                                                  // alloc->vma is not NULL at this point
                                                  return ERR_PTR(-ESRCH);
                                              }
                                              ...
                                              // #L438
                                              binder_update_page_range(alloc, 0,
                                                      (void *)PAGE_ALIGN((uintptr_t)buffer->data),
                                                      end_page_addr);

                                              // In binder_update_page_range() #L218
                                              // But still alloc->vma_vm_mm is NULL here
                                              if (need_mm && mmget_not_zero(alloc->vma_vm_mm))
alloc->vma_vm_mm = vma->vm_mm;

Crash Log:
==================================================================
BUG: KASAN: null-ptr-deref in __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
BUG: KASAN: null-ptr-deref in atomic_add_unless include/linux/atomic.h:533 [inline]
BUG: KASAN: null-ptr-deref in mmget_not_zero include/linux/sched/mm.h:75 [inline]
BUG: KASAN: null-ptr-deref in binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
Write of size 4 at addr 0000000000000058 by task syz-executor0/11184

CPU: 1 PID: 11184 Comm: syz-executor0 Not tainted 4.18.0-rc3 150balbes#1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x16e/0x22c lib/dump_stack.c:113
 kasan_report_error mm/kasan/report.c:352 [inline]
 kasan_report+0x163/0x380 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 check_memory_region+0x140/0x1a0 mm/kasan/kasan.c:267
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
 __atomic_add_unless include/asm-generic/atomic-instrumented.h:89 [inline]
 atomic_add_unless include/linux/atomic.h:533 [inline]
 mmget_not_zero include/linux/sched/mm.h:75 [inline]
 binder_update_page_range+0xece/0x18e0 drivers/android/binder_alloc.c:218
 binder_alloc_new_buf_locked drivers/android/binder_alloc.c:443 [inline]
 binder_alloc_new_buf+0x467/0xc30 drivers/android/binder_alloc.c:513
 binder_transaction+0x125b/0x4fb0 drivers/android/binder.c:2957
 binder_thread_write+0xc08/0x2770 drivers/android/binder.c:3528
 binder_ioctl_write_read.isra.39+0x24f/0x8e0 drivers/android/binder.c:4456
 binder_ioctl+0xa86/0xf34 drivers/android/binder.c:4596
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x154/0xd40 fs/ioctl.c:686
 ksys_ioctl+0x94/0xb0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x43/0x50 fs/ioctl.c:706
 do_syscall_64+0x167/0x4b0 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
"

Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 6e0495c ]

An arbitrary error in ppc4xx_msi_probe() quite likely results in a
crash similar to the following, seen after dma_alloc_coherent()
returned an error.

  Unable to handle kernel paging request for data at address 0x00000000
  Faulting instruction address: 0xc001bff0
  Oops: Kernel access of bad area, sig: 11 [150balbes#1]
  BE Canyonlands
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper Tainted: G        W
  4.18.0-rc6-00010-gff33d1030a6c 150balbes#1
  NIP:  c001bff0 LR: c001c418 CTR: c01faa7c
  REGS: cf82db40 TRAP: 0300   Tainted: G        W
  (4.18.0-rc6-00010-gff33d1030a6c)
  MSR:  00029000 <CE,EE,ME>  CR: 28002024  XER: 00000000
  DEAR: 00000000 ESR: 00000000
  GPR00: c001c418 cf82dbf0 cf828000 cf8de400 00000000 00000000 000000c4 000000c4
  GPR08: c0481ea4 00000000 00000000 000000c4 22002024 00000000 c00025e8 00000000
  GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c0492380 0000004a
  GPR24: 00029000 0000000c 00000000 cf8de410 c0494d60 c0494d60 cf8bebc0 00000001
  NIP [c001bff0] ppc4xx_of_msi_remove+0x48/0xa0
  LR [c001c418] ppc4xx_msi_probe+0x294/0x3b8
  Call Trace:
  [cf82dbf0] [00029000] 0x29000 (unreliable)
  [cf82dc10] [c001c418] ppc4xx_msi_probe+0x294/0x3b8
  [cf82dc70] [c0209fbc] platform_drv_probe+0x40/0x9c
  [cf82dc90] [c0208240] driver_probe_device+0x2a8/0x350
  [cf82dcc0] [c0206204] bus_for_each_drv+0x60/0xac
  [cf82dcf0] [c0207e88] __device_attach+0xe8/0x160
  [cf82dd20] [c02071e0] bus_probe_device+0xa0/0xbc
  [cf82dd40] [c02050c8] device_add+0x404/0x5c4
  [cf82dd90] [c0288978] of_platform_device_create_pdata+0x88/0xd8
  [cf82ddb0] [c0288b70] of_platform_bus_create+0x134/0x220
  [cf82de10] [c0288bcc] of_platform_bus_create+0x190/0x220
  [cf82de70] [c0288cf4] of_platform_bus_probe+0x98/0xec
  [cf82de90] [c0449650] __machine_initcall_canyonlands_ppc460ex_device_probe+0x38/0x54
  [cf82dea0] [c0002404] do_one_initcall+0x40/0x188
  [cf82df00] [c043daec] kernel_init_freeable+0x130/0x1d0
  [cf82df30] [c0002600] kernel_init+0x18/0x104
  [cf82df40] [c000c23c] ret_from_kernel_thread+0x14/0x1c
  Instruction dump:
  90010024 813d0024 2f890000 83c30058 41bd0014 48000038 813d0024 7f89f800
  409d002c 813e000c 57ea103a 3bff0001 <7c69502e> 2f830000 419effe0 4803b26d
  ---[ end trace 8cf551077ecfc42a ]---

Fix it up. Specifically,

- Return valid error codes from ppc4xx_setup_pcieh_hw(), have it clean
  up after itself, and only access hardware after all possible error
  conditions have been handled.
- Use devm_kzalloc() instead of kzalloc() in ppc4xx_msi_probe()

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 93a3922 ]

During remote port loss fault testing, the driver crashed with the
following trace:

general protection fault: 0000 [150balbes#1] SMP
RIP: ... lpfc_nvme_register_port+0x250/0x480 [lpfc]
Call Trace:
 lpfc_nlp_state_cleanup+0x1b3/0x7a0 [lpfc]
 lpfc_nlp_set_state+0xa6/0x1d0 [lpfc]
 lpfc_cmpl_prli_prli_issue+0x213/0x440
 lpfc_disc_state_machine+0x7e/0x1e0 [lpfc]
 lpfc_cmpl_els_prli+0x18a/0x200 [lpfc]
 lpfc_sli_sp_handle_rspiocb+0x3b5/0x6f0 [lpfc]
 lpfc_sli_handle_slow_ring_event_s4+0x161/0x240 [lpfc]
 lpfc_work_done+0x948/0x14c0 [lpfc]
 lpfc_do_work+0x16f/0x180 [lpfc]
 kthread+0xc9/0xe0
 ret_from_fork+0x55/0x80

After registering a new remoteport, the driver is pulling an ndlp pointer
from the lpfc rport associated with the private area of a newly registered
remoteport. The private area is uninitialized, so it's garbage.

Correct by pulling the the lpfc rport pointer from the entering ndlp point,
then ndlp value from at rport. Note the entering ndlp may be replacing by
the rport->ndlp due to an address change swap.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit b34e9a1 ]

The call trace:
XXX/1910 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<ffffffff97008c87>] might_fault+0x57/0xb0

but task is already holding lock:
 (&idev->info_lock){+.+...}, at: [<ffffffffc0638a06>] uio_write+0x46/0x130 [uio]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> 150balbes#1 (&idev->info_lock){+.+...}:
       [<ffffffff96f31fc9>] lock_acquire+0x99/0x1e0
       [<ffffffff975edad3>] mutex_lock_nested+0x93/0x410
       [<ffffffffc063873d>] uio_mmap+0x2d/0x170 [uio]
       [<ffffffff97016b58>] mmap_region+0x428/0x650
       [<ffffffff97017138>] do_mmap+0x3b8/0x4e0
       [<ffffffff96ffaba3>] vm_mmap_pgoff+0xd3/0x120
       [<ffffffff97015261>] SyS_mmap_pgoff+0x1f1/0x270
       [<ffffffff96e387c2>] SyS_mmap+0x22/0x30
       [<ffffffff975ff315>] system_call_fastpath+0x1c/0x21

-> #0 (&mm->mmap_sem){++++++}:
       [<ffffffff96f30e9c>] __lock_acquire+0xdac/0x15f0
       [<ffffffff96f31fc9>] lock_acquire+0x99/0x1e0
       [<ffffffff97008cb4>] might_fault+0x84/0xb0
       [<ffffffffc0638a74>] uio_write+0xb4/0x130 [uio]
       [<ffffffff9706ffa3>] vfs_write+0xc3/0x1f0
       [<ffffffff97070e2a>] SyS_write+0x8a/0x100
       [<ffffffff975ff315>] system_call_fastpath+0x1c/0x21

other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&idev->info_lock);
                               lock(&mm->mmap_sem);
                               lock(&idev->info_lock);
  lock(&mm->mmap_sem);

 *** DEADLOCK ***
1 lock held by XXX/1910:
 #0:  (&idev->info_lock){+.+...}, at: [<ffffffffc0638a06>] uio_write+0x46/0x130 [uio]

stack backtrace:
CPU: 0 PID: 1910 Comm: XXX Kdump: loaded Not tainted 150balbes#1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
Call Trace:
 [<ffffffff975e9211>] dump_stack+0x19/0x1b
 [<ffffffff975e260a>] print_circular_bug+0x1f9/0x207
 [<ffffffff96f2f6a7>] check_prevs_add+0x957/0x960
 [<ffffffff96f30e9c>] __lock_acquire+0xdac/0x15f0
 [<ffffffff96f2fb19>] ? mark_held_locks+0xb9/0x140
 [<ffffffff96f31fc9>] lock_acquire+0x99/0x1e0
 [<ffffffff97008c87>] ? might_fault+0x57/0xb0
 [<ffffffff97008cb4>] might_fault+0x84/0xb0
 [<ffffffff97008c87>] ? might_fault+0x57/0xb0
 [<ffffffffc0638a74>] uio_write+0xb4/0x130 [uio]
 [<ffffffff9706ffa3>] vfs_write+0xc3/0x1f0
 [<ffffffff9709349c>] ? fget_light+0xfc/0x510
 [<ffffffff97070e2a>] SyS_write+0x8a/0x100
 [<ffffffff975ff315>] system_call_fastpath+0x1c/0x21

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 10d255c ]

If segment type in SSA and SIT is inconsistent, we will encounter below
BUG_ON during GC, to avoid this panic, let's just skip doing GC on such
segment.

The bug is triggered with image reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200223

[  388.060262] ------------[ cut here ]------------
[  388.060268] kernel BUG at /home/y00370721/git/devf2fs/gc.c:989!
[  388.061172] invalid opcode: 0000 [150balbes#1] SMP
[  388.061773] Modules linked in: f2fs(O) bluetooth ecdh_generic xt_tcpudp iptable_filter ip_tables x_tables lp ttm drm_kms_helper drm intel_rapl sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel fb_sys_fops ppdev aes_x86_64 syscopyarea crypto_simd sysfillrect parport_pc joydev sysimgblt glue_helper parport cryptd i2c_piix4 serio_raw mac_hid btrfs hid_generic usbhid hid raid6_pq psmouse pata_acpi floppy
[  388.064247] CPU: 7 PID: 4151 Comm: f2fs_gc-7:0 Tainted: G           O    4.13.0-rc1+ 150balbes#26
[  388.065306] Hardware name: Xen HVM domU, BIOS 4.1.2_115-900.260_ 11/06/2015
[  388.066058] task: ffff880201583b80 task.stack: ffffc90004d7c000
[  388.069948] RIP: 0010:do_garbage_collect+0xcc8/0xcd0 [f2fs]
[  388.070766] RSP: 0018:ffffc90004d7fc68 EFLAGS: 00010202
[  388.071783] RAX: ffff8801ed227000 RBX: 0000000000000001 RCX: ffffea0007b489c0
[  388.072700] RDX: ffff880000000000 RSI: 0000000000000001 RDI: ffffea0007b489c0
[  388.073607] RBP: ffffc90004d7fd58 R08: 0000000000000003 R09: ffffea0007b489dc
[  388.074619] R10: 0000000000000000 R11: 0052782ab317138d R12: 0000000000000018
[  388.075625] R13: 0000000000000018 R14: ffff880211ceb000 R15: ffff880211ceb000
[  388.076687] FS:  0000000000000000(0000) GS:ffff880214fc0000(0000) knlGS:0000000000000000
[  388.083277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  388.084536] CR2: 0000000000e18c60 CR3: 00000001ecf2e000 CR4: 00000000001406e0
[  388.085748] Call Trace:
[  388.086690]  ? find_next_bit+0xb/0x10
[  388.088091]  f2fs_gc+0x1a8/0x9d0 [f2fs]
[  388.088888]  ? lock_timer_base+0x7d/0xa0
[  388.090213]  ? try_to_del_timer_sync+0x44/0x60
[  388.091698]  gc_thread_func+0x342/0x4b0 [f2fs]
[  388.092892]  ? wait_woken+0x80/0x80
[  388.094098]  kthread+0x109/0x140
[  388.095010]  ? f2fs_gc+0x9d0/0x9d0 [f2fs]
[  388.096043]  ? kthread_park+0x60/0x60
[  388.097281]  ret_from_fork+0x25/0x30
[  388.098401] Code: ff ff 48 83 e8 01 48 89 44 24 58 e9 27 f8 ff ff 48 83 e8 01 e9 78 fc ff ff 48 8d 78 ff e9 17 fb ff ff 48 83 ef 01 e9 4d f4 ff ff <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
[  388.100864] RIP: do_garbage_collect+0xcc8/0xcd0 [f2fs] RSP: ffffc90004d7fc68
[  388.101810] ---[ end trace 81c73d6e6b7da61d ]---

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 4dbe38d ]

As Wen Xu reported in bugzilla, after image was injected with random data
by fuzzing, inline inode would contain invalid reserved blkaddr, then
during inline conversion, we will encounter illegal memory accessing
reported by KASAN, the root cause of this is when writing out converted
inline page, we will use invalid reserved blkaddr to update sit bitmap,
result in accessing memory beyond sit bitmap boundary.

In order to fix this issue, let's do sanity check with reserved block
address of inline inode to avoid above condition.

https://bugzilla.kernel.org/show_bug.cgi?id=200179

[ 1428.846352] BUG: KASAN: use-after-free in update_sit_entry+0x80/0x7f0
[ 1428.846618] Read of size 4 at addr ffff880194483540 by task a.out/2741

[ 1428.846855] CPU: 0 PID: 2741 Comm: a.out Tainted: G        W         4.17.0+ 150balbes#1
[ 1428.846858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 1428.846860] Call Trace:
[ 1428.846868]  dump_stack+0x71/0xab
[ 1428.846875]  print_address_description+0x6b/0x290
[ 1428.846881]  kasan_report+0x28e/0x390
[ 1428.846888]  ? update_sit_entry+0x80/0x7f0
[ 1428.846898]  update_sit_entry+0x80/0x7f0
[ 1428.846906]  f2fs_allocate_data_block+0x6db/0xc70
[ 1428.846914]  ? f2fs_get_node_info+0x14f/0x590
[ 1428.846920]  do_write_page+0xc8/0x150
[ 1428.846928]  f2fs_outplace_write_data+0xfe/0x210
[ 1428.846935]  ? f2fs_do_write_node_page+0x170/0x170
[ 1428.846941]  ? radix_tree_tag_clear+0xff/0x130
[ 1428.846946]  ? __mod_node_page_state+0x22/0xa0
[ 1428.846951]  ? inc_zone_page_state+0x54/0x100
[ 1428.846956]  ? __test_set_page_writeback+0x336/0x5d0
[ 1428.846964]  f2fs_convert_inline_page+0x407/0x6d0
[ 1428.846971]  ? f2fs_read_inline_data+0x3b0/0x3b0
[ 1428.846978]  ? __get_node_page+0x335/0x6b0
[ 1428.846987]  f2fs_convert_inline_inode+0x41b/0x500
[ 1428.846994]  ? f2fs_convert_inline_page+0x6d0/0x6d0
[ 1428.847000]  ? kasan_unpoison_shadow+0x31/0x40
[ 1428.847005]  ? kasan_kmalloc+0xa6/0xd0
[ 1428.847024]  f2fs_file_mmap+0x79/0xc0
[ 1428.847029]  mmap_region+0x58b/0x880
[ 1428.847037]  ? arch_get_unmapped_area+0x370/0x370
[ 1428.847042]  do_mmap+0x55b/0x7a0
[ 1428.847048]  vm_mmap_pgoff+0x16f/0x1c0
[ 1428.847055]  ? vma_is_stack_for_current+0x50/0x50
[ 1428.847062]  ? __fsnotify_update_child_dentry_flags.part.1+0x160/0x160
[ 1428.847068]  ? do_sys_open+0x206/0x2a0
[ 1428.847073]  ? __fget+0xb4/0x100
[ 1428.847079]  ksys_mmap_pgoff+0x278/0x360
[ 1428.847085]  ? find_mergeable_anon_vma+0x50/0x50
[ 1428.847091]  do_syscall_64+0x73/0x160
[ 1428.847098]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1428.847102] RIP: 0033:0x7fb1430766ba
[ 1428.847103] Code: 89 f5 41 54 49 89 fc 55 53 74 35 49 63 e8 48 63 da 4d 89 f9 49 89 e8 4d 63 d6 48 89 da 4c 89 ee 4c 89 e7 b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 56 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 00
[ 1428.847162] RSP: 002b:00007ffc651d9388 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[ 1428.847167] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb1430766ba
[ 1428.847170] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[ 1428.847173] RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
[ 1428.847176] R10: 0000000000008002 R11: 0000000000000246 R12: 0000000000000000
[ 1428.847179] R13: 0000000000001000 R14: 0000000000008002 R15: 0000000000000000

[ 1428.847252] Allocated by task 2683:
[ 1428.847372]  kasan_kmalloc+0xa6/0xd0
[ 1428.847380]  kmem_cache_alloc+0xc8/0x1e0
[ 1428.847385]  getname_flags+0x73/0x2b0
[ 1428.847390]  user_path_at_empty+0x1d/0x40
[ 1428.847395]  vfs_statx+0xc1/0x150
[ 1428.847401]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847405]  do_syscall_64+0x73/0x160
[ 1428.847411]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847466] Freed by task 2683:
[ 1428.847566]  __kasan_slab_free+0x137/0x190
[ 1428.847571]  kmem_cache_free+0x85/0x1e0
[ 1428.847575]  filename_lookup+0x191/0x280
[ 1428.847580]  vfs_statx+0xc1/0x150
[ 1428.847585]  __do_sys_newlstat+0x7e/0xd0
[ 1428.847590]  do_syscall_64+0x73/0x160
[ 1428.847596]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 1428.847648] The buggy address belongs to the object at ffff880194483300
                which belongs to the cache names_cache of size 4096
[ 1428.847946] The buggy address is located 576 bytes inside of
                4096-byte region [ffff880194483300, ffff880194484300)
[ 1428.848234] The buggy address belongs to the page:
[ 1428.848366] page:ffffea0006512000 count:1 mapcount:0 mapping:ffff8801f3586380 index:0x0 compound_mapcount: 0
[ 1428.848606] flags: 0x17fff8000008100(slab|head)
[ 1428.848737] raw: 017fff8000008100 dead000000000100 dead000000000200 ffff8801f3586380
[ 1428.848931] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 1428.849122] page dumped because: kasan: bad access detected

[ 1428.849305] Memory state around the buggy address:
[ 1428.849436]  ffff880194483400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849620]  ffff880194483480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849804] >ffff880194483500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.849985]                                            ^
[ 1428.850120]  ffff880194483580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850303]  ffff880194483600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1428.850498] ==================================================================

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 20cdcaf ]

During the duplication of em28xx state for the second tuner pair
a pointer to alt_max_pkt_size_isoc is copied. During tear down
the second tuner is destroyed first and kfrees alt_max_pkt_size_isoc,
then the first tuner is destroyed and kfrees it again. The property
should only be kfree'd if the tuner is PRIMARY_TS.

[  354.888560] ------------[ cut here ]------------
[  354.888562] kernel BUG at mm/slub.c:296!
[  354.888574] invalid opcode: 0000 [150balbes#1] SMP NOPTI
[  354.888869] CPU: 1 PID: 19 Comm: kworker/1:0 Not tainted 4.18.0-rc1+ 150balbes#20
[  354.889140] Hardware name: MSI MS-7A39/B350M GAMING PRO (MS-7A39), BIOS 2.G0 04/27/2018
[  354.889408] Workqueue: usb_hub_wq hub_event
[  354.889679] RIP: 0010:__slab_free+0x217/0x370
[  354.889942] Code: bb c0 e8 07 41 38 c7 72 39 48 83 c4 70 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 f3 90 49 8b 04 24 a8 01 75 f6 eb 82 <0f> 0b 44 89 45 80 48 89 4d 88 e8 aa fa ff ff 85 c0 74 cc e9 b7 fe
[  354.890598] RSP: 0018:ffffb84c41a4fad0 EFLAGS: 00010246
[  354.890934] RAX: ffff948646e85150 RBX: ffff948646e85150 RCX: ffff948646e85150
[  354.891280] RDX: 00000000820001d9 RSI: fffffa8fd01ba140 RDI: ffff94865e807c00
[  354.891649] RBP: ffffb84c41a4fb70 R08: 0000000000000001 R09: ffffffffc059ce21
[  354.892025] R10: ffff948646e85150 R11: 0000000000000001 R12: fffffa8fd01ba140
[  354.892403] R13: ffff948646e85150 R14: ffff94865e807c00 R15: ffff94864c92e0a0
[  354.892780] FS:  0000000000000000(0000) GS:ffff94865ec40000(0000) knlGS:0000000000000000
[  354.893150] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  354.893530] CR2: 00007f4e476da950 CR3: 000000040112c000 CR4: 00000000003406e0
[  354.893917] Call Trace:
[  354.894315]  ? __dev_printk+0x3c/0x80
[  354.894695]  ? _dev_info+0x64/0x80
[  354.895082]  ? em28xx_free_device+0x41/0x50 [em28xx]
[  354.895464]  kfree+0x17a/0x190
[  354.895852]  ? kfree+0x17a/0x190
[  354.896310]  em28xx_free_device+0x41/0x50 [em28xx]
[  354.896698]  em28xx_usb_disconnect+0xfa/0x110 [em28xx]
[  354.897083]  usb_unbind_interface+0x7a/0x270
[  354.897475]  device_release_driver_internal+0x17c/0x250
[  354.897864]  device_release_driver+0x12/0x20
[  354.898252]  bus_remove_device+0xec/0x160
[  354.898639]  device_del+0x13d/0x320
[  354.899018]  ? usb_remove_ep_devs+0x1f/0x30
[  354.899392]  usb_disable_device+0x9e/0x270
[  354.899772]  usb_disconnect+0x92/0x2a0
[  354.900149]  hub_event+0x98e/0x1650
[  354.900519]  ? sched_clock_cpu+0x11/0xa0
[  354.900890]  process_one_work+0x167/0x3f0
[  354.901251]  worker_thread+0x4d/0x460
[  354.901610]  kthread+0x105/0x140
[  354.901964]  ? rescuer_thread+0x360/0x360
[  354.902318]  ? kthread_associate_blkcg+0xa0/0xa0
[  354.902672]  ret_from_fork+0x22/0x40
[  354.903024] Modules linked in: rc_hauppauge em28xx_rc rc_core si2157 lgdt3306a i2c_mux em28xx_dvb dvb_core videobuf2_vmalloc videobuf2_memops videobuf2_common snd_hda_codec_hdmi nls_iso8859_1 edac_mce_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi aesni_intel snd_seq_midi_event aes_x86_64 snd_rawmidi crypto_simd em28xx cryptd glue_helper asix tveeprom usbnet snd_seq v4l2_common mii videodev snd_seq_device media input_leds snd_timer joydev ccp k10temp wmi_bmof snd soundcore mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables vfio_pci vfio_virqfd irqbypass vfio_iommu_type1 vfio nouveau mxm_wmi video i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_piix4 drm ahci libahci
[  354.905129]  wmi gpio_amdpt gpio_generic hid_generic usbhid hid
[  354.908140] ---[ end trace c230d02716298c34 ]---
[  354.908145] RIP: 0010:__slab_free+0x217/0x370
[  354.908147] Code: bb c0 e8 07 41 38 c7 72 39 48 83 c4 70 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 f3 90 49 8b 04 24 a8 01 75 f6 eb 82 <0f> 0b 44 89 45 80 48 89 4d 88 e8 aa fa ff ff 85 c0 74 cc e9 b7 fe
[  354.908183] RSP: 0018:ffffb84c41a4fad0 EFLAGS: 00010246
[  354.908186] RAX: ffff948646e85150 RBX: ffff948646e85150 RCX: ffff948646e85150
[  354.908189] RDX: 00000000820001d9 RSI: fffffa8fd01ba140 RDI: ffff94865e807c00
[  354.908191] RBP: ffffb84c41a4fb70 R08: 0000000000000001 R09: ffffffffc059ce21
[  354.908193] R10: ffff948646e85150 R11: 0000000000000001 R12: fffffa8fd01ba140
[  354.908195] R13: ffff948646e85150 R14: ffff94865e807c00 R15: ffff94864c92e0a0
[  354.908198] FS:  0000000000000000(0000) GS:ffff94865ec40000(0000) knlGS:0000000000000000
[  354.908201] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  354.908203] CR2: 00007f4e476da950 CR3: 000000016b20a000 CR4: 00000000003406e0

Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Michael Ira Krufky <mkrufky@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit a1933c0 ]

[  155.018460] ======================================================
[  155.021431] WARNING: possible circular locking dependency detected
[  155.024339] 4.18.0-rc3+ 150balbes#5 Tainted: G           OE
[  155.026879] ------------------------------------------------------
[  155.029783] umount/2901 is trying to acquire lock:
[  155.032187] 00000000c4282f1f (kn->count#130){++++}, at: kernfs_remove+0x1f/0x30
[  155.035439]
[  155.035439] but task is already holding lock:
[  155.038892] 0000000056e4307b (&type->s_umount_key#41){++++}, at: deactivate_super+0x33/0x50
[  155.042602]
[  155.042602] which lock already depends on the new lock.
[  155.042602]
[  155.047465]
[  155.047465] the existing dependency chain (in reverse order) is:
[  155.051354]
[  155.051354] -> 150balbes#1 (&type->s_umount_key#41){++++}:
[  155.054768]        f2fs_sbi_store+0x61/0x460 [f2fs]
[  155.057083]        kernfs_fop_write+0x113/0x1a0
[  155.059277]        __vfs_write+0x36/0x180
[  155.061250]        vfs_write+0xbe/0x1b0
[  155.063179]        ksys_write+0x55/0xc0
[  155.065068]        do_syscall_64+0x60/0x1b0
[  155.067071]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  155.069529]
[  155.069529] -> #0 (kn->count#130){++++}:
[  155.072421]        __kernfs_remove+0x26f/0x2e0
[  155.074452]        kernfs_remove+0x1f/0x30
[  155.076342]        kobject_del.part.5+0xe/0x40
[  155.078354]        f2fs_put_super+0x12d/0x290 [f2fs]
[  155.080500]        generic_shutdown_super+0x6c/0x110
[  155.082655]        kill_block_super+0x21/0x50
[  155.084634]        kill_f2fs_super+0x9c/0xc0 [f2fs]
[  155.086726]        deactivate_locked_super+0x3f/0x70
[  155.088826]        cleanup_mnt+0x3b/0x70
[  155.090584]        task_work_run+0x93/0xc0
[  155.092367]        exit_to_usermode_loop+0xf0/0x100
[  155.094466]        do_syscall_64+0x162/0x1b0
[  155.096312]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  155.098603]
[  155.098603] other info that might help us debug this:
[  155.098603]
[  155.102418]  Possible unsafe locking scenario:
[  155.102418]
[  155.105134]        CPU0                    CPU1
[  155.107037]        ----                    ----
[  155.108910]   lock(&type->s_umount_key#41);
[  155.110674]                                lock(kn->count#130);
[  155.113010]                                lock(&type->s_umount_key#41);
[  155.115608]   lock(kn->count#130);

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 42bf546 ]

As Wen Xu reported in below link:

https://bugzilla.kernel.org/show_bug.cgi?id=200183

- Overview
Divide zero in reset_curseg() when mounting a crafted f2fs image

- Reproduce

- Kernel message
[  588.281510] divide error: 0000 [150balbes#1] SMP KASAN PTI
[  588.282701] CPU: 0 PID: 1293 Comm: mount Not tainted 4.18.0-rc1+ 150balbes#4
[  588.284000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  588.286178] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.298166] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.299360] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.300809] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.305272] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.306822] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.308456] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.309623] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0
[  588.311085] Call Trace:
[  588.311637]  f2fs_build_segment_manager+0x103f/0x3410
[  588.316136]  ? f2fs_commit_super+0x1b0/0x1b0
[  588.317031]  ? set_blocksize+0x90/0x140
[  588.319473]  f2fs_mount+0x15/0x20
[  588.320166]  mount_fs+0x60/0x1a0
[  588.320847]  ? alloc_vfsmnt+0x309/0x360
[  588.321647]  vfs_kern_mount+0x6b/0x1a0
[  588.322432]  do_mount+0x34a/0x18c0
[  588.323175]  ? strndup_user+0x46/0x70
[  588.323937]  ? copy_mount_string+0x20/0x20
[  588.324793]  ? memcg_kmem_put_cache+0x1b/0xa0
[  588.325702]  ? kasan_check_write+0x14/0x20
[  588.326562]  ? _copy_from_user+0x6a/0x90
[  588.327375]  ? memdup_user+0x42/0x60
[  588.328118]  ksys_mount+0x83/0xd0
[  588.328808]  __x64_sys_mount+0x67/0x80
[  588.329607]  do_syscall_64+0x78/0x170
[  588.330400]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  588.331461] RIP: 0033:0x7fad848e8b9a
[  588.336022] RSP: 002b:00007ffd7c5b6be8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  588.337547] RAX: ffffffffffffffda RBX: 00000000016f8030 RCX: 00007fad848e8b9a
[  588.338999] RDX: 00000000016f8210 RSI: 00000000016f9f30 RDI: 0000000001700ec0
[  588.340442] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  588.341887] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001700ec0
[  588.343341] R13: 00000000016f8210 R14: 0000000000000000 R15: 0000000000000003
[  588.354891] ---[ end trace 4ce02f25ff7d3df5 ]---
[  588.355862] RIP: 0010:reset_curseg+0x94/0x1a0
[  588.360742] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[  588.361812] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[  588.363485] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[  588.365213] RBP: ffff8801e88d7968 R08: ffffed003c32266f R09: ffffed003c32266f
[  588.366661] R10: 0000000000000001 R11: ffffed003c32266e R12: ffff8801f0337700
[  588.368110] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[  588.370057] FS:  00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  588.372099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  588.373291] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0

- Location
https://elixir.bootlin.com/linux/latest/source/fs/f2fs/segment.c#L2147
        curseg->zone = GET_ZONE_FROM_SEG(sbi, curseg->segno);

If secs_per_zone is corrupted due to fuzzing test, it will cause divide
zero operation when using GET_ZONE_FROM_SEG macro, so we should do more
sanity check with secs_per_zone during mount to avoid this issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit e237401 ]

As Anatoly Trosinenko reported in bugzilla:

How to reproduce:
1. Compile the 73fcb1a version of the kernel using the config attached
2. Unpack and mount the attached filesystem image as F2FS
3. The kernel will BUG() on mount (BUGs are explicitly enabled in config)

[    2.233612] F2FS-fs (sda): Found nat_bits in checkpoint
[    2.248422] ------------[ cut here ]------------
[    2.248857] kernel BUG at fs/f2fs/node.c:1967!
[    2.249760] invalid opcode: 0000 [150balbes#1] SMP NOPTI
[    2.250219] Modules linked in:
[    2.251848] CPU: 0 PID: 944 Comm: mount Not tainted 4.17.0-rc5+ 150balbes#1
[    2.252331] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[    2.253305] RIP: 0010:build_free_nids+0x337/0x3f0
[    2.253672] RSP: 0018:ffffae7fc0857c50 EFLAGS: 00000246
[    2.254080] RAX: 00000000ffffffff RBX: 0000000000000123 RCX: 0000000000000001
[    2.254638] RDX: ffff9aa7063d5c00 RSI: 0000000000000122 RDI: ffff9aa705852e00
[    2.255190] RBP: ffff9aa705852e00 R08: 0000000000000001 R09: ffff9aa7059090c0
[    2.255719] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9aa705852e00
[    2.256242] R13: ffff9aa7063ad000 R14: ffff9aa705919000 R15: 0000000000000123
[    2.256809] FS:  00000000023078c0(0000) GS:ffff9aa707800000(0000) knlGS:0000000000000000
[    2.258654] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.259153] CR2: 00000000005511ae CR3: 0000000005872000 CR4: 00000000000006f0
[    2.259801] Call Trace:
[    2.260583]  build_node_manager+0x5cd/0x600
[    2.260963]  f2fs_fill_super+0x66a/0x17c0
[    2.261300]  ? f2fs_commit_super+0xe0/0xe0
[    2.261622]  mount_bdev+0x16e/0x1a0
[    2.261899]  mount_fs+0x30/0x150
[    2.262398]  vfs_kern_mount.part.28+0x4f/0xf0
[    2.262743]  do_mount+0x5d0/0xc60
[    2.263010]  ? _copy_from_user+0x37/0x60
[    2.263313]  ? memdup_user+0x39/0x60
[    2.263692]  ksys_mount+0x7b/0xd0
[    2.263960]  __x64_sys_mount+0x1c/0x20
[    2.264268]  do_syscall_64+0x43/0xf0
[    2.264560]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.265095] RIP: 0033:0x48d31a
[    2.265502] RSP: 002b:00007ffc6fe60a08 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[    2.266089] RAX: ffffffffffffffda RBX: 0000000000008000 RCX: 000000000048d31a
[    2.266607] RDX: 00007ffc6fe62fa5 RSI: 00007ffc6fe62f9d RDI: 00007ffc6fe62f94
[    2.267130] RBP: 00000000023078a0 R08: 0000000000000000 R09: 0000000000000000
[    2.267670] R10: 0000000000008000 R11: 0000000000000246 R12: 0000000000000000
[    2.268192] R13: 0000000000000000 R14: 00007ffc6fe60c78 R15: 0000000000000000
[    2.268767] Code: e8 5f c3 ff ff 83 c3 01 41 83 c7 01 81 fb c7 01 00 00 74 48 44 39 7d 04 76 42 48 63 c3 48 8d 04 c0 41 8b 44 06 05 83 f8 ff 75 c1 <0f> 0b 49 8b 45 50 48 8d b8 b0 00 00 00 e8 37 59 69 00 b9 01 00
[    2.270434] RIP: build_free_nids+0x337/0x3f0 RSP: ffffae7fc0857c50
[    2.271426] ---[ end trace ab20c06cd3c8fde4 ]---

During loading NAT entries, we will do sanity check, once the entry info
is corrupted, it will cause BUG_ON directly to protect user data from
being overwrited.

In this case, it will be better to just return failure on mount() instead
of panic, so that user can get hint from kmsg and try fsck for recovery
immediately rather than after an abnormal reboot.

https://bugzilla.kernel.org/show_bug.cgi?id=199769

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 76d56d4 ]

If FI_EXTRA_ATTR is set in inode by fuzzing, inode.i_addr[0] will be
parsed as inode.i_extra_isize, then in __recover_inline_status, inline
data address will beyond boundary of page, result in accessing invalid
memory.

So in this condition, during reading inode page, let's do sanity check
with EXTRA_ATTR feature of fs and extra_attr bit of inode, if they're
inconsistent, deny to load this inode.

- Overview
Out-of-bound access in f2fs_iget() when mounting a corrupted f2fs image

- Reproduce

The following message will be got in KASAN build of 4.18 upstream kernel.
[  819.392227] ==================================================================
[  819.393901] BUG: KASAN: slab-out-of-bounds in f2fs_iget+0x736/0x1530
[  819.395329] Read of size 4 at addr ffff8801f099c968 by task mount/1292

[  819.397079] CPU: 1 PID: 1292 Comm: mount Not tainted 4.18.0-rc1+ 150balbes#4
[  819.397082] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  819.397088] Call Trace:
[  819.397124]  dump_stack+0x7b/0xb5
[  819.397154]  print_address_description+0x70/0x290
[  819.397159]  kasan_report+0x291/0x390
[  819.397163]  ? f2fs_iget+0x736/0x1530
[  819.397176]  check_memory_region+0x139/0x190
[  819.397182]  __asan_loadN+0xf/0x20
[  819.397185]  f2fs_iget+0x736/0x1530
[  819.397197]  f2fs_fill_super+0x1b4f/0x2b40
[  819.397202]  ? f2fs_fill_super+0x1b4f/0x2b40
[  819.397208]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397227]  ? set_blocksize+0x90/0x140
[  819.397241]  mount_bdev+0x1c5/0x210
[  819.397245]  ? f2fs_commit_super+0x1b0/0x1b0
[  819.397252]  f2fs_mount+0x15/0x20
[  819.397256]  mount_fs+0x60/0x1a0
[  819.397267]  ? alloc_vfsmnt+0x309/0x360
[  819.397272]  vfs_kern_mount+0x6b/0x1a0
[  819.397282]  do_mount+0x34a/0x18c0
[  819.397300]  ? lockref_put_or_lock+0xcf/0x160
[  819.397306]  ? copy_mount_string+0x20/0x20
[  819.397318]  ? memcg_kmem_put_cache+0x1b/0xa0
[  819.397324]  ? kasan_check_write+0x14/0x20
[  819.397334]  ? _copy_from_user+0x6a/0x90
[  819.397353]  ? memdup_user+0x42/0x60
[  819.397359]  ksys_mount+0x83/0xd0
[  819.397365]  __x64_sys_mount+0x67/0x80
[  819.397388]  do_syscall_64+0x78/0x170
[  819.397403]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  819.397422] RIP: 0033:0x7f54c667cb9a
[  819.397424] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[  819.397483] RSP: 002b:00007ffd8f46cd08 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  819.397496] RAX: ffffffffffffffda RBX: 0000000000dfa030 RCX: 00007f54c667cb9a
[  819.397498] RDX: 0000000000dfa210 RSI: 0000000000dfbf30 RDI: 0000000000e02ec0
[  819.397501] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[  819.397503] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000000000e02ec0
[  819.397505] R13: 0000000000dfa210 R14: 0000000000000000 R15: 0000000000000003

[  819.397866] Allocated by task 139:
[  819.398702]  save_stack+0x46/0xd0
[  819.398705]  kasan_kmalloc+0xad/0xe0
[  819.398709]  kasan_slab_alloc+0x11/0x20
[  819.398713]  kmem_cache_alloc+0xd1/0x1e0
[  819.398717]  dup_fd+0x50/0x4c0
[  819.398740]  copy_process.part.37+0xbed/0x32e0
[  819.398744]  _do_fork+0x16e/0x590
[  819.398748]  __x64_sys_clone+0x69/0x80
[  819.398752]  do_syscall_64+0x78/0x170
[  819.398756]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.399097] Freed by task 159:
[  819.399743]  save_stack+0x46/0xd0
[  819.399747]  __kasan_slab_free+0x13c/0x1a0
[  819.399750]  kasan_slab_free+0xe/0x10
[  819.399754]  kmem_cache_free+0x89/0x1e0
[  819.399757]  put_files_struct+0x132/0x150
[  819.399761]  exit_files+0x62/0x70
[  819.399766]  do_exit+0x47b/0x1390
[  819.399770]  do_group_exit+0x86/0x130
[  819.399774]  __x64_sys_exit_group+0x2c/0x30
[  819.399778]  do_syscall_64+0x78/0x170
[  819.399782]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  819.400115] The buggy address belongs to the object at ffff8801f099c680
                which belongs to the cache files_cache of size 704
[  819.403234] The buggy address is located 40 bytes to the right of
                704-byte region [ffff8801f099c680, ffff8801f099c940)
[  819.405689] The buggy address belongs to the page:
[  819.406709] page:ffffea0007c26700 count:1 mapcount:0 mapping:ffff8801f69a3340 index:0xffff8801f099d380 compound_mapcount: 0
[  819.408984] flags: 0x2ffff0000008100(slab|head)
[  819.409932] raw: 02ffff0000008100 ffffea00077fb600 0000000200000002 ffff8801f69a3340
[  819.411514] raw: ffff8801f099d380 0000000080130000 00000001ffffffff 0000000000000000
[  819.413073] page dumped because: kasan: bad access detected

[  819.414539] Memory state around the buggy address:
[  819.415521]  ffff8801f099c800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.416981]  ffff8801f099c880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.418454] >ffff8801f099c900: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  819.419921]                                                           ^
[  819.421265]  ffff8801f099c980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[  819.422745]  ffff8801f099ca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  819.424206] ==================================================================
[  819.425668] Disabling lock debugging due to kernel taint
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3

The kernel still mounts the image. If you run the following program on the mounted folder mnt,

(poc.c)

static void activity(char *mpoint) {

  char *foo_bar_baz;
  int err;

  static int buf[8192];
  memset(buf, 0, sizeof(buf));

  err = asprintf(&foo_bar_baz, "%s/foo/bar/baz", mpoint);
    int fd = open(foo_bar_baz, O_RDONLY, 0);
  if (fd >= 0) {
      read(fd, (char *)buf, 11);
      close(fd);
  }
}

int main(int argc, char *argv[]) {
  activity(argv[1]);
  return 0;
}

You can get kernel crash:
[  819.457463] F2FS-fs (loop0): Mounted with checkpoint version = 3
[  918.028501] BUG: unable to handle kernel paging request at ffffed0048000d82
[  918.044020] PGD 23ffee067 P4D 23ffee067 PUD 23fbef067 PMD 0
[  918.045207] Oops: 0000 [150balbes#1] SMP KASAN PTI
[  918.046048] CPU: 0 PID: 1309 Comm: poc Tainted: G    B             4.18.0-rc1+ 150balbes#4
[  918.047573] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[  918.049552] RIP: 0010:check_memory_region+0x5e/0x190
[  918.050565] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.054322] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.055400] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.056832] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.058253] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.059717] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.061159] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.062614] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.064246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.065412] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0
[  918.066882] Call Trace:
[  918.067410]  __asan_loadN+0xf/0x20
[  918.068149]  f2fs_find_target_dentry+0xf4/0x270
[  918.069083]  ? __get_node_page+0x331/0x5b0
[  918.069925]  f2fs_find_in_inline_dir+0x24b/0x310
[  918.070881]  ? f2fs_recover_inline_data+0x4c0/0x4c0
[  918.071905]  ? unwind_next_frame.part.5+0x34f/0x490
[  918.072901]  ? unwind_dump+0x290/0x290
[  918.073695]  ? is_bpf_text_address+0xe/0x20
[  918.074566]  __f2fs_find_entry+0x599/0x670
[  918.075408]  ? kasan_unpoison_shadow+0x36/0x50
[  918.076315]  ? kasan_kmalloc+0xad/0xe0
[  918.077100]  ? memcg_kmem_put_cache+0x55/0xa0
[  918.077998]  ? f2fs_find_target_dentry+0x270/0x270
[  918.079006]  ? d_set_d_op+0x30/0x100
[  918.079749]  ? __d_lookup_rcu+0x69/0x2e0
[  918.080556]  ? __d_alloc+0x275/0x450
[  918.081297]  ? kasan_check_write+0x14/0x20
[  918.082135]  ? memset+0x31/0x40
[  918.082820]  ? fscrypt_setup_filename+0x1ec/0x4c0
[  918.083782]  ? d_alloc_parallel+0x5bb/0x8c0
[  918.084640]  f2fs_find_entry+0xe9/0x110
[  918.085432]  ? __f2fs_find_entry+0x670/0x670
[  918.086308]  ? kasan_check_write+0x14/0x20
[  918.087163]  f2fs_lookup+0x297/0x590
[  918.087902]  ? f2fs_link+0x2b0/0x2b0
[  918.088646]  ? legitimize_path.isra.29+0x61/0xa0
[  918.089589]  __lookup_slow+0x12e/0x240
[  918.090371]  ? may_delete+0x2b0/0x2b0
[  918.091123]  ? __nd_alloc_stack+0xa0/0xa0
[  918.091944]  lookup_slow+0x44/0x60
[  918.092642]  walk_component+0x3ee/0xa40
[  918.093428]  ? is_bpf_text_address+0xe/0x20
[  918.094283]  ? pick_link+0x3e0/0x3e0
[  918.095047]  ? in_group_p+0xa5/0xe0
[  918.095771]  ? generic_permission+0x53/0x1e0
[  918.096666]  ? security_inode_permission+0x1d/0x70
[  918.097646]  ? inode_permission+0x7a/0x1f0
[  918.098497]  link_path_walk+0x2a2/0x7b0
[  918.099298]  ? apparmor_capget+0x3d0/0x3d0
[  918.100140]  ? walk_component+0xa40/0xa40
[  918.100958]  ? path_init+0x2e6/0x580
[  918.101695]  path_openat+0x1bb/0x2160
[  918.102471]  ? __save_stack_trace+0x92/0x100
[  918.103352]  ? save_stack+0xb5/0xd0
[  918.104070]  ? vfs_unlink+0x250/0x250
[  918.104822]  ? save_stack+0x46/0xd0
[  918.105538]  ? kasan_slab_alloc+0x11/0x20
[  918.106370]  ? kmem_cache_alloc+0xd1/0x1e0
[  918.107213]  ? getname_flags+0x76/0x2c0
[  918.107997]  ? getname+0x12/0x20
[  918.108677]  ? do_sys_open+0x14b/0x2c0
[  918.109450]  ? __x64_sys_open+0x4c/0x60
[  918.110255]  ? do_syscall_64+0x78/0x170
[  918.111083]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.112148]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.113204]  ? f2fs_empty_inline_dir+0x1e0/0x1e0
[  918.114150]  ? timespec64_trunc+0x5c/0x90
[  918.114993]  ? wb_io_lists_depopulated+0x1a/0xc0
[  918.115937]  ? inode_io_list_move_locked+0x102/0x110
[  918.116949]  do_filp_open+0x12b/0x1d0
[  918.117709]  ? may_open_dev+0x50/0x50
[  918.118475]  ? kasan_kmalloc+0xad/0xe0
[  918.119246]  do_sys_open+0x17c/0x2c0
[  918.119983]  ? do_sys_open+0x17c/0x2c0
[  918.120751]  ? filp_open+0x60/0x60
[  918.121463]  ? task_work_run+0x4d/0xf0
[  918.122237]  __x64_sys_open+0x4c/0x60
[  918.123001]  do_syscall_64+0x78/0x170
[  918.123759]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  918.124802] RIP: 0033:0x7fac96e3e040
[  918.125537] Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 83 3d 09 27 2d 00 00 75 10 b8 02 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e e0 01 00 48 89 04 24
[  918.129341] RSP: 002b:00007fff1b37f848 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
[  918.130870] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fac96e3e040
[  918.132295] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000122d080
[  918.133748] RBP: 00007fff1b37f9b0 R08: 00007fac9710bbd8 R09: 0000000000000001
[  918.135209] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000400c20
[  918.136650] R13: 00007fff1b37fab0 R14: 0000000000000000 R15: 0000000000000000
[  918.138093] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[  918.147924] CR2: ffffed0048000d82
[  918.148619] ---[ end trace 4ce02f25ff7d3df5 ]---
[  918.149563] RIP: 0010:check_memory_region+0x5e/0x190
[  918.150576] Code: f8 49 c1 e8 03 49 89 db 49 c1 eb 03 4d 01 cb 4d 01 c1 4d 8d 63 01 4c 89 c8 4d 89 e2 4d 29 ca 49 83 fa 10 7f 3d 4d 85 d2 74 32 <41> 80 39 00 75 23 48 b8 01 00 00 00 00 fc ff df 4d 01 d1 49 01 c0
[  918.154360] RSP: 0018:ffff8801e3a1f258 EFLAGS: 00010202
[  918.155411] RAX: ffffed0048000d82 RBX: ffff880240006c11 RCX: ffffffffb8867d14
[  918.156833] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880240006c10
[  918.158257] RBP: ffff8801e3a1f268 R08: 1ffff10048000d82 R09: ffffed0048000d82
[  918.159722] R10: 0000000000000001 R11: ffffed0048000d82 R12: ffffed0048000d83
[  918.161149] R13: ffff8801e3a1f390 R14: 0000000000000000 R15: ffff880240006c08
[  918.162587] FS:  00007fac9732c700(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[  918.164203] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  918.165356] CR2: ffffed0048000d82 CR3: 00000001df77a000 CR4: 00000000000006f0

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 46b3722 ]

We occasionaly hit following assert failure in 'perf top', when processing the
/proc info in multiple threads.

  perf: ...include/linux/refcount.h:109: refcount_inc:
        Assertion `!(!refcount_inc_not_zero(r))' failed.

The gdb backtrace looks like this:

  [Switching to Thread 0x7ffff11ba700 (LWP 13749)]
  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  150balbes#1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
  150balbes#2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
  150balbes#3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
  150balbes#4  0x0000000000535373 in refcount_inc (r=0x7fffdc009be0)
      at ...include/linux/refcount.h:109
  150balbes#5  0x00000000005354f1 in comm_str__get (cs=0x7fffdc009bc0)
      at util/comm.c:24
  150balbes#6  0x00000000005356bd in __comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:72
  150balbes#7  0x000000000053579e in comm_str__findnew (str=0x7fffd000b260 ":2",
      root=0xbed5c0 <comm_str_root>) at util/comm.c:95
  150balbes#8  0x000000000053582e in comm__new (str=0x7fffd000b260 ":2",
      timestamp=0, exec=false) at util/comm.c:111
  150balbes#9  0x00000000005363bc in thread__new (pid=2, tid=2) at util/thread.c:57
  150balbes#10 0x0000000000523da0 in ____machine__findnew_thread (machine=0xbfde38,
      threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:457
  150balbes#11 0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
  ...

The failing assertion is this one:

  REFCOUNT_WARN(!refcount_inc_not_zero(r), ...

The problem is that we keep global comm_str_root list, which
is accessed by multiple threads during the 'perf top' startup
and following 2 paths can race:

  thread 1:
    ...
    thread__new
      comm__new
        comm_str__findnew
          down_write(&comm_str_lock);
          __comm_str__findnew
            comm_str__get

  thread 2:
    ...
    comm__override or comm__free
      comm_str__put
        refcount_dec_and_test
          down_write(&comm_str_lock);
          rb_erase(&cs->rb_node, &comm_str_root);

Because thread 2 first decrements the refcnt and only after then it removes the
struct comm_str from the list, the thread 1 can find this object on the list
with refcnt equls to 0 and hit the assert.

This patch fixes the thread 1 __comm_str__findnew path, by ignoring objects
that already dropped the refcnt to 0. For the rest of the objects we take the
refcnt before comparing its name and release it afterwards with comm_str__put,
which can also release the object completely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20180720101740.GA27176@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit b71c69c ]

Fixes this warning that was provoked by a pairing:

[60258.016221] WARNING: possible recursive locking detected
[60258.021558] 4.15.0-RD1812-BSP 150balbes#1 Tainted: G           O
[60258.027146] --------------------------------------------
[60258.032464] kworker/u5:0/70 is trying to acquire lock:
[60258.037609]  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}, at: [<87759073>] bt_accept_enqueue+0x3c/0x74
[60258.046863]
[60258.046863] but task is already holding lock:
[60258.052704]  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}, at: [<d22d7106>] l2cap_sock_new_connection_cb+0x1c/0x88
[60258.062905]
[60258.062905] other info that might help us debug this:
[60258.069441]  Possible unsafe locking scenario:
[60258.069441]
[60258.075368]        CPU0
[60258.077821]        ----
[60258.080272]   lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
[60258.085510]   lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP);
[60258.090748]
[60258.090748]  *** DEADLOCK ***
[60258.090748]
[60258.096676]  May be due to missing lock nesting notation
[60258.096676]
[60258.103472] 5 locks held by kworker/u5:0/70:
[60258.107747]  #0:  ((wq_completion)%shdev->name#2){+.+.}, at: [<9460d092>] process_one_work+0x130/0x4fc
[60258.117263]  150balbes#1:  ((work_completion)(&hdev->rx_work)){+.+.}, at: [<9460d092>] process_one_work+0x130/0x4fc
[60258.126942]  150balbes#2:  (&conn->chan_lock){+.+.}, at: [<7877c8c3>] l2cap_connect+0x80/0x4f8
[60258.134806]  150balbes#3:  (&chan->lock/2){+.+.}, at: [<2e16c724>] l2cap_connect+0x8c/0x4f8
[60258.142410]  150balbes#4:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.}, at: [<d22d7106>] l2cap_sock_new_connection_cb+0x1c/0x88
[60258.153043]
[60258.153043] stack backtrace:
[60258.157413] CPU: 1 PID: 70 Comm: kworker/u5:0 Tainted: G           O     4.15.0-RD1812-BSP 150balbes#1
[60258.165945] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[60258.172485] Workqueue: hci0 hci_rx_work
[60258.176331] Backtrace:
[60258.178797] [<8010c9fc>] (dump_backtrace) from [<8010ccbc>] (show_stack+0x18/0x1c)
[60258.186379]  r7:80e55fe4 r6:80e55fe4 r5:20050093 r4:00000000
[60258.192058] [<8010cca4>] (show_stack) from [<809864e8>] (dump_stack+0xb0/0xdc)
[60258.199301] [<80986438>] (dump_stack) from [<8016ecc8>] (__lock_acquire+0xffc/0x11d4)
[60258.207144]  r9:5e2bb019 r8:630f974c r7:ba8a5940 r6:ba8a5ed8 r5:815b5220 r4:80fa081c
[60258.214901] [<8016dccc>] (__lock_acquire) from [<8016f620>] (lock_acquire+0x78/0x98)
[60258.222655]  r10:00000040 r9:00000040 r8:808729f0 r7:00000001 r6:00000000 r5:60050013
[60258.230491]  r4:00000000
[60258.233045] [<8016f5a8>] (lock_acquire) from [<806ee974>] (lock_sock_nested+0x64/0x88)
[60258.240970]  r7:00000000 r6:b796e870 r5:00000001 r4:b796e800
[60258.246643] [<806ee910>] (lock_sock_nested) from [<808729f0>] (bt_accept_enqueue+0x3c/0x74)
[60258.255004]  r8:00000001 r7:ba7d3c00 r6:ba7d3ea4 r5:ba7d2000 r4:b796e800
[60258.261717] [<808729b4>] (bt_accept_enqueue) from [<808aa39c>] (l2cap_sock_new_connection_cb+0x68/0x88)
[60258.271117]  r5:b796e800 r4:ba7d2000
[60258.274708] [<808aa334>] (l2cap_sock_new_connection_cb) from [<808a294c>] (l2cap_connect+0x190/0x4f8)
[60258.283933]  r5:00000001 r4:ba6dce00
[60258.287524] [<808a27bc>] (l2cap_connect) from [<808a4a14>] (l2cap_recv_frame+0x744/0x2cf8)
[60258.295800]  r10:ba6dcf24 r9:00000004 r8:b78d8014 r7:00000004 r6:bb05d000 r5:00000004
[60258.303635]  r4:bb05d008
[60258.306183] [<808a42d0>] (l2cap_recv_frame) from [<808a7808>] (l2cap_recv_acldata+0x210/0x214)
[60258.314805]  r10:b78e7800 r9:bb05d960 r8:00000001 r7:bb05d000 r6:0000000c r5:b7957a80
[60258.322641]  r4:ba6dce00
[60258.325188] [<808a75f8>] (l2cap_recv_acldata) from [<8087630c>] (hci_rx_work+0x35c/0x4e8)
[60258.333374]  r6:80e5743c r5:bb05d7c8 r4:b7957a80
[60258.338004] [<80875fb0>] (hci_rx_work) from [<8013dc7c>] (process_one_work+0x1a4/0x4fc)
[60258.346018]  r10:00000001 r9:00000000 r8:baabfef8 r7:ba997500 r6:baaba800 r5:baaa5d00
[60258.353853]  r4:bb05d7c8
[60258.356401] [<8013dad8>] (process_one_work) from [<8013e028>] (worker_thread+0x54/0x5cc)
[60258.364503]  r10:baabe038 r9:baaba834 r8:80e05900 r7:00000088 r6:baaa5d18 r5:baaba800
[60258.372338]  r4:baaa5d00
[60258.374888] [<8013dfd4>] (worker_thread) from [<801448f8>] (kthread+0x134/0x160)
[60258.382295]  r10:ba8310b8 r9:bb07dbfc r8:8013dfd4 r7:baaa5d00 r6:00000000 r5:baaa8ac0
[60258.390130]  r4:ba831080
[60258.392682] [<801447c4>] (kthread) from [<801080b4>] (ret_from_fork+0x14/0x20)
[60258.399915]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:801447c4
[60258.407751]  r4:baaa8ac0 r3:baabe000

Signed-off-by: Philipp Puschmann <pp@emlix.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
[ Upstream commit 018349d ]

When netvsc device is removed it can call reschedule in RCU context.
This happens because canceling the subchannel setup work could (in theory)
cause a reschedule when manipulating the timer.

To reproduce, run with lockdep enabled kernel and unbind
a network device from hv_netvsc (via sysfs).

[  160.682011] WARNING: suspicious RCU usage
[  160.707466] 4.19.0-rc3-uio+ 150balbes#2 Not tainted
[  160.709937] -----------------------------
[  160.712352] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section!
[  160.723691]
[  160.723691] other info that might help us debug this:
[  160.723691]
[  160.730955]
[  160.730955] rcu_scheduler_active = 2, debug_locks = 1
[  160.762813] 5 locks held by rebind-eth.sh/1812:
[  160.766851]  #0: 000000008befa37a (sb_writers#6){.+.+}, at: vfs_write+0x184/0x1b0
[  160.773416]  150balbes#1: 00000000b097f236 (&of->mutex){+.+.}, at: kernfs_fop_write+0xe2/0x1a0
[  160.783766]  150balbes#2: 0000000041ee6889 (kn->count#3){++++}, at: kernfs_fop_write+0xeb/0x1a0
[  160.787465]  150balbes#3: 0000000056d92a74 (&dev->mutex){....}, at: device_release_driver_internal+0x39/0x250
[  160.816987]  150balbes#4: 0000000030f6031e (rcu_read_lock){....}, at: netvsc_remove+0x1e/0x250 [hv_netvsc]
[  160.828629]
[  160.828629] stack backtrace:
[  160.831966] CPU: 1 PID: 1812 Comm: rebind-eth.sh Not tainted 4.19.0-rc3-uio+ 150balbes#2
[  160.832952] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[  160.832952] Call Trace:
[  160.832952]  dump_stack+0x85/0xcb
[  160.832952]  ___might_sleep+0x1a3/0x240
[  160.832952]  __flush_work+0x57/0x2e0
[  160.832952]  ? __mutex_lock+0x83/0x990
[  160.832952]  ? __kernfs_remove+0x24f/0x2e0
[  160.832952]  ? __kernfs_remove+0x1b2/0x2e0
[  160.832952]  ? mark_held_locks+0x50/0x80
[  160.832952]  ? get_work_pool+0x90/0x90
[  160.832952]  __cancel_work_timer+0x13c/0x1e0
[  160.832952]  ? netvsc_remove+0x1e/0x250 [hv_netvsc]
[  160.832952]  ? __lock_is_held+0x55/0x90
[  160.832952]  netvsc_remove+0x9a/0x250 [hv_netvsc]
[  160.832952]  vmbus_remove+0x26/0x30
[  160.832952]  device_release_driver_internal+0x18a/0x250
[  160.832952]  unbind_store+0xb4/0x180
[  160.832952]  kernfs_fop_write+0x113/0x1a0
[  160.832952]  __vfs_write+0x36/0x1a0
[  160.832952]  ? rcu_read_lock_sched_held+0x6b/0x80
[  160.832952]  ? rcu_sync_lockdep_assert+0x2e/0x60
[  160.832952]  ? __sb_start_write+0x141/0x1a0
[  160.832952]  ? vfs_write+0x184/0x1b0
[  160.832952]  vfs_write+0xbe/0x1b0
[  160.832952]  ksys_write+0x55/0xc0
[  160.832952]  do_syscall_64+0x60/0x1b0
[  160.832952]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  160.832952] RIP: 0033:0x7fe48f4c8154

Resolve this by getting RTNL earlier. This is safe because the subchannel
work queue does trylock on RTNL and will detect the race.

Fixes: 7b2ee50 ("hv_netvsc: common detach logic")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit 889c695 upstream.

Deferred struct page init is needed only on systems with large amount of
physical memory to improve boot performance.  32-bit systems do not
benefit from this feature.

Jiri reported a problem where deferred struct pages do not work well with
x86-32:

[    0.035162] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[    0.035725] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.036269] Initializing CPU#0
[    0.036513] Initializing HighMem for node 0 (00036ffe:0007ffe0)
[    0.038459] page:f6780000 is uninitialized and poisoned
[    0.038460] raw: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
[    0.039509] page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
[    0.040038] ------------[ cut here ]------------
[    0.040399] kernel BUG at include/linux/page-flags.h:293!
[    0.040823] invalid opcode: 0000 [150balbes#1] SMP PTI
[    0.041166] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc1_pt_jiri 150balbes#9
[    0.041694] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[    0.042496] EIP: free_highmem_page+0x64/0x80
[    0.042839] Code: 13 46 d8 c1 e8 18 5d 83 e0 03 8d 04 c0 c1 e0 06 ff 80 ec 5f 44 d8 c3 8d b4 26 00 00 00 00 ba 08 65 28 d8 89 d8 e8 fc 71 02 00 <0f> 0b 8d 76 00 8d bc 27 00 00 00 00 ba d0 b1 26 d8 89 d8 e8 e4 71
[    0.044338] EAX: 0000003c EBX: f6780000 ECX: 00000000 EDX: d856cbe8
[    0.044868] ESI: 0007ffe0 EDI: d838df20 EBP: d838df00 ESP: d838defc
[    0.045372] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210086
[    0.045913] CR0: 80050033 CR2: 00000000 CR3: 18556000 CR4: 00040690
[    0.046413] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    0.046913] DR6: fffe0ff0 DR7: 00000400
[    0.047220] Call Trace:
[    0.047419]  add_highpages_with_active_regions+0xbd/0x10d
[    0.047854]  set_highmem_pages_init+0x5b/0x71
[    0.048202]  mem_init+0x2b/0x1e8
[    0.048460]  start_kernel+0x1d2/0x425
[    0.048757]  i386_start_kernel+0x93/0x97
[    0.049073]  startup_32_smp+0x164/0x168
[    0.049379] Modules linked in:
[    0.049626] ---[ end trace 337949378db0abbb ]---

We free highmem pages before their struct pages are initialized:

mem_init()
 set_highmem_pages_init()
  add_highpages_with_active_regions()
   free_highmem_page()
    .. Access uninitialized struct page here..

Because there is no reason to have this feature on 32-bit systems, just
disable it.

Link: http://lkml.kernel.org/r/20180831150506.31246-1-pavel.tatashin@microsoft.com
Fixes: 2e3ca40 ("mm: relax deferred struct page requirements")
Signed-off-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit b45d71f upstream.

Directories and inodes don't necessarily need to be in the same lockdep
class.  For ex, hugetlbfs splits them out too to prevent false positives
in lockdep.  Annotate correctly after new inode creation.  If its a
directory inode, it will be put into a different class.

This should fix a lockdep splat reported by syzbot:

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.18.0-rc8-next-20180810+ #36 Not tainted
> ------------------------------------------------------
> syz-executor900/4483 is trying to acquire lock:
> 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at: inode_lock
> include/linux/fs.h:765 [inline]
> 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at:
> shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
>
> but task is already holding lock:
> 0000000025208078 (ashmem_mutex){+.+.}, at: ashmem_shrink_scan+0xb4/0x630
> drivers/staging/android/ashmem.c:448
>
> which lock already depends on the new lock.
>
> -> 150balbes#2 (ashmem_mutex){+.+.}:
>        __mutex_lock_common kernel/locking/mutex.c:925 [inline]
>        __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
>        ashmem_mmap+0x55/0x520 drivers/staging/android/ashmem.c:361
>        call_mmap include/linux/fs.h:1844 [inline]
>        mmap_region+0xf27/0x1c50 mm/mmap.c:1762
>        do_mmap+0xa10/0x1220 mm/mmap.c:1535
>        do_mmap_pgoff include/linux/mm.h:2298 [inline]
>        vm_mmap_pgoff+0x213/0x2c0 mm/util.c:357
>        ksys_mmap_pgoff+0x4da/0x660 mm/mmap.c:1585
>        __do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
>        __se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
>        __x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> -> 150balbes#1 (&mm->mmap_sem){++++}:
>        __might_fault+0x155/0x1e0 mm/memory.c:4568
>        _copy_to_user+0x30/0x110 lib/usercopy.c:25
>        copy_to_user include/linux/uaccess.h:155 [inline]
>        filldir+0x1ea/0x3a0 fs/readdir.c:196
>        dir_emit_dot include/linux/fs.h:3464 [inline]
>        dir_emit_dots include/linux/fs.h:3475 [inline]
>        dcache_readdir+0x13a/0x620 fs/libfs.c:193
>        iterate_dir+0x48b/0x5d0 fs/readdir.c:51
>        __do_sys_getdents fs/readdir.c:231 [inline]
>        __se_sys_getdents fs/readdir.c:212 [inline]
>        __x64_sys_getdents+0x29f/0x510 fs/readdir.c:212
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> -> #0 (&sb->s_type->i_mutex_key#9){++++}:
>        lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
>        down_write+0x8f/0x130 kernel/locking/rwsem.c:70
>        inode_lock include/linux/fs.h:765 [inline]
>        shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
>        ashmem_shrink_scan+0x236/0x630 drivers/staging/android/ashmem.c:455
>        ashmem_ioctl+0x3ae/0x13a0 drivers/staging/android/ashmem.c:797
>        vfs_ioctl fs/ioctl.c:46 [inline]
>        file_ioctl fs/ioctl.c:501 [inline]
>        do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:685
>        ksys_ioctl+0xa9/0xd0 fs/ioctl.c:702
>        __do_sys_ioctl fs/ioctl.c:709 [inline]
>        __se_sys_ioctl fs/ioctl.c:707 [inline]
>        __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:707
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> other info that might help us debug this:
>
> Chain exists of:
>   &sb->s_type->i_mutex_key#9 --> &mm->mmap_sem --> ashmem_mutex
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(ashmem_mutex);
>                                lock(&mm->mmap_sem);
>                                lock(ashmem_mutex);
>   lock(&sb->s_type->i_mutex_key#9);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor900/4483:
>  #0: 0000000025208078 (ashmem_mutex){+.+.}, at:
> ashmem_shrink_scan+0xb4/0x630 drivers/staging/android/ashmem.c:448

Link: http://lkml.kernel.org/r/20180821231835.166639-1-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Suggested-by: NeilBrown <neilb@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit 234b69e upstream.

While reading block, it is possible that io error return due to underlying
storage issue, in this case, BH_NeedsValidate was left in the buffer head.
Then when reading the very block next time, if it was already linked into
journal, that will trigger the following panic.

[203748.702517] kernel BUG at fs/ocfs2/buffer_head_io.c:342!
[203748.702533] invalid opcode: 0000 [150balbes#1] SMP
[203748.702561] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc dm_switch dm_queue_length dm_multipath bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i iw_cxgb4 cxgb4 cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas ipmi_ssif i2c_core ipmi_si ipmi_msghandler acpi_pad pcspkr sb_edac edac_core lpc_ich mfd_core shpchp sg tg3 ptp pps_core ext4 jbd2 mbcache2 sr_mod cdrom sd_mod ahci libahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod
[203748.703024] CPU: 7 PID: 38369 Comm: touch Not tainted 4.1.12-124.18.6.el6uek.x86_64 150balbes#2
[203748.703045] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.5.2 01/28/2015
[203748.703067] task: ffff880768139c00 ti: ffff88006ff48000 task.ti: ffff88006ff48000
[203748.703088] RIP: 0010:[<ffffffffa05e9f09>]  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
[203748.703130] RSP: 0018:ffff88006ff4b818  EFLAGS: 00010206
[203748.703389] RAX: 0000000008620029 RBX: ffff88006ff4b910 RCX: 0000000000000000
[203748.703885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000023079fe
[203748.704382] RBP: ffff88006ff4b8d8 R08: 0000000000000000 R09: ffff8807578c25b0
[203748.704877] R10: 000000000f637376 R11: 000000003030322e R12: 0000000000000000
[203748.705373] R13: ffff88006ff4b910 R14: ffff880732fe38f0 R15: 0000000000000000
[203748.705871] FS:  00007f401992c700(0000) GS:ffff880bfebc0000(0000) knlGS:0000000000000000
[203748.706370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[203748.706627] CR2: 00007f4019252440 CR3: 00000000a621e000 CR4: 0000000000060670
[203748.707124] Stack:
[203748.707371]  ffff88006ff4b828 ffffffffa0609f52 ffff88006ff4b838 0000000000000001
[203748.707885]  0000000000000000 0000000000000000 ffff880bf67c3800 ffffffffa05eca00
[203748.708399]  00000000023079ff ffffffff81c58b80 0000000000000000 0000000000000000
[203748.708915] Call Trace:
[203748.709175]  [<ffffffffa0609f52>] ? ocfs2_inode_cache_io_unlock+0x12/0x20 [ocfs2]
[203748.709680]  [<ffffffffa05eca00>] ? ocfs2_empty_dir_filldir+0x80/0x80 [ocfs2]
[203748.710185]  [<ffffffffa05ec0cb>] ocfs2_read_dir_block_direct+0x3b/0x200 [ocfs2]
[203748.710691]  [<ffffffffa05f0fbf>] ocfs2_prepare_dx_dir_for_insert.isra.57+0x19f/0xf60 [ocfs2]
[203748.711204]  [<ffffffffa065660f>] ? ocfs2_metadata_cache_io_unlock+0x1f/0x30 [ocfs2]
[203748.711716]  [<ffffffffa05f4f3a>] ocfs2_prepare_dir_for_insert+0x13a/0x890 [ocfs2]
[203748.712227]  [<ffffffffa05f442e>] ? ocfs2_check_dir_for_entry+0x8e/0x140 [ocfs2]
[203748.712737]  [<ffffffffa061b2f2>] ocfs2_mknod+0x4b2/0x1370 [ocfs2]
[203748.713003]  [<ffffffffa061c385>] ocfs2_create+0x65/0x170 [ocfs2]
[203748.713263]  [<ffffffff8121714b>] vfs_create+0xdb/0x150
[203748.713518]  [<ffffffff8121b225>] do_last+0x815/0x1210
[203748.713772]  [<ffffffff812192e9>] ? path_init+0xb9/0x450
[203748.714123]  [<ffffffff8121bca0>] path_openat+0x80/0x600
[203748.714378]  [<ffffffff811bcd45>] ? handle_pte_fault+0xd15/0x1620
[203748.714634]  [<ffffffff8121d7ba>] do_filp_open+0x3a/0xb0
[203748.714888]  [<ffffffff8122a767>] ? __alloc_fd+0xa7/0x130
[203748.715143]  [<ffffffff81209ffc>] do_sys_open+0x12c/0x220
[203748.715403]  [<ffffffff81026ddb>] ? syscall_trace_enter_phase1+0x11b/0x180
[203748.715668]  [<ffffffff816f0c9f>] ? system_call_after_swapgs+0xe9/0x190
[203748.715928]  [<ffffffff8120a10e>] SyS_open+0x1e/0x20
[203748.716184]  [<ffffffff816f0d5e>] system_call_fastpath+0x18/0xd7
[203748.716440] Code: 00 00 48 8b 7b 08 48 83 c3 10 45 89 f8 44 89 e1 44 89 f2 4c 89 ee e8 07 06 11 e1 48 8b 03 48 85 c0 75 df 8b 5d c8 e9 4d fa ff ff <0f> 0b 48 8b 7d a0 e8 dc c6 06 00 48 b8 00 00 00 00 00 00 00 10
[203748.717505] RIP  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
[203748.717775]  RSP <ffff88006ff4b818>

Joesph ever reported a similar panic.
Link: https://oss.oracle.com/pipermail/ocfs2-devel/2013-May/008931.html

Link: http://lkml.kernel.org/r/20180912063207.29484-1-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit 3e1a127 upstream.

When we disable hotplugging on the GPU, we need to be able to
synchronize with each connector's hotplug interrupt handler before the
interrupt is finally disabled. This can be a problem however, since
nouveau_connector_detect() currently grabs a runtime power reference
when handling connector probing. This will deadlock the runtime suspend
handler like so:

[  861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
[  861.483290]       Tainted: G           O      4.18.0-rc6Lyude-Test+ 150balbes#1
[  861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.486332] kworker/0:2     D    0    61      2 0x80000000
[  861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
[  861.487737] Call Trace:
[  861.488394]  __schedule+0x322/0xaf0
[  861.489070]  schedule+0x33/0x90
[  861.489744]  rpm_resume+0x19c/0x850
[  861.490392]  ? finish_wait+0x90/0x90
[  861.491068]  __pm_runtime_resume+0x4e/0x90
[  861.491753]  nouveau_display_hpd_work+0x22/0x60 [nouveau]
[  861.492416]  process_one_work+0x231/0x620
[  861.493068]  worker_thread+0x44/0x3a0
[  861.493722]  kthread+0x12b/0x150
[  861.494342]  ? wq_pool_ids_show+0x140/0x140
[  861.494991]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.495648]  ret_from_fork+0x3a/0x50
[  861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
[  861.496968]       Tainted: G           O      4.18.0-rc6Lyude-Test+ 150balbes#1
[  861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.498341] kworker/6:2     D    0   320      2 0x80000080
[  861.499045] Workqueue: pm pm_runtime_work
[  861.499739] Call Trace:
[  861.500428]  __schedule+0x322/0xaf0
[  861.501134]  ? wait_for_completion+0x104/0x190
[  861.501851]  schedule+0x33/0x90
[  861.502564]  schedule_timeout+0x3a5/0x590
[  861.503284]  ? mark_held_locks+0x58/0x80
[  861.503988]  ? _raw_spin_unlock_irq+0x2c/0x40
[  861.504710]  ? wait_for_completion+0x104/0x190
[  861.505417]  ? trace_hardirqs_on_caller+0xf4/0x190
[  861.506136]  ? wait_for_completion+0x104/0x190
[  861.506845]  wait_for_completion+0x12c/0x190
[  861.507555]  ? wake_up_q+0x80/0x80
[  861.508268]  flush_work+0x1c9/0x280
[  861.508990]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[  861.509735]  nvif_notify_put+0xb1/0xc0 [nouveau]
[  861.510482]  nouveau_display_fini+0xbd/0x170 [nouveau]
[  861.511241]  nouveau_display_suspend+0x67/0x120 [nouveau]
[  861.511969]  nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[  861.512715]  nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
[  861.513435]  pci_pm_runtime_suspend+0x6b/0x180
[  861.514165]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.514897]  __rpm_callback+0x7a/0x1d0
[  861.515618]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.516313]  rpm_callback+0x24/0x80
[  861.517027]  ? pci_has_legacy_pm_support+0x70/0x70
[  861.517741]  rpm_suspend+0x142/0x6b0
[  861.518449]  pm_runtime_work+0x97/0xc0
[  861.519144]  process_one_work+0x231/0x620
[  861.519831]  worker_thread+0x44/0x3a0
[  861.520522]  kthread+0x12b/0x150
[  861.521220]  ? wq_pool_ids_show+0x140/0x140
[  861.521925]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.522622]  ret_from_fork+0x3a/0x50
[  861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
[  861.523977]       Tainted: G           O      4.18.0-rc6Lyude-Test+ 150balbes#1
[  861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  861.525349] kworker/6:0     D    0  1329      2 0x80000000
[  861.526073] Workqueue: events nvif_notify_work [nouveau]
[  861.526751] Call Trace:
[  861.527411]  __schedule+0x322/0xaf0
[  861.528089]  schedule+0x33/0x90
[  861.528758]  rpm_resume+0x19c/0x850
[  861.529399]  ? finish_wait+0x90/0x90
[  861.530073]  __pm_runtime_resume+0x4e/0x90
[  861.530798]  nouveau_connector_detect+0x7e/0x510 [nouveau]
[  861.531459]  ? ww_mutex_lock+0x47/0x80
[  861.532097]  ? ww_mutex_lock+0x47/0x80
[  861.532819]  ? drm_modeset_lock+0x88/0x130 [drm]
[  861.533481]  drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
[  861.534127]  drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
[  861.534940]  nouveau_connector_hotplug+0x98/0x120 [nouveau]
[  861.535556]  nvif_notify_work+0x2d/0xb0 [nouveau]
[  861.536221]  process_one_work+0x231/0x620
[  861.536994]  worker_thread+0x44/0x3a0
[  861.537757]  kthread+0x12b/0x150
[  861.538463]  ? wq_pool_ids_show+0x140/0x140
[  861.539102]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.539815]  ret_from_fork+0x3a/0x50
[  861.540521]
               Showing all locks held in the system:
[  861.541696] 2 locks held by kworker/0:2/61:
[  861.542406]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543071]  150balbes#1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.543814] 1 lock held by khungtaskd/64:
[  861.544535]  #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[  861.545160] 3 locks held by kworker/6:2/320:
[  861.545896]  #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.546702]  150balbes#1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[  861.547443]  150balbes#2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
[  861.548146] 1 lock held by dmesg/983:
[  861.548889] 2 locks held by zsh/1250:
[  861.549605]  #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[  861.550393]  150balbes#1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[  861.551122] 6 locks held by kworker/6:0/1329:
[  861.551957]  #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  861.552765]  150balbes#1: 00000000ddb499ad ((work_completion)(&notify->work)150balbes#2){+.+.}, at: process_one_work+0x1b3/0x620
[  861.553582]  150balbes#2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
[  861.554357]  150balbes#3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
[  861.555227]  150balbes#4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
[  861.556133]  150balbes#5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]

[  861.557864] =============================================

[  861.559507] NMI backtrace for cpu 2
[  861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G           O      4.18.0-rc6Lyude-Test+ 150balbes#1
[  861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[  861.561948] Call Trace:
[  861.562757]  dump_stack+0x8e/0xd3
[  861.563516]  nmi_cpu_backtrace.cold.3+0x14/0x5a
[  861.564269]  ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[  861.565029]  nmi_trigger_cpumask_backtrace+0xa1/0xae
[  861.565789]  arch_trigger_cpumask_backtrace+0x19/0x20
[  861.566558]  watchdog+0x316/0x580
[  861.567355]  kthread+0x12b/0x150
[  861.568114]  ? reset_hung_task_detector+0x20/0x20
[  861.568863]  ? kthread_create_worker_on_cpu+0x70/0x70
[  861.569598]  ret_from_fork+0x3a/0x50
[  861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
[  861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[  861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[  861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[  861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[  861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[  861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[  861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
[  861.572428] Kernel panic - not syncing: hung_task: blocked tasks

So: fix this by making it so that normal hotplug handling /only/ happens
so long as the GPU is currently awake without any pending runtime PM
requests. In the event that a hotplug occurs while the device is
suspending or resuming, we can simply defer our response until the GPU
is fully runtime resumed again.

Changes since v4:
- Use a new trick I came up with using pm_runtime_get() instead of the
  hackish junk we had before

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
…equests

commit 7fec8f5 upstream.

Currently, nouveau uses the generic drm_fb_helper_output_poll_changed()
function provided by DRM as it's output_poll_changed callback.
Unfortunately however, this function doesn't grab runtime PM references
early enough and even if it did-we can't block waiting for the device to
resume in output_poll_changed() since it's very likely that we'll need
to grab the fb_helper lock at some point during the runtime resume
process. This currently results in deadlocking like so:

[  246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[  246.673398]       Not tainted 4.18.0-rc5Lyude-Test+ 150balbes#2
[  246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.676527] kworker/4:0     D    0    37      2 0x80000000
[  246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[  246.678704] Call Trace:
[  246.679753]  __schedule+0x322/0xaf0
[  246.680916]  schedule+0x33/0x90
[  246.681924]  schedule_preempt_disabled+0x15/0x20
[  246.683023]  __mutex_lock+0x569/0x9a0
[  246.684035]  ? kobject_uevent_env+0x117/0x7b0
[  246.685132]  ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.686179]  mutex_lock_nested+0x1b/0x20
[  246.687278]  ? mutex_lock_nested+0x1b/0x20
[  246.688307]  drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.689420]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[  246.690462]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[  246.691570]  output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[  246.692611]  process_one_work+0x231/0x620
[  246.693725]  worker_thread+0x214/0x3a0
[  246.694756]  kthread+0x12b/0x150
[  246.695856]  ? wq_pool_ids_show+0x140/0x140
[  246.696888]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.697998]  ret_from_fork+0x3a/0x50
[  246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[  246.700153]       Not tainted 4.18.0-rc5Lyude-Test+ 150balbes#2
[  246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.702278] kworker/0:1     D    0    60      2 0x80000000
[  246.703293] Workqueue: pm pm_runtime_work
[  246.704393] Call Trace:
[  246.705403]  __schedule+0x322/0xaf0
[  246.706439]  ? wait_for_completion+0x104/0x190
[  246.707393]  schedule+0x33/0x90
[  246.708375]  schedule_timeout+0x3a5/0x590
[  246.709289]  ? mark_held_locks+0x58/0x80
[  246.710208]  ? _raw_spin_unlock_irq+0x2c/0x40
[  246.711222]  ? wait_for_completion+0x104/0x190
[  246.712134]  ? trace_hardirqs_on_caller+0xf4/0x190
[  246.713094]  ? wait_for_completion+0x104/0x190
[  246.713964]  wait_for_completion+0x12c/0x190
[  246.714895]  ? wake_up_q+0x80/0x80
[  246.715727]  ? get_work_pool+0x90/0x90
[  246.716649]  flush_work+0x1c9/0x280
[  246.717483]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[  246.718442]  __cancel_work_timer+0x146/0x1d0
[  246.719247]  cancel_delayed_work_sync+0x13/0x20
[  246.720043]  drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[  246.721123]  nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[  246.721897]  pci_pm_runtime_suspend+0x6b/0x190
[  246.722825]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.723737]  __rpm_callback+0x7a/0x1d0
[  246.724721]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.725607]  rpm_callback+0x24/0x80
[  246.726553]  ? pci_has_legacy_pm_support+0x70/0x70
[  246.727376]  rpm_suspend+0x142/0x6b0
[  246.728185]  pm_runtime_work+0x97/0xc0
[  246.728938]  process_one_work+0x231/0x620
[  246.729796]  worker_thread+0x44/0x3a0
[  246.730614]  kthread+0x12b/0x150
[  246.731395]  ? wq_pool_ids_show+0x140/0x140
[  246.732202]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.732878]  ret_from_fork+0x3a/0x50
[  246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[  246.734587]       Not tainted 4.18.0-rc5Lyude-Test+ 150balbes#2
[  246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  246.736113] kworker/4:2     D    0   422      2 0x80000080
[  246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[  246.737665] Call Trace:
[  246.738490]  __schedule+0x322/0xaf0
[  246.739250]  schedule+0x33/0x90
[  246.739908]  rpm_resume+0x19c/0x850
[  246.740750]  ? finish_wait+0x90/0x90
[  246.741541]  __pm_runtime_resume+0x4e/0x90
[  246.742370]  nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[  246.743124]  drm_atomic_commit+0x4a/0x50 [drm]
[  246.743775]  restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[  246.744603]  restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[  246.745373]  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[  246.746220]  drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[  246.746884]  drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[  246.747675]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[  246.748544]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[  246.749439]  nv50_mstm_hotplug+0x15/0x20 [nouveau]
[  246.750111]  drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[  246.750764]  drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[  246.751602]  drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[  246.752314]  process_one_work+0x231/0x620
[  246.752979]  worker_thread+0x44/0x3a0
[  246.753838]  kthread+0x12b/0x150
[  246.754619]  ? wq_pool_ids_show+0x140/0x140
[  246.755386]  ? kthread_create_worker_on_cpu+0x70/0x70
[  246.756162]  ret_from_fork+0x3a/0x50
[  246.756847]
           Showing all locks held in the system:
[  246.758261] 3 locks held by kworker/4:0/37:
[  246.759016]  #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.759856]  150balbes#1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.760670]  150balbes#2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[  246.761516] 2 locks held by kworker/0:1/60:
[  246.762274]  #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.762982]  150balbes#1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.763890] 1 lock held by khungtaskd/64:
[  246.764664]  #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[  246.765588] 5 locks held by kworker/4:2/422:
[  246.766440]  #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[  246.767390]  150balbes#1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[  246.768154]  150balbes#2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[  246.768966]  150balbes#3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[  246.769921]  150balbes#4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[  246.770839] 1 lock held by dmesg/1038:
[  246.771739] 2 locks held by zsh/1172:
[  246.772650]  #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[  246.773680]  150balbes#1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870

[  246.775522] =============================================

After trying dozens of different solutions, I found one very simple one
that should also have the benefit of preventing us from having to fight
locking for the rest of our lives. So, we work around these deadlocks by
deferring all fbcon hotplug events that happen after the runtime suspend
process starts until after the device is resumed again.

Changes since v7:
 - Fixup commit message - Daniel Vetter

Changes since v6:
 - Remove unused nouveau_fbcon_hotplugged_in_suspend() - Ilia

Changes since v5:
 - Come up with the (hopefully final) solution for solving this dumb
   problem, one that is a lot less likely to cause issues with locking in
   the future. This should work around all deadlock conditions with fbcon
   brought up thus far.

Changes since v4:
 - Add nouveau_fbcon_hotplugged_in_suspend() to workaround deadlock
   condition that Lukas described
 - Just move all of this out of drm_fb_helper. It seems that other DRM
   drivers have already figured out other workarounds for this. If other
   drivers do end up needing this in the future, we can just move this
   back into drm_fb_helper again.

Changes since v3:
- Actually check if fb_helper is NULL in both new helpers
- Actually check drm_fbdev_emulation in both new helpers
- Don't fire off a fb_helper hotplug unconditionally; only do it if
  the following conditions are true (as otherwise, calling this in the
  wrong spot will cause Bad Things to happen):
  - fb_helper hotplug handling was actually inhibited previously
  - fb_helper actually has a delayed hotplug pending
  - fb_helper is actually bound
  - fb_helper is actually initialized
- Add __must_check to drm_fb_helper_suspend_hotplug(). There's no
  situation where a driver would actually want to use this without
  checking the return value, so enforce that
- Rewrite and clarify the documentation for both helpers.
- Make sure to return true in the drm_fb_helper_suspend_hotplug() stub
  that's provided in drm_fb_helper.h when CONFIG_DRM_FBDEV_EMULATION
  isn't enabled
- Actually grab the toplevel fb_helper lock in
  drm_fb_helper_resume_hotplug(), since it's possible other activity
  (such as a hotplug) could be going on at the same time the driver
  calls drm_fb_helper_resume_hotplug(). We need this to check whether or
  not drm_fb_helper_hotplug_event() needs to be called anyway

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Oct 10, 2018
commit 79e765a upstream.

On most systems with ACPI hotplugging support, it seems that we always
receive a hotplug event once we re-enable EC interrupts even if the GPU
hasn't even been resumed yet.

This can cause problems since even though we schedule hpd_work to handle
connector reprobing for us, hpd_work synchronizes on
pm_runtime_get_sync() to wait until the device is ready to perform
reprobing. Since runtime suspend/resume callbacks are disabled before
the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
this period will grab a runtime PM ref and return immediately with
-EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
a connector reprobe immediately even if the GPU isn't actually resumed
just yet. This causes various warnings in dmesg and occasionally, also
prevents some displays connected to the dedicated GPU from coming back
up after suspend. Example:

usb 1-4: USB disconnect, device number 14
usb 1-4.1: USB disconnect, device number 15
WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 150balbes#1
Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Workqueue: events nouveau_display_hpd_work [nouveau]
RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? _cond_resched+0x15/0x30
 nouveau_connector_detect+0x2ce/0x520 [nouveau]
 ? _cond_resched+0x15/0x30
 ? ww_mutex_lock+0x12/0x40
 drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
 drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
 nouveau_display_hpd_work+0x2a/0x60 [nouveau]
 process_one_work+0x187/0x340
 worker_thread+0x2e/0x380
 ? pwq_unbound_release_workfn+0xd0/0xd0
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x35/0x40
Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
---[ end trace 55d811b38fc8e71a ]---

So, to fix this we attempt to grab a runtime PM reference in the ACPI
handler itself asynchronously. If the GPU is already awake (it will have
normal hotplugging at this point) or runtime PM callbacks are currently
disabled on the device, we drop our reference without updating the
autosuspend delay. We only schedule connector reprobes when we
successfully managed to queue up a resume request with our asynchronous
PM ref.

This also has the added benefit of preventing redundant connector
reprobes from ACPI while the GPU is runtime resumed!

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <kherbst@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 1, 2019
[ Upstream commit e68599a ]

Mike Galbraith reported a regression caused by the commit 9b6f7e1
("mm: rework memcg kernel stack accounting") on a system with
"cgroup_disable=memory" boot option: the system panics with the following
stack trace:

  BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
  PGD 0 P4D 0
  Oops: 0002 [150balbes#1] PREEMPT SMP PTI
  CPU: 0 PID: 1 Comm: systemd Not tainted 4.19.0-preempt+ #410
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fed4
  RIP: 0010:page_counter_try_charge+0x22/0xc0
  Code: 41 5d c3 c3 0f 1f 40 00 0f 1f 44 00 00 48 85 ff 0f 84 a7 00 00 00 41 56 48 89 f8 49 89 fe 49
  Call Trace:
   try_charge+0xcb/0x780
   memcg_kmem_charge_memcg+0x28/0x80
   memcg_kmem_charge+0x8b/0x1d0
   copy_process.part.41+0x1ca/0x2070
   _do_fork+0xd7/0x3d0
   do_syscall_64+0x5a/0x180
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

The problem occurs because get_mem_cgroup_from_current() returns the NULL
pointer if memory controller is disabled.  Let's check if this is a case
at the beginning of memcg_kmem_charge() and just return 0 if
mem_cgroup_disabled() returns true.  This is how we handle this case in
many other places in the memory controller code.

Link: http://lkml.kernel.org/r/20181029215123.17830-1-guro@fb.com
Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 1, 2019
[ Upstream commit 751a0cc ]

When a PCIe NVMe device is not present, nvme_dev_remove_admin() calls
blk_cleanup_queue() on the admin queue, which frees the hctx for that
queue.  Moments later, on the same path nvme_kill_queues() calls
blk_mq_unquiesce_queue() on admin queue and tries to access hctx of it,
which leads to following OOPS:

Oops: 0000 [150balbes#1] SMP PTI
RIP: 0010:sbitmap_any_bit_set+0xb/0x40
Call Trace:
 blk_mq_run_hw_queue+0xd5/0x150
 blk_mq_run_hw_queues+0x3a/0x50
 nvme_kill_queues+0x26/0x50
 nvme_remove_namespaces+0xb2/0xc0
 nvme_remove+0x60/0x140
 pci_device_remove+0x3b/0xb0

Fixes: cb4bfda ("nvme-pci: fix hot removal during error handling")
Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 1, 2019
commit f3f5ba42c58d56d50f539854d8cc188944e96087 upstream.

The touch timer is set up in intf1. If the second interface does not exist,
the timer and touch input device are not setup and we get the following
error, when touch events are reported via intf0.

kernel BUG at kernel/time/timer.c:956!
invalid opcode: 0000 [150balbes#1] SMP KASAN
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__mod_timer kernel/time/timer.c:956 [inline]
RIP: 0010:__mod_timer kernel/time/timer.c:949 [inline]
RIP: 0010:mod_timer+0x5a2/0xb50 kernel/time/timer.c:1100
Code: 45 10 c7 44 24 14 ff ff ff ff 48 89 44 24 08 48 8d 45 20 48 c7 44 24 18 00 00 00 00 48 89 04 24 e9 5a fc ff ff e8 ae ce 0e 00 <0f> 0b e8 a7 ce 0e 00 4c 89 74 24 20 e9 37 fe ff ff e8 98 ce 0e 00
RSP: 0018:ffff8881db209930 EFLAGS: 00010006
RAX: ffffffff86c2b200 RBX: 00000000ffffa688 RCX: ffffffff83efc583
RDX: 0000000000000100 RSI: ffffffff812f4d82 RDI: ffff8881d2356200
RBP: ffff8881d23561e8 R08: ffffffff86c2b200 R09: ffffed103a46abeb
R10: ffffed103a46abea R11: ffff8881d2355f53 R12: dffffc0000000000
R13: 1ffff1103b64132d R14: ffff8881d2355f50 R15: 0000000000000006
FS:  0000000000000000(0000) GS:ffff8881db200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f75e2799000 CR3: 00000001d3b07000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 imon_touch_event drivers/media/rc/imon.c:1348 [inline]
 imon_incoming_packet.isra.0+0x2546/0x2f10 drivers/media/rc/imon.c:1603
 usb_rx_callback_intf0+0x151/0x1e0 drivers/media/rc/imon.c:1734
 __usb_hcd_giveback_urb+0x1f2/0x470 drivers/usb/core/hcd.c:1654
 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1719
 dummy_timer+0x120f/0x2fa2 drivers/usb/gadget/udc/dummy_hcd.c:1965
 call_timer_fn+0x179/0x650 kernel/time/timer.c:1404
 expire_timers kernel/time/timer.c:1449 [inline]
 __run_timers kernel/time/timer.c:1773 [inline]
 __run_timers kernel/time/timer.c:1740 [inline]
 run_timer_softirq+0x5e3/0x1490 kernel/time/timer.c:1786
 __do_softirq+0x221/0x912 kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0x178/0x1a0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0x12f/0x500 arch/x86/kernel/apic/apic.c:1137
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
 </IRQ>
RIP: 0010:default_idle+0x28/0x2e0 arch/x86/kernel/process.c:581
Code: 90 90 41 56 41 55 65 44 8b 2d 44 3a 8f 7a 41 54 55 53 0f 1f 44 00 00 e8 36 ee d0 fb e9 07 00 00 00 0f 00 2d fa dd 4f 00 fb f4 <65> 44 8b 2d 20 3a 8f 7a 0f 1f 44 00 00 5b 5d 41 5c 41 5d 41 5e c3
RSP: 0018:ffffffff86c07da8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000007 RBX: ffffffff86c2b200 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffffff86c2ba4c
RBP: fffffbfff0d85640 R08: ffffffff86c2b200 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 cpuidle_idle_call kernel/sched/idle.c:154 [inline]
 do_idle+0x3b6/0x500 kernel/sched/idle.c:263
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:355
 start_kernel+0x82a/0x864 init/main.c:784
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
Modules linked in:

Reported-by: syzbot+f49d12d34f2321cf4df2@syzkaller.appspotmail.com
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 5, 2019
[ Upstream commit 41d49e7939de5ec532d86494185b2ca2e99c848a ]

When timer_of_init fails, it cleans up after itself by undoing
everything it did during the initialization function.

mtk_syst_init and mtk_gpt_init both call timer_of_cleanup if
timer_of_init fails. timer_of_cleanup try to release the resource
taken.  Since these resources have already been cleaned up by
timer_of_init, we end up getting a few warnings printed:

[    0.001935] WARNING: CPU: 0 PID: 0 at __clk_put+0xe8/0x128
[    0.002650] Modules linked in:
[    0.003058] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.67+ 150balbes#1
[    0.003852] Hardware name: MediaTek MT8183 (DT)
[    0.004446] pstate: 20400085 (nzCv daIf +PAN -UAO)
[    0.005073] pc : __clk_put+0xe8/0x128
[    0.005555] lr : clk_put+0xc/0x14
[    0.005988] sp : ffffff80090b3ea0
[    0.006422] x29: ffffff80090b3ea0 x28: 0000000040e20018
[    0.007121] x27: ffffffc07bfff780 x26: 0000000000000001
[    0.007819] x25: ffffff80090bda80 x24: ffffff8008ec5828
[    0.008517] x23: ffffff80090bd000 x22: ffffff8008d8b2e8
[    0.009216] x21: 0000000000000001 x20: fffffffffffffdfb
[    0.009914] x19: ffffff8009166180 x18: 00000000002bffa8
[    0.010612] x17: ffffffc012996980 x16: 0000000000000000
[    0.011311] x15: ffffffbf004a6800 x14: 3536343038393334
[    0.012009] x13: 2079726576652073 x12: 7eb9c62c5c38f100
[    0.012707] x11: ffffff80090b3ba0 x10: ffffff80090b3ba0
[    0.013405] x9 : 0000000000000004 x8 : 0000000000000040
[    0.014103] x7 : ffffffc079400270 x6 : 0000000000000000
[    0.014801] x5 : ffffffc079400248 x4 : 0000000000000000
[    0.015499] x3 : 0000000000000000 x2 : 0000000000000000
[    0.016197] x1 : ffffff80091661c0 x0 : fffffffffffffdfb
[    0.016896] Call trace:
[    0.017218]  __clk_put+0xe8/0x128
[    0.017654]  clk_put+0xc/0x14
[    0.018048]  timer_of_cleanup+0x60/0x7c
[    0.018551]  mtk_syst_init+0x8c/0x9c
[    0.019020]  timer_probe+0x6c/0xe0
[    0.019469]  time_init+0x14/0x44
[    0.019893]  start_kernel+0x2d0/0x46c
[    0.020378] ---[ end trace 8c1efabea1267649 ]---
[    0.020982] ------------[ cut here ]------------
[    0.021586] Trying to vfree() nonexistent vm area ((____ptrval____))
[    0.022427] WARNING: CPU: 0 PID: 0 at __vunmap+0xd0/0xd8
[    0.023119] Modules linked in:
[    0.023524] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         4.19.67+ 150balbes#1
[    0.024498] Hardware name: MediaTek MT8183 (DT)
[    0.025091] pstate: 60400085 (nZCv daIf +PAN -UAO)
[    0.025718] pc : __vunmap+0xd0/0xd8
[    0.026176] lr : __vunmap+0xd0/0xd8
[    0.026632] sp : ffffff80090b3e90
[    0.027066] x29: ffffff80090b3e90 x28: 0000000040e20018
[    0.027764] x27: ffffffc07bfff780 x26: 0000000000000001
[    0.028462] x25: ffffff80090bda80 x24: ffffff8008ec5828
[    0.029160] x23: ffffff80090bd000 x22: ffffff8008d8b2e8
[    0.029858] x21: 0000000000000000 x20: 0000000000000000
[    0.030556] x19: ffffff800800d000 x18: 00000000002bffa8
[    0.031254] x17: 0000000000000000 x16: 0000000000000000
[    0.031952] x15: ffffffbf004a6800 x14: 3536343038393334
[    0.032651] x13: 2079726576652073 x12: 7eb9c62c5c38f100
[    0.033349] x11: ffffff80090b3b40 x10: ffffff80090b3b40
[    0.034047] x9 : 0000000000000005 x8 : 5f5f6c6176727470
[    0.034745] x7 : 5f5f5f5f28282061 x6 : ffffff80091c86ef
[    0.035443] x5 : ffffff800852b690 x4 : 0000000000000000
[    0.036141] x3 : 0000000000000002 x2 : 0000000000000002
[    0.036839] x1 : 7eb9c62c5c38f100 x0 : 7eb9c62c5c38f100
[    0.037536] Call trace:
[    0.037859]  __vunmap+0xd0/0xd8
[    0.038271]  vunmap+0x24/0x30
[    0.038664]  __iounmap+0x2c/0x34
[    0.039088]  timer_of_cleanup+0x70/0x7c
[    0.039591]  mtk_syst_init+0x8c/0x9c
[    0.040060]  timer_probe+0x6c/0xe0
[    0.040507]  time_init+0x14/0x44
[    0.040932]  start_kernel+0x2d0/0x46c

This commit remove the calls to timer_of_cleanup when timer_of_init
fails since it is unnecessary and actually cause warnings to be printed.

Fixes: a0858f9 ("mediatek: Convert the driver to timer-of")
Signed-off-by: Fabien Parent <fparent@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/linux-arm-kernel/20190919191315.25190-1-fparent@baylibre.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 5, 2019
[ Upstream commit 7e5d0bf6afcc7bd72f78e7f33570e2e0945624f0 ]

Since commit a211b8c ("ARM: dts: imx6qdl-sabreauto: Add sensors")
a storm of accelerometer interrupts is seen:

[  114.211283] irq 260: nobody cared (try booting with the "irqpoll" option)
[  114.218108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.4 150balbes#1
[  114.223960] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  114.230531] [<c0112858>] (unwind_backtrace) from [<c010cdc8>] (show_stack+0x10/0x14)
[  114.238301] [<c010cdc8>] (show_stack) from [<c0c1aa1c>] (dump_stack+0xd8/0x110)
[  114.245644] [<c0c1aa1c>] (dump_stack) from [<c0193594>] (__report_bad_irq+0x30/0xc0)
[  114.253417] [<c0193594>] (__report_bad_irq) from [<c01933ac>] (note_interrupt+0x108/0x298)
[  114.261707] [<c01933ac>] (note_interrupt) from [<c018ffe4>] (handle_irq_event_percpu+0x70/0x80)
[  114.270433] [<c018ffe4>] (handle_irq_event_percpu) from [<c019002c>] (handle_irq_event+0x38/0x5c)
[  114.279326] [<c019002c>] (handle_irq_event) from [<c019438c>] (handle_level_irq+0xc8/0x154)
[  114.287701] [<c019438c>] (handle_level_irq) from [<c018eda0>] (generic_handle_irq+0x20/0x34)
[  114.296166] [<c018eda0>] (generic_handle_irq) from [<c0534214>] (mxc_gpio_irq_handler+0x30/0xf0)
[  114.304975] [<c0534214>] (mxc_gpio_irq_handler) from [<c0534334>] (mx3_gpio_irq_handler+0x60/0xb0)
[  114.313955] [<c0534334>] (mx3_gpio_irq_handler) from [<c018eda0>] (generic_handle_irq+0x20/0x34)
[  114.322762] [<c018eda0>] (generic_handle_irq) from [<c018f3ac>] (__handle_domain_irq+0x64/0xe0)
[  114.331485] [<c018f3ac>] (__handle_domain_irq) from [<c05215a8>] (gic_handle_irq+0x4c/0xa8)
[  114.339862] [<c05215a8>] (gic_handle_irq) from [<c0101a70>] (__irq_svc+0x70/0x98)
[  114.347361] Exception stack(0xc1301ec0 to 0xc1301f08)
[  114.352435] 1ec0: 00000001 00000006 00000000 c130c340 00000001 c130f688 9785636d c13ea2e8
[  114.360635] 1ee0: 9784907d 0000001a eaf99d78 0000001a 00000000 c1301f10 c0182b00 c0878de4
[  114.368830] 1f00: 20000013 ffffffff
[  114.372349] [<c0101a70>] (__irq_svc) from [<c0878de4>] (cpuidle_enter_state+0x168/0x5f4)
[  114.380464] [<c0878de4>] (cpuidle_enter_state) from [<c08792ac>] (cpuidle_enter+0x28/0x38)
[  114.388751] [<c08792ac>] (cpuidle_enter) from [<c015ef9c>] (do_idle+0x224/0x2a8)
[  114.396168] [<c015ef9c>] (do_idle) from [<c015f3b8>] (cpu_startup_entry+0x18/0x20)
[  114.403765] [<c015f3b8>] (cpu_startup_entry) from [<c1200e54>] (start_kernel+0x43c/0x500)
[  114.411958] handlers:
[  114.414302] [<a01028b8>] irq_default_primary_handler threaded [<fd7a3b08>] mma8452_interrupt
[  114.422974] Disabling IRQ #260

           CPU0       CPU1
....
260:     100001          0  gpio-mxc  31 Level     mma8451

The MMA8451 interrupt triggers as low level, so the GPIO6_IO31 pin
needs to activate its pull up, otherwise it will stay always at low level
generating multiple interrupts.

The current device tree does not configure the IOMUX for this pin, so
it uses whathever comes configured from the bootloader.

The IOMUXC_SW_PAD_CTL_PAD_EIM_BCLK register value comes as 0x8000 from
the bootloader, which has PKE bit cleared, hence disabling the
pull-up.

Instead of relying on a previous configuration from the bootloader,
configure the GPIO6_IO31 pin with pull-up enabled in order to fix
this problem.

Fixes: a211b8c ("ARM: dts: imx6qdl-sabreauto: Add sensors")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-By: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 5, 2019
[ Upstream commit b13bc35193d9e7a8c050a24928ca5c9e7c9a009b ]

On an arm64 ThunderX2 server, the first kmemleak scan would crash [1]
with CONFIG_DEBUG_VM_PGFLAGS=y due to page_to_nid() found a pfn that is
not directly mapped (MEMBLOCK_NOMAP).  Hence, the page->flags is
uninitialized.

This is due to the commit 9f1eb38e0e11 ("mm, kmemleak: little
optimization while scanning") starts to use pfn_to_online_page() instead
of pfn_valid().  However, in the CONFIG_MEMORY_HOTPLUG=y case,
pfn_to_online_page() does not call memblock_is_map_memory() while
pfn_valid() does.

Historically, the commit 68709f4 ("arm64: only consider memblocks
with NOMAP cleared for linear mapping") causes pages marked as nomap
being no long reassigned to the new zone in memmap_init_zone() by
calling __init_single_page().

Since the commit 2d070ea ("mm: consider zone which is not fully
populated to have holes") introduced pfn_to_online_page() and was
designed to return a valid pfn only, but it is clearly broken on arm64.

Therefore, let pfn_to_online_page() call pfn_valid_within(), so it can
handle nomap thanks to the commit f52bb98 ("arm64: mm: always
enable CONFIG_HOLES_IN_ZONE"), while it will be optimized away on
architectures where have no HOLES_IN_ZONE.

[1]
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
  Mem abort info:
    ESR = 0x96000005
    Exception class = DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000005
    CM = 0, WnR = 0
  Internal error: Oops: 96000005 [150balbes#1] SMP
  CPU: 60 PID: 1408 Comm: kmemleak Not tainted 5.0.0-rc2+ 150balbes#8
  pstate: 60400009 (nZCv daif +PAN -UAO)
  pc : page_mapping+0x24/0x144
  lr : __dump_page+0x34/0x3dc
  sp : ffff00003a5cfd10
  x29: ffff00003a5cfd10 x28: 000000000000802f
  x27: 0000000000000000 x26: 0000000000277d00
  x25: ffff000010791f56 x24: ffff7fe000000000
  x23: ffff000010772f8b x22: ffff00001125f670
  x21: ffff000011311000 x20: ffff000010772f8b
  x19: fffffffffffffffe x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000
  x15: 0000000000000000 x14: ffff802698b19600
  x13: ffff802698b1a200 x12: ffff802698b16f00
  x11: ffff802698b1a400 x10: 0000000000001400
  x9 : 0000000000000001 x8 : ffff00001121a000
  x7 : 0000000000000000 x6 : ffff0000102c53b8
  x5 : 0000000000000000 x4 : 0000000000000003
  x3 : 0000000000000100 x2 : 0000000000000000
  x1 : ffff000010772f8b x0 : ffffffffffffffff
  Process kmemleak (pid: 1408, stack limit = 0x(____ptrval____))
  Call trace:
   page_mapping+0x24/0x144
   __dump_page+0x34/0x3dc
   dump_page+0x28/0x4c
   kmemleak_scan+0x4ac/0x680
   kmemleak_scan_thread+0xb4/0xdc
   kthread+0x12c/0x13c
   ret_from_fork+0x10/0x18
  Code: d503201f f9400660 36000040 d1000413 (f9400661)
  ---[ end trace 4d4bd7f573490c8e ]---
  Kernel panic - not syncing: Fatal exception
  SMP: stopping secondary CPUs
  Kernel Offset: disabled
  CPU features: 0x002,20000c38
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Fatal exception ]---

Link: http://lkml.kernel.org/r/20190122132916.28360-1-cai@lca.pw
Fixes: 9f1eb38e0e11 ("mm, kmemleak: little optimization while scanning")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 5, 2019
[ Upstream commit 7eb9d7675c08937cd11d32b0b40442d4d731c5ee ]

We need to calculate the skb size correctly otherwise we risk triggering
skb_over_panic[1]. The issue is that data_len is added to the skb in a
nl attribute, but we don't account for its header size (nlattr 4 bytes)
and alignment. We account for it when calculating the total size in
the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when
allocating after that. The fix is simple - use nla_total_size() for
data_len when allocating.

To reproduce:
 $ tc qdisc add dev eth1 clsact
 $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129
 $ mausezahn eth1 -b bcast -a rand -c 1 -p 129
 < skb_over_panic BUG(), tail is 4 bytes past skb->end >

[1] Trace:
 [   50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:<NULL>
 [   50.474339][ T3480] ------------[ cut here ]------------
 [   50.481132][ T3480] kernel BUG at net/core/skbuff.c:108!
 [   50.486059][ T3480] invalid opcode: 0000 [150balbes#1] PREEMPT SMP
 [   50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 #108
 [   50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
 [   50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b
 [   50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe <0f> 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22
 [   50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282
 [   50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000
 [   50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff
 [   50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000
 [   50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0
 [   50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084
 [   50.515222][ T3480] FS:  00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000
 [   50.517135][ T3480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [   50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0
 [   50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [   50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [   50.524450][ T3480] Call Trace:
 [   50.525214][ T3480]  skb_put.cold+0x1b/0x1b
 [   50.526171][ T3480]  psample_sample_packet+0x1d3/0x340
 [   50.527307][ T3480]  tcf_sample_act+0x178/0x250
 [   50.528339][ T3480]  tcf_action_exec+0xb1/0x190
 [   50.529354][ T3480]  mall_classify+0x67/0x90
 [   50.530332][ T3480]  tcf_classify+0x72/0x160
 [   50.531286][ T3480]  __dev_queue_xmit+0x3db/0xd50
 [   50.532327][ T3480]  dev_queue_xmit+0x18/0x20
 [   50.533299][ T3480]  packet_sendmsg+0xee7/0x2090
 [   50.534331][ T3480]  sock_sendmsg+0x54/0x70
 [   50.535271][ T3480]  __sys_sendto+0x148/0x1f0
 [   50.536252][ T3480]  ? tomoyo_file_ioctl+0x23/0x30
 [   50.537334][ T3480]  ? ksys_ioctl+0x5e/0xb0
 [   50.540068][ T3480]  __x64_sys_sendto+0x2a/0x30
 [   50.542810][ T3480]  do_syscall_64+0x73/0x1f0
 [   50.545383][ T3480]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 [   50.548477][ T3480] RIP: 0033:0x7f35357d6fb3
 [   50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24
 [   50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 [   50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3
 [   50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003
 [   50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014
 [   50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082
 [   50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003
 [   50.578170][ T3480] Modules linked in: sch_ingress virtio_net
 [   50.580976][ T3480] ---[ end trace 61a515626a595af6 ]---

CC: Yotam Gigi <yotamg@mellanox.com>
CC: Jiri Pirko <jiri@mellanox.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Simon Horman <simon.horman@netronome.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Fixes: 6ae0a62 ("net: Introduce psample, a new genetlink channel for packet sampling")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 13, 2019
commit 7d73170e1c282576419f8b50a771f1fcd2b81a94 upstream.

Doing fuzz test on sbsa uart device, causes a kernel crash
due to NULL pointer dereference:

------------[ cut here ]------------
Unable to handle kernel paging request at virtual address fffffffffffffffc
pgd = ffffffe331723000
[fffffffffffffffc] *pgd=0000002333595003, *pud=0000002333595003, *pmd=00000
Internal error: Oops: 96000005 [150balbes#1] PREEMPT SMP
Modules linked in: ping(O) jffs2 rtos_snapshot(O) pramdisk(O) hisi_sfc(O)
Drv_Nandc_K(O) Drv_SysCtl_K(O) Drv_SysClk_K(O) bsp_reg(O) hns3(O)
hns3_uio_enet(O) hclgevf(O) hclge(O) hnae3(O) mdio_factory(O)
mdio_registry(O) mdio_dev(O) mdio(O) hns3_info(O) rtos_kbox_panic(O)
uart_suspend(O) rsm(O) stp llc tunnel4 xt_tcpudp ipt_REJECT nf_reject_ipv4
iptable_filter ip_tables x_tables sd_mod xhci_plat_hcd xhci_pci xhci_hcd
usbmon usbhid usb_storage ohci_platform ohci_pci ohci_hcd hid_generic hid
ehci_platform ehci_pci ehci_hcd vfat fat usbcore usb_common scsi_mod
yaffs2multi(O) ext4 jbd2 ext2 mbcache ofpart i2c_dev i2c_core uio ubi nand
nand_ecc nand_ids cfi_cmdset_0002 cfi_cmdset_0001 cfi_probe gen_probe
cmdlinepart chipreg mtdblock mtd_blkdevs mtd nfsd auth_rpcgss oid_registry
nfsv3 nfs nfs_acl lockd sunrpc grace autofs4
CPU: 2 PID: 2385 Comm: tty_fuzz_test Tainted: G           O    4.4.193 150balbes#1
task: ffffffe32b23f110 task.stack: ffffffe32bda4000
PC is at uart_break_ctl+0x44/0x84
LR is at uart_break_ctl+0x34/0x84
pc : [<ffffff8393196098>] lr : [<ffffff8393196088>] pstate: 80000005
sp : ffffffe32bda7cc0
x29: ffffffe32bda7cc0 x28: ffffffe32b23f110
x27: ffffff8393402000 x26: 0000000000000000
x25: ffffffe32b233f40 x24: ffffffc07a8ec680
x23: 0000000000005425 x22: 00000000ffffffff
x21: ffffffe33ed73c98 x20: 0000000000000000
x19: ffffffe33ed94168 x18: 0000000000000004
x17: 0000007f92ae9d30 x16: ffffff8392fa6064
x15: 0000000000000010 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000020 x10: 0000007ffdac1708
x9 : 0000000000000078 x8 : 000000000000001d
x7 : 0000000052a64887 x6 : ffffffe32bda7e08
x5 : ffffffe32b23c000 x4 : 0000005fbc5b0000
x3 : ffffff83938d5018 x2 : 0000000000000080
x1 : ffffffe32b23c040 x0 : ffffff83934428f8
virtual start addr offset is 38ac00000
module base offset is 2cd4cf1000
linear region base offset is : 0
Process tty_fuzz_test (pid: 2385, stack limit = 0xffffffe32bda4000)
Stack: (0xffffffe32bda7cc0 to 0xffffffe32bda8000)
7cc0: ffffffe32bda7cf0 ffffff8393177718 ffffffc07a8ec680 ffffff8393196054
7ce0: 000000001739f2e0 0000007ffdac1978 ffffffe32bda7d20 ffffff8393179a1c
7d00: 0000000000000000 ffffff8393c0a000 ffffffc07a8ec680 cb88537fdc8ba600
7d20: ffffffe32bda7df0 ffffff8392fa5a40 ffffff8393c0a000 0000000000005425
7d40: 0000007ffdac1978 ffffffe32b233f40 ffffff8393178dcc 0000000000000003
7d60: 000000000000011d 000000000000001d ffffffe32b23f110 000000000000029e
7d80: ffffffe34fe8d5d0 0000000000000000 ffffffe32bda7e14 cb88537fdc8ba600
7da0: ffffffe32bda7e30 ffffff8393042cfc ffffff8393c41720 ffffff8393c46410
7dc0: ffffff839304fa68 ffffffe32b233f40 0000000000005425 0000007ffdac1978
7de0: 000000000000011d cb88537fdc8ba600 ffffffe32bda7e70 ffffff8392fa60cc
7e00: 0000000000000000 ffffffe32b233f40 ffffffe32b233f40 0000000000000003
7e20: 0000000000005425 0000007ffdac1978 ffffffe32bda7e70 ffffff8392fa60b0
7e40: 0000000000000280 ffffffe32b233f40 ffffffe32b233f40 0000000000000003
7e60: 0000000000005425 cb88537fdc8ba600 0000000000000000 ffffff8392e02e78
7e80: 0000000000000280 0000005fbc5b0000 ffffffffffffffff 0000007f92ae9d3c
7ea0: 0000000060000000 0000000000000015 0000000000000003 0000000000005425
7ec0: 0000007ffdac1978 0000000000000000 00000000a54c910e 0000007f92b95014
7ee0: 0000007f92b95090 0000000052a64887 000000000000001d 0000000000000078
7f00: 0000007ffdac1708 0000000000000020 0000000000000000 0000000000000000
7f20: 0000000000000000 0000000000000010 000000556acf0090 0000007f92ae9d30
7f40: 0000000000000004 000000556acdef10 0000000000000000 000000556acdebd0
7f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
7f80: 0000000000000000 0000000000000000 0000000000000000 0000007ffdac1840
7fa0: 000000556acdedcc 0000007ffdac1840 0000007f92ae9d3c 0000000060000000
7fc0: 0000000000000000 0000000000000000 0000000000000003 000000000000001d
7fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call trace:
Exception stack(0xffffffe32bda7ab0 to 0xffffffe32bda7bf0)
7aa0:                                   0000000000001000 0000007fffffffff
7ac0: ffffffe32bda7cc0 ffffff8393196098 0000000080000005 0000000000000025
7ae0: ffffffe32b233f40 ffffff83930d777c ffffffe32bda7b30 ffffff83930d777c
7b00: ffffffe32bda7be0 ffffff83938d5000 ffffffe32bda7be0 ffffffe32bda7c20
7b20: ffffffe32bda7b60 ffffff83930d777c ffffffe32bda7c10 ffffff83938d5000
7b40: ffffffe32bda7c10 ffffffe32bda7c50 ffffff8393c0a000 ffffffe32b23f110
7b60: ffffffe32bda7b70 ffffff8392e09df4 ffffffe32bda7bb0 cb88537fdc8ba600
7b80: ffffff83934428f8 ffffffe32b23c040 0000000000000080 ffffff83938d5018
7ba0: 0000005fbc5b0000 ffffffe32b23c000 ffffffe32bda7e08 0000000052a64887
7bc0: 000000000000001d 0000000000000078 0000007ffdac1708 0000000000000020
7be0: 0000000000000000 0000000000000000
[<ffffff8393196098>] uart_break_ctl+0x44/0x84
[<ffffff8393177718>] send_break+0xa0/0x114
[<ffffff8393179a1c>] tty_ioctl+0xc50/0xe84
[<ffffff8392fa5a40>] do_vfs_ioctl+0xc4/0x6e8
[<ffffff8392fa60cc>] SyS_ioctl+0x68/0x9c
[<ffffff8392e02e78>] __sys_trace_return+0x0/0x4
Code: b9410ea0 34000160 f9408aa0 f9402814 (b85fc280)
---[ end trace 8606094f1960c5e0 ]---
Kernel panic - not syncing: Fatal exception

Fix this problem by adding NULL checks prior to calling break_ctl ops.

Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1574263133-28259-1-git-send-email-xiaojiangfeng@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 13, 2019
[ Upstream commit a71a29f50de1ef97ab55c151a1598eb12dde379d ]

I2C communication errors (-EREMOTEIO) during the IRQ handler of nxp-nci
result in a NULL pointer dereference at the moment:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    Oops: 0002 [150balbes#1] PREEMPT SMP NOPTI
    CPU: 1 PID: 355 Comm: irq/137-nxp-nci Not tainted 5.4.0-rc6 150balbes#1
    RIP: 0010:skb_queue_tail+0x25/0x50
    Call Trace:
     nci_recv_frame+0x36/0x90 [nci]
     nxp_nci_i2c_irq_thread_fn+0xd1/0x285 [nxp_nci_i2c]
     ? preempt_count_add+0x68/0xa0
     ? irq_forced_thread_fn+0x80/0x80
     irq_thread_fn+0x20/0x60
     irq_thread+0xee/0x180
     ? wake_threads_waitq+0x30/0x30
     kthread+0xfb/0x130
     ? irq_thread_check_affinity+0xd0/0xd0
     ? kthread_park+0x90/0x90
     ret_from_fork+0x1f/0x40

Afterward the kernel must be rebooted to work properly again.

This happens because it attempts to call nci_recv_frame() with skb == NULL.
However, unlike nxp_nci_fw_recv_frame(), nci_recv_frame() does not have any
NULL checks for skb, causing the NULL pointer dereference.

Change the code to call only nxp_nci_fw_recv_frame() in case of an error.
Make sure to log it so it is obvious that a communication error occurred.
The error above then becomes:

    nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121
    nci: __nci_request: wait_for_completion_interruptible_timeout failed 0
    nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121

Fixes: 6be8867 ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 13, 2019
[ Upstream commit d6e1935819db0c91ce4a5af82466f3ab50d17346 ]

Right now serial drivers process sysrq keys deep in their character
receiving code.  This means that they've already grabbed their
port->lock spinlock.  This can end up getting in the way if we've go
to do serial stuff (especially kgdb) in response to the sysrq.

Serial drivers have various hacks in them to handle this.  Looking at
'8250_port.c' you can see that the console_write() skips locking if
we're in the sysrq handler.  Looking at 'msm_serial.c' you can see
that the port lock is dropped around uart_handle_sysrq_char().

It turns out that these hacks aren't exactly perfect.  If you have
lockdep turned on and use something like the 8250_port hack you'll get
a splat that looks like:

  WARNING: possible circular locking dependency detected
  [...] is trying to acquire lock:
  ... (console_owner){-.-.}, at: console_unlock+0x2e0/0x5e4

  but task is already holding lock:
  ... (&port_lock_key){-.-.}, at: serial8250_handle_irq+0x30/0xe4

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> 150balbes#1 (&port_lock_key){-.-.}:
         _raw_spin_lock_irqsave+0x58/0x70
         serial8250_console_write+0xa8/0x250
         univ8250_console_write+0x40/0x4c
         console_unlock+0x528/0x5e4
         register_console+0x2c4/0x3b0
         uart_add_one_port+0x350/0x478
         serial8250_register_8250_port+0x350/0x3a8
         dw8250_probe+0x67c/0x754
         platform_drv_probe+0x58/0xa4
         really_probe+0x150/0x294
         driver_probe_device+0xac/0xe8
         __driver_attach+0x98/0xd0
         bus_for_each_dev+0x84/0xc8
         driver_attach+0x2c/0x34
         bus_add_driver+0xf0/0x1ec
         driver_register+0xb4/0x100
         __platform_driver_register+0x60/0x6c
         dw8250_platform_driver_init+0x20/0x28
	 ...

  -> #0 (console_owner){-.-.}:
         lock_acquire+0x1e8/0x214
         console_unlock+0x35c/0x5e4
         vprintk_emit+0x230/0x274
         vprintk_default+0x7c/0x84
         vprintk_func+0x190/0x1bc
         printk+0x80/0xa0
         __handle_sysrq+0x104/0x21c
         handle_sysrq+0x30/0x3c
         serial8250_read_char+0x15c/0x18c
         serial8250_rx_chars+0x34/0x74
         serial8250_handle_irq+0x9c/0xe4
         dw8250_handle_irq+0x98/0xcc
         serial8250_interrupt+0x50/0xe8
         ...

  other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(&port_lock_key);
                                 lock(console_owner);
                                 lock(&port_lock_key);
    lock(console_owner);

   *** DEADLOCK ***

The hack used in 'msm_serial.c' doesn't cause the above splats but it
seems a bit ugly to unlock / lock our spinlock deep in our irq
handler.

It seems like we could defer processing the sysrq until the end of the
interrupt handler right after we've unlocked the port.  With this
scheme if a whole batch of sysrq characters comes in one irq then we
won't handle them all, but that seems like it should be a fine
compromise.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 13, 2019
commit 691505a803a7f223b2af621848d581259c61f77d upstream.

A NULL-pointer dereference was reported in fedora bz#1762199 while
reshaping a raid6 array after adding a fifth drive to an existing
array.

[   47.343549] md/raid:md0: raid level 6 active with 3 out of 5 devices, algorithm 2
[   47.804017] md0: detected capacity change from 0 to 7885289422848
[   47.822083] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000
...
[   47.940477] CPU: 1 PID: 14210 Comm: md0_raid6 Tainted: G        W         5.2.18-200.fc30.aarch64 150balbes#1
[   47.949594] Hardware name: AMD Overdrive/Supercharger/To be filled by O.E.M., BIOS ROD1002C 04/08/2016
[   47.958886] pstate: 00400085 (nzcv daIf +PAN -UAO)
[   47.963668] pc : __list_del_entry_valid+0x2c/0xa8
[   47.968366] lr : ccp_tx_submit+0x84/0x168 [ccp]
[   47.972882] sp : ffff00001369b970
[   47.976184] x29: ffff00001369b970 x28: ffff00001369bdb8
[   47.981483] x27: 00000000ffffffff x26: ffff8003b758af70
[   47.986782] x25: ffff8003b758b2d8 x24: ffff8003e6245818
[   47.992080] x23: 0000000000000000 x22: ffff8003e62450c0
[   47.997379] x21: ffff8003dfd6add8 x20: 0000000000000003
[   48.002678] x19: ffff8003e6245100 x18: 0000000000000000
[   48.007976] x17: 0000000000000000 x16: 0000000000000000
[   48.013274] x15: 0000000000000000 x14: 0000000000000000
[   48.018572] x13: ffff7e000ef83a00 x12: 0000000000000001
[   48.023870] x11: ffff000010eff998 x10: 00000000000019a0
[   48.029169] x9 : 0000000000000000 x8 : ffff8003e6245180
[   48.034467] x7 : 0000000000000000 x6 : 000000000000003f
[   48.039766] x5 : 0000000000000040 x4 : ffff8003e0145080
[   48.045064] x3 : dead000000000200 x2 : 0000000000000000
[   48.050362] x1 : 0000000000000000 x0 : ffff8003e62450c0
[   48.055660] Call trace:
[   48.058095]  __list_del_entry_valid+0x2c/0xa8
[   48.062442]  ccp_tx_submit+0x84/0x168 [ccp]
[   48.066615]  async_tx_submit+0x224/0x368 [async_tx]
[   48.071480]  async_trigger_callback+0x68/0xfc [async_tx]
[   48.076784]  ops_run_biofill+0x178/0x1e8 [raid456]
[   48.081566]  raid_run_ops+0x248/0x818 [raid456]
[   48.086086]  handle_stripe+0x864/0x1208 [raid456]
[   48.090781]  handle_active_stripes.isra.0+0xb0/0x278 [raid456]
[   48.096604]  raid5d+0x378/0x618 [raid456]
[   48.100602]  md_thread+0xa0/0x150
[   48.103905]  kthread+0x104/0x130
[   48.107122]  ret_from_fork+0x10/0x18
[   48.110686] Code: d2804003 f2fbd5a3 eb03003f 54000320 (f9400021)
[   48.116766] ---[ end trace 23f390a527f7ad77 ]---

ccp_tx_submit is passed a dma_async_tx_descriptor which is contained in
a ccp_dma_desc and adds it to a ccp channel's pending list:

	list_del(&desc->entry);
	list_add_tail(&desc->entry, &chan->pending);

The problem is that desc->entry may be uninitialized in the
async_trigger_callback path where the descriptor was gotten
from ccp_prep_dma_interrupt which got it from ccp_alloc_dma_desc
which doesn't initialize the desc->entry list head. So, just
initialize the list head to avoid the problem.

Cc: <stable@vger.kernel.org>
Reported-by: Sahaj Sarup <sahajsarup@gmail.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
commit f7bddf1e27d18fbc7d3e3056ba449cfbe4e20b0a upstream.

During a cyclic writeback, extent_write_cache_pages() uses done_index
to update the writeback_index after the current run is over.  However,
instead of current index + 1, it gets to to the current index itself.

Unfortunately, this, combined with returning on EOF instead of looping
back, can lead to the following pathlogical behavior.

1. There is a single file which has accumulated enough dirty pages to
   trigger balance_dirty_pages() and the writer appending to the file
   with a series of short writes.

2. balance_dirty_pages kicks in, wakes up background writeback and sleeps.

3. Writeback kicks in and the cursor is on the last page of the dirty
   file.  Writeback is started or skipped if already in progress.  As
   it's EOF, extent_write_cache_pages() returns and the cursor is set
   to done_index which is pointing to the last page.

4. Writeback is done.  Nothing happens till balance_dirty_pages
   finishes, at which point we go back to 150balbes#1.

This can almost completely stall out writing back of the file and keep
the system over dirty threshold for a long time which can mess up the
whole system.  We encountered this issue in production with a package
handling application which can reliably reproduce the issue when
running under tight memory limits.

Reading the comment in the error handling section, this seems to be to
avoid accidentally skipping a page in case the write attempt on the
page doesn't succeed.  However, this concern seems bogus.

On each page, the code either:

* Skips and moves onto the next page.

* Fails issue and sets done_index to index + 1.

* Successfully issues and continue to the next page if budget allows
  and not EOF.

IOW, as long as it's not EOF and there's budget, the code never
retries writing back the same page.  Only when a page happens to be
the last page of a particular run, we end up retrying the page, which
can't possibly guarantee anything data integrity related.  Besides,
cyclic writes are only used for non-syncing writebacks meaning that
there's no data integrity implication to begin with.

Fix it by always setting done_index past the current page being
processed.

Note that this problem exists in other writepages too.

CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
commit b67a95f2abff0c34e5667c15ab8900de73d8d087 upstream.

The PCI INTx interrupts and other LSI interrupts are handled differently
under a sPAPR platform. When the interrupt source characteristics are
queried, the hypervisor returns an H_INT_ESB flag to inform the OS
that it should be using the H_INT_ESB hcall for interrupt management
and not loads and stores on the interrupt ESB pages.

A default -1 value is returned for the addresses of the ESB pages. The
driver ignores this condition today and performs a bogus IO mapping.
Recent changes and the DEBUG_VM configuration option make the bug
visible with :

  kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612!
  Oops: Exception in kernel mode, sig: 5 [150balbes#1]
  LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le 150balbes#1
  NIP:  c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000
  REGS: c0000000fa45f0d0 TRAP: 0700   Not tainted  (5.4.0-0.rc6.git0.1.fc32.ppc64le)
  ...
  NIP ioremap_page_range+0x4c4/0x6e0
  LR  ioremap_page_range+0x74/0x6e0
  Call Trace:
    ioremap_page_range+0x74/0x6e0 (unreliable)
    do_ioremap+0x8c/0x120
    __ioremap_caller+0x128/0x140
    ioremap+0x30/0x50
    xive_spapr_populate_irq_data+0x170/0x260
    xive_irq_domain_map+0x8c/0x170
    irq_domain_associate+0xb4/0x2d0
    irq_create_mapping+0x1e0/0x3b0
    irq_create_fwspec_mapping+0x27c/0x3e0
    irq_create_of_mapping+0x98/0xb0
    of_irq_parse_and_map_pci+0x168/0x230
    pcibios_setup_device+0x88/0x250
    pcibios_setup_bus_devices+0x54/0x100
    __of_scan_bus+0x160/0x310
    pcibios_scan_phb+0x330/0x390
    pcibios_init+0x8c/0x128
    do_one_initcall+0x60/0x2c0
    kernel_init_freeable+0x290/0x378
    kernel_init+0x2c/0x148
    ret_from_kernel_thread+0x5c/0x80

Fixes: bed81ee ("powerpc/xive: introduce H_INT_ESB hcall")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191203163642.2428-1-clg@kaod.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
commit 2a5cf35 upstream.

There are actually two kinds of discard merge:

- one is the normal discard merge, just like normal read/write request,
and call it single-range discard

- another is the multi-range discard, queue_max_discard_segments(rq->q) > 1

For the former case, queue_max_discard_segments(rq->q) is 1, and we
should handle this kind of discard merge like the normal read/write
request.

This patch fixes the following kernel panic issue[1], which is caused by
not removing the single-range discard request from elevator queue.

Guangwu has one raid discard test case, in which this issue is a bit
easier to trigger, and I verified that this patch can fix the kernel
panic issue in Guangwu's test case.

[1] kernel panic log from Jens's report

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
 PGD 0 P4D 0.
 Oops: 0000 [150balbes#1] SMP PTI
 CPU: 37 PID: 763 Comm: kworker/37:1H Not tainted \
4.20.0-rc3-00649-ge64d9a554a91-dirty 150balbes#14  Hardware name: Wiwynn \
Leopard-Orv2/Leopard-DDR BW, BIOS LBM08   03/03/2017       Workqueue: kblockd \
blk_mq_run_work_fn                                            RIP: \
0010:blk_mq_get_driver_tag+0x81/0x120                                       Code: 24 \
10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 00 00 00 \
0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 20 72 37 \
f6 87 b0 00 00 00 02  RSP: 0018:ffffc90004aabd30 EFLAGS: 00010246                     \
  RAX: 0000000000000003 RBX: ffff888465ea1300 RCX: ffffc90004aabde8
 RDX: 00000000ffffffff RSI: ffffc90004aabde8 RDI: 0000000000000000
 RBP: 0000000000000000 R08: ffff888465ea1348 R09: 0000000000000000
 R10: 0000000000001000 R11: 00000000ffffffff R12: ffff888465ea1300
 R13: 0000000000000000 R14: ffff888465ea1348 R15: ffff888465d10000
 FS:  0000000000000000(0000) GS:ffff88846f9c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000148 CR3: 000000000220a003 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  blk_mq_dispatch_rq_list+0xec/0x480
  ? elv_rb_del+0x11/0x30
  blk_mq_do_dispatch_sched+0x6e/0xf0
  blk_mq_sched_dispatch_requests+0xfa/0x170
  __blk_mq_run_hw_queue+0x5f/0xe0
  process_one_work+0x154/0x350
  worker_thread+0x46/0x3c0
  kthread+0xf5/0x130
  ? process_one_work+0x350/0x350
  ? kthread_destroy_worker+0x50/0x50
  ret_from_fork+0x1f/0x30
 Modules linked in: sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel \
kvm switchtec irqbypass iTCO_wdt iTCO_vendor_support efivars cdc_ether usbnet mii \
cdc_acm i2c_i801 lpc_ich mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq \
button sch_fq_codel nfsd nfs_acl lockd grace auth_rpcgss oid_registry sunrpc nvme \
nvme_core fuse sg loop efivarfs autofs4  CR2: 0000000000000148                        \

 ---[ end trace 340a1fb996df1b9b ]---
 RIP: 0010:blk_mq_get_driver_tag+0x81/0x120
 Code: 24 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 \
00 00 00 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 \
20 72 37 f6 87 b0 00 00 00 02

Fixes: 445251d ("blk-mq: fix discard merge with scheduler attached")
Reported-by: Jens Axboe <axboe@kernel.dk>
Cc: Guangwu Zhang <guazhang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Andre Tomt <andre@tomt.net>
Cc: Jack Wang <jack.wang.usish@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
[ Upstream commit f00b3428a801758243693e046b34226e92bc56b3 ]

A hang was observed in the fcport delete path when the device was
responding slow and an issue-lip path (results in session termination) was
taken.

Fix this by issuing logo requests unconditionally.

PID: 19491  TASK: ffff8e23e67bb150  CPU: 0   COMMAND: "kworker/0:0"
 #0 [ffff8e2370297bf8] __schedule at ffffffffb4f7dbb0
 150balbes#1 [ffff8e2370297c88] schedule at ffffffffb4f7e199
 150balbes#2 [ffff8e2370297c98] schedule_timeout at ffffffffb4f7ba68
 150balbes#3 [ffff8e2370297d40] msleep at ffffffffb48ad9ff
 150balbes#4 [ffff8e2370297d58] qlt_free_session_done at ffffffffc0c32052 [qla2xxx]
 150balbes#5 [ffff8e2370297e20] process_one_work at ffffffffb48bcfdf
 150balbes#6 [ffff8e2370297e68] worker_thread at ffffffffb48bdca6
 150balbes#7 [ffff8e2370297ec8] kthread at ffffffffb48c4f81

Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
commit aa71ecd8d86500da6081a72da6b0b524007e0627 upstream.

In 64bit system. sb->s_maxbytes of shmem filesystem is MAX_LFS_FILESIZE,
which equal LLONG_MAX.

If offset > LLONG_MAX - PAGE_SIZE, offset + len < LLONG_MAX in
shmem_fallocate, which will pass the checking in vfs_fallocate.

	/* Check for wrap through zero too */
	if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
		return -EFBIG;

loff_t unmap_start = round_up(offset, PAGE_SIZE) in shmem_fallocate
causes a overflow.

Syzkaller reports a overflow problem in mm/shmem:

  UBSAN: Undefined behaviour in mm/shmem.c:2014:10
  signed integer overflow: '9223372036854775807 + 1' cannot be represented in type 'long long int'
  CPU: 0 PID:17076 Comm: syz-executor0 Not tainted 4.1.46+ 150balbes#1
  Hardware name: linux, dummy-virt (DT)
  Call trace:
     dump_backtrace+0x0/0x2c8 arch/arm64/kernel/traps.c:100
     show_stack+0x20/0x30 arch/arm64/kernel/traps.c:238
     __dump_stack lib/dump_stack.c:15 [inline]
     ubsan_epilogue+0x18/0x70 lib/ubsan.c:164
     handle_overflow+0x158/0x1b0 lib/ubsan.c:195
     shmem_fallocate+0x6d0/0x820 mm/shmem.c:2104
     vfs_fallocate+0x238/0x428 fs/open.c:312
     SYSC_fallocate fs/open.c:335 [inline]
     SyS_fallocate+0x54/0xc8 fs/open.c:239

The highest bit of unmap_start will be appended with sign bit 1
(overflow) when calculate shmem_falloc.start:

    shmem_falloc.start = unmap_start >> PAGE_SHIFT.

Fix it by casting the type of unmap_start to u64, when right shifted.

This bug is found in LTS Linux 4.1.  It also seems to exist in mainline.

Link: http://lkml.kernel.org/r/1573867464-5107-1-git-send-email-chenjun102@huawei.com
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 17, 2019
[ Upstream commit 5fcaf6982d1167f1cd9b264704f6d1ef4c505d54 ]

I was investigating a crash in our Virtuozzo7 kernel which happened in
in svcauth_unix_set_client. I found out that we access m_client field
in ip_map structure, which was received from sunrpc_cache_lookup (we
have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
these field looks uninitialized (m_client == 0x74 don't look like a
pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.

It looks like the problem appeared from our previous fix to sunrpc (1):
commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
request")

And we've also found a patch already fixing our patch (2):
commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")

Though the crash is eliminated, I think the core of the problem is not
completely fixed:

Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
cache_fresh_locked which was added in (1) to fix crash. These way
cache_is_valid won't say the cache is valid anymore and in
svcauth_unix_set_client the function cache_check will return error
instead of 0, and we don't count entry as initialized.

But it looks like we need to remove cache_fresh_locked completely in
sunrpc_cache_lookup:

In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
that cache_requests with no readers also release corresponding
cache_head, to fix their leak.  We with Vasily were not sure if
cache_fresh_locked and cache_fresh_unlocked should be used in pair or
not, so we've guessed to use them in pair.

Now we see that we don't want the CACHE_VALID bit set here by
cache_fresh_locked, as "valid" means "initialized" and there is no
initialization in sunrpc_cache_add_entry. Both expiry_time and
last_refresh are not used in cache_fresh_unlocked code-path and also not
required for the initial fix.

So to conclude cache_fresh_locked was called by mistake, and we can just
safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
ideologically better for me. Hope I don't miss something here.

Here is our crash backtrace:
[13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
[13108726.326365] IP: [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
[13108726.326448] PGD 0
[13108726.326468] Oops: 0002 [150balbes#1] SMP
[13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
[13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat
[13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1]
[13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G        W  O   ------------   3.10.0-862.20.2.vz7.73.29 150balbes#1 73.29
[13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
[13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000
[13108726.328610] RIP: 0010:[<ffffffffc01f79eb>]  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
[13108726.328706] RSP: 0018:ffffa0c2a74bfd80  EFLAGS: 00010246
[13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000
[13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0
[13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001
[13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000
[13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4
[13108726.329022] FS:  0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000
[13108726.329081] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0
[13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[13108726.338908] PKRU: 00000000
[13108726.341047] Call Trace:
[13108726.343074]  [<ffffffff8a2c78b4>] ? groups_alloc+0x34/0x110
[13108726.344837]  [<ffffffffc01f5eb4>] svc_set_client+0x24/0x30 [sunrpc]
[13108726.346631]  [<ffffffffc01f2ac1>] svc_process_common+0x241/0x710 [sunrpc]
[13108726.348332]  [<ffffffffc01f3093>] svc_process+0x103/0x190 [sunrpc]
[13108726.350016]  [<ffffffffc07d605f>] nfsd+0xdf/0x150 [nfsd]
[13108726.351735]  [<ffffffffc07d5f80>] ? nfsd_destroy+0x80/0x80 [nfsd]
[13108726.353459]  [<ffffffff8a2bf741>] kthread+0xd1/0xe0
[13108726.355195]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
[13108726.356896]  [<ffffffff8a9556dd>] ret_from_fork_nospec_begin+0x7/0x21
[13108726.358577]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
[13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 <f0> 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38
[13108726.363769] RIP  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
[13108726.365530]  RSP <ffffa0c2a74bfd80>
[13108726.367179] CR2: 0000000000000074

Fixes: d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 25, 2019
[ Upstream commit 9cf1cd8ee3ee09ef2859017df2058e2f53c5347f ]

In order to set/get/dump, the tipc uses the generic netlink
infrastructure. So, when tipc module is inserted, init function
calls genl_register_family().
After genl_register_family(), set/get/dump commands are immediately
allowed and these callbacks internally use the net_generic.
net_generic is allocated by register_pernet_device() but this
is called after genl_register_family() in the __init function.
So, these callbacks would use un-initialized net_generic.

Test commands:
    #SHELL1
    while :
    do
        modprobe tipc
        modprobe -rv tipc
    done

    #SHELL2
    while :
    do
        tipc link list
    done

Splat looks like:
[   59.616322][ T2788] kasan: CONFIG_KASAN_INLINE enabled
[   59.617234][ T2788] kasan: GPF could be caused by NULL-ptr deref or user memory access
[   59.618398][ T2788] general protection fault: 0000 [150balbes#1] SMP DEBUG_PAGEALLOC KASAN PTI
[   59.619389][ T2788] CPU: 3 PID: 2788 Comm: tipc Not tainted 5.4.0+ #194
[   59.620231][ T2788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   59.621428][ T2788] RIP: 0010:tipc_bcast_get_broadcast_mode+0x131/0x310 [tipc]
[   59.622379][ T2788] Code: c7 c6 ef 8b 38 c0 65 ff 0d 84 83 c9 3f e8 d7 a5 f2 e3 48 8d bb 38 11 00 00 48 b8 00 00 00 00
[   59.622550][ T2780] NET: Registered protocol family 30
[   59.624627][ T2788] RSP: 0018:ffff88804b09f578 EFLAGS: 00010202
[   59.624630][ T2788] RAX: dffffc0000000000 RBX: 0000000000000011 RCX: 000000008bc66907
[   59.624631][ T2788] RDX: 0000000000000229 RSI: 000000004b3cf4cc RDI: 0000000000001149
[   59.624633][ T2788] RBP: ffff88804b09f588 R08: 0000000000000003 R09: fffffbfff4fb3df1
[   59.624635][ T2788] R10: fffffbfff50318f8 R11: ffff888066cadc18 R12: ffffffffa6cc2f40
[   59.624637][ T2788] R13: 1ffff11009613eba R14: ffff8880662e9328 R15: ffff8880662e9328
[   59.624639][ T2788] FS:  00007f57d8f7b740(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
[   59.624645][ T2788] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.625875][ T2780] tipc: Started in single node mode
[   59.626128][ T2788] CR2: 00007f57d887a8c0 CR3: 000000004b140002 CR4: 00000000000606e0
[   59.633991][ T2788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.635195][ T2788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.636478][ T2788] Call Trace:
[   59.637025][ T2788]  tipc_nl_add_bc_link+0x179/0x1470 [tipc]
[   59.638219][ T2788]  ? lock_downgrade+0x6e0/0x6e0
[   59.638923][ T2788]  ? __tipc_nl_add_link+0xf90/0xf90 [tipc]
[   59.639533][ T2788]  ? tipc_nl_node_dump_link+0x318/0xa50 [tipc]
[   59.640160][ T2788]  ? mutex_lock_io_nested+0x1380/0x1380
[   59.640746][ T2788]  tipc_nl_node_dump_link+0x4fd/0xa50 [tipc]
[   59.641356][ T2788]  ? tipc_nl_node_reset_link_stats+0x340/0x340 [tipc]
[   59.642088][ T2788]  ? __skb_ext_del+0x270/0x270
[   59.642594][ T2788]  genl_lock_dumpit+0x85/0xb0
[   59.643050][ T2788]  netlink_dump+0x49c/0xed0
[   59.643529][ T2788]  ? __netlink_sendskb+0xc0/0xc0
[   59.644044][ T2788]  ? __netlink_dump_start+0x190/0x800
[   59.644617][ T2788]  ? __mutex_unlock_slowpath+0xd0/0x670
[   59.645177][ T2788]  __netlink_dump_start+0x5a0/0x800
[   59.645692][ T2788]  genl_rcv_msg+0xa75/0xe90
[   59.646144][ T2788]  ? __lock_acquire+0xdfe/0x3de0
[   59.646692][ T2788]  ? genl_family_rcv_msg_attrs_parse+0x320/0x320
[   59.647340][ T2788]  ? genl_lock_dumpit+0xb0/0xb0
[   59.647821][ T2788]  ? genl_unlock+0x20/0x20
[   59.648290][ T2788]  ? genl_parallel_done+0xe0/0xe0
[   59.648787][ T2788]  ? find_held_lock+0x39/0x1d0
[   59.649276][ T2788]  ? genl_rcv+0x15/0x40
[   59.649722][ T2788]  ? lock_contended+0xcd0/0xcd0
[   59.650296][ T2788]  netlink_rcv_skb+0x121/0x350
[   59.650828][ T2788]  ? genl_family_rcv_msg_attrs_parse+0x320/0x320
[   59.651491][ T2788]  ? netlink_ack+0x940/0x940
[   59.651953][ T2788]  ? lock_acquire+0x164/0x3b0
[   59.652449][ T2788]  genl_rcv+0x24/0x40
[   59.652841][ T2788]  netlink_unicast+0x421/0x600
[ ... ]

Fixes: 7e43690 ("tipc: fix a slab object leak")
Fixes: a62fbcc ("tipc: make subscriber server support net namespace")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 25, 2019
commit 5480e299b5ae57956af01d4839c9fc88a465eeab upstream.

Some time ago the block layer was modified such that timeout handlers are
called from thread context instead of interrupt context. Make it safe to
run the iSCSI timeout handler in thread context. This patch fixes the
following lockdep complaint:

================================
WARNING: inconsistent lock state
5.5.1-dbg+ 150balbes#11 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/7:1H/206 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff88802d9827e8 (&(&session->frwd_lock)->rlock){+.?.}, at: iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi]
{IN-SOFTIRQ-W} state was registered at:
  lock_acquire+0x106/0x240
  _raw_spin_lock+0x38/0x50
  iscsi_check_transport_timeouts+0x3e/0x210 [libiscsi]
  call_timer_fn+0x132/0x470
  __run_timers.part.0+0x39f/0x5b0
  run_timer_softirq+0x63/0xc0
  __do_softirq+0x12d/0x5fd
  irq_exit+0xb3/0x110
  smp_apic_timer_interrupt+0x131/0x3d0
  apic_timer_interrupt+0xf/0x20
  default_idle+0x31/0x230
  arch_cpu_idle+0x13/0x20
  default_idle_call+0x53/0x60
  do_idle+0x38a/0x3f0
  cpu_startup_entry+0x24/0x30
  start_secondary+0x222/0x290
  secondary_startup_64+0xa4/0xb0
irq event stamp: 1383705
hardirqs last  enabled at (1383705): [<ffffffff81aace5c>] _raw_spin_unlock_irq+0x2c/0x50
hardirqs last disabled at (1383704): [<ffffffff81aacb98>] _raw_spin_lock_irq+0x18/0x50
softirqs last  enabled at (1383690): [<ffffffffa0e2efea>] iscsi_queuecommand+0x76a/0xa20 [libiscsi]
softirqs last disabled at (1383682): [<ffffffffa0e2e998>] iscsi_queuecommand+0x118/0xa20 [libiscsi]

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&session->frwd_lock)->rlock);
  <Interrupt>
    lock(&(&session->frwd_lock)->rlock);

 *** DEADLOCK ***

2 locks held by kworker/7:1H/206:
 #0: ffff8880d57bf928 ((wq_completion)kblockd){+.+.}, at: process_one_work+0x472/0xab0
 150balbes#1: ffff88802b9c7de8 ((work_completion)(&q->timeout_work)){+.+.}, at: process_one_work+0x476/0xab0

stack backtrace:
CPU: 7 PID: 206 Comm: kworker/7:1H Not tainted 5.5.1-dbg+ 150balbes#11
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: kblockd blk_mq_timeout_work
Call Trace:
 dump_stack+0xa5/0xe6
 print_usage_bug.cold+0x232/0x23b
 mark_lock+0x8dc/0xa70
 __lock_acquire+0xcea/0x2af0
 lock_acquire+0x106/0x240
 _raw_spin_lock+0x38/0x50
 iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi]
 scsi_times_out+0xf4/0x440 [scsi_mod]
 scsi_timeout+0x1d/0x20 [scsi_mod]
 blk_mq_check_expired+0x365/0x3a0
 bt_iter+0xd6/0xf0
 blk_mq_queue_tag_busy_iter+0x3de/0x650
 blk_mq_timeout_work+0x1af/0x380
 process_one_work+0x56d/0xab0
 worker_thread+0x7a/0x5d0
 kthread+0x1bc/0x210
 ret_from_fork+0x24/0x30

Fixes: 287922e ("block: defer timeouts to a workqueue")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191209173457.187370-1-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
[ Upstream commit 90b3b339364c76baa2436445401ea9ade040c216 ]

When doing stress test, we get the following trace:
kernel BUG at lib/dynamic_queue_limits.c:26!
Internal error: Oops - BUG: 0 [150balbes#1] SMP ARM
Modules linked in: hip04_eth
CPU: 0 PID: 2003 Comm: tDblStackPcap0 Tainted: G           O L  4.4.197 150balbes#1
Hardware name: Hisilicon A15
task: c3637668 task.stack: de3bc000
PC is at dql_completed+0x18/0x154
LR is at hip04_tx_reclaim+0x110/0x174 [hip04_eth]
pc : [<c041abfc>]    lr : [<bf0003a8>]    psr: 800f0313
sp : de3bdc2c  ip : 00000000  fp : c020fb10
r10: 00000000  r9 : c39b4224  r8 : 00000001
r7 : 00000046  r6 : c39b4000  r5 : 0078f392  r4 : 0078f392
r3 : 00000047  r2 : 00000000  r1 : 00000046  r0 : df5d5c80
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 32c5387d  Table: 1e189b80  DAC: 55555555
Process tDblStackPcap0 (pid: 2003, stack limit = 0xde3bc190)
Stack: (0xde3bdc2c to 0xde3be000)
[<c041abfc>] (dql_completed) from [<bf0003a8>] (hip04_tx_reclaim+0x110/0x174 [hip04_eth])
[<bf0003a8>] (hip04_tx_reclaim [hip04_eth]) from [<bf0012c0>] (hip04_rx_poll+0x20/0x388 [hip04_eth])
[<bf0012c0>] (hip04_rx_poll [hip04_eth]) from [<c04c8d9c>] (net_rx_action+0x120/0x374)
[<c04c8d9c>] (net_rx_action) from [<c021eaf4>] (__do_softirq+0x218/0x318)
[<c021eaf4>] (__do_softirq) from [<c021eea0>] (irq_exit+0x88/0xac)
[<c021eea0>] (irq_exit) from [<c0240130>] (msa_irq_exit+0x11c/0x1d4)
[<c0240130>] (msa_irq_exit) from [<c0267ba8>] (__handle_domain_irq+0x110/0x148)
[<c0267ba8>] (__handle_domain_irq) from [<c0201588>] (gic_handle_irq+0xd4/0x118)
[<c0201588>] (gic_handle_irq) from [<c0558360>] (__irq_svc+0x40/0x58)
Exception stack(0xde3bdde0 to 0xde3bde28)
dde0: 00000000 00008001 c3637668 00000000 00000000 a00f0213 dd3627a0 c0af6380
de00: c086d380 a00f0213 c0a22a50 de3bde6c 00000002 de3bde30 c0558138 c055813c
de20: 600f0213 ffffffff
[<c0558360>] (__irq_svc) from [<c055813c>] (_raw_spin_unlock_irqrestore+0x44/0x54)
Kernel panic - not syncing: Fatal exception in interrupt

Pre-modification code:
int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
{
[...]
[1]	priv->tx_head = TX_NEXT(tx_head);
[2]	count++;
[3]	netdev_sent_queue(ndev, skb->len);
[...]
}
An rx interrupt occurs if hip04_mac_start_xmit just executes to the line 2,
tx_head has been updated, but corresponding 'skb->len' has not been
added to dql_queue.

And then
hip04_mac_interrupt->__napi_schedule->hip04_rx_poll->hip04_tx_reclaim

In hip04_tx_reclaim, because tx_head has been updated,
bytes_compl will plus an additional "skb-> len"
which has not been added to dql_queue. And then
trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed).

To solve the problem described above, we put
"netdev_sent_queue(ndev, skb->len);"
before
"priv->tx_head = TX_NEXT(tx_head);"

Fixes: a41ea46 ("net: hisilicon: new hip04 ethernet driver")
Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
…free issues

commit 6609fee8897ac475378388238456c84298bff802 upstream.

When a tree mod log user no longer needs to use the tree it calls
btrfs_put_tree_mod_seq() to remove itself from the list of users and
delete all no longer used elements of the tree's red black tree, which
should be all elements with a sequence number less then our equals to
the caller's sequence number. However the logic is broken because it
can delete and free elements from the red black tree that have a
sequence number greater then the caller's sequence number:

1) At a point in time we have sequence numbers 1, 2, 3 and 4 in the
   tree mod log;

2) The task which got assigned the sequence number 1 calls
   btrfs_put_tree_mod_seq();

3) Sequence number 1 is deleted from the list of sequence numbers;

4) The current minimum sequence number is computed to be the sequence
   number 2;

5) A task using sequence number 2 is at tree_mod_log_rewind() and gets
   a pointer to one of its elements from the red black tree through
   a call to tree_mod_log_search();

6) The task with sequence number 1 iterates the red black tree of tree
   modification elements and deletes (and frees) all elements with a
   sequence number less then or equals to 2 (the computed minimum sequence
   number) - it ends up only leaving elements with sequence numbers of 3
   and 4;

7) The task with sequence number 2 now uses the pointer to its element,
   already freed by the other task, at __tree_mod_log_rewind(), resulting
   in a use-after-free issue. When CONFIG_DEBUG_PAGEALLOC=y it produces
   a trace like the following:

  [16804.546854] general protection fault: 0000 [150balbes#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  [16804.547451] CPU: 0 PID: 28257 Comm: pool Tainted: G        W         5.4.0-rc8-btrfs-next-51 150balbes#1
  [16804.548059] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
  [16804.548666] RIP: 0010:rb_next+0x16/0x50
  (...)
  [16804.550581] RSP: 0018:ffffb948418ef9b0 EFLAGS: 00010202
  [16804.551227] RAX: 6b6b6b6b6b6b6b6b RBX: ffff90e0247f6600 RCX: 6b6b6b6b6b6b6b6b
  [16804.551873] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff90e0247f6600
  [16804.552504] RBP: ffff90dffe0d4688 R08: 0000000000000001 R09: 0000000000000000
  [16804.553136] R10: ffff90dffa4a0040 R11: 0000000000000000 R12: 000000000000002e
  [16804.553768] R13: ffff90e0247f6600 R14: 0000000000001663 R15: ffff90dff77862b8
  [16804.554399] FS:  00007f4b197ae700(0000) GS:ffff90e036a00000(0000) knlGS:0000000000000000
  [16804.555039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [16804.555683] CR2: 00007f4b10022000 CR3: 00000002060e2004 CR4: 00000000003606f0
  [16804.556336] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [16804.556968] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [16804.557583] Call Trace:
  [16804.558207]  __tree_mod_log_rewind+0xbf/0x280 [btrfs]
  [16804.558835]  btrfs_search_old_slot+0x105/0xd00 [btrfs]
  [16804.559468]  resolve_indirect_refs+0x1eb/0xc70 [btrfs]
  [16804.560087]  ? free_extent_buffer.part.19+0x5a/0xc0 [btrfs]
  [16804.560700]  find_parent_nodes+0x388/0x1120 [btrfs]
  [16804.561310]  btrfs_check_shared+0x115/0x1c0 [btrfs]
  [16804.561916]  ? extent_fiemap+0x59d/0x6d0 [btrfs]
  [16804.562518]  extent_fiemap+0x59d/0x6d0 [btrfs]
  [16804.563112]  ? __might_fault+0x11/0x90
  [16804.563706]  do_vfs_ioctl+0x45a/0x700
  [16804.564299]  ksys_ioctl+0x70/0x80
  [16804.564885]  ? trace_hardirqs_off_thunk+0x1a/0x20
  [16804.565461]  __x64_sys_ioctl+0x16/0x20
  [16804.566020]  do_syscall_64+0x5c/0x250
  [16804.566580]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [16804.567153] RIP: 0033:0x7f4b1ba2add7
  (...)
  [16804.568907] RSP: 002b:00007f4b197adc88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [16804.569513] RAX: ffffffffffffffda RBX: 00007f4b100210d8 RCX: 00007f4b1ba2add7
  [16804.570133] RDX: 00007f4b100210d8 RSI: 00000000c020660b RDI: 0000000000000003
  [16804.570726] RBP: 000055de05a6cfe0 R08: 0000000000000000 R09: 00007f4b197add44
  [16804.571314] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4b197add48
  [16804.571905] R13: 00007f4b197add40 R14: 00007f4b100210d0 R15: 00007f4b197add50
  (...)
  [16804.575623] ---[ end trace 87317359aad4ba50 ]---

Fix this by making btrfs_put_tree_mod_seq() skip deletion of elements that
have a sequence number equals to the computed minimum sequence number, and
not just elements with a sequence number greater then that minimum.

Fixes: bd989ba ("Btrfs: add tree modification log functions")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
[ Upstream commit a37980ac5be29b83da67bf7d571c6bd9f90f8e45 ]

v4l2-compliance fails with this message:

   warn: v4l2-test-formats.cpp(717): \
   	TRY_FMT cannot handle an invalid pixelformat.
   test VIDIOC_TRY_FMT: FAIL

This causes the following kernel panic:

Unable to handle kernel paging request at virtual address 56595561
pgd = ecd80e00
*pgd=00000000
Internal error: Oops: 205 [150balbes#1] PREEMPT SMP ARM
...
CPU: 0 PID: 930 Comm: v4l2-compliance Not tainted \
	4.14.62-01715-gc8cd67f49a19 150balbes#1
Hardware name: Generic DRA72X (Flattened Device Tree)
task: ece44d80 task.stack: ecc6e000
PC is at __vpe_try_fmt+0x18c/0x2a8 [ti_vpe]
LR is at 0x8

Because the driver fails to properly check the 'num_planes' values for
proper ranges it ends up accessing out of bound data causing the kernel
panic.

Since this driver only handle single or dual plane pixel format, make
sure the provided value does not exceed 2 planes.

Signed-off-by: Benoit Parrot <bparrot@ti.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
…_clear_bit

[ Upstream commit fadcbd2901a0f7c8721f3bdb69eac95c272dc8ed ]

We need to move "spin_lock_irq(&bitmap->counts.lock)" before unmap previous
storage, otherwise panic like belows could happen as follows.

[  902.353802] sdl: detected capacity change from 1077936128 to 3221225472
[  902.616948] general protection fault: 0000 [150balbes#1] SMP
[snip]
[  902.618588] CPU: 12 PID: 33698 Comm: md0_raid1 Tainted: G           O    4.14.144-1-pserver 150balbes#4.14.144-1.1~deb10
[  902.618870] Hardware name: Supermicro SBA-7142G-T4/BHQGE, BIOS 3.00       10/24/2012
[  902.619120] task: ffff9ae1860fc600 task.stack: ffffb52e4c704000
[  902.619301] RIP: 0010:bitmap_file_clear_bit+0x90/0xd0 [md_mod]
[  902.619464] RSP: 0018:ffffb52e4c707d28 EFLAGS: 00010087
[  902.619626] RAX: ffe8008b0d061000 RBX: ffff9ad078c87300 RCX: 0000000000000000
[  902.619792] RDX: ffff9ad986341868 RSI: 0000000000000803 RDI: ffff9ad078c87300
[  902.619986] RBP: ffff9ad0ed7a8000 R08: 0000000000000000 R09: 0000000000000000
[  902.620154] R10: ffffb52e4c707ec0 R11: ffff9ad987d1ed44 R12: ffff9ad0ed7a8360
[  902.620320] R13: 0000000000000003 R14: 0000000000060000 R15: 0000000000000800
[  902.620487] FS:  0000000000000000(0000) GS:ffff9ad987d00000(0000) knlGS:0000000000000000
[  902.620738] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  902.620901] CR2: 000055ff12aecec0 CR3: 0000001005207000 CR4: 00000000000406e0
[  902.621068] Call Trace:
[  902.621256]  bitmap_daemon_work+0x2dd/0x360 [md_mod]
[  902.621429]  ? find_pers+0x70/0x70 [md_mod]
[  902.621597]  md_check_recovery+0x51/0x540 [md_mod]
[  902.621762]  raid1d+0x5c/0xeb0 [raid1]
[  902.621939]  ? try_to_del_timer_sync+0x4d/0x80
[  902.622102]  ? del_timer_sync+0x35/0x40
[  902.622265]  ? schedule_timeout+0x177/0x360
[  902.622453]  ? call_timer_fn+0x130/0x130
[  902.622623]  ? find_pers+0x70/0x70 [md_mod]
[  902.622794]  ? md_thread+0x94/0x150 [md_mod]
[  902.622959]  md_thread+0x94/0x150 [md_mod]
[  902.623121]  ? wait_woken+0x80/0x80
[  902.623280]  kthread+0x119/0x130
[  902.623437]  ? kthread_create_on_node+0x60/0x60
[  902.623600]  ret_from_fork+0x22/0x40
[  902.624225] RIP: bitmap_file_clear_bit+0x90/0xd0 [md_mod] RSP: ffffb52e4c707d28

Because mdadm was running on another cpu to do resize, so bitmap_resize was
called to replace bitmap as below shows.

PID: 38801  TASK: ffff9ad074a90e00  CPU: 0   COMMAND: "mdadm"
   [exception RIP: queued_spin_lock_slowpath+56]
   [snip]
-- <NMI exception stack> --
 150balbes#5 [ffffb52e60f17c58] queued_spin_lock_slowpath at ffffffff9c0b27b8
 150balbes#6 [ffffb52e60f17c58] bitmap_resize at ffffffffc0399877 [md_mod]
 150balbes#7 [ffffb52e60f17d30] raid1_resize at ffffffffc0285bf9 [raid1]
 150balbes#8 [ffffb52e60f17d50] update_size at ffffffffc038a31a [md_mod]
 150balbes#9 [ffffb52e60f17d70] md_ioctl at ffffffffc0395ca4 [md_mod]

And the procedure to keep resize bitmap safe is allocate new storage
space, then quiesce, copy bits, replace bitmap, and re-start.

However the daemon (bitmap_daemon_work) could happen even the array is
quiesced, which means when bitmap_file_clear_bit is triggered by raid1d,
then it thinks it should be fine to access store->filemap since
counts->lock is held, but resize could change the storage without the
protection of the lock.

Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
[ Upstream commit f4d99bdfd124823a81878b44b5e8750b97f73902 ]

Skip end-of-sequence and non-statement lines while walking through lines
list.

The "end-of-sequence" line information means:

 "the current address is that of the first byte after the
  end of a sequence of target machine instructions."
 (DWARF version 4 spec 6.2.2)

This actually means out of scope and we can not probe on it.

On the other hand, the statement lines (is_stmt) means:

 "the current instruction is a recommended breakpoint location.
  A recommended breakpoint location is intended to “represent”
  a line, a statement and/or a semantically distinct subpart
  of a statement."

 (DWARF version 4 spec 6.2.2)

So, non-statement line info also should be skipped.

These can reduce unneeded probe points and also avoid an error.

E.g. without this patch:

  # perf probe -a "clear_tasks_mm_cpumask:1"
  Added new events:
    probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_3 (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_4 (on clear_tasks_mm_cpumask:1)

  You can now use it in all perf tools, such as:

  	perf record -e probe:clear_tasks_mm_cpumask_4 -aR sleep 1

  #

This puts 5 probes on one line, but acutally it's not inlined function.
This is because there are many non statement instructions at the
function prologue.

With this patch:

  # perf probe -a "clear_tasks_mm_cpumask:1"
  Added new event:
    probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)

  You can now use it in all perf tools, such as:

  	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1

  #

Now perf-probe skips unneeded addresses.

Committer testing:

Slightly different results, but similar:

Before:

  # uname -a
  Linux quaco 5.3.8-200.fc30.x86_64 150balbes#1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  #
  # perf probe -a "clear_tasks_mm_cpumask:1"
  Added new events:
    probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
    probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)

  You can now use it in all perf tools, such as:

  	perf record -e probe:clear_tasks_mm_cpumask_2 -aR sleep 1

  #

After:

  # perf probe -a "clear_tasks_mm_cpumask:1"
  Added new event:
    probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)

  You can now use it in all perf tools, such as:

  	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1

  # perf probe -l
    probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
  #

Fixes: 4cc9cec ("perf probe: Introduce lines walker interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241936090.32002.12156347518596111660.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
…bprogram

[ Upstream commit da6cb952a89efe24bb76c4971370d485737a2d85 ]

Filter out instances except for inlined_subroutine and subprogram DIE in
die_walk_instances() and die_is_func_instance().

This fixes an issue that perf probe sets some probes on calling address
instead of a target function itself.

When perf probe walks on instances of an abstruct origin (a kind of
function prototype of inlined function), die_walk_instances() can also
pass a GNU_call_site (a GNU extension for call site) to callback. Since
it is not an inlined instance of target function, we have to filter out
when searching a probe point.

Without this patch, perf probe sets probes on call site address too.This
can happen on some function which is marked "inlined", but has actual
symbol. (I'm not sure why GCC mark it "inlined"):

  # perf probe -D vfs_read
  p:probe/vfs_read _text+2500017
  p:probe/vfs_read_1 _text+2499468
  p:probe/vfs_read_2 _text+2499563
  p:probe/vfs_read_3 _text+2498876
  p:probe/vfs_read_4 _text+2498512
  p:probe/vfs_read_5 _text+2498627

With this patch:

Slightly different results, similar tho:

  # perf probe -D vfs_read
  p:probe/vfs_read _text+2498512

Committer testing:

  # uname -a
  Linux quaco 5.3.8-200.fc30.x86_64 150balbes#1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Before:

  # perf probe -D vfs_read
  p:probe/vfs_read _text+3131557
  p:probe/vfs_read_1 _text+3130975
  p:probe/vfs_read_2 _text+3131047
  p:probe/vfs_read_3 _text+3130380
  p:probe/vfs_read_4 _text+3130000
  # uname -a
  Linux quaco 5.3.8-200.fc30.x86_64 150balbes#1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  #

After:

  # perf probe -D vfs_read
  p:probe/vfs_read _text+3130000
  #

Fixes: db0d2c6 ("perf probe: Search concrete out-of-line instances")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241937063.32002.11024544873990816590.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Dec 31, 2019
commit 246ff09f89e54fdf740a8d496176c86743db3ec7 upstream.

... because interrupts are disabled that early and sending IPIs can
deadlock:

  BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
  no locks held by swapper/1/0.
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0
  softirqs last  enabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  Preemption disabled at:
  [<ffffffff8104703b>] start_secondary+0x3b/0x190
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ 150balbes#1
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  Call Trace:
   dump_stack
   ___might_sleep.cold.92
   wait_for_completion
   ? generic_exec_single
   rdmsr_safe_on_cpu
   ? wrmsr_on_cpus
   mce_amd_feature_init
   mcheck_cpu_init
   identify_cpu
   identify_secondary_cpu
   smp_store_cpu_info
   start_secondary
   secondary_startup_64

The function smca_configure() is called only on the current CPU anyway,
therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid
the IPI.

 [ bp: Update commit message. ]

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Jan 4, 2020
[ Upstream commit 238191d65d7217982d69e21c1d623616da34b281 ]

If a faulty initiator fails to bind the socket to the iSCSI connection
before emitting a command, for instance, a subsequent send_pdu, it will
crash the kernel due to a null pointer dereference in sock_sendmsg(), as
shown in the log below.  This patch makes sure the bind succeeded before
trying to use the socket.

BUG: kernel NULL pointer dereference, address: 0000000000000018
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [150balbes#1] SMP PTI
CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.4.0-rc2.iscsi+ 150balbes#13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   24.158246] Workqueue: iscsi_q_0 iscsi_xmitworker
[   24.158883] RIP: 0010:apparmor_socket_sendmsg+0x5/0x20
[...]
[   24.161739] RSP: 0018:ffffab6440043ca0 EFLAGS: 00010282
[   24.162400] RAX: ffffffff891c1c00 RBX: ffffffff89d53968 RCX: 0000000000000001
[   24.163253] RDX: 0000000000000030 RSI: ffffab6440043d00 RDI: 0000000000000000
[   24.164104] RBP: 0000000000000030 R08: 0000000000000030 R09: 0000000000000030
[   24.165166] R10: ffffffff893e66a0 R11: 0000000000000018 R12: ffffab6440043d00
[   24.166038] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d5575a62e90
[   24.166919] FS:  0000000000000000(0000) GS:ffff9d557db80000(0000) knlGS:0000000000000000
[   24.167890] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.168587] CR2: 0000000000000018 CR3: 000000007a838000 CR4: 00000000000006e0
[   24.169451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   24.170320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   24.171214] Call Trace:
[   24.171537]  security_socket_sendmsg+0x3a/0x50
[   24.172079]  sock_sendmsg+0x16/0x60
[   24.172506]  iscsi_sw_tcp_xmit_segment+0x77/0x120
[   24.173076]  iscsi_sw_tcp_pdu_xmit+0x58/0x170
[   24.173604]  ? iscsi_dbg_trace+0x63/0x80
[   24.174087]  iscsi_tcp_task_xmit+0x101/0x280
[   24.174666]  iscsi_xmit_task+0x83/0x110
[   24.175206]  iscsi_xmitworker+0x57/0x380
[   24.175757]  ? __schedule+0x2a2/0x700
[   24.176273]  process_one_work+0x1b5/0x360
[   24.176837]  worker_thread+0x50/0x3c0
[   24.177353]  kthread+0xf9/0x130
[   24.177799]  ? process_one_work+0x360/0x360
[   24.178401]  ? kthread_park+0x90/0x90
[   24.178915]  ret_from_fork+0x35/0x40
[   24.179421] Modules linked in:
[   24.179856] CR2: 0000000000000018
[   24.180327] ---[ end trace b4b7674b6df5f480 ]---

Signed-off-by: Anatol Pomazau <anatol@google.com>
Co-developed-by: Frank Mayhar <fmayhar@google.com>
Signed-off-by: Frank Mayhar <fmayhar@google.com>
Co-developed-by: Bharath Ravi <rbharath@google.com>
Signed-off-by: Bharath Ravi <rbharath@google.com>
Co-developed-by: Khazhimsel Kumykov <khazhy@google.com>
Signed-off-by: Khazhimsel Kumykov <khazhy@google.com>
Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Jan 4, 2020
commit 5c9934b6767b16ba60be22ec3cbd4379ad64170d upstream.

We got another syzbot report [1] that tells us we must use
write_lock_irq()/write_unlock_irq() to avoid possible deadlock.

[1]

WARNING: inconsistent lock state
5.5.0-rc1-syzkaller #0 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage.
syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
{HARDIRQ-ON-W} state was registered at:
  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
  __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
  _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319
  sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657
  tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489
  tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585
  tiocsetd drivers/tty/tty_io.c:2337 [inline]
  tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597
  vfs_ioctl fs/ioctl.c:47 [inline]
  file_ioctl fs/ioctl.c:545 [inline]
  do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:749
  __do_sys_ioctl fs/ioctl.c:756 [inline]
  __se_sys_ioctl fs/ioctl.c:754 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754
  do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
irq event stamp: 3946
hardirqs last  enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last  enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199
hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42
softirqs last  enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline]
softirqs last  enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222
softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline]
softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(disc_data_lock);
  <Interrupt>
    lock(disc_data_lock);

 *** DEADLOCK ***

5 locks held by syz-executor826/9605:
 #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
 150balbes#1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413
 150balbes#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
 150balbes#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116
 150balbes#3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823
 150balbes#4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288

stack backtrace:
CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101
 valid_state kernel/locking/lockdep.c:3112 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3309 [inline]
 mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666
 mark_usage kernel/locking/lockdep.c:3554 [inline]
 __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909
 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
 _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223
 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402
 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536
 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50
 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387
 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104
 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761
 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834
 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline]
 serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850
 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126
 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149
 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189
 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206
 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830
 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline]
 do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250
 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607
 </IRQ>
RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline]
RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579
Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7
RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7
RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd
RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000
RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899
R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138
R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000
 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
 __mutex_lock_common kernel/locking/mutex.c:962 [inline]
 __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121
 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665
 __fput+0x2ff/0x890 fs/file_table.c:280
 ____fput+0x16/0x20 fs/file_table.c:313
 task_work_run+0x145/0x1c0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x8e7/0x2ef0 kernel/exit.c:797
 do_group_exit+0x135/0x360 kernel/exit.c:895
 __do_sys_exit_group kernel/exit.c:906 [inline]
 __se_sys_exit_group kernel/exit.c:904 [inline]
 __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43fef8
Code: Bad RIP value.
RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000

Fixes: 6e4e2f8 ("6pack,mkiss: fix lock inconsistency")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Jan 4, 2020
[ Upstream commit a33121e5487b424339636b25c35d3a180eaa5f5e ]

In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
device is removed, closing this file leads to a race. This reproduces
easily in a kvm virtual machine:

ts# cat openptp0.c
int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
ts# uname -r
5.5.0-rc3-46cf053e
ts# cat /proc/cmdline
... slub_debug=FZP
ts# modprobe ptp_kvm
ts# ./openptp0 &
[1] 670
opened /dev/ptp0, sleeping 10s...
ts# rmmod ptp_kvm
ts# ls /dev/ptp*
ls: cannot access '/dev/ptp*': No such file or directory
ts# ...woken up
[   48.010809] general protection fault: 0000 [150balbes#1] SMP
[   48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e 150balbes#25
[   48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
[   48.016270] RIP: 0010:module_put.part.0+0x7/0x80
[   48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
[   48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
[   48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
[   48.019470] ...                                              ^^^ a slub poison
[   48.023854] Call Trace:
[   48.024050]  __fput+0x21f/0x240
[   48.024288]  task_work_run+0x79/0x90
[   48.024555]  do_exit+0x2af/0xab0
[   48.024799]  ? vfs_write+0x16a/0x190
[   48.025082]  do_group_exit+0x35/0x90
[   48.025387]  __x64_sys_exit_group+0xf/0x10
[   48.025737]  do_syscall_64+0x3d/0x130
[   48.026056]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   48.026479] RIP: 0033:0x7f53b12082f6
[   48.026792] ...
[   48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
[   48.045001] Fixing recursive fault but reboot is needed!

This happens in:

static void __fput(struct file *file)
{   ...
    if (file->f_op->release)
        file->f_op->release(inode, file); <<< cdev is kfree'd here
    if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
             !(mode & FMODE_PATH))) {
        cdev_put(inode->i_cdev); <<< cdev fields are accessed here

Namely:

__fput()
  posix_clock_release()
    kref_put(&clk->kref, delete_clock) <<< the last reference
      delete_clock()
        delete_ptp_clock()
          kfree(ptp) <<< cdev is embedded in ptp
  cdev_put
    module_put(p->owner) <<< *p is kfree'd, bang!

Here cdev is embedded in posix_clock which is embedded in ptp_clock.
The race happens because ptp_clock's lifetime is controlled by two
refcounts: kref and cdev.kobj in posix_clock. This is wrong.

Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
created especially for such cases. This way the parent device with its
ptp_clock is not released until all references to the cdev are released.
This adds a requirement that an initialized but not exposed struct
device should be provided to posix_clock_register() by a caller instead
of a simple dev_t.

This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix
the race between the release of watchdog_core_data and cdev"). See
details of the implementation in the commit 233ed09 ("chardev: add
helper function to register char devs with a struct device").

Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
Analyzed-by: Stephen Johnston <sjohnsto@redhat.com>
Analyzed-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isjerryxiao pushed a commit to isjerryxiao/Amlogic_s905-kernel that referenced this issue Jan 4, 2020
[ Upstream commit 94dc550a5062030569d4aa76e10e50c8fc001930 ]

ipv4_pdp_find() is called in TX packet path of GTP.
ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context.
In the current code, gtp->tid_hash and gtp->addr_hash are freed by
->dellink(), which is gtp_dellink().
But gtp_dellink() would be called while packets are processing.
So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash.
Instead, dev->priv_destructor() would be used because this callback
is called after all packet processing safely.

Test commands:
    ip link add veth1 type veth peer name veth2
    ip a a 172.0.0.1/24 dev veth1
    ip link set veth1 up
    ip a a 172.99.0.1/32 dev lo

    gtp-link add gtp1 &

    gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2
    ip r a  172.99.0.2/32 dev gtp1
    ip link set gtp1 mtu 1500

    ip netns add ns2
    ip link set veth2 netns ns2
    ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2
    ip netns exec ns2 ip link set veth2 up
    ip netns exec ns2 ip a a 172.99.0.2/32 dev lo
    ip netns exec ns2 ip link set lo up

    ip netns exec ns2 gtp-link add gtp2 &
    ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1
    ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2
    ip netns exec ns2 ip link set gtp2 mtu 1500

    hping3 172.99.0.2 -2 --flood &
    ip link del gtp1

Splat looks like:
[   72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.568916][ T1195] Read of size 8 at addr ffff8880b9a35d28 by task hping3/1195
[   72.569631][ T1195]
[   72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199
[   72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   72.571438][ T1195] Call Trace:
[   72.571764][ T1195]  dump_stack+0x96/0xdb
[   72.572171][ T1195]  ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.572761][ T1195]  print_address_description.constprop.5+0x1be/0x360
[   72.573400][ T1195]  ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.573971][ T1195]  ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.574544][ T1195]  __kasan_report+0x12a/0x16f
[   72.575014][ T1195]  ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.575593][ T1195]  kasan_report+0xe/0x20
[   72.576004][ T1195]  ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[   72.576577][ T1195]  gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ ... ]
[   72.647671][ T1195] BUG: unable to handle page fault for address: ffff8880b9a35d28
[   72.648512][ T1195] #PF: supervisor read access in kernel mode
[   72.649158][ T1195] #PF: error_code(0x0000) - not-present page
[   72.649849][ T1195] PGD a6c01067 P4D a6c01067 PUD 11fb07067 PMD 11f939067 PTE 800fffff465ca060
[   72.652958][ T1195] Oops: 0000 [150balbes#1] SMP DEBUG_PAGEALLOC KASAN PTI
[   72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G    B             5.5.0-rc1 #199
[   72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp]
[ ... ]
[   72.679168][ T1195] Call Trace:
[   72.679603][ T1195]  gtp_build_skb_ip4+0x199/0x1420 [gtp]
[   72.681915][ T1195]  ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp]
[   72.682513][ T1195]  ? lock_acquire+0x164/0x3b0
[   72.682966][ T1195]  ? gtp_dev_xmit+0x35e/0x890 [gtp]
[   72.683481][ T1195]  gtp_dev_xmit+0x3c2/0x890 [gtp]
[ ... ]

Fixes: 459aa66 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants