Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM: dts: stm32: use pll4_p for ethernet PTP clock #4

Closed

Conversation

ddrown
Copy link

@ddrown ddrown commented Apr 23, 2020

IEEE 1588 support needs a clock defined as ptp_ref and this uses pll4_p as the parent clock

before:
$ ./testptp -g ; sleep 1 ; ./testptp -g
clock time: 0.000000000 or Wed Dec 31 18:00:00 1969
clock time: 0.000000000 or Wed Dec 31 18:00:00 1969

after:
$ ./testptp -g ; sleep 1 ; ./testptp -g
clock time: 1587623094.279486315 or Thu Apr 23 01:24:54 2020
clock time: 1587623095.292978915 or Thu Apr 23 01:24:55 2020

IEEE 1588 support needs a clock defined as ptp_ref and this uses pll4_p
as the parent clock
@atorgue
Copy link
Collaborator

@atorgue atorgue commented Apr 23, 2020

Hi Dan,

Thanks for your patch. It's something I use internally to test ptp feature. However, it's not yet in branch as pll4_p could be shared with other devices and his rate changed. I gonna raise an internal ticket to see if we can change default clock tree to make ptp feature enable. In this case, no longer need to add an assigned-clock parent in device tree.

regards
alex

ADESTM pushed a commit that referenced this issue Jun 22, 2020
[ Upstream commit 102210f ]

rmnet_get_port() internally calls rcu_dereference_rtnl(),
which checks RTNL.
But rmnet_get_port() could be called by packet path.
The packet path is not protected by RTNL.
So, the suspicious RCU usage problem occurs.

Test commands:
    modprobe rmnet
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link set veth1 netns nst
    ip link add rmnet0 link veth0 type rmnet mux_id 1
    ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set rmnet1 up
    ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1
    ip link set veth0 up
    ip link set rmnet0 up
    ip a a 192.168.100.1/24 dev rmnet0
    ping 192.168.100.2

Splat looks like:
[  146.630958][ T1174] WARNING: suspicious RCU usage
[  146.631735][ T1174] 5.6.0-rc1+ torvalds#447 Not tainted
[  146.632387][ T1174] -----------------------------
[  146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() !
[  146.634742][ T1174]
[  146.634742][ T1174] other info that might help us debug this:
[  146.634742][ T1174]
[  146.645992][ T1174]
[  146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1
[  146.646937][ T1174] 5 locks held by ping/1174:
[  146.647609][ T1174]  #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980
[  146.662463][ T1174]  #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150
[  146.671696][ T1174]  #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940
[  146.673064][ T1174]  #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150
[  146.690358][ T1174]  #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020
[  146.699875][ T1174]
[  146.699875][ T1174] stack backtrace:
[  146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ torvalds#447
[  146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  146.706565][ T1174] Call Trace:
[  146.707102][ T1174]  dump_stack+0x96/0xdb
[  146.708007][ T1174]  rmnet_get_port.part.9+0x76/0x80 [rmnet]
[  146.709233][ T1174]  rmnet_egress_handler+0x107/0x420 [rmnet]
[  146.710492][ T1174]  ? sch_direct_xmit+0x1e2/0x1020
[  146.716193][ T1174]  rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet]
[  146.717012][ T1174]  dev_hard_start_xmit+0x160/0x740
[  146.717854][ T1174]  sch_direct_xmit+0x265/0x1020
[  146.718577][ T1174]  ? register_lock_class+0x14d0/0x14d0
[  146.719429][ T1174]  ? dev_watchdog+0xac0/0xac0
[  146.723738][ T1174]  ? __dev_queue_xmit+0x15fd/0x2940
[  146.724469][ T1174]  ? lock_acquire+0x164/0x3b0
[  146.725172][ T1174]  __dev_queue_xmit+0x20c7/0x2940
[ ... ]

Fixes: ceed73a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
[ Upstream commit 6c5d911 ]

journal_head::b_transaction and journal_head::b_next_transaction could
be accessed concurrently as noticed by KCSAN,

 LTP: starting fsync04
 /dev/zero: Can't open blockdev
 EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
 EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
 ==================================================================
 BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]

 write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
  __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
  __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
  jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
  (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
  kjournald2+0x13b/0x450 [jbd2]
  kthread+0x1cd/0x1f0
  ret_from_fork+0x27/0x50

 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
  jbd2_write_access_granted+0x1b2/0x250 [jbd2]
  jbd2_write_access_granted at fs/jbd2/transaction.c:1155
  jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
  __ext4_journal_get_write_access+0x50/0x90 [ext4]
  ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
  ext4_mb_new_blocks+0x54f/0xca0 [ext4]
  ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
  ext4_map_blocks+0x3b4/0x950 [ext4]
  _ext4_get_block+0xfc/0x270 [ext4]
  ext4_get_block+0x3b/0x50 [ext4]
  __block_write_begin_int+0x22e/0xae0
  __block_write_begin+0x39/0x50
  ext4_write_begin+0x388/0xb50 [ext4]
  generic_perform_write+0x15d/0x290
  ext4_buffered_write_iter+0x11f/0x210 [ext4]
  ext4_file_write_iter+0xce/0x9e0 [ext4]
  new_sync_write+0x29c/0x3b0
  __vfs_write+0x92/0xa0
  vfs_write+0x103/0x260
  ksys_write+0x9d/0x130
  __x64_sys_write+0x4c/0x60
  do_syscall_64+0x91/0xb05
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 5 locks held by fsync04/25724:
  #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
  #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
  #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
  #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
  #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
 irq event stamp: 1407125
 hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
 softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
 softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

The plain reads are outside of jh->b_state_lock critical section which result
in data races. Fix them by adding pairs of READ|WRITE_ONCE().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
[ Upstream commit a866759 ]

This reverts commit 64e62bd.

This commit ends up causing some lockdep splats due to trying to grab the
payload lock while holding the mgr's lock:

[   54.010099]
[   54.011765] ======================================================
[   54.018670] WARNING: possible circular locking dependency detected
[   54.025577] 5.5.0-rc6-02274-g77381c23ee63 torvalds#47 Not tainted
[   54.031610] ------------------------------------------------------
[   54.038516] kworker/1:6/1040 is trying to acquire lock:
[   54.044354] ffff888272af3228 (&mgr->payload_lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.054957]
[   54.054957] but task is already holding lock:
[   54.061473] ffff888272af3060 (&mgr->lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
[   54.071193]
[   54.071193] which lock already depends on the new lock.
[   54.071193]
[   54.080334]
[   54.080334] the existing dependency chain (in reverse order) is:
[   54.088697]
[   54.088697] -> #1 (&mgr->lock){+.+.}:
[   54.094440]        __mutex_lock+0xc3/0x498
[   54.099015]        drm_dp_mst_topology_get_port_validated+0x25/0x80
[   54.106018]        drm_dp_update_payload_part1+0xa2/0x2e2
[   54.112051]        intel_mst_pre_enable_dp+0x144/0x18f
[   54.117791]        intel_encoders_pre_enable+0x63/0x70
[   54.123532]        hsw_crtc_enable+0xa1/0x722
[   54.128396]        intel_update_crtc+0x50/0x194
[   54.133455]        skl_commit_modeset_enables+0x40c/0x540
[   54.139485]        intel_atomic_commit_tail+0x5f7/0x130d
[   54.145418]        intel_atomic_commit+0x2c8/0x2d8
[   54.150770]        drm_atomic_helper_set_config+0x5a/0x70
[   54.156801]        drm_mode_setcrtc+0x2ab/0x833
[   54.161862]        drm_ioctl+0x2e5/0x424
[   54.166242]        vfs_ioctl+0x21/0x2f
[   54.170426]        do_vfs_ioctl+0x5fb/0x61e
[   54.175096]        ksys_ioctl+0x55/0x75
[   54.179377]        __x64_sys_ioctl+0x1a/0x1e
[   54.184146]        do_syscall_64+0x5c/0x6d
[   54.188721]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   54.194946]
[   54.194946] -> #0 (&mgr->payload_lock){+.+.}:
[   54.201463]
[   54.201463] other info that might help us debug this:
[   54.201463]
[   54.210410]  Possible unsafe locking scenario:
[   54.210410]
[   54.217025]        CPU0                    CPU1
[   54.222082]        ----                    ----
[   54.227138]   lock(&mgr->lock);
[   54.230643]                                lock(&mgr->payload_lock);
[   54.237742]                                lock(&mgr->lock);
[   54.244062]   lock(&mgr->payload_lock);
[   54.248346]
[   54.248346]  *** DEADLOCK ***
[   54.248346]
[   54.254959] 7 locks held by kworker/1:6/1040:
[   54.259822]  #0: ffff888275c4f528 ((wq_completion)events){+.+.},
at: worker_thread+0x455/0x6e2
[   54.269451]  #1: ffffc9000119beb0
((work_completion)(&(&dev_priv->hotplug.hotplug_work)->work)){+.+.},
at: worker_thread+0x455/0x6e2
[   54.282768]  #2: ffff888272a403f0 (&dev->mode_config.mutex){+.+.},
at: i915_hotplug_work_func+0x4b/0x2be
[   54.293368]  #3: ffffffff824fc6c0 (drm_connector_list_iter){.+.+},
at: i915_hotplug_work_func+0x17e/0x2be
[   54.304061]  #4: ffffc9000119bc58 (crtc_ww_class_acquire){+.+.},
at: drm_helper_probe_detect_ctx+0x40/0xfd
[   54.314855]  #5: ffff888272a40470 (crtc_ww_class_mutex){+.+.}, at:
drm_modeset_lock+0x74/0xe2
[   54.324385]  #6: ffff888272af3060 (&mgr->lock){+.+.}, at:
drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4
[   54.334597]
[   54.334597] stack backtrace:
[   54.339464] CPU: 1 PID: 1040 Comm: kworker/1:6 Not tainted
5.5.0-rc6-02274-g77381c23ee63 torvalds#47
[   54.348893] Hardware name: Google Fizz/Fizz, BIOS
Google_Fizz.10139.39.0 01/04/2018
[   54.357451] Workqueue: events i915_hotplug_work_func
[   54.362995] Call Trace:
[   54.365724]  dump_stack+0x71/0x9c
[   54.369427]  check_noncircular+0x91/0xbc
[   54.373809]  ? __lock_acquire+0xc9e/0xf66
[   54.378286]  ? __lock_acquire+0xc9e/0xf66
[   54.382763]  ? lock_acquire+0x175/0x1ac
[   54.387048]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.393177]  ? __mutex_lock+0xc3/0x498
[   54.397362]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.403492]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.409620]  ? drm_dp_dpcd_access+0xd9/0x101
[   54.414390]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.420517]  ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4
[   54.426645]  ? intel_digital_port_connected+0x34d/0x35c
[   54.432482]  ? intel_dp_detect+0x227/0x44e
[   54.437056]  ? ww_mutex_lock+0x49/0x9a
[   54.441242]  ? drm_helper_probe_detect_ctx+0x75/0xfd
[   54.446789]  ? intel_encoder_hotplug+0x4b/0x97
[   54.451752]  ? intel_ddi_hotplug+0x61/0x2e0
[   54.456423]  ? mark_held_locks+0x53/0x68
[   54.460803]  ? _raw_spin_unlock_irqrestore+0x3a/0x51
[   54.466347]  ? lockdep_hardirqs_on+0x187/0x1a4
[   54.471310]  ? drm_connector_list_iter_next+0x89/0x9a
[   54.476953]  ? i915_hotplug_work_func+0x206/0x2be
[   54.482208]  ? worker_thread+0x4d5/0x6e2
[   54.486587]  ? worker_thread+0x455/0x6e2
[   54.490966]  ? queue_work_on+0x64/0x64
[   54.495151]  ? kthread+0x1e9/0x1f1
[   54.498946]  ? queue_work_on+0x64/0x64
[   54.503130]  ? kthread_unpark+0x5e/0x5e
[   54.507413]  ? ret_from_fork+0x3a/0x50

The proper fix for this is probably cleanup the VCPI allocations when we're
enabling the topology, or on the first payload allocation. For now though,
let's just revert.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 64e62bd ("drm/dp_mst: Remove VCPI while disabling topology mgr")
Cc: Sean Paul <sean@poorly.run>
Cc: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117205149.97262-1-lyude@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
commit 9f61419 upstream.

Looks like the dma_sync calls don't do what we want on armv7 either.
Fixes:

  Unable to handle kernel paging request at virtual address 50001000
  pgd = (ptrval)
  [50001000] *pgd=00000000
  Internal error: Oops: 805 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc6-00271-g9f159ae07f07 #4
  Hardware name: Freescale i.MX53 (Device Tree Support)
  PC is at v7_dma_clean_range+0x20/0x38
  LR is at __dma_page_cpu_to_dev+0x28/0x90
  pc : [<c011c76c>]    lr : [<c01181c4>]    psr: 20000013
  sp : d80b5a88  ip : de96c000  fp : d840ce6c
  r10: 00000000  r9 : 00000001  r8 : d843e010
  r7 : 00000000  r6 : 00008000  r5 : ddb6c000  r4 : 00000000
  r3 : 0000003f  r2 : 00000040  r1 : 50008000  r0 : 50001000
  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 70004019  DAC: 00000051
  Process swapper/0 (pid: 1, stack limit = 0x(ptrval))

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: 3de433c ("drm/msm: Use the correct dma_sync calls in msm_gem")
Tested-by: Fabio Estevam <festevam@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
commit 56df70a upstream.

find_mergeable_vma() can return NULL.  In this case, it leads to a crash
when we access vm_mm(its offset is 0x40) later in write_protect_page.
And this case did happen on our server.  The following call trace is
captured in kernel 4.19 with the following patch applied and KSM zero
page enabled on our server.

  commit e86c59b ("mm/ksm: improve deduplication of zero pages with colouring")

So add a vma check to fix it.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
  Oops: 0000 [#1] SMP NOPTI
  CPU: 9 PID: 510 Comm: ksmd Kdump: loaded Tainted: G OE 4.19.36.bsk.9-amd64 #4.19.36.bsk.9
  RIP: try_to_merge_one_page+0xc7/0x760
  Code: 24 58 65 48 33 34 25 28 00 00 00 89 e8 0f 85 a3 06 00 00 48 83 c4
        60 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 46 08 a8 01 75 b8 <49>
        8b 44 24 40 4c 8d 7c 24 20 b9 07 00 00 00 4c 89 e6 4c 89 ff 48
  RSP: 0018:ffffadbdd9fffdb0 EFLAGS: 00010246
  RAX: ffffda83ffd4be08 RBX: ffffda83ffd4be40 RCX: 0000002c6e800000
  RDX: 0000000000000000 RSI: ffffda83ffd4be40 RDI: 0000000000000000
  RBP: ffffa11939f02ec0 R08: 0000000094e1a447 R09: 00000000abe76577
  R10: 0000000000000962 R11: 0000000000004e6a R12: 0000000000000000
  R13: ffffda83b1e06380 R14: ffffa18f31f072c0 R15: ffffda83ffd4be40
  FS: 0000000000000000(0000) GS:ffffa0da43b80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000040 CR3: 0000002c77c0a003 CR4: 00000000007626e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
    ksm_scan_thread+0x115e/0x1960
    kthread+0xf5/0x130
    ret_from_fork+0x1f/0x30

[songmuchun@bytedance.com: if the vma is out of date, just exit]
  Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com
[akpm@linux-foundation.org: add the conventional braces, replace /** with /*]
Fixes: e86c59b ("mm/ksm: improve deduplication of zero pages with colouring")
Co-developed-by: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Markus Elfring <Markus.Elfring@web.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416025034.29780-1-songmuchun@bytedance.com
Link: http://lkml.kernel.org/r/20200414132905.83819-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
commit e84fe99 upstream.

Without CONFIG_PREEMPT, it can happen that we get soft lockups detected,
e.g., while booting up.

  watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4
  Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
  RIP: __pageblock_pfn_to_page+0x134/0x1c0
  Call Trace:
   set_zone_contiguous+0x56/0x70
   page_alloc_init_late+0x166/0x176
   kernel_init_freeable+0xfa/0x255
   kernel_init+0xa/0x106
   ret_from_fork+0x35/0x40

The issue becomes visible when having a lot of memory (e.g., 4TB)
assigned to a single NUMA node - a system that can easily be created
using QEMU.  Inside VMs on a hypervisor with quite some memory
overcommit, this is fairly easy to trigger.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Jun 22, 2020
commit 00e2176 upstream.

The check for the HWO flag in dwc3_gadget_ep_reclaim_trb_sg()
causes us to break out of the loop before we call
dwc3_gadget_ep_reclaim_completed_trb(), which is what likely
should be clearing the HWO flag.

This can cause odd behavior where we never reclaim all the trbs
in the sg list, so we never call giveback on a usb req, and that
will causes transfer stalls.

This effectively resovles the adb stalls seen on HiKey960
after userland changes started only using AIO in adbd.

Cc: YongQin Liu <yongqin.liu@linaro.org>
Cc: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com>
Cc: Yang Fei <fei.yang@intel.com>
Cc: Thinh Nguyen <thinhn@synopsys.com>
Cc: Tejas Joglekar <tejas.joglekar@synopsys.com>
Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Cc: Jack Pham <jackp@codeaurora.org>
Cc: Josh Gao <jmgao@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org #4.20+
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit 936e6b8 upstream.

Suppose that, for unrelated reasons, FSF requests on behalf of recovery are
very slow and can run into the ERP timeout.

In the case at hand, we did adapter recovery to a large degree.  However
due to the slowness a LUN open is pending so the corresponding fc_rport
remains blocked.  After fast_io_fail_tmo we trigger close physical port
recovery for the port under which the LUN should have been opened.  The new
higher order port recovery dismisses the pending LUN open ERP action and
dismisses the pending LUN open FSF request.  Such dismissal decouples the
ERP action from the pending corresponding FSF request by setting
zfcp_fsf_req->erp_action to NULL (among other things)
[zfcp_erp_strategy_check_fsfreq()].

If now the ERP timeout for the pending open LUN request runs out, we must
not use zfcp_fsf_req->erp_action in the ERP timeout handler.  This is a
problem since v4.15 commit 75492a5 ("s390/scsi: Convert timers to use
timer_setup()"). Before that we intentionally only passed zfcp_erp_action
as context argument to zfcp_erp_timeout_handler().

Note: The lifetime of the corresponding zfcp_fsf_req object continues until
a (late) response or an (unrelated) adapter recovery.

Just like the regular response path ignores dismissed requests
[zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the
ERP timeout handler now needs to ignore dismissed requests.  So simply
return early in the ERP timeout handler if the FSF request is marked as
dismissed in its status flags.  To protect against the race where
zfcp_erp_strategy_check_fsfreq() dismisses and sets
zfcp_fsf_req->erp_action to NULL after our previous status flag check,
return early if zfcp_fsf_req->erp_action is NULL.  After all, the former
ERP action does not need to be woken up as that was already done as part of
the dismissal above [zfcp_erp_action_dismiss()].

This fixes the following panic due to kernel page fault in IRQ context:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d
Oops: 0004 ilc:2 [#1] SMP
Modules linked in: ...
CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G            E  X   ...
Hardware name: IBM 8561 T01 701 (LPAR)
Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp])
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000
           000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02
           0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000
           0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8
Krnl Code: 001fffff80549bd4: a7180000            lhi     %r1,0
           001fffff80549bd8: 4120a0f0            la      %r2,240(%r10)
          #001fffff80549bdc: a53e0003            llilh   %r3,3
          >001fffff80549be0: ba132000            cs      %r1,%r3,0(%r2)
           001fffff80549be4: a7740037            brc     7,1fffff80549c52
           001fffff80549be8: e320b0180004        lg      %r2,24(%r11)
           001fffff80549bee: e31020e00004        lg      %r1,224(%r2)
           001fffff80549bf4: 412020e0            la      %r2,224(%r2)
Call Trace:
 [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp]
 [<00000985915e26f0>] call_timer_fn+0x38/0x190
 [<00000985915e2944>] expire_timers+0xfc/0x190
 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218
 [<0000098591ca7c4c>] __do_softirq+0x144/0x398
 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88
 [<0000098591551b58>] irq_exit+0xb0/0xb8
 [<0000098591510c6a>] do_IRQ+0x82/0xb0
 [<0000098591ca7140>] ext_int_handler+0x128/0x12c
 [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60
([<000009859172ae4c>] clear_huge_page+0xec/0x250)
 [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768
 [<000009859172a712>] __handle_mm_fault+0x88a/0x900
 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0
 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8
 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220
Last Breaking-Event-Address:
 [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp]
Kernel panic - not syncing: Fatal exception in interrupt

Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com
Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> #4.15+
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
[ Upstream commit a066562 ]

Enable promisc mode of PF, set VF link state to enable, and
run iperf of the VF, then do self test of the PF. The self test
will fail with a low frequency, and may cause a use-after-free
problem.

[   87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   87.159722] ==================================================================
[   87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608
[   87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186
[   87.201012]
[   87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty #4
[   87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020
[   87.238292] Call trace:
[   87.243173]  dump_backtrace+0x0/0x280
[   87.250491]  show_stack+0x24/0x30
[   87.257114]  dump_stack+0xe8/0x140
[   87.263911]  print_address_description.isra.8+0x70/0x380
[   87.274538]  __kasan_report+0x12c/0x230
[   87.282203]  kasan_report+0xc/0x18
[   87.288999]  __asan_load1+0x60/0x68
[   87.295969]  hex_dump_to_buffer+0x140/0x608
[   87.304332]  print_hex_dump+0x140/0x1e0
[   87.312000]  hns3_lb_check_skb_data+0x168/0x170
[   87.321060]  hns3_clean_rx_ring+0xa94/0xfe0
[   87.329422]  hns3_self_test+0x708/0x8c0

The length of packet sent by the selftest process is only
128 + 14 bytes, and the min buffer size of a BD is 256 bytes,
and the receive process will make sure the packet sent by
the selftest process is in the linear part, so only check
the linear part in hns3_lb_check_skb_data().

So fix this use-after-free by using skb_headlen() to dump
skb->data instead of skb->len.

Fixes: c39c4d9 ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit f5e5677 upstream.

NULL pointer exception happens occasionally on serial output initiated
by login timeout.  This was reproduced only if kernel was built with
significant debugging options and EDMA driver is used with serial
console.

    col-vf50 login: root
    Password:
    Login timed out after 60 seconds.
    Unable to handle kernel NULL pointer dereference at virtual address 00000044
    Internal error: Oops: 5 [#1] ARM
    CPU: 0 PID: 157 Comm: login Not tainted 5.7.0-next-20200610-dirty #4
    Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
      (fsl_edma_tx_handler) from [<8016eb10>] (__handle_irq_event_percpu+0x64/0x304)
      (__handle_irq_event_percpu) from [<8016eddc>] (handle_irq_event_percpu+0x2c/0x7c)
      (handle_irq_event_percpu) from [<8016ee64>] (handle_irq_event+0x38/0x5c)
      (handle_irq_event) from [<801729e4>] (handle_fasteoi_irq+0xa4/0x160)
      (handle_fasteoi_irq) from [<8016ddcc>] (generic_handle_irq+0x34/0x44)
      (generic_handle_irq) from [<8016e40c>] (__handle_domain_irq+0x54/0xa8)
      (__handle_domain_irq) from [<80508bc8>] (gic_handle_irq+0x4c/0x80)
      (gic_handle_irq) from [<80100af0>] (__irq_svc+0x70/0x98)
    Exception stack(0x8459fe80 to 0x8459fec8)
    fe80: 72286b00 e3359f64 00000001 0000412d a0070013 85c98840 85c98840 a0070013
    fea0: 8054e0d4 00000000 00000002 00000000 00000002 8459fed0 8081fbe8 8081fbec
    fec0: 60070013 ffffffff
      (__irq_svc) from [<8081fbec>] (_raw_spin_unlock_irqrestore+0x30/0x58)
      (_raw_spin_unlock_irqrestore) from [<8056cb48>] (uart_flush_buffer+0x88/0xf8)
      (uart_flush_buffer) from [<80554e60>] (tty_ldisc_hangup+0x38/0x1ac)
      (tty_ldisc_hangup) from [<8054c7f4>] (__tty_hangup+0x158/0x2bc)
      (__tty_hangup) from [<80557b90>] (disassociate_ctty.part.1+0x30/0x23c)
      (disassociate_ctty.part.1) from [<8011fc18>] (do_exit+0x580/0xba0)
      (do_exit) from [<801214f8>] (do_group_exit+0x3c/0xb4)
      (do_group_exit) from [<80121580>] (__wake_up_parent+0x0/0x14)

Issue looks like race condition between interrupt handler fsl_edma_tx_handler()
(called as result of fsl_edma_xfer_desc()) and terminating the transfer with
fsl_edma_terminate_all().

The fsl_edma_tx_handler() handles interrupt for a transfer with already freed
edesc and idle==true.

Fixes: d6be34f ("dma: Add Freescale eDMA engine driver support")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Robin Gong <yibin.gong@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1591877861-28156-2-git-send-email-krzk@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit fc846e9 upstream.

The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked.  Shift amounts greater than or equal to 32 will result in
undefined behavior.  Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity.  Only channels 0 to 15
are valid.

Fixes: a8c66b6 ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable@vger.kernel.org> #4.0+: ef75e14: staging: comedi: verify array index is correct before using it
Cc: <stable@vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit 254503a upstream.

The drm/omap driver was fixed to correct an issue where using a
divider of 32 breaks the DSS despite the TRM stating 32 is a valid
number.  Through experimentation, it appears that 31 works, and
it is consistent with the value used by the drm/omap driver.

This patch fixes the divider for fbdev driver instead of the drm.

Fixes: f76ee89 ("omapfb: copy omapdss & displays for omapfb")
Cc: <stable@vger.kernel.org> #4.5+
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
[b.zolnierkie: mark patch as applicable to stable 4.5+ (was 4.9+)]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630182636.439015-1-aford173@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
[ Upstream commit e24c644 ]

I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        torvalds#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        torvalds#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        torvalds#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        torvalds#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        torvalds#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        torvalds#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        torvalds#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        torvalds#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        torvalds#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit 18c850f upstream.

There's long existed a lockdep splat because we open our bdev's under
the ->device_list_mutex at mount time, which acquires the bd_mutex.
Usually this goes unnoticed, but if you do loopback devices at all
suddenly the bd_mutex comes with a whole host of other dependencies,
which results in the splat when you mount a btrfs file system.

======================================================
WARNING: possible circular locking dependency detected
5.8.0-0.rc3.1.fc33.x86_64+debug #1 Not tainted
------------------------------------------------------
systemd-journal/509 is trying to acquire lock:
ffff970831f84db0 (&fs_info->reloc_mutex){+.+.}-{3:3}, at: btrfs_record_root_in_trans+0x44/0x70 [btrfs]

but task is already holding lock:
ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

 -> #6 (sb_pagefaults){.+.+}-{0:0}:
       __sb_start_write+0x13e/0x220
       btrfs_page_mkwrite+0x59/0x560 [btrfs]
       do_page_mkwrite+0x4f/0x130
       do_wp_page+0x3b0/0x4f0
       handle_mm_fault+0xf47/0x1850
       do_user_addr_fault+0x1fc/0x4b0
       exc_page_fault+0x88/0x300
       asm_exc_page_fault+0x1e/0x30

 -> #5 (&mm->mmap_lock#2){++++}-{3:3}:
       __might_fault+0x60/0x80
       _copy_from_user+0x20/0xb0
       get_sg_io_hdr+0x9a/0xb0
       scsi_cmd_ioctl+0x1ea/0x2f0
       cdrom_ioctl+0x3c/0x12b4
       sr_block_ioctl+0xa4/0xd0
       block_ioctl+0x3f/0x50
       ksys_ioctl+0x82/0xc0
       __x64_sys_ioctl+0x16/0x20
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #4 (&cd->lock){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       sr_block_open+0xa2/0x180
       __blkdev_get+0xdd/0x550
       blkdev_get+0x38/0x150
       do_dentry_open+0x16b/0x3e0
       path_openat+0x3c9/0xa00
       do_filp_open+0x75/0x100
       do_sys_openat2+0x8a/0x140
       __x64_sys_openat+0x46/0x70
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #3 (&bdev->bd_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       __blkdev_get+0x6a/0x550
       blkdev_get+0x85/0x150
       blkdev_get_by_path+0x2c/0x70
       btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs]
       open_fs_devices+0x88/0x240 [btrfs]
       btrfs_open_devices+0x92/0xa0 [btrfs]
       btrfs_mount_root+0x250/0x490 [btrfs]
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       vfs_kern_mount.part.0+0x71/0xb0
       btrfs_mount+0x119/0x380 [btrfs]
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       do_mount+0x8c6/0xca0
       __x64_sys_mount+0x8e/0xd0
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #2 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       btrfs_run_dev_stats+0x36/0x420 [btrfs]
       commit_cowonly_roots+0x91/0x2d0 [btrfs]
       btrfs_commit_transaction+0x4e6/0x9f0 [btrfs]
       btrfs_sync_file+0x38a/0x480 [btrfs]
       __x64_sys_fdatasync+0x47/0x80
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #1 (&fs_info->tree_log_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       btrfs_commit_transaction+0x48e/0x9f0 [btrfs]
       btrfs_sync_file+0x38a/0x480 [btrfs]
       __x64_sys_fdatasync+0x47/0x80
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #0 (&fs_info->reloc_mutex){+.+.}-{3:3}:
       __lock_acquire+0x1241/0x20c0
       lock_acquire+0xb0/0x400
       __mutex_lock+0x7b/0x820
       btrfs_record_root_in_trans+0x44/0x70 [btrfs]
       start_transaction+0xd2/0x500 [btrfs]
       btrfs_dirty_inode+0x44/0xd0 [btrfs]
       file_update_time+0xc6/0x120
       btrfs_page_mkwrite+0xda/0x560 [btrfs]
       do_page_mkwrite+0x4f/0x130
       do_wp_page+0x3b0/0x4f0
       handle_mm_fault+0xf47/0x1850
       do_user_addr_fault+0x1fc/0x4b0
       exc_page_fault+0x88/0x300
       asm_exc_page_fault+0x1e/0x30

other info that might help us debug this:

Chain exists of:
  &fs_info->reloc_mutex --> &mm->mmap_lock#2 --> sb_pagefaults

Possible unsafe locking scenario:

     CPU0                    CPU1
     ----                    ----
 lock(sb_pagefaults);
                             lock(&mm->mmap_lock#2);
                             lock(sb_pagefaults);
 lock(&fs_info->reloc_mutex);

 *** DEADLOCK ***

3 locks held by systemd-journal/509:
 #0: ffff97083bdec8b8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x12e/0x4b0
 #1: ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]
 #2: ffff97083144d6a8 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x3f8/0x500 [btrfs]

stack backtrace:
CPU: 0 PID: 509 Comm: systemd-journal Not tainted 5.8.0-0.rc3.1.fc33.x86_64+debug #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
 dump_stack+0x92/0xc8
 check_noncircular+0x134/0x150
 __lock_acquire+0x1241/0x20c0
 lock_acquire+0xb0/0x400
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 ? lock_acquire+0xb0/0x400
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 __mutex_lock+0x7b/0x820
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 ? kvm_sched_clock_read+0x14/0x30
 ? sched_clock+0x5/0x10
 ? sched_clock_cpu+0xc/0xb0
 btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 start_transaction+0xd2/0x500 [btrfs]
 btrfs_dirty_inode+0x44/0xd0 [btrfs]
 file_update_time+0xc6/0x120
 btrfs_page_mkwrite+0xda/0x560 [btrfs]
 ? sched_clock+0x5/0x10
 do_page_mkwrite+0x4f/0x130
 do_wp_page+0x3b0/0x4f0
 handle_mm_fault+0xf47/0x1850
 do_user_addr_fault+0x1fc/0x4b0
 exc_page_fault+0x88/0x300
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fa3972fdbfe
Code: Bad RIP value.

Fix this by not holding the ->device_list_mutex at this point.  The
device_list_mutex exists to protect us from modifying the device list
while the file system is running.

However it can also be modified by doing a scan on a device.  But this
action is specifically protected by the uuid_mutex, which we are holding
here.  We cannot race with opening at this point because we have the
->s_mount lock held during the mount.  Not having the
->device_list_mutex here is perfectly safe as we're not going to change
the devices at this point.

CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add some comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ADESTM pushed a commit that referenced this issue Sep 25, 2020
commit 2d9a2c5 upstream.

Before v4.15 commit 75492a5 ("s390/scsi: Convert timers to use
timer_setup()"), we intentionally only passed zfcp_adapter as context
argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
adapter recovery, it was unnecessary to sync against races between timeout
and (late) completion.  Likewise, we only passed zfcp_erp_action as context
argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
it was unnecessary to sync against races between timeout and (late)
completion.

Meanwhile the timeout handlers get timer_list as context argument and do a
timer-specific container-of to zfcp_fsf_req which can have been freed.

Fix it by making sure that any request timeout handlers, that might just
have started before del_timer(), are completed by using del_timer_sync()
instead. This ensures the request free happens afterwards.

Space time diagram of potential use-after-free:

Basic idea is to have 2 or more pending requests whose timeouts run out at
almost the same time.

req 1 timeout     ERP thread        req 2 timeout
----------------  ----------------  ---------------------------------------
zfcp_fsf_request_timeout_handler
fsf_req = from_timer(fsf_req, t, timer)
adapter = fsf_req->adapter
zfcp_qdio_siosl(adapter)
zfcp_erp_adapter_reopen(adapter,...)
                  zfcp_erp_strategy
                  ...
                  zfcp_fsf_req_dismiss_all
                  list_for_each_entry_safe
                    zfcp_fsf_req_complete 1
                    del_timer 1
                    zfcp_fsf_req_free 1
                    zfcp_fsf_req_complete 2
                                    zfcp_fsf_request_timeout_handler
                    del_timer 2
                                    fsf_req = from_timer(fsf_req, t, timer)
                    zfcp_fsf_req_free 2
                                    adapter = fsf_req->adapter
                                              ^^^^^^^ already freed

Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com
Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> #4.15+
Suggested-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
atorgue added a commit that referenced this issue Nov 2, 2020
Mutex protection in not needed in _allocate_opp_table as it this function
is already called under mutex lock. Furthermore, it fixes the following
issue seen with CONFIG_PROVE_LOCKING flag:

[    3.648912] =========================
[    3.648927] WARNING: held lock freed!
[    3.648944] 5.4.56-00014-g7cc4a3bdec16 #7 Not tainted
[    3.648957] -------------------------
[    3.648976] kworker/1:1/22 is freeing memory df183c00-df183fff, with a lock still held there!
[    3.648991] df183ca0 (&opp_table->lock){+.+.}, at: _opp_get_opp_table+0x1cc/0x224
[    3.649037] 5 locks held by kworker/1:1/22:
[    3.649049]  #0: df0086a0 ((wq_completion)events){+.+.}, at: process_one_work+0x200/0x858
[    3.649082]  #1: df515f1c (deferred_probe_work){+.+.}, at: process_one_work+0x200/0x858
[    3.649110]  #2: d1ff08cc (&dev->mutex){....}, at: __device_attach+0x34/0x17c
[    3.649145]  #3: c139e5e0 (opp_table_lock){+.+.}, at: _opp_get_opp_table+0x1c/0x224
[    3.649175]  #4: df183ca0 (&opp_table->lock){+.+.}, at: _opp_get_opp_table+0x1cc/0x224

Fixes: cf1f361 ("opp: core: fix memory leak in probe deferral")

Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Change-Id: I6da51fab26b698bb803a3a8ddf5454f73d8af956
Reviewed-on: https://gerrit.st.com/c/mpu/oe/st/linux-stm32/+/178474
Reviewed-by: CITOOLS <smet-aci-reviews@lists.codex.cro.st.com>
stalyatech pushed a commit to stalyatech/linux that referenced this issue Feb 26, 2021
commit 936e6b8 upstream.

Suppose that, for unrelated reasons, FSF requests on behalf of recovery are
very slow and can run into the ERP timeout.

In the case at hand, we did adapter recovery to a large degree.  However
due to the slowness a LUN open is pending so the corresponding fc_rport
remains blocked.  After fast_io_fail_tmo we trigger close physical port
recovery for the port under which the LUN should have been opened.  The new
higher order port recovery dismisses the pending LUN open ERP action and
dismisses the pending LUN open FSF request.  Such dismissal decouples the
ERP action from the pending corresponding FSF request by setting
zfcp_fsf_req->erp_action to NULL (among other things)
[zfcp_erp_strategy_check_fsfreq()].

If now the ERP timeout for the pending open LUN request runs out, we must
not use zfcp_fsf_req->erp_action in the ERP timeout handler.  This is a
problem since v4.15 commit 75492a5 ("s390/scsi: Convert timers to use
timer_setup()"). Before that we intentionally only passed zfcp_erp_action
as context argument to zfcp_erp_timeout_handler().

Note: The lifetime of the corresponding zfcp_fsf_req object continues until
a (late) response or an (unrelated) adapter recovery.

Just like the regular response path ignores dismissed requests
[zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the
ERP timeout handler now needs to ignore dismissed requests.  So simply
return early in the ERP timeout handler if the FSF request is marked as
dismissed in its status flags.  To protect against the race where
zfcp_erp_strategy_check_fsfreq() dismisses and sets
zfcp_fsf_req->erp_action to NULL after our previous status flag check,
return early if zfcp_fsf_req->erp_action is NULL.  After all, the former
ERP action does not need to be woken up as that was already done as part of
the dismissal above [zfcp_erp_action_dismiss()].

This fixes the following panic due to kernel page fault in IRQ context:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d
Oops: 0004 ilc:2 [STMicroelectronics#1] SMP
Modules linked in: ...
CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G            E  X   ...
Hardware name: IBM 8561 T01 701 (LPAR)
Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp])
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000
           000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02
           0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000
           0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8
Krnl Code: 001fffff80549bd4: a7180000            lhi     %r1,0
           001fffff80549bd8: 4120a0f0            la      %r2,240(%r10)
          #001fffff80549bdc: a53e0003            llilh   %r3,3
          >001fffff80549be0: ba132000            cs      %r1,%r3,0(%r2)
           001fffff80549be4: a7740037            brc     7,1fffff80549c52
           001fffff80549be8: e320b0180004        lg      %r2,24(%r11)
           001fffff80549bee: e31020e00004        lg      %r1,224(%r2)
           001fffff80549bf4: 412020e0            la      %r2,224(%r2)
Call Trace:
 [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp]
 [<00000985915e26f0>] call_timer_fn+0x38/0x190
 [<00000985915e2944>] expire_timers+0xfc/0x190
 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218
 [<0000098591ca7c4c>] __do_softirq+0x144/0x398
 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88
 [<0000098591551b58>] irq_exit+0xb0/0xb8
 [<0000098591510c6a>] do_IRQ+0x82/0xb0
 [<0000098591ca7140>] ext_int_handler+0x128/0x12c
 [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60
([<000009859172ae4c>] clear_huge_page+0xec/0x250)
 [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768
 [<000009859172a712>] __handle_mm_fault+0x88a/0x900
 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0
 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8
 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220
Last Breaking-Event-Address:
 [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp]
Kernel panic - not syncing: Fatal exception in interrupt

Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com
Fixes: 75492a5 ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> STMicroelectronics#4.15+
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stalyatech pushed a commit to stalyatech/linux that referenced this issue Feb 26, 2021
[ Upstream commit a066562 ]

Enable promisc mode of PF, set VF link state to enable, and
run iperf of the VF, then do self test of the PF. The self test
will fail with a low frequency, and may cause a use-after-free
problem.

[   87.142126] selftest:000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   87.159722] ==================================================================
[   87.174187] BUG: KASAN: use-after-free in hex_dump_to_buffer+0x140/0x608
[   87.187600] Read of size 1 at addr ffff003b22828000 by task ethtool/1186
[   87.201012]
[   87.203978] CPU: 7 PID: 1186 Comm: ethtool Not tainted 5.5.0-rc4-gfd51c473-dirty STMicroelectronics#4
[   87.219306] Hardware name: Huawei TaiShan 2280 V2/BC82AMDA, BIOS TA BIOS 2280-A CS V2.B160.01 01/15/2020
[   87.238292] Call trace:
[   87.243173]  dump_backtrace+0x0/0x280
[   87.250491]  show_stack+0x24/0x30
[   87.257114]  dump_stack+0xe8/0x140
[   87.263911]  print_address_description.isra.8+0x70/0x380
[   87.274538]  __kasan_report+0x12c/0x230
[   87.282203]  kasan_report+0xc/0x18
[   87.288999]  __asan_load1+0x60/0x68
[   87.295969]  hex_dump_to_buffer+0x140/0x608
[   87.304332]  print_hex_dump+0x140/0x1e0
[   87.312000]  hns3_lb_check_skb_data+0x168/0x170
[   87.321060]  hns3_clean_rx_ring+0xa94/0xfe0
[   87.329422]  hns3_self_test+0x708/0x8c0

The length of packet sent by the selftest process is only
128 + 14 bytes, and the min buffer size of a BD is 256 bytes,
and the receive process will make sure the packet sent by
the selftest process is in the linear part, so only check
the linear part in hns3_lb_check_skb_data().

So fix this use-after-free by using skb_headlen() to dump
skb->data instead of skb->len.

Fixes: c39c4d9 ("net: hns3: Add mac loopback selftest support in hns3 driver")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
stalyatech pushed a commit to stalyatech/linux that referenced this issue Feb 26, 2021
commit f5e5677 upstream.

NULL pointer exception happens occasionally on serial output initiated
by login timeout.  This was reproduced only if kernel was built with
significant debugging options and EDMA driver is used with serial
console.

    col-vf50 login: root
    Password:
    Login timed out after 60 seconds.
    Unable to handle kernel NULL pointer dereference at virtual address 00000044
    Internal error: Oops: 5 [STMicroelectronics#1] ARM
    CPU: 0 PID: 157 Comm: login Not tainted 5.7.0-next-20200610-dirty STMicroelectronics#4
    Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
      (fsl_edma_tx_handler) from [<8016eb10>] (__handle_irq_event_percpu+0x64/0x304)
      (__handle_irq_event_percpu) from [<8016eddc>] (handle_irq_event_percpu+0x2c/0x7c)
      (handle_irq_event_percpu) from [<8016ee64>] (handle_irq_event+0x38/0x5c)
      (handle_irq_event) from [<801729e4>] (handle_fasteoi_irq+0xa4/0x160)
      (handle_fasteoi_irq) from [<8016ddcc>] (generic_handle_irq+0x34/0x44)
      (generic_handle_irq) from [<8016e40c>] (__handle_domain_irq+0x54/0xa8)
      (__handle_domain_irq) from [<80508bc8>] (gic_handle_irq+0x4c/0x80)
      (gic_handle_irq) from [<80100af0>] (__irq_svc+0x70/0x98)
    Exception stack(0x8459fe80 to 0x8459fec8)
    fe80: 72286b00 e3359f64 00000001 0000412d a0070013 85c98840 85c98840 a0070013
    fea0: 8054e0d4 00000000 00000002 00000000 00000002 8459fed0 8081fbe8 8081fbec
    fec0: 60070013 ffffffff
      (__irq_svc) from [<8081fbec>] (_raw_spin_unlock_irqrestore+0x30/0x58)
      (_raw_spin_unlock_irqrestore) from [<8056cb48>] (uart_flush_buffer+0x88/0xf8)
      (uart_flush_buffer) from [<80554e60>] (tty_ldisc_hangup+0x38/0x1ac)
      (tty_ldisc_hangup) from [<8054c7f4>] (__tty_hangup+0x158/0x2bc)
      (__tty_hangup) from [<80557b90>] (disassociate_ctty.part.1+0x30/0x23c)
      (disassociate_ctty.part.1) from [<8011fc18>] (do_exit+0x580/0xba0)
      (do_exit) from [<801214f8>] (do_group_exit+0x3c/0xb4)
      (do_group_exit) from [<80121580>] (__wake_up_parent+0x0/0x14)

Issue looks like race condition between interrupt handler fsl_edma_tx_handler()
(called as result of fsl_edma_xfer_desc()) and terminating the transfer with
fsl_edma_terminate_all().

The fsl_edma_tx_handler() handles interrupt for a transfer with already freed
edesc and idle==true.

Fixes: d6be34f ("dma: Add Freescale eDMA engine driver support")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Robin Gong <yibin.gong@nxp.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1591877861-28156-2-git-send-email-krzk@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stalyatech pushed a commit to stalyatech/linux that referenced this issue Feb 26, 2021
commit fc846e9 upstream.

The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked.  Shift amounts greater than or equal to 32 will result in
undefined behavior.  Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity.  Only channels 0 to 15
are valid.

Fixes: a8c66b6 ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable@vger.kernel.org> STMicroelectronics#4.0+: ef75e14: staging: comedi: verify array index is correct before using it
Cc: <stable@vger.kernel.org> STMicroelectronics#4.0+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arnopo pushed a commit to arnopo/linux that referenced this issue Mar 14, 2022
arm32 uses software to simulate the instruction replaced
by kprobe. some instructions may be simulated by constructing
assembly functions. therefore, before executing instruction
simulation, it is necessary to construct assembly function
execution environment in C language through binding registers.
after kasan is enabled, the register binding relationship will
be destroyed, resulting in instruction simulation errors and
causing kernel panic.

the kprobe emulate instruction function is distributed in three
files: actions-common.c actions-arm.c actions-thumb.c, so disable
KASAN when compiling these files.

for example, use kprobe insert on cap_capable+20 after kasan
enabled, the cap_capable assembly code is as follows:
<cap_capable>:
e92d47f0	push	{r4, r5, r6, r7, r8, r9, sl, lr}
e1a05000	mov	r5, r0
e280006c	add	r0, r0, torvalds#108    ; 0x6c
e1a04001	mov	r4, r1
e1a06002	mov	r6, r2
e59fa090	ldr	sl, [pc, torvalds#144]  ;
ebfc7bf8	bl	c03aa4b4 <__asan_load4>
e595706c	ldr	r7, [r5, torvalds#108]  ; 0x6c
e2859014	add	r9, r5, torvalds#20
......
The emulate_ldr assembly code after enabling kasan is as follows:
c06f1384 <emulate_ldr>:
e92d47f0	push	{r4, r5, r6, r7, r8, r9, sl, lr}
e282803c	add	r8, r2, torvalds#60     ; 0x3c
e1a05000	mov	r5, r0
e7e37855	ubfx	r7, r5, STMicroelectronics#16, STMicroelectronics#4
e1a00008	mov	r0, r8
e1a09001	mov	r9, r1
e1a04002	mov	r4, r2
ebf35462	bl	c03c6530 <__asan_load4>
e357000f	cmp	r7, STMicroelectronics#15
e7e36655	ubfx	r6, r5, STMicroelectronics#12, STMicroelectronics#4
e205a00f	and	sl, r5, STMicroelectronics#15
0a000001	beq	c06f13bc <emulate_ldr+0x38>
e0840107	add	r0, r4, r7, lsl STMicroelectronics#2
ebf3545c	bl	c03c6530 <__asan_load4>
e084010a	add	r0, r4, sl, lsl STMicroelectronics#2
ebf3545a	bl	c03c6530 <__asan_load4>
e2890010	add	r0, r9, STMicroelectronics#16
ebf35458	bl	c03c6530 <__asan_load4>
e5990010	ldr	r0, [r9, STMicroelectronics#16]
e12fff30	blx	r0
e356000f	cm	r6, STMicroelectronics#15
1a000014	bne	c06f1430 <emulate_ldr+0xac>
e1a06000	mov	r6, r0
e2840040	add	r0, r4, torvalds#64     ; 0x40
......

when running in emulate_ldr to simulate the ldr instruction, panic
occurred, and the log is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
00000090
pgd = ecb46400
[00000090] *pgd=2e0fa003, *pmd=00000000
Internal error: Oops: 206 [STMicroelectronics#1] SMP ARM
PC is at cap_capable+0x14/0xb0
LR is at emulate_ldr+0x50/0xc0
psr: 600d0293 sp : ecd63af8  ip : 00000004  fp : c0a7c30c
r10: 00000000  r9 : c30897f4  r8 : ecd63cd4
r7 : 0000000f  r6 : 0000000a  r5 : e59fa090  r4 : ecd63c98
r3 : c06ae294  r2 : 00000000  r1 : b7611300  r0 : bf4ec008
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 32c5387d  Table: 2d546400  DAC: 55555555
Process bash (pid: 1643, stack limit = 0xecd60190)
(cap_capable) from (kprobe_handler+0x218/0x340)
(kprobe_handler) from (kprobe_trap_handler+0x24/0x48)
(kprobe_trap_handler) from (do_undefinstr+0x13c/0x364)
(do_undefinstr) from (__und_svc_finish+0x0/0x30)
(__und_svc_finish) from (cap_capable+0x18/0xb0)
(cap_capable) from (cap_vm_enough_memory+0x38/0x48)
(cap_vm_enough_memory) from
(security_vm_enough_memory_mm+0x48/0x6c)
(security_vm_enough_memory_mm) from
(copy_process.constprop.5+0x16b4/0x25c8)
(copy_process.constprop.5) from (_do_fork+0xe8/0x55c)
(_do_fork) from (SyS_clone+0x1c/0x24)
(SyS_clone) from (__sys_trace_return+0x0/0x10)
Code: 0050a0e1 6c0080e2 0140a0e1 0260a0e1 (f801f0e7)

Fixes: 35aa1df ("ARM kprobes: instruction single-stepping support")
Fixes: 4210157 ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: huangshaobo <huangshaobo6@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
@ddrown ddrown closed this Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants