Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable hw TSO for atl1e atheros card #338

Closed
wants to merge 1 commit into from

Conversation

anthonytex
Copy link

So maybe i'm the only one that is still using this Ethernet card in the world but after opening a bug a lot time ago my patch was not merged: this card has a broken hardware as described here it causes a lot of speed problem.

@AntonBoch1244
Copy link

@anthonytex, Send this PR to torvalds/linux @ kernel.org repo instead of this repo, because it's mirror of torvalds/linux @ kernel.org repo.

@anthonytex
Copy link
Author

Done, thanks @AntonBoch1244

@anthonytex anthonytex closed this Oct 19, 2016
kraj pushed a commit to kraj/linux that referenced this pull request Feb 20, 2018
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Nov 19, 2018
On 19/11/18 14:43, Juri Lelli wrote:
> On 19/11/18 13:52, Peter Zijlstra wrote:
> > On Mon, Nov 19, 2018 at 01:07:18PM +0100, luca abeni wrote:
> >
> > > > On Sun, 18 Nov 2018, syzbot wrote:
> >
> > > > > WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
> > > > > enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
> > >
> > > Here, it looks like a task is invoking sched_setattr() to become
> > > SCHED_DEADLINE when dl_boosted is set...
> > >
> > > Is this possible / correct?
> >
> > Possible, clearly. Correct, only in so far as that it is not a malformed
> > program, but it is very poor design to actually trigger this (of course
> > the fuzzer doesn't care about that).
> >
> > > If this (sched_setattr() with dl_boosted set) should not be possible,
> > > then we have a bug that we need to investigate...
> > >
> > > Otherwise, I suspect we can just remove the WARN_ON at line 628 of
> > > deadline.c
> >
> > I wonder why we put that WARN in there to begin with... git-blame gives
> > us:
> >
> >   98b0a85 ("sched/deadline: Remove useless parameter from setup_new_dl_entity()")
> >
> > So the problem seems to be that if we're boosted, we should maybe not be
> > using our own (newly set) parameters, but those of the donor task.
> >
> > Specifically, our 'suboptimal' deadline inheritance scheme 'requires' us
> > to use the inherited deadline, not our own. So in that respect I think
> > the WARN is valid, although I'm not sure what, apart from actually
> > finishing that PE patch-set we can do about it just now.
>
> Mmm, but, as it was written in the comment that was removed by 295d6d5
> ("sched/deadline: Fix switching to -deadline"), I was still expecting
> that for a boosted task setup_new_dl_entity() shouldn't be called.
> Wonder if this is another manifestation of the problems we have with
> clocks. Need to think more about it.

So, while this looks like nothing more than a stop-gap solution until we
get PE in place, would the following make any sense? It seems I can't
reproduce the warning anymore with it (w/o it usually takes a few secs
to reproduce).

--->8---

From 9326fd2b20269cffef7290bdc5b8173460d3c870 Mon Sep 17 00:00:00 2001
From: Juri Lelli <juri.lelli@redhat.com>
Date: Mon, 19 Nov 2018 16:04:42 +0100
Subject: [PATCH] sched/core: Fix PI boosting between RT and DEADLINE

syzbot reported the following warning:

 WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
 enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
 PM: Basic memory bitmaps freed
 Kernel panic - not syncing: panic_on_warn set ...
 CPU: 1 PID: 6351 Comm: syz-executor0 Not tainted 4.20.0-rc2+ torvalds#338
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x244/0x39d lib/dump_stack.c:113
   panic+0x2ad/0x55c kernel/panic.c:188
   __warn.cold.8+0x20/0x45 kernel/panic.c:540
   report_bug+0x254/0x2d0 lib/bug.c:186
   fixup_bug arch/x86/kernel/traps.c:178 [inline]
   do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
   do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
   invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969
 RIP: 0010:enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
 Code: ff 48 8b 8d c8 fe ff ff 48 c1 e6 2a 4c 8b 9d d0 fe ff ff 8b 95 d8 fe
 ff ff 48 8b 85 e0 fe ff ff e9 16 e4 ff ff e8 16 d0 ea ff <0f> 0b e9 17 f1
 ff ff 48 8b bd e8 fe ff ff 4c 89 95 c8 fe ff ff 48
 RSP: 0018:ffff8881ba39fa18 EFLAGS: 00010002
 RAX: 0000000000000000 RBX: ffff8881b9d6c000 RCX: ffff8881b9d6c278
 RDX: ffff8881b9d6c03c RSI: 0000000000000002 RDI: ffff8881daf2d710
 RBP: ffff8881ba39fb78 R08: 0000000000000001 R09: ffff8881daf00000
 R10: 0000001a4d4f1987 R11: ffff8881daf2db3b R12: 1ffff11037473f4e
 R13: ffff8881b9d6c2cc R14: ffff8881daf2ccc0 R15: ffff8881daf2ccc0
   enqueue_task+0x184/0x390 kernel/sched/core.c:730
   __sched_setscheduler+0xe99/0x2190 kernel/sched/core.c:4336
   sched_setattr kernel/sched/core.c:4394 [inline]
   __do_sys_sched_setattr kernel/sched/core.c:4570 [inline]
   __se_sys_sched_setattr kernel/sched/core.c:4549 [inline]
   __x64_sys_sched_setattr+0x1b2/0x2f0 kernel/sched/core.c:4549
   do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
 RIP: 0033:0x457569
 Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
 ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
 RSP: 002b:00007f05ce0a2c78 EFLAGS: 00000246 ORIG_RAX: 000000000000013a
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000000
 RBP: 000000000072bfa0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f05ce0a36d4
 R13: 00000000004c369f R14: 00000000004d5730 R15: 00000000ffffffff

At deadline.c:628 we have:

 623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
 624 {
 625 	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
 626 	struct rq *rq = rq_of_dl_rq(dl_rq);
 627
 628 	WARN_ON(dl_se->dl_boosted);
 629 	WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline));
        [...]
     }

Which means that setup_new_dl_entity() has been called on a task
currently boosted. This shouldn't happen though, as setup_new_
dl_entity() is only called when the 'dynamic' deadline of the new entity
is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this
condition.

Digging through PI code I noticed that what above might in fact happen
if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the
first branch of boosting conditions we check only if a pi_task 'dynamic'
deadline is earlier than mutex holder's and in this case we set mutex
holder to be dl_boosted. However, since RT 'dynamic' deadlines are only
initialized if such tasks get boosted at some point (or if they become
DEADLINE of course), in general RT 'dynamic' deadlines are usually equal
to 0 and this verifies the aforementioned condition.

Fix it by checking that the potential donor task is actually (even if
temporary because in turn boosted) running at DEADLINE priority before
using its 'dynamic' deadline value.

Reported-by: syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 17, 2020
rtllib_crypt_ccmp.c: Fixed the error - space required before the
open parenthesis '(' on line torvalds#281.

rtllib_crypt_ccmp.c: Fixed the warning - suspect code indent for
conditional statements on line torvalds#338

Signed-off-by: Darshan D V <darshandv10@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jul 19, 2020
    rtllib_crypt_ccmp.c: Fixed the warning - suspect code indent for
    conditional statements on line torvalds#338

Signed-off-by: Darshan D V <darshandv10@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 24, 2020
ubifs_jnl_write_inode() probably cause read out-of-bounds in some situation.
There is kasan stack:
[  336.432159] BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.433634] Read of size 4 at addr ffff888019612ff8 by task kworker/u8:4/135
[  336.434605]
[  336.434830] CPU: 1 PID: 135 Comm: kworker/u8:4 Not tainted 5.10.0-11826-gaf2a097952f3-dirty torvalds#338
[  336.436050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[  336.437876] Workqueue: writeback wb_workfn (flush-ubifs_0_0)
[  336.438670] Call Trace:
[  336.439021]  ? dump_stack+0xdd/0x126
[  336.439513]  ? print_address_description.constprop.0+0x2c/0x3c0
[  336.440308]  ? _raw_write_lock_irqsave+0x140/0x140
[  336.440921]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.441546]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.442186]  ? kasan_report.cold+0x5d/0xd8
[  336.442711]  ? nand_reset_op+0x280/0x310
[  336.443218]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.443842]  ? __asan_load4+0x77/0x120
[  336.444334]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.444963]  ? nand_ecc_sw_hamming_calculate+0x6c/0x80
[  336.445619]  ? rawnand_sw_hamming_calculate+0x12/0x20
[  336.446263]  ? nand_write_page_swecc+0xa9/0x160
[  336.446849]  ? nand_do_write_ops+0x390/0x830
[  336.447406]  ? __writeback_single_inode+0x6cc/0x880
[  336.448041]  ? nand_write_oob+0x78/0x100
[  336.448568]  ? mtd_write_oob_std+0xe2/0x160
[  336.449127]  ? mtd_write_oob+0xec/0x1b0
[  336.449679]  ? mtd_write+0x92/0xf0
[  336.450128]  ? mtd_write_oob+0x1b0/0x1b0
[  336.450633]  ? ubi_self_check_all_ff+0x82/0x2e0 [ubi]
[  336.451328]  ? __list_add_valid+0x2b/0x130
[  336.451865]  ? ubi_io_write+0x2c2/0xa90 [ubi]
[  336.452472]  ? _raw_read_lock_irq+0x90/0x90
[  336.453078]  ? kmem_cache_alloc_trace+0x465/0x8b0
[  336.453749]  ? do_sync_erase+0x350/0x350 [ubi]
[  336.454430]  ? __kasan_check_write+0x20/0x30
[  336.455050]  ? down_write+0xf2/0x190
[  336.455569]  ? down_write_killable+0x1b0/0x1b0
[  336.456221]  ? check_mapping+0x2c/0x590 [ubi]
[  336.456890]  ? ubi_eba_write_leb+0x58a/0xfa0 [ubi]
[  336.457618]  ? __kmalloc+0x490/0x910
[  336.458142]  ? ubifs_jnl_write_inode.cold+0x6f/0x878 [ubifs]
[  336.459033]  ? writeback_sb_inodes+0x3a9/0x9a0
[  336.459672]  ? __writeback_inodes_wb+0xc8/0x170
[  336.460330]  ? wb_writeback+0x637/0x700
[  336.460882]  ? wb_workfn+0x8af/0xb80
[  336.461398]  ? process_one_work+0x467/0x9f0
[  336.462004]  ? worker_thread+0x34d/0x8e0
[  336.462582]  ? kthread+0x204/0x280
[  336.463047]  ? ret_from_fork+0x1f/0x30
[  336.463570]  ? create_prof_cpu_mask+0x30/0x30
[  336.464185]  ? ubi_eba_read_leb_sg+0x1f0/0x1f0 [ubi]
[  336.464917]  ? hrtimer_active+0x9b/0x100
[  336.465468]  ? ubi_leb_write+0x22c/0x2f0 [ubi]
[  336.466130]  ? ubifs_leb_write+0xf2/0x1b0 [ubifs]
[  336.466851]  ? ubifs_wbuf_write_nolock+0x412/0x1280 [ubifs]
[  336.467686]  ? write_head+0xdf/0x1c0 [ubifs]
[  336.468355]  ? ubifs_jnl_write_inode.cold+0x3ec/0x878 [ubifs]
[  336.469183]  ? ret_from_fork+0x1e/0x30
[  336.469707]  ? ubifs_jnl_write_data+0x660/0x660 [ubifs]
[  336.470497]  ? unwind_next_frame+0x247/0xca0
[  336.471095]  ? ret_from_fork+0x1f/0x30
[  336.471574]  ? fprop_reflect_period_percpu.isra.0+0x1f/0x1b0
[  336.472335]  ? generic_writepages+0x93/0x140
[  336.472933]  ? __kasan_check_write+0x20/0x30
[  336.473526]  ? mutex_lock+0xa6/0x110
[  336.474031]  ? __mutex_lock_slowpath+0x30/0x30
[  336.474662]  ? ubifs_write_inode+0x1c3/0x290 [ubifs]
[  336.475446]  ? __writeback_single_inode+0x6cc/0x880
[  336.476155]  ? wbc_attach_and_unlock_inode+0x2b6/0x400
[  336.476891]  ? writeback_sb_inodes+0x3a9/0x9a0
[  336.477528]  ? write_inode_now+0x1e0/0x1e0
[  336.478119]  ? __writeback_inodes_wb+0xc8/0x170
[  336.478770]  ? wb_writeback+0x637/0x700
[  336.479326]  ? __writeback_inodes_wb+0x170/0x170
[  336.479992]  ? current_work+0xa0/0xa0
[  336.480524]  ? _find_next_bit.constprop.0+0x3e/0x140
[  336.481241]  ? find_next_bit+0x18/0x30
[  336.481780]  ? cpumask_next+0x2f/0x40
[  336.482312]  ? wb_workfn+0x8af/0xb80
[  336.482832]  ? update_cfs_group+0x1e/0x1b0
[  336.483421]  ? inode_wait_for_writeback+0x60/0x60
[  336.484106]  ? schedule+0xb7/0x240
[  336.484595]  ? finish_task_switch+0x14e/0x9a0
[  336.485225]  ? __kasan_check_write+0x20/0x30
[  336.485841]  ? __schedule+0x6f4/0x1600
[  336.486382]  ? __kasan_check_read+0x1d/0x30
[  336.486981]  ? read_word_at_a_time+0x16/0x30
[  336.487594]  ? process_one_work+0x467/0x9f0
[  336.488198]  ? worker_thread+0x34d/0x8e0
[  336.488762]  ? rescuer_thread+0x820/0x820
[  336.489344]  ? kthread+0x204/0x280
[  336.489839]  ? kthread_bind+0x50/0x50
[  336.490367]  ? ret_from_fork+0x1f/0x30
[  336.490913]
[  336.491138] Allocated by task 135:
[  336.491629]  kasan_save_stack+0x23/0x60
[  336.492189]  __kasan_kmalloc.constprop.0+0x10b/0x120
[  336.492898]  kasan_kmalloc+0xd/0x20
[  336.493401]  __kmalloc+0x490/0x910
[  336.493890]  ubifs_jnl_write_inode.cold+0x6f/0x878 [ubifs]
[  336.494744]  ubifs_write_inode+0x1c3/0x290 [ubifs]
[  336.495500]  __writeback_single_inode+0x6cc/0x880
[  336.496179]  writeback_sb_inodes+0x3a9/0x9a0
[  336.496791]  __writeback_inodes_wb+0xc8/0x170
[  336.497417]  wb_writeback+0x637/0x700
[  336.497949]  wb_workfn+0x8af/0xb80
[  336.498440]  process_one_work+0x467/0x9f0
[  336.499023]  worker_thread+0x34d/0x8e0
[  336.499567]  kthread+0x204/0x280
[  336.500050]  ret_from_fork+0x1f/0x30
[  336.500570]
[  336.500793] The buggy address belongs to the object at ffff888019612000
[  336.500793]  which belongs to the cache kmalloc-4k of size 4096
[  336.502550] The buggy address is located 4088 bytes inside of
[  336.502550]  4096-byte region [ffff888019612000, ffff888019613000)
[  336.504231] The buggy address belongs to the page:
[  336.504917] page:000000003204ded8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x19610
[  336.506234] head:000000003204ded8 order:3 compound_mapcount:0 compound_pincount:0
[  336.507293] flags: 0x1fffff80010200(slab|head)
[  336.507934] raw: 001fffff80010200 ffffea0000667000 0000000200000002 ffff888010842140
[  336.509038] raw: 0000000000000000 0000000080040004 00000001ffffffff ffff88801956e3c1
[  336.510132] page dumped because: kasan: bad access detected
[  336.510923] pages's memcg:ffff88801956e3c1
[  336.511509]
[  336.511730] Memory state around the buggy address:
[  336.512421]  ffff888019612e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  336.513446]  ffff888019612f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  336.514468] >ffff888019612f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
[  336.515494]                                                                 ^
[  336.516506]  ffff888019613000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  336.517535]  ffff888019613080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  336.518560] ==================================================================

The memory area allocated in ubifs_jnl_write_inode() is not aligned with 8 bytes:
ino_start = ino = kmalloc(write_len, GFP_NOFS);

When ino_start passed into write_head -> ubifs_wbuf_write_nolock:
    n = aligned_len >> c->max_write_shift;
    if (n) {
      n <<= c->max_write_shift;
      err = ubifs_leb_write(c, wbuf->lnum, buf + written, wbuf->offs, n);
      // Read oob occurs here, read n bytes from buf, and buf is passed from @ino_start which is
      // not 8 bytes aligned(write_len < n). Program read (n - write_len) more bytes.
    }

Reproducer:
0. config KASAN && apply print.patch
1. mount ubifs on /root/temp
2. run test.sh
3. cd /root/temp && ls // change atime for link_file
4. wait 1~2 minutes

Cc: <stable@vger.kernel.org>
Fixes: 1e51764 ("UBIFS: add new flash file system")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210865

Signed-off-by: Chengsong Ke <kechengsong@huawei.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 28, 2020
ubifs_jnl_write_inode() probably cause read out-of-bounds in some situation.
There is kasan stack:
[  336.432159] BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.433634] Read of size 4 at addr ffff888019612ff8 by task kworker/u8:4/135
[  336.434605]
[  336.434830] CPU: 1 PID: 135 Comm: kworker/u8:4 Not tainted 5.10.0-11826-gaf2a097952f3-dirty torvalds#338
[  336.436050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[  336.437876] Workqueue: writeback wb_workfn (flush-ubifs_0_0)
[  336.438670] Call Trace:
[  336.439021]  ? dump_stack+0xdd/0x126
[  336.439513]  ? print_address_description.constprop.0+0x2c/0x3c0
[  336.440308]  ? _raw_write_lock_irqsave+0x140/0x140
[  336.440921]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.441546]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.442186]  ? kasan_report.cold+0x5d/0xd8
[  336.442711]  ? nand_reset_op+0x280/0x310
[  336.443218]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.443842]  ? __asan_load4+0x77/0x120
[  336.444334]  ? ecc_sw_hamming_calculate+0x1dc/0x7d0
[  336.444963]  ? nand_ecc_sw_hamming_calculate+0x6c/0x80
[  336.445619]  ? rawnand_sw_hamming_calculate+0x12/0x20
[  336.446263]  ? nand_write_page_swecc+0xa9/0x160
[  336.446849]  ? nand_do_write_ops+0x390/0x830
[  336.447406]  ? __writeback_single_inode+0x6cc/0x880
[  336.448041]  ? nand_write_oob+0x78/0x100
[  336.448568]  ? mtd_write_oob_std+0xe2/0x160
[  336.449127]  ? mtd_write_oob+0xec/0x1b0
[  336.449679]  ? mtd_write+0x92/0xf0
[  336.450128]  ? mtd_write_oob+0x1b0/0x1b0
[  336.450633]  ? ubi_self_check_all_ff+0x82/0x2e0 [ubi]
[  336.451328]  ? __list_add_valid+0x2b/0x130
[  336.451865]  ? ubi_io_write+0x2c2/0xa90 [ubi]
[  336.452472]  ? _raw_read_lock_irq+0x90/0x90
[  336.453078]  ? kmem_cache_alloc_trace+0x465/0x8b0
[  336.453749]  ? do_sync_erase+0x350/0x350 [ubi]
[  336.454430]  ? __kasan_check_write+0x20/0x30
[  336.455050]  ? down_write+0xf2/0x190
[  336.455569]  ? down_write_killable+0x1b0/0x1b0
[  336.456221]  ? check_mapping+0x2c/0x590 [ubi]
[  336.456890]  ? ubi_eba_write_leb+0x58a/0xfa0 [ubi]
[  336.457618]  ? __kmalloc+0x490/0x910
[  336.458142]  ? ubifs_jnl_write_inode.cold+0x6f/0x878 [ubifs]
[  336.459033]  ? writeback_sb_inodes+0x3a9/0x9a0
[  336.459672]  ? __writeback_inodes_wb+0xc8/0x170
[  336.460330]  ? wb_writeback+0x637/0x700
[  336.460882]  ? wb_workfn+0x8af/0xb80
[  336.461398]  ? process_one_work+0x467/0x9f0
[  336.462004]  ? worker_thread+0x34d/0x8e0
[  336.462582]  ? kthread+0x204/0x280
[  336.463047]  ? ret_from_fork+0x1f/0x30
[  336.463570]  ? create_prof_cpu_mask+0x30/0x30
[  336.464185]  ? ubi_eba_read_leb_sg+0x1f0/0x1f0 [ubi]
[  336.464917]  ? hrtimer_active+0x9b/0x100
[  336.465468]  ? ubi_leb_write+0x22c/0x2f0 [ubi]
[  336.466130]  ? ubifs_leb_write+0xf2/0x1b0 [ubifs]
[  336.466851]  ? ubifs_wbuf_write_nolock+0x412/0x1280 [ubifs]
[  336.467686]  ? write_head+0xdf/0x1c0 [ubifs]
[  336.468355]  ? ubifs_jnl_write_inode.cold+0x3ec/0x878 [ubifs]
[  336.469183]  ? ret_from_fork+0x1e/0x30
[  336.469707]  ? ubifs_jnl_write_data+0x660/0x660 [ubifs]
[  336.470497]  ? unwind_next_frame+0x247/0xca0
[  336.471095]  ? ret_from_fork+0x1f/0x30
[  336.471574]  ? fprop_reflect_period_percpu.isra.0+0x1f/0x1b0
[  336.472335]  ? generic_writepages+0x93/0x140
[  336.472933]  ? __kasan_check_write+0x20/0x30
[  336.473526]  ? mutex_lock+0xa6/0x110
[  336.474031]  ? __mutex_lock_slowpath+0x30/0x30
[  336.474662]  ? ubifs_write_inode+0x1c3/0x290 [ubifs]
[  336.475446]  ? __writeback_single_inode+0x6cc/0x880
[  336.476155]  ? wbc_attach_and_unlock_inode+0x2b6/0x400
[  336.476891]  ? writeback_sb_inodes+0x3a9/0x9a0
[  336.477528]  ? write_inode_now+0x1e0/0x1e0
[  336.478119]  ? __writeback_inodes_wb+0xc8/0x170
[  336.478770]  ? wb_writeback+0x637/0x700
[  336.479326]  ? __writeback_inodes_wb+0x170/0x170
[  336.479992]  ? current_work+0xa0/0xa0
[  336.480524]  ? _find_next_bit.constprop.0+0x3e/0x140
[  336.481241]  ? find_next_bit+0x18/0x30
[  336.481780]  ? cpumask_next+0x2f/0x40
[  336.482312]  ? wb_workfn+0x8af/0xb80
[  336.482832]  ? update_cfs_group+0x1e/0x1b0
[  336.483421]  ? inode_wait_for_writeback+0x60/0x60
[  336.484106]  ? schedule+0xb7/0x240
[  336.484595]  ? finish_task_switch+0x14e/0x9a0
[  336.485225]  ? __kasan_check_write+0x20/0x30
[  336.485841]  ? __schedule+0x6f4/0x1600
[  336.486382]  ? __kasan_check_read+0x1d/0x30
[  336.486981]  ? read_word_at_a_time+0x16/0x30
[  336.487594]  ? process_one_work+0x467/0x9f0
[  336.488198]  ? worker_thread+0x34d/0x8e0
[  336.488762]  ? rescuer_thread+0x820/0x820
[  336.489344]  ? kthread+0x204/0x280
[  336.489839]  ? kthread_bind+0x50/0x50
[  336.490367]  ? ret_from_fork+0x1f/0x30
[  336.490913]
[  336.491138] Allocated by task 135:
[  336.491629]  kasan_save_stack+0x23/0x60
[  336.492189]  __kasan_kmalloc.constprop.0+0x10b/0x120
[  336.492898]  kasan_kmalloc+0xd/0x20
[  336.493401]  __kmalloc+0x490/0x910
[  336.493890]  ubifs_jnl_write_inode.cold+0x6f/0x878 [ubifs]
[  336.494744]  ubifs_write_inode+0x1c3/0x290 [ubifs]
[  336.495500]  __writeback_single_inode+0x6cc/0x880
[  336.496179]  writeback_sb_inodes+0x3a9/0x9a0
[  336.496791]  __writeback_inodes_wb+0xc8/0x170
[  336.497417]  wb_writeback+0x637/0x700
[  336.497949]  wb_workfn+0x8af/0xb80
[  336.498440]  process_one_work+0x467/0x9f0
[  336.499023]  worker_thread+0x34d/0x8e0
[  336.499567]  kthread+0x204/0x280
[  336.500050]  ret_from_fork+0x1f/0x30
[  336.500570]
[  336.500793] The buggy address belongs to the object at ffff888019612000
[  336.500793]  which belongs to the cache kmalloc-4k of size 4096
[  336.502550] The buggy address is located 4088 bytes inside of
[  336.502550]  4096-byte region [ffff888019612000, ffff888019613000)
[  336.504231] The buggy address belongs to the page:
[  336.504917] page:000000003204ded8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x19610
[  336.506234] head:000000003204ded8 order:3 compound_mapcount:0 compound_pincount:0
[  336.507293] flags: 0x1fffff80010200(slab|head)
[  336.507934] raw: 001fffff80010200 ffffea0000667000 0000000200000002 ffff888010842140
[  336.509038] raw: 0000000000000000 0000000080040004 00000001ffffffff ffff88801956e3c1
[  336.510132] page dumped because: kasan: bad access detected
[  336.510923] pages's memcg:ffff88801956e3c1
[  336.511509]
[  336.511730] Memory state around the buggy address:
[  336.512421]  ffff888019612e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  336.513446]  ffff888019612f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  336.514468] >ffff888019612f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
[  336.515494]                                                                 ^
[  336.516506]  ffff888019613000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  336.517535]  ffff888019613080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  336.518560] ==================================================================

The memory area allocated in ubifs_jnl_write_inode() is not aligned with 8 bytes:
ino_start = ino = kmalloc(write_len, GFP_NOFS);

When ino_start passed into write_head -> ubifs_wbuf_write_nolock:
    n = aligned_len >> c->max_write_shift;
    if (n) {
      n <<= c->max_write_shift;
      err = ubifs_leb_write(c, wbuf->lnum, buf + written, wbuf->offs, n);
      // Read oob occurs here, read n bytes from buf, and buf is passed from @ino_start which is
      // not 8 bytes aligned(write_len < n). Program read (n - write_len) more bytes.
    }

Reproducer:
0. config KASAN && apply print.patch
1. mount ubifs on /root/temp
2. run test.sh
3. cd /root/temp && ls // change atime for link_file
4. wait 1~2 minutes

In order to solve the read oob problem in ubifs_wbuf_write_nolock, just align the write_len to
8 bytes when alloc the memory. So that this patch will not affect the use of write_len in other
functions, such as ubifs_jnl_write_inode->make_reservation and ubifs_jnl_write_inode->ubifs_node_calc_hash.

Cc: <stable@vger.kernel.org>
Fixes: 1e51764 ("UBIFS: add new flash file system")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210865

Signed-off-by: Chengsong Ke <kechengsong@huawei.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 15, 2021
This commit fixes the following checkpatch.pl errors:

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#285: FILE: ./hal/odm.c:285:
    +void odm_CommonInfoSelfInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#287: FILE: ./hal/odm.c:287:
    +void odm_CommonInfoSelfUpdate(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#289: FILE: ./hal/odm.c:289:
    +void odm_CmnInfoInit_Debug(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#291: FILE: ./hal/odm.c:291:
    +void odm_BasicDbgMessage(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#305: FILE: ./hal/odm.c:305:
    +void odm_RefreshRateAdaptiveMaskCE(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#309: FILE: ./hal/odm.c:309:
    +void odm_RSSIMonitorInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#311: FILE: ./hal/odm.c:311:
    +void odm_RSSIMonitorCheckCE(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#313: FILE: ./hal/odm.c:313:
    +void odm_RSSIMonitorCheck(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#315: FILE: ./hal/odm.c:315:
    +void odm_SwAntDetectInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#323: FILE: ./hal/odm.c:323:
    +void odm_RefreshRateAdaptiveMask(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#325: FILE: ./hal/odm.c:325:
    +void ODM_TXPowerTrackingCheck(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#327: FILE: ./hal/odm.c:327:
    +void odm_RateAdaptiveMaskInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#330: FILE: ./hal/odm.c:330:
    +void odm_TXPowerTrackingInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#338: FILE: ./hal/odm.c:338:
    +void odm_InitHybridAntDiv(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#341: FILE: ./hal/odm.c:341:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#349: FILE: ./hal/odm.c:349:
    +void odm_SetRxIdleAnt(struct DM_ODM_T * pDM_Odm, u8 Ant, bool bDualPath);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#353: FILE: ./hal/odm.c:353:
    +void odm_HwAntDiv(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#363: FILE: ./hal/odm.c:363:
    +void ODM_DMInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#393: FILE: ./hal/odm.c:393:
    +void ODM_DMWatchdog(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#420: FILE: ./hal/odm.c:420:
    +		struct DIG_T * pDM_DigTable = &pDM_Odm->DM_DigTable;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#448: FILE: ./hal/odm.c:448:
    +void ODM_CmnInfoInit(struct DM_ODM_T * pDM_Odm, enum ODM_CMNINFO_E CmnInfo, u32 Value)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#560: FILE: ./hal/odm.c:560:
    +void ODM_CmnInfoHook(struct DM_ODM_T * pDM_Odm, enum ODM_CMNINFO_E CmnInfo, void *pValue)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#689: FILE: ./hal/odm.c:689:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#717: FILE: ./hal/odm.c:717:
    +void ODM_CmnInfoUpdate(struct DM_ODM_T * pDM_Odm, u32 CmnInfo, u64 Value)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#831: FILE: ./hal/odm.c:831:
    +void odm_CommonInfoSelfInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#841: FILE: ./hal/odm.c:841:
    +void odm_CommonInfoSelfUpdate(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#867: FILE: ./hal/odm.c:867:
    +void odm_CmnInfoInit_Debug(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#888: FILE: ./hal/odm.c:888:
    +void odm_BasicDbgMessage(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #935: FILE: ./hal/odm.c:935:
    +void odm_RateAdaptiveMaskInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #937: FILE: ./hal/odm.c:937:
    +	struct ODM_RATE_ADAPTIVE * pOdmRA = &pDM_Odm->RateAdaptive;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #953: FILE: ./hal/odm.c:953:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1083: FILE: ./hal/odm.c:1083:
    +void odm_RefreshRateAdaptiveMask(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1094: FILE: ./hal/odm.c:1094:
    +void odm_RefreshRateAdaptiveMaskCE(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1131: FILE: ./hal/odm.c:1131:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1137: FILE: ./hal/odm.c:1137:
    +	struct ODM_RATE_ADAPTIVE * pRA = &pDM_Odm->RateAdaptive;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1196: FILE: ./hal/odm.c:1196:
    +void odm_RSSIMonitorInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1198: FILE: ./hal/odm.c:1198:
    +	struct RA_T * pRA_Table = &pDM_Odm->DM_RA_Table;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1204: FILE: ./hal/odm.c:1204:
    +void odm_RSSIMonitorCheck(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1217: FILE: ./hal/odm.c:1217:
    +	struct DM_ODM_T * pDM_Odm = &(pHalData->odmpriv);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1234: FILE: ./hal/odm.c:1234:
    +void odm_RSSIMonitorCheckCE(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1243: FILE: ./hal/odm.c:1243:
    +	struct RA_T * pRA_Table = &pDM_Odm->DM_RA_Table;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1306: FILE: ./hal/odm.c:1306:
    +static u8 getSwingIndex(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1330: FILE: ./hal/odm.c:1330:
    +void odm_TXPowerTrackingInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1374: FILE: ./hal/odm.c:1374:
    +void ODM_TXPowerTrackingCheck(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1398: FILE: ./hal/odm.c:1398:
    +void odm_SwAntDetectInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1400: FILE: ./hal/odm.c:1400:
    +	struct SWAT_T * pDM_SWAT_Table = &pDM_Odm->DM_SWAT_Table;

Signed-off-by: Marco Cesati <marcocesati@gmail.com>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Mar 16, 2021
This commit fixes the following checkpatch.pl errors:

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#285: FILE: ./hal/odm.c:285:
    +void odm_CommonInfoSelfInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#287: FILE: ./hal/odm.c:287:
    +void odm_CommonInfoSelfUpdate(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#289: FILE: ./hal/odm.c:289:
    +void odm_CmnInfoInit_Debug(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#291: FILE: ./hal/odm.c:291:
    +void odm_BasicDbgMessage(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#305: FILE: ./hal/odm.c:305:
    +void odm_RefreshRateAdaptiveMaskCE(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#309: FILE: ./hal/odm.c:309:
    +void odm_RSSIMonitorInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#311: FILE: ./hal/odm.c:311:
    +void odm_RSSIMonitorCheckCE(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#313: FILE: ./hal/odm.c:313:
    +void odm_RSSIMonitorCheck(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#315: FILE: ./hal/odm.c:315:
    +void odm_SwAntDetectInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#323: FILE: ./hal/odm.c:323:
    +void odm_RefreshRateAdaptiveMask(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#325: FILE: ./hal/odm.c:325:
    +void ODM_TXPowerTrackingCheck(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#327: FILE: ./hal/odm.c:327:
    +void odm_RateAdaptiveMaskInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#330: FILE: ./hal/odm.c:330:
    +void odm_TXPowerTrackingInit(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#338: FILE: ./hal/odm.c:338:
    +void odm_InitHybridAntDiv(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#341: FILE: ./hal/odm.c:341:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#349: FILE: ./hal/odm.c:349:
    +void odm_SetRxIdleAnt(struct DM_ODM_T * pDM_Odm, u8 Ant, bool bDualPath);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#353: FILE: ./hal/odm.c:353:
    +void odm_HwAntDiv(struct DM_ODM_T * pDM_Odm);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#363: FILE: ./hal/odm.c:363:
    +void ODM_DMInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#393: FILE: ./hal/odm.c:393:
    +void ODM_DMWatchdog(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#420: FILE: ./hal/odm.c:420:
    +		struct DIG_T * pDM_DigTable = &pDM_Odm->DM_DigTable;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#448: FILE: ./hal/odm.c:448:
    +void ODM_CmnInfoInit(struct DM_ODM_T * pDM_Odm, enum ODM_CMNINFO_E CmnInfo, u32 Value)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#560: FILE: ./hal/odm.c:560:
    +void ODM_CmnInfoHook(struct DM_ODM_T * pDM_Odm, enum ODM_CMNINFO_E CmnInfo, void *pValue)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#689: FILE: ./hal/odm.c:689:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#717: FILE: ./hal/odm.c:717:
    +void ODM_CmnInfoUpdate(struct DM_ODM_T * pDM_Odm, u32 CmnInfo, u64 Value)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#831: FILE: ./hal/odm.c:831:
    +void odm_CommonInfoSelfInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#841: FILE: ./hal/odm.c:841:
    +void odm_CommonInfoSelfUpdate(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#867: FILE: ./hal/odm.c:867:
    +void odm_CmnInfoInit_Debug(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    torvalds#888: FILE: ./hal/odm.c:888:
    +void odm_BasicDbgMessage(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #935: FILE: ./hal/odm.c:935:
    +void odm_RateAdaptiveMaskInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #937: FILE: ./hal/odm.c:937:
    +	struct ODM_RATE_ADAPTIVE * pOdmRA = &pDM_Odm->RateAdaptive;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #953: FILE: ./hal/odm.c:953:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1083: FILE: ./hal/odm.c:1083:
    +void odm_RefreshRateAdaptiveMask(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1094: FILE: ./hal/odm.c:1094:
    +void odm_RefreshRateAdaptiveMaskCE(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1131: FILE: ./hal/odm.c:1131:
    +	struct DM_ODM_T * pDM_Odm,

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1137: FILE: ./hal/odm.c:1137:
    +	struct ODM_RATE_ADAPTIVE * pRA = &pDM_Odm->RateAdaptive;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1196: FILE: ./hal/odm.c:1196:
    +void odm_RSSIMonitorInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1198: FILE: ./hal/odm.c:1198:
    +	struct RA_T * pRA_Table = &pDM_Odm->DM_RA_Table;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1204: FILE: ./hal/odm.c:1204:
    +void odm_RSSIMonitorCheck(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1217: FILE: ./hal/odm.c:1217:
    +	struct DM_ODM_T * pDM_Odm = &(pHalData->odmpriv);

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1234: FILE: ./hal/odm.c:1234:
    +void odm_RSSIMonitorCheckCE(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1243: FILE: ./hal/odm.c:1243:
    +	struct RA_T * pRA_Table = &pDM_Odm->DM_RA_Table;

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1306: FILE: ./hal/odm.c:1306:
    +static u8 getSwingIndex(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1330: FILE: ./hal/odm.c:1330:
    +void odm_TXPowerTrackingInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1374: FILE: ./hal/odm.c:1374:
    +void ODM_TXPowerTrackingCheck(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1398: FILE: ./hal/odm.c:1398:
    +void odm_SwAntDetectInit(struct DM_ODM_T * pDM_Odm)

    ERROR:POINTER_LOCATION: "foo * bar" should be "foo *bar"
    #1400: FILE: ./hal/odm.c:1400:
    +	struct SWAT_T * pDM_SWAT_Table = &pDM_Odm->DM_SWAT_Table;

Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marco Cesati <marcocesati@gmail.com>
Link: https://lore.kernel.org/r/20210315170618.2566-21-marcocesati@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ojeda added a commit to ojeda/linux that referenced this pull request Jun 3, 2021
GitHub: a couple improvements to the issue templates
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Jan 30, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using another sched field.

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 1, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 2, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 9, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
torvalds pushed a commit that referenced this pull request Feb 10, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 10, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 13, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 13, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 13, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 13, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 13, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Feb 14, 2023
commit 5ad7bbf upstream.

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
someone5678 referenced this pull request in someone5678/zen-kernel Feb 28, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [zen-kernel#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli zen-kernel#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://lore.kernel.org/r/20230201164814.1353383-1-gpiccoli@igalia.com
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Peter-JanGootzen pushed a commit to Peter-JanGootzen/linux that referenced this pull request May 7, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Oct 27, 2023
Enable cpu v4 tests for LoongArch. Currently, we don't
have BPF trampoline in LoongArch JIT, so the fentry
test `test_ptr_struct_arg` still failed, will followup.
Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  torvalds#316/1   verifier_bswap/BSWAP, 16:OK
  torvalds#316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  torvalds#316/3   verifier_bswap/BSWAP, 32:OK
  torvalds#316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  torvalds#316/5   verifier_bswap/BSWAP, 64:OK
  torvalds#316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  torvalds#316     verifier_bswap:OK
  torvalds#330/1   verifier_gotol/gotol, small_imm:OK
  torvalds#330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  torvalds#330     verifier_gotol:OK
  torvalds#338/1   verifier_ldsx/LDSX, S8:OK
  torvalds#338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  torvalds#338/3   verifier_ldsx/LDSX, S16:OK
  torvalds#338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  torvalds#338/5   verifier_ldsx/LDSX, S32:OK
  torvalds#338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  torvalds#338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  torvalds#338/8   verifier_ldsx/LDSX, S16 range checking:OK
  torvalds#338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  torvalds#338/10  verifier_ldsx/LDSX, S32 range checking:OK
  torvalds#338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  torvalds#338     verifier_ldsx:OK
  torvalds#349/1   verifier_movsx/MOV32SX, S8:OK
  torvalds#349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  torvalds#349/3   verifier_movsx/MOV32SX, S16:OK
  torvalds#349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  torvalds#349/5   verifier_movsx/MOV64SX, S8:OK
  torvalds#349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  torvalds#349/7   verifier_movsx/MOV64SX, S16:OK
  torvalds#349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  torvalds#349/9   verifier_movsx/MOV64SX, S32:OK
  torvalds#349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  torvalds#349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  torvalds#349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  torvalds#349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  torvalds#349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  torvalds#349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  torvalds#349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  torvalds#349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  torvalds#349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  torvalds#349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  torvalds#349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  torvalds#349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  torvalds#349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  torvalds#349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  torvalds#349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  torvalds#349     verifier_movsx:OK
  torvalds#361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  torvalds#361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  torvalds#361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  torvalds#361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  torvalds#361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  torvalds#361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  torvalds#361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  torvalds#361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  torvalds#361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  torvalds#361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  torvalds#361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  torvalds#361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  torvalds#361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  torvalds#361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  torvalds#361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  torvalds#361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  torvalds#361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  torvalds#361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  torvalds#361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  torvalds#361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  torvalds#361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  torvalds#361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  torvalds#361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  torvalds#361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  torvalds#361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  torvalds#361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  torvalds#361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  torvalds#361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  torvalds#361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  torvalds#361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  torvalds#361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  torvalds#361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  torvalds#361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  torvalds#361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  torvalds#361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  torvalds#361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  torvalds#361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  torvalds#361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  torvalds#361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  torvalds#361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  torvalds#361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  torvalds#361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  torvalds#361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  torvalds#361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  torvalds#361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  torvalds#361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  torvalds#361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  torvalds#361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  torvalds#361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  torvalds#361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  torvalds#361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  torvalds#361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  torvalds#361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  torvalds#361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  torvalds#361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  torvalds#361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  torvalds#361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/113 verifier_sdiv/SDIV32, zero divisor:OK
  torvalds#361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  torvalds#361/115 verifier_sdiv/SDIV64, zero divisor:OK
  torvalds#361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  torvalds#361/117 verifier_sdiv/SMOD32, zero divisor:OK
  torvalds#361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  torvalds#361/119 verifier_sdiv/SMOD64, zero divisor:OK
  torvalds#361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  torvalds#361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116/2   ldsx_insn/ctx_member_sign_ext:OK
  torvalds#116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  torvalds#116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
samueldr pushed a commit to samueldr/linux that referenced this pull request Nov 3, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Nov 9, 2023
Enable the cpu v4 tests for LoongArch. Currently, we don't have BPF
trampoline in LoongArch JIT, so the fentry test `test_ptr_struct_arg`
still failed, will followup.

Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  torvalds#316/1   verifier_bswap/BSWAP, 16:OK
  torvalds#316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  torvalds#316/3   verifier_bswap/BSWAP, 32:OK
  torvalds#316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  torvalds#316/5   verifier_bswap/BSWAP, 64:OK
  torvalds#316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  torvalds#316     verifier_bswap:OK
  torvalds#330/1   verifier_gotol/gotol, small_imm:OK
  torvalds#330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  torvalds#330     verifier_gotol:OK
  torvalds#338/1   verifier_ldsx/LDSX, S8:OK
  torvalds#338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  torvalds#338/3   verifier_ldsx/LDSX, S16:OK
  torvalds#338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  torvalds#338/5   verifier_ldsx/LDSX, S32:OK
  torvalds#338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  torvalds#338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  torvalds#338/8   verifier_ldsx/LDSX, S16 range checking:OK
  torvalds#338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  torvalds#338/10  verifier_ldsx/LDSX, S32 range checking:OK
  torvalds#338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  torvalds#338     verifier_ldsx:OK
  torvalds#349/1   verifier_movsx/MOV32SX, S8:OK
  torvalds#349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  torvalds#349/3   verifier_movsx/MOV32SX, S16:OK
  torvalds#349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  torvalds#349/5   verifier_movsx/MOV64SX, S8:OK
  torvalds#349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  torvalds#349/7   verifier_movsx/MOV64SX, S16:OK
  torvalds#349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  torvalds#349/9   verifier_movsx/MOV64SX, S32:OK
  torvalds#349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  torvalds#349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  torvalds#349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  torvalds#349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  torvalds#349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  torvalds#349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  torvalds#349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  torvalds#349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  torvalds#349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  torvalds#349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  torvalds#349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  torvalds#349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  torvalds#349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  torvalds#349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  torvalds#349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  torvalds#349     verifier_movsx:OK
  torvalds#361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  torvalds#361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  torvalds#361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  torvalds#361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  torvalds#361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  torvalds#361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  torvalds#361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  torvalds#361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  torvalds#361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  torvalds#361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  torvalds#361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  torvalds#361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  torvalds#361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  torvalds#361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  torvalds#361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  torvalds#361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  torvalds#361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  torvalds#361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  torvalds#361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  torvalds#361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  torvalds#361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  torvalds#361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  torvalds#361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  torvalds#361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  torvalds#361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  torvalds#361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  torvalds#361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  torvalds#361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  torvalds#361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  torvalds#361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  torvalds#361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  torvalds#361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  torvalds#361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  torvalds#361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  torvalds#361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  torvalds#361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  torvalds#361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  torvalds#361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  torvalds#361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  torvalds#361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  torvalds#361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  torvalds#361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  torvalds#361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  torvalds#361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  torvalds#361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  torvalds#361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  torvalds#361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  torvalds#361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  torvalds#361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  torvalds#361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  torvalds#361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  torvalds#361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  torvalds#361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  torvalds#361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  torvalds#361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  torvalds#361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  torvalds#361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/113 verifier_sdiv/SDIV32, zero divisor:OK
  torvalds#361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  torvalds#361/115 verifier_sdiv/SDIV64, zero divisor:OK
  torvalds#361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  torvalds#361/117 verifier_sdiv/SMOD32, zero divisor:OK
  torvalds#361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  torvalds#361/119 verifier_sdiv/SMOD64, zero divisor:OK
  torvalds#361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  torvalds#361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116/2   ldsx_insn/ctx_member_sign_ext:OK
  torvalds#116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  torvalds#116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
yetist pushed a commit to loongarchlinux/linux that referenced this pull request Nov 13, 2023
Enable the cpu v4 tests for LoongArch. Currently, we don't have BPF
trampoline in LoongArch JIT, so the fentry test `test_ptr_struct_arg`
still failed, will followup.

Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  torvalds#316/1   verifier_bswap/BSWAP, 16:OK
  torvalds#316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  torvalds#316/3   verifier_bswap/BSWAP, 32:OK
  torvalds#316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  torvalds#316/5   verifier_bswap/BSWAP, 64:OK
  torvalds#316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  torvalds#316     verifier_bswap:OK
  torvalds#330/1   verifier_gotol/gotol, small_imm:OK
  torvalds#330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  torvalds#330     verifier_gotol:OK
  torvalds#338/1   verifier_ldsx/LDSX, S8:OK
  torvalds#338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  torvalds#338/3   verifier_ldsx/LDSX, S16:OK
  torvalds#338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  torvalds#338/5   verifier_ldsx/LDSX, S32:OK
  torvalds#338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  torvalds#338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  torvalds#338/8   verifier_ldsx/LDSX, S16 range checking:OK
  torvalds#338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  torvalds#338/10  verifier_ldsx/LDSX, S32 range checking:OK
  torvalds#338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  torvalds#338     verifier_ldsx:OK
  torvalds#349/1   verifier_movsx/MOV32SX, S8:OK
  torvalds#349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  torvalds#349/3   verifier_movsx/MOV32SX, S16:OK
  torvalds#349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  torvalds#349/5   verifier_movsx/MOV64SX, S8:OK
  torvalds#349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  torvalds#349/7   verifier_movsx/MOV64SX, S16:OK
  torvalds#349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  torvalds#349/9   verifier_movsx/MOV64SX, S32:OK
  torvalds#349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  torvalds#349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  torvalds#349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  torvalds#349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  torvalds#349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  torvalds#349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  torvalds#349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  torvalds#349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  torvalds#349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  torvalds#349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  torvalds#349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  torvalds#349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  torvalds#349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  torvalds#349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  torvalds#349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  torvalds#349     verifier_movsx:OK
  torvalds#361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  torvalds#361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  torvalds#361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  torvalds#361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  torvalds#361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  torvalds#361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  torvalds#361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  torvalds#361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  torvalds#361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  torvalds#361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  torvalds#361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  torvalds#361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  torvalds#361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  torvalds#361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  torvalds#361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  torvalds#361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  torvalds#361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  torvalds#361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  torvalds#361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  torvalds#361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  torvalds#361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  torvalds#361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  torvalds#361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  torvalds#361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  torvalds#361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  torvalds#361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  torvalds#361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  torvalds#361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  torvalds#361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  torvalds#361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  torvalds#361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  torvalds#361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  torvalds#361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  torvalds#361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  torvalds#361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  torvalds#361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  torvalds#361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  torvalds#361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  torvalds#361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  torvalds#361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  torvalds#361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  torvalds#361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  torvalds#361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  torvalds#361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  torvalds#361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  torvalds#361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  torvalds#361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  torvalds#361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  torvalds#361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  torvalds#361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  torvalds#361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  torvalds#361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  torvalds#361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  torvalds#361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  torvalds#361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  torvalds#361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  torvalds#361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/113 verifier_sdiv/SDIV32, zero divisor:OK
  torvalds#361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  torvalds#361/115 verifier_sdiv/SDIV64, zero divisor:OK
  torvalds#361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  torvalds#361/117 verifier_sdiv/SMOD32, zero divisor:OK
  torvalds#361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  torvalds#361/119 verifier_sdiv/SMOD64, zero divisor:OK
  torvalds#361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  torvalds#361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116/2   ldsx_insn/ctx_member_sign_ext:OK
  torvalds#116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  torvalds#116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
RevySR pushed a commit to RevySR/linux that referenced this pull request Nov 23, 2023
Enable the cpu v4 tests for LoongArch. Currently, we don't have BPF
trampoline in LoongArch JIT, so the fentry test `test_ptr_struct_arg`
still failed, will followup.

Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  torvalds#316/1   verifier_bswap/BSWAP, 16:OK
  torvalds#316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  torvalds#316/3   verifier_bswap/BSWAP, 32:OK
  torvalds#316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  torvalds#316/5   verifier_bswap/BSWAP, 64:OK
  torvalds#316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  torvalds#316     verifier_bswap:OK
  torvalds#330/1   verifier_gotol/gotol, small_imm:OK
  torvalds#330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  torvalds#330     verifier_gotol:OK
  torvalds#338/1   verifier_ldsx/LDSX, S8:OK
  torvalds#338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  torvalds#338/3   verifier_ldsx/LDSX, S16:OK
  torvalds#338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  torvalds#338/5   verifier_ldsx/LDSX, S32:OK
  torvalds#338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  torvalds#338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  torvalds#338/8   verifier_ldsx/LDSX, S16 range checking:OK
  torvalds#338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  torvalds#338/10  verifier_ldsx/LDSX, S32 range checking:OK
  torvalds#338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  torvalds#338     verifier_ldsx:OK
  torvalds#349/1   verifier_movsx/MOV32SX, S8:OK
  torvalds#349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  torvalds#349/3   verifier_movsx/MOV32SX, S16:OK
  torvalds#349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  torvalds#349/5   verifier_movsx/MOV64SX, S8:OK
  torvalds#349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  torvalds#349/7   verifier_movsx/MOV64SX, S16:OK
  torvalds#349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  torvalds#349/9   verifier_movsx/MOV64SX, S32:OK
  torvalds#349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  torvalds#349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  torvalds#349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  torvalds#349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  torvalds#349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  torvalds#349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  torvalds#349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  torvalds#349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  torvalds#349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  torvalds#349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  torvalds#349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  torvalds#349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  torvalds#349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  torvalds#349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  torvalds#349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  torvalds#349     verifier_movsx:OK
  torvalds#361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  torvalds#361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  torvalds#361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  torvalds#361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  torvalds#361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  torvalds#361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  torvalds#361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  torvalds#361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  torvalds#361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  torvalds#361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  torvalds#361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  torvalds#361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  torvalds#361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  torvalds#361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  torvalds#361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  torvalds#361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  torvalds#361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  torvalds#361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  torvalds#361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  torvalds#361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  torvalds#361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  torvalds#361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  torvalds#361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  torvalds#361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  torvalds#361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  torvalds#361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  torvalds#361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  torvalds#361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  torvalds#361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  torvalds#361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  torvalds#361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  torvalds#361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  torvalds#361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  torvalds#361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  torvalds#361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  torvalds#361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  torvalds#361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  torvalds#361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  torvalds#361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  torvalds#361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  torvalds#361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  torvalds#361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  torvalds#361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  torvalds#361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  torvalds#361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  torvalds#361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  torvalds#361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  torvalds#361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  torvalds#361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  torvalds#361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  torvalds#361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  torvalds#361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  torvalds#361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  torvalds#361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  torvalds#361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  torvalds#361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  torvalds#361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  torvalds#361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  torvalds#361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  torvalds#361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  torvalds#361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  torvalds#361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  torvalds#361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  torvalds#361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  torvalds#361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  torvalds#361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  torvalds#361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  torvalds#361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  torvalds#361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  torvalds#361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  torvalds#361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  torvalds#361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  torvalds#361/113 verifier_sdiv/SDIV32, zero divisor:OK
  torvalds#361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  torvalds#361/115 verifier_sdiv/SDIV64, zero divisor:OK
  torvalds#361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  torvalds#361/117 verifier_sdiv/SMOD32, zero divisor:OK
  torvalds#361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  torvalds#361/119 verifier_sdiv/SMOD64, zero divisor:OK
  torvalds#361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  torvalds#361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116/2   ldsx_insn/ctx_member_sign_ext:OK
  torvalds#116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  torvalds#116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  torvalds#116/1   ldsx_insn/map_val and probed_memory:FAIL
  torvalds#116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
fozog pushed a commit to fozog/linux that referenced this pull request Nov 30, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli torvalds#338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants