Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msm8937: KGSL doesn't work and sometimes makes kp when tries to load fw #1

Closed
MrArtemSid opened this issue Aug 26, 2021 · 6 comments
Closed

Comments

@MrArtemSid
Copy link
Contributor

KGSL on A505 doesn't wanna work, meanwhile on msm8917(a308) works fine

Issue:
[ 15.710098] kgsl kgsl-3d0: CP initialization failed to idle
[ 15.710130] kgsl kgsl-3d0: rb=0 pos=0/9 rbbm_status=800001C1/00000000 int_0_status=00800000
[ 15.710140] kgsl kgsl-3d0: hwfault=00000000

ramoops link:
https://gist.github.com/MrArtemSid/52209b0f64c84ba1abccc1e5c96bb7b0

@MrArtemSid
Copy link
Contributor Author

another log from Xiaomi Redmi 4X (santoni)
https://gist.github.com/MrArtemSid/bfb0181d8cd1a52704a52d17b5ed05c9

@ElectroPerf
Copy link

ElectroPerf commented Nov 21, 2021

Atom-X-Devs/android_kernel_xiaomi_sdm660@59f6d7d#diff-a0ef6d5f001bb7b32dda89f472249110e88cbe4aa5bb3a3e6623bbb2dde2bb0a

Try this.. we need this revert as we dont have that new gpu firmware hence it crashes... Tested on asus, xiaomi 4.19 kernels on our org @Atom-X-Devs

Edit:- I reviewed your commit history in default branch and its missing this commit. Our stock kernel revisions also have this change btw.. so its indeed needed

@MrArtemSid
Copy link
Contributor Author

@ElectroPerf I tried reverting this, it didn't help at all. We tried different combinations of fw, since msm8937 device doesn't have adreno fw signed

@wiktorek140
Copy link

wiktorek140 commented Dec 1, 2021

Hmm, this seems interesting.

[ 11.807139] can't get fw name.

Maybe its from gpu driver? Its not catching proper file so it fail to load and crash. Something like race condition between gpu driver and mounting partition.

Proably like that becuse of these logs:

kgsl kgsl-3d0: Falling back to syfs fallback for: a530_pm4.fw
[ 13.694686] kgsl kgsl-3d0: Falling back to syfs fallback for: a530_pfp.fw

@MrArtemSid
Copy link
Contributor Author

MrArtemSid commented Dec 2, 2021

kgsl kgsl-3d0: Falling back to syfs fallback for: a530_pm4.fw
[ 13.694686] kgsl kgsl-3d0: Falling back to syfs fallback for: a530_pfp.fw

This happens even on sdm660, so we can ignore that

can't get fw name. It's from sensors as far as I remember

MrArtemSid pushed a commit that referenced this issue Dec 11, 2021
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1
MrArtemSid pushed a commit that referenced this issue Dec 11, 2021
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1
MrArtemSid pushed a commit that referenced this issue Dec 11, 2021
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1
@MrArtemSid
Copy link
Contributor Author

Fixed by:
6503735
92e96e3
32874a3

MrArtemSid pushed a commit that referenced this issue Mar 28, 2023
Debian's clang carries a patch that makes the default FPU mode
'vfp3-d16' instead of 'neon' for 'armv7-a' to avoid generating NEON
instructions on hardware that does not support them:

https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/raw/5a61ca6f21b4ad8c6ac4970e5ea5a7b5b4486d22/debian/patches/clang-arm-default-vfp3-on-armv7a.patch
https://bugs.debian.org/841474
https://bugs.debian.org/842142
https://bugs.debian.org/914268

This results in the following build error when clang's integrated
assembler is used because the '.arch' directive overrides the '.fpu'
directive:

arch/arm/crypto/curve25519-core.S:25:2: error: instruction requires: NEON
 vmov.i32 q0, #1
 ^
arch/arm/crypto/curve25519-core.S:26:2: error: instruction requires: NEON
 vshr.u64 q1, q0, #7
 ^
arch/arm/crypto/curve25519-core.S:27:2: error: instruction requires: NEON
 vshr.u64 q0, q0, #8
 ^
arch/arm/crypto/curve25519-core.S:28:2: error: instruction requires: NEON
 vmov.i32 d4, #19
 ^

Shuffle the order of the '.arch' and '.fpu' directives so that the code
builds regardless of the default FPU mode. This has been tested against
both clang with and without Debian's patch and GCC.

Bug: 254441685
Cc: stable@vger.kernel.org
Fixes: d8f1308a025f ("crypto: arm/curve25519 - wire up NEON implementation")
Link: https://github.com/ClangBuiltLinux/continuous-integration2/issues/118
Reported-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 44200f2d9b8b52389c70e6c7bbe51e0dc6eaf938)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I2a55fa2141daa99e4ee259acd0660f505c3415ce
MrArtemSid pushed a commit that referenced this issue Mar 28, 2023
… UAF

[ Upstream commit 226fae124b2dac217ea5436060d623ff3385bc34 ]

After a call to console_unlock() in vcs_read() the vc_data struct can be
freed by vc_deallocate(). Because of that, the struct vc_data pointer
load must be done at the top of while loop in vcs_read() to avoid a UAF
when vcs_size() is called.

Syzkaller reported a UAF in vcs_size().

BUG: KASAN: use-after-free in vcs_size (drivers/tty/vt/vc_screen.c:215)
Read of size 4 at addr ffff8881137479a8 by task 4a005ed81e27e65/1537

CPU: 0 PID: 1537 Comm: 4a005ed81e27e65 Not tainted 6.2.0-rc5 #1
Hardware name: Red Hat KVM, BIOS 1.15.0-2.module
Call Trace:
  <TASK>
__asan_report_load4_noabort (mm/kasan/report_generic.c:350)
vcs_size (drivers/tty/vt/vc_screen.c:215)
vcs_read (drivers/tty/vt/vc_screen.c:415)
vfs_read (fs/read_write.c:468 fs/read_write.c:450)
...
  </TASK>

Allocated by task 1191:
...
kmalloc_trace (mm/slab_common.c:1069)
vc_allocate (./include/linux/slab.h:580 ./include/linux/slab.h:720
     drivers/tty/vt/vt.c:1128 drivers/tty/vt/vt.c:1108)
con_install (drivers/tty/vt/vt.c:3383)
tty_init_dev (drivers/tty/tty_io.c:1301 drivers/tty/tty_io.c:1413
     drivers/tty/tty_io.c:1390)
tty_open (drivers/tty/tty_io.c:2080 drivers/tty/tty_io.c:2126)
chrdev_open (fs/char_dev.c:415)
do_dentry_open (fs/open.c:883)
vfs_open (fs/open.c:1014)
...

Freed by task 1548:
...
kfree (mm/slab_common.c:1021)
vc_port_destruct (drivers/tty/vt/vt.c:1094)
tty_port_destructor (drivers/tty/tty_port.c:296)
tty_port_put (drivers/tty/tty_port.c:312)
vt_disallocate_all (drivers/tty/vt/vt_ioctl.c:662 (discriminator 2))
vt_ioctl (drivers/tty/vt/vt_ioctl.c:903)
tty_ioctl (drivers/tty/tty_io.c:2776)
...

The buggy address belongs to the object at ffff888113747800
  which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 424 bytes inside of
  1024-byte region [ffff888113747800, ffff888113747c00)

The buggy address belongs to the physical page:
page:00000000b3fe6c7c refcount:1 mapcount:0 mapping:0000000000000000
     index:0x0 pfn:0x113740
head:00000000b3fe6c7c order:3 compound_mapcount:0 subpages_mapcount:0
     compound_pincount:0
anon flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0010200 ffff888100042dc0 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff888113747880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888113747900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888113747980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                   ^
  ffff888113747a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888113747a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint

Fixes: ac751ef ("console: rename acquire/release_console_sem() to console_lock/unlock()")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Suggested-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Link: https://lore.kernel.org/r/1674577014-12374-1-git-send-email-george.kennedy@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
MrArtemSid pushed a commit that referenced this issue Apr 4, 2023
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1
MrArtemSid pushed a commit that referenced this issue Apr 4, 2023
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1
MrArtemSid pushed a commit that referenced this issue Apr 30, 2023
[   15.615401] --------gf_parse_dts end---OK.--------
[   15.624789] ------------[ cut here ]------------
[   15.624815] WARNING: CPU: 1 PID: 563 at ../kernel/irq/manage.c:448 enable_irq+0x74/0x9c()
[   15.624821] Unbalanced enable for IRQ 16
[   15.624828] Modules linked in: exfat
[   15.624840] CPU: 1 PID: 563 Comm: gx_fpd Not tainted 3.18.31-Clarity-Personal #1
[   15.624843] Hardware name: Qualcomm Technologies, Inc. MSM8940-PMI8950 QRD SKU7 (DT)
[   15.624847] Call trace:
[   15.624858] [<ffffffc000089754>] dump_backtrace+0x0/0x270
[   15.624862] ---[ end trace 6f1a0564338675ac ]---
[   15.624890] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_reset'
[   15.627915] goodix_fp soc:goodix_fp: Selected 'goodixfp_reset_active'
[   15.627931] goodix_fp soc:goodix_fp: IRQ after reset 1

Change-Id: I9e754ffa26c5a75079d3098fbdb88bba4118f8ff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants