Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tama][Apollo][AOSP11][4.14] wled_ovp uses 50+% CPU when the screen is on #757

Open
faenil opened this issue Mar 1, 2022 · 15 comments
Open
Labels

Comments

@faenil
Copy link

faenil commented Mar 1, 2022

Platform: Tama
Device: Apollo
Kernel version: 4.14
Android version: AOSP11 r46
Software binaries version: v9a

Previously working on
n/a

Description

When the screen is on, top shows high CPU usage by wled_ovp

Symptoms
UI is stuttery and laggy, perhaps also due to the high CPU usage of wled_ovp

How to reproduce

  • Flash AOSP11/4.4 and the specified OEM binaries
  • Boot the device
  • Observe top output while the screen is on

Additional context
Top excerpt:

   PID USER         PR  NI VIRT  RES  SHR S[%CPU] %MEM     TIME+ ARGS 
   195 root         RT   0    0    0    0 S 61.6   0.0   4:11.87 [irq/94-wled_ovp]
  5856 shell        20   0  10G 4.8M 3.4M R  4.6   0.1   0:00.69 top
   420 root          0 -20    0    0    0 R  3.0   0.0   0:12.55 [kworker/u17:1]
  1120 system       18  -2  16G 407M 359M S  1.3  11.0   0:22.06 system_server
  1681 u0_a122      20   0  14G 212M 152M S  1.0   5.7   0:10.43 com.android.systemui
   358 root         20   0    0    0    0 I  1.0   0.0   0:02.98 [kworker/u16:10]
  5705 root         20   0    0    0    0 R  0.6   0.0   0:00.93 [kworker/u16:2]

Not much in dmesg, beside the radio service failing to start every 1 sec

@faenil faenil added the bug label Mar 1, 2022
@MarijnS95
Copy link
Contributor

MarijnS95 commented Mar 1, 2022

Poor white LEDs.

(OVP stands for Over Voltage Protection. Let's pray the voltage limit is simply set too low)

@faenil faenil changed the title [Tama][AOSP11][4.14] wled_ovp uses 50+% CPU when the screen is on [Tama][Apollo][AOSP11][4.14] wled_ovp uses 50+% CPU when the screen is on Mar 1, 2022
faenil added a commit to faenil/kernel that referenced this issue Mar 3, 2022
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
@faenil
Copy link
Author

faenil commented Mar 3, 2022

PR up :)

@MartinX3
Copy link

MartinX3 commented Mar 3, 2022

Woohoo

Auto-String-Detection means somewhere manual defined ones are wrong?

@MarijnS95
Copy link
Contributor

Let's link back to the PR here too for those not seeing GH's automated back-link: sonyxperiadev/kernel#2499

@faenil
Copy link
Author

faenil commented Mar 3, 2022

Auto-String-Detection means somewhere manual defined ones are wrong?

Yeah that was exactly my thought too...
I cross checked against stock 4.9 and sodp 4.19 dts and drivers and could not find anything clearly wrong, but noticed that they all use auto string detection, so I went for it.

It would be great to understand the root cause, indeed.

My current understanding of this driver is limited, so I appreciate any detailed pointer, else it will take my a few weeks before I put together enough spare time to go through it in detail 😀

Let's link back to the PR here too for those not seeing GH's automated back-link

Thanks for the feedback, will do so from now on 👍

@MarijnS95
Copy link
Contributor

I cross checked against stock 4.9 and sodp 4.10 dts and drivers and could not find anything clearly wrong, but noticed that they all use auto string detection, so I went for it.

That's totally scary. Fwiw, perhaps you meant 4.14 dts or some other version? I don't think we ever shipped on 4.10.

Would be great to paste some GH permalinks to the DTs enabling autodetection, to back up the reason for making such a PR. Cross-check if it's Apollo or Tama-wide, etc.

No need to dive into the driver for this, it's all configured from DT. But if you feel like playing around, delete it and import the much improved version from mainline, my patches should land in 5.17 that finally make it usable.

@faenil
Copy link
Author

faenil commented Mar 4, 2022

That's totally scary. Fwiw, perhaps you meant 4.14 dts or some other version? I don't think we ever shipped on 4.10.

Sorry, I meant 4.19 (I checked 4.9 and 4.19 to spot differences with 4.14), edited the post, thanks

Would be great to paste some GH permalinks to the DTs enabling autodetection, to back up the reason for making such a PR. Cross-check if it's Apollo or Tama-wide, etc.

Will do 👍
Of course it should not be applied to a different set of devices. I have already checked what DT block it was applied to in the other versions and I added it to the same block, but will post permalinks ;)

No need to dive into the driver for this, it's all configured from DT.

Yea, but since the variable names changed, especially some booleans, I also tried to have a quick look at the driver to make sure they were referring to same things, checking some of the register addresses matched etc

But if you feel like playing around, delete it and import the much improved version from mainline, my patches should land in 5.17 that finally make it usable.

Great, sounds like a plan!

@faenil
Copy link
Author

faenil commented Mar 4, 2022

Would be great to paste some GH permalinks to the DTs enabling autodetection, to back up the reason for making such a PR. Cross-check if it's Apollo or Tama-wide, etc.

Stock 4.9: https://github.com/sonyxperiadev/kernel-copyleft/blob/364b61ee85425f131d19a944ab5819bcb0b038c8/arch/arm64/boot/dts/qcom/pmi8998.dtsi#L585

SODP 4.19: https://github.com/sonyxperiadev/kernel/blob/85e4c4974990e0b7c1fdd65af7af2e5308f9c0af/arch/arm64/boot/dts/qcom/pmi8998.dtsi#L550

@faenil
Copy link
Author

faenil commented Mar 6, 2022

@MarijnS95 I have ported your 5.17 wled driver changes to sodp 4.14 (plus the other missing changes, except those requiring newer kernel APIs/functions such as backlight_get_brightness()) 👍

Some of the bugs you fixed were also things that got me confused when I had a quick look at the sodp 4.14 driver (things like iterating over num-strings disregarding the enabled strings). Good to see a confirmation that they were actually bugs 👍

With auto string detection disabled the issues reported in this thread is still present.
I will investigate more as I get the chance. I am curious to learn more.

Regarding auto string detection feature, if my understanding is correct it is always good to have it so that it can take care of hardware faults, but I do agree it is not the right fix for this particular issue, so I want to dig more :)

@MarijnS95
Copy link
Contributor

MarijnS95 commented Mar 7, 2022

@faenil great to hear that about the mainline WLED driver! (Though I think you meant 5.17?).

I won't be able to reply to everything for a while nor elaborate much, but I should have said that the mainline driver won't fix the "too many strings are enabled so you get OVP" bug, that still really seems to be a configuration issue.

If we've seriously used string-detection on previous kernels for apollo (I have not followed the links yet), then let's indeed keep that approach.

Though otoh, try enabling less strings (2 instead of 3) and maybe the ovps disappear, meaning that only 2 WLEDs are connected - and that should be the same for all phones, there's supposedly no panel lottery for Apollo.

EDIT: Don't forget to look at node modifiers that set different properties for this driver, you can find these under &pmi8998_wled.

@faenil
Copy link
Author

faenil commented Mar 7, 2022

@MarijnS95 yes, I meant 5.17 👍
No, I did not expect it to fix the ovp bug, just reported the test result as confirmation, but yes it was expected indeed :)
I mainly hoped the fixes from the mainline driver would perhaps highlight a dt configuration issue.
Need to have a better look at the logs to see if it is printing any additional info.
If not, I will just see what values the auto-string-detection is settling on (need to look at the source, but I assume it tries all combinations from a hardcoded set and then settles on the ones that don't cause OVP to trigger?) and see if it makes sense to use those as default in the dts.

Yeah, I did look at the node modifiers.

Thanks for your advice, I'll tinker with it a bit and let you know.

If we've seriously used string-detection on previous kernels for apollo (I have not followed the links yet), then let's indeed keep that approach.

Thanks 👍
after enabling auto string detection (both on 4.14 and 5.17 drivers) the backlight is now flickering at every boot (I guess that's the auto-string-detection running its logic) while the Sony logo is on. That's not ideal, so I first would like to dig a bit more to see if there's anything we're missing, beside enabling auto string detection 👍
Like, it might be on Stock the dts values are correct and auto-sring-detection is only enabled to take care of hardware faults (and so it doesn't run at every boot) while here we might be using it to workaround a broken dt config (causing it to run at every boot -> flickering)

faenil added a commit to faenil/kernel that referenced this issue Mar 8, 2022
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
@faenil
Copy link
Author

faenil commented Mar 9, 2022

I tested with auto-string-detection on 5.17 wled driver, and unfortunately something is not working.

The dts is the same as sodp 4.14, but while with 4.14 driver the screen flickers at boot and then there are no OVPs, with 5.17 driver the screen turns off.
After adding a few debug logs, it seems the auto string detection algo on 5.17 only runs fine for the first time (detecting that only strings 0 and 1 are valid). However faults keep getting triggered so the auto-string-detection algo is run again and those subsequent times it fails, reporting no valid sinks and leading to the display being turned off.

Note: both 4.14 and 5.17 drivers report only string 0 and 1 as valid (it seems string 2 is not, even though it's in the dts config), at least on my unit.

Need to investigate more.

@MarijnS95
Copy link
Contributor

Given the flickering and wasting startup time, I'd shoot for enabling just 2 strings and - as Alin mentioned - there's supposedly no panel lottery on these devices so there should be no variation on this rule. Still feel that this should be overridden just Apollo and Akari (and Akatsuki should have it disabled for obvious reasons), pmi8998.dtsi may work for our case but is just plain wrong in the global sense.

I haven't actually tested autodection on the mainline driver (neither before nor after my patches), perhaps it never worked or one of my patches broke it?

jerpelea pushed a commit to sonyxperiadev/kernel that referenced this issue Apr 11, 2022
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
@faenil
Copy link
Author

faenil commented Apr 11, 2022

Thanks @MarijnS95, all good points. Yeah, more investigation is needed.

Interesting that the flag was enabled on a global level in the copyleft kernel (unless Akatsuki is using a different archive with that flag turned off)

@MarijnS95
Copy link
Contributor

MarijnS95 commented Apr 11, 2022

Interesting that the flag was enabled on a global level in the copyleft kernel (unless Akatsuki is using a different archive with that flag turned off)

That's the big downside of downstream carrying a single (usually outdated and desynced) copy of various trees per device - code can just be modified at will since it "only runs on this specific device" - so to answer your question: yes Akatsuki (and I think even Apollo and Akari?) each have their own archive(s, if you count all the software versions). In contrast, mainline - and SODP to a diminishing extent - are generic and run on many different devices.

Thaodan pushed a commit to mer-hybris/android_kernel_sony_msm that referenced this issue Jul 13, 2022
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 5, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 11, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 16, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 16, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 17, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 19, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
trautamaki pushed a commit to trautamaki/4.14-android_kernel_oneplus_msm8998 that referenced this issue May 19, 2023
This fixes pmi8998 triggering huge amounts of OVP irqs on Apollo,
causing the ovp IRQ handler to take 50+% CPU when the screen is on,
leading to terrible UI performance.

More info:
sonyxperiadev/bug_tracker#757
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants