Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDE Power management crash #228

Closed
shaumux opened this issue Oct 15, 2021 · 13 comments
Closed

KDE Power management crash #228

shaumux opened this issue Oct 15, 2021 · 13 comments

Comments

@shaumux
Copy link

shaumux commented Oct 15, 2021

I'm experiencing the same issue as #205 but on Gentoo, and with AMD 6700XT
I have updated to ddcutil 1.2 and Plasma 5.23 but still face the same issue.

I've added the ddcutilrc file and will report once i get the error again.
This is what the backtrace looks like using coredumpctl

Reading symbols from /usr/lib64/libexec/org_kde_powerdevil...
(No debugging symbols found in /usr/lib64/libexec/org_kde_powerdevil)

warning: core file may not match specified executable file.
[New LWP 15223]
[New LWP 15227]
[New LWP 15228]
[New LWP 15229]
[New LWP 15226]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/libexec/org_kde_powerdevil'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007ff3d97ab53e in raise () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7ff3d4ae6e40 (LWP 15223))]
(gdb) bt
#0 0x00007ff3d97ab53e in raise () at /lib64/libc.so.6
#1 0x00007ff3da9ba77c in KCrash::defaultCrashHandler(int) () at /usr/lib64/libKF5Crash.so.5
#2 0x00007ff3d97ab5c0 in () at /lib64/libc.so.6
#3 0x00007ff3d97ab53e in raise () at /lib64/libc.so.6
#4 0x00007ff3d9795536 in abort () at /lib64/libc.so.6
#5 0x00007ff3d979541f in () at /lib64/libc.so.6
#6 0x00007ff3d97a3ec2 in () at /lib64/libc.so.6
#7 0x00007ff3d2af70f6 in () at /usr/lib64/libddcutil.so.4
#8 0x00007ff3d2aebdde in ddca_open_display2 () at /usr/lib64/libddcutil.so.4
#9 0x00007ff3d3e28753 in () at /usr/lib64/qt5/plugins/kf5/powerdevil/powerdevilupowerbackend.so
#10 0x00007ff3d3e29a2c in () at /usr/lib64/qt5/plugins/kf5/powerdevil/powerdevilupowerbackend.so
#11 0x00007ff3d9e03146 in () at /usr/lib64/libQt5Core.so.5
#12 0x00007ff3d9e0778a in QTimer::timeout(QTimer::QPrivateSignal) () at /usr/lib64/libQt5Core.so.5
#13 0x00007ff3d9dfae5f in QObject::event(QEvent*) () at /usr/lib64/libQt5Core.so.5
#14 0x00007ff3d9dcf528 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib64/libQt5Core.so.5
#15 0x00007ff3d9e212fb in QTimerInfoList::activateTimers() () at /usr/lib64/libQt5Core.so.5
#16 0x00007ff3d9e21c01 in () at /usr/lib64/libQt5Core.so.5
#17 0x00007ff3d7ce5dab in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
#18 0x00007ff3d7ce6068 in () at /usr/lib64/libglib-2.0.so.0
#19 0x00007ff3d7ce611f in g_main_context_iteration () at /usr/lib64/libglib-2.0.so.0
#20 0x00007ff3d9e21fb4 in QEventDispatcherGlib::processEvents(QFlagsQEventLoop::ProcessEventsFlag) () at /usr/lib64/libQt5Core.so.5
#21 0x00007ff3d9dcdefb in QEventLoop::exec(QFlagsQEventLoop::ProcessEventsFlag) () at /usr/lib64/libQt5Core.so.5
#22 0x00007ff3d9dd62dd in QCoreApplication::exec() () at /usr/lib64/libQt5Core.so.5
#23 0x000055c797e42c7d in ()
#24 0x00007ff3d97967fd in __libc_start_main () at /lib64/libc.so.6
#25 0x000055c797e42cea in ()

@rockowitz rockowitz added amdgpu navi Radeon Navi chipset labels Oct 16, 2021
@rockowitz
Copy link
Owner

First, please ensure that PowerDevil is picking up the latest libddcutil. The system log should contain a line that looks like libddcutil[258928]: Initializing. ddcutil version 1.2.0

Please run ddcutil environment --very-verbose and submit the output as an attachment. (--very-verbose is an undocumented option, not intended for general use.) Given that the command probes for monitors using many different techniques, it may well trigger the crash. In any event, the output will be useful.

Since your trace shows the crash occurs in function ddca_open_display2(), please ensure that your ddcutilrc file includes at least the following options in section [libddcutil]:

[libddcutil]
options: --tid --libddcutil-trace-file <your tracefile name> --trace api --trcfunc ddc_open_display 

Again, please send the trace file showing the crash as an attachment rather than including it inline.

I look forward to getting a better understanding what is happening.

@shaumux
Copy link
Author

shaumux commented Oct 17, 2021

PowerDevil is definitely picking it up

shaumux@Shaumux-PC ~ $ journalctl -b 0 | grep -i 'ddc'
Oct 17 12:25:52 Shaumux-PC libddcutil[2260]: Initializing. ddcutil version 1.2.0

The output of ddcutil environment --very-verbose
verbose_ddc_env.txt

The tracefie
ddcutil_log.txt

The interesting thing is with the options enabled in the ddcutilrc file kde no longer detects the broghtness control but powerdevil still seems to crash but without a backtrace.

@shaumux
Copy link
Author

shaumux commented Oct 17, 2021

The systrem log difference with ddcutilrc options and without
powerdevil_norc.txt
powerdevil_ddcutilrc.txt

EDIT:
Also after disabling and re-enabling the options and rebooting it seems to have generated slightly different tracefile
ddcutil_log_new.txt

@shaumux
Copy link
Author

shaumux commented Oct 17, 2021

I was finally get a crash with trace while the options were enabled after a few reboots, here's the file, hopefully all this helps
ddcutil_log_crash.txt

@rockowitz
Copy link
Owner

I'm baffled as to what's going on. The coredump trace clearly shows the failure as being within ddca_open_display2(). Yet the trace logs always show ddca_open_display2() returning successfully.

Branch 1.2.1-dev contains additional tracing to look at called functions of interest, and to output detailed information for more assert() failures. Perhaps you'll hit a case where the trace file shows a function that doesn't return, or there's useful output in the system log. The libddcutil entry in ddcutilrc should now be:

[libddcutil]
options: --tid --libddcutil-trace-file <your tracefile name> --trace api --trcfunc ddc_open_display --trcfunc --ddc_close_display   --trcfunc ddc_is_valid_display_ref --trcfunc ddc_is_valid_display_handle  --trcfunc lock_distinct_display --trcfunc unlock_distinct_display   --trcfunc free_display_handle

If a crash is occurring within libddcutil I'd like to understand it, but since Gentoo is a source based distribution you can simply unset macro WITH_DDCUTIL when building PowerDevil. The PowerDevil code that uses libddcutil is quite rudimentary, and has not been maintained. The developer recommends that it not be used.

@shaumux
Copy link
Author

shaumux commented Oct 20, 2021

Yes, I can disable the use of ddcutil but i'll try to help with this issue

I thought [ 4918](lock_distinct_display ) Attempting to lock display already locked by current thread this seemed interesting in the logs

I'll try to use the dev branch and enable the additional tracing maybe tonight or tomorrow

@rockowitz
Copy link
Owner

Thanks so much for your help.

The lock_distinct_display message is indeed interesting. I suspect what's happening is that after a suspend, PowerDevil tries to open the display again without having closed it.

I'm going to take a look at enabling the PowerDevil debug messages along with those from libddcutil. I'll let you know when it makes sense to test again.

@rockowitz
Copy link
Owner

I've made several enhancements on the 1.2.1-dev branch to libddcutil tracing. In particular, utility option --f3 causes trace messages to be duplicated to the system log. (This is quick and dirty implementation. Not everything will be caught.)

The libddcutil options string should now be:

options: --f3  --tid  --libddcutil-trace-file libtrace.trc  --trace api  --trcfunc free_display_handle --trcfunc ddc_open_display --trcfunc --ddc_close_display  --trcfunc ddc_is_valid_display_ref --trcfunc ddc_is_valid_display_handle  --trcfunc lock_distinct_display --trcfunc unlock_distinct_display --trcfunc get_distinct_display_ref

I'm not familiar with KDE programming, but as I understand the documentation setting environment variable
QT_LOGGING_RULES causes KDE trace output to be written to the terminal or system log. (See KDE documentation. So the following should cause almost all powerdevil trace messages to be output:

export QT_LOGGING_RULES="*powerdevil.info=true"

Sending both powerdevil and libddcutil tracing to the system log should give a better handle on what is going on.

If you can duplicate the crash, please submit the contents of the system log starting from the first libddcutil call.

Thanks again for your help.

@shaumux
Copy link
Author

shaumux commented Oct 21, 2021

Here are the log files
This is the file generated by ddcutil
libtrace.trc.txt

This is the system log from journalctl
journalctl.txt

I had to reboot, login, logout and login to get the brightness slider, so the lower parts of the logs might be of more interest

I added the QT_LOGGING_RULES="*powerdevil.info=true" in my /etc/environment exactly like that, not really sure if that worked or not

rockowitz added a commit that referenced this issue Oct 22, 2021
…in current thread

Previously asserted failure if display already locked by thread but DREF_OPEN not
set in the Display_Ref.  However, this can occur if the client has created
a new Display_Ref.

Addresses the PowerDevil crash reported in issues #228 and #205.

However, this fix will probably cause failure within PowerDevil itself, which
never calls ddca_close_display(), but instead calls ddca_open_display() with
a new Display_Ref after return from suspend.
@rockowitz
Copy link
Owner

I see the problem (line 1681 in journalctl.txt). PowerDevil never closes a "Display Handle" (sort of like a file handle for the I2C bus). When awakened from sleep, it tries to open a new Display Handle for the I2C bus for the same display using a newly "Display Reference". This puts the data structures into a state I had thought impossible.

Function ddca_open_display() on branch 1.2.1-dev now returns DDCRC_ALREADY_OPEN if the display is open in the current thread. This fixes libddcutil, but as I read the PowerDevil code just pushes the failure location to PowerDevil itself.

Can we check that libddcutil itself no longer asserts failure, and see how PowerDevil handles the ddca_open_display() failure? I thought that "*powerdevil.info=true" would cause all more severe messages to be output as well. But only info messages appear in the syslog. As a read the documentation just "powerdevil=true" is at least one way to cause all messages to be output. So change the QT_LOGGING_RULES statement to:

QT_LOGGING_RULES="*powerdevil=true"

Again, please send the libddcutil trace file and relevant section of syslog output. Also the coredump trace if PowerDeveil fails. Thank you.

@shaumux
Copy link
Author

shaumux commented Oct 24, 2021

I believe ddcutil is not failing anymore, powerdevil or something still takes down the plasma shell though can't find a coredump using coredumpctl, I need to check more on that.

I did get the trace file and system logs though
libtrace.trc.txt
journalctl-22.txt

@rockowitz rockowitz removed amdgpu navi Radeon Navi chipset labels Oct 27, 2021
@rockowitz
Copy link
Owner

Your logs confirm that given the fix to function ddca_open_display(), libddcutil no longer crashes, but powerdevil does fail. The immediate problem is that powerdevil does not close an existing display handle before attempting to open another display handle using a different display reference. Until/if powerdevil is fixed, I suggest you disable use of lbiddcutil when installing powerdevil. I haven't installed gentoo so can't say how best to do this when installing the gentoo powerdevil package. The CMakeLists .txt file defines the macro automatically whenever ddcutil is installed.

Again, that you for all your help in remotely debugging this elusive problem.

@shaumux
Copy link
Author

shaumux commented Nov 15, 2021

I'll close this as this has been fixed on ddcutil
There seems to be a workaround at least on plasma 5.23.3 now as well
Thanks for all your effort

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants