Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

picom crashes at random time (glx backend) #1062

Closed
vejkse opened this issue May 6, 2023 · 12 comments
Closed

picom crashes at random time (glx backend) #1062

vejkse opened this issue May 6, 2023 · 12 comments

Comments

@vejkse
Copy link

vejkse commented May 6, 2023

Platform

ArchLinux

GPU, drivers, and screen setup

Intel HD Graphics 4600
modesetting (xorg 21.1.7)
mesa 22.3.6

Environment

XMonad

picom version

vgit-05ef1

Diagnostics
**Version:** vgit-05ef1

### Extensions:

* Shape: Yes
* RandR: Yes
* Present: Present

### Misc:

* Use Overlay: Yes
* Config file used: /home/vejksez/.config/picom/picom.conf

### Drivers (inaccurate):

modesetting

### Backend: glx

* Driver vendors:
 * GLX: Mesa Project and SGI
 * GL: Intel
* GL renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
* Accelerated: 1

### Backend: egl

* Driver vendors:
 * EGL: Mesa Project
 * EGL driver: crocus
 * GL: Intel
* GL renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)

Configuration:

Configuration file
# See /etc/xdg/picom.conf.example for an explanation of the options.
# Any change to this file is immediately applied.

backend = "glx"
glx-no-stencil = true
glx-no-rebind-pixmap = true

# Prevent tearing
vsync = true

# while these bugs are not solved:
#    https://github.com/yshui/picom/issues/375
#    https://github.com/yshui/picom/issues/401
#    https://github.com/yshui/picom/issues/237
use-damage = false

# # To try when use-damage will be true:
# max-brightness=0.5

Steps of reproduction

I haven’t found a way to reproduce it reliably. Even though I usually only notice that picom has crashed some time later (I use it mainly to avoid tearing), I know that just at the time it last happened, I changed workspace twice and launched mpv (but audio only, no video, the window was just black).

Expected behavior

No crash.

Current Behavior

Random crashes.

Stack trace

I’m not sure this is what I was supposed to do, but here’s:

  1. The message given by coredumpctl:
           PID: 1113 (picom)
           UID: 1000 (vejksez)
           GID: 100 (users)
        Signal: 11 (SEGV)
     Timestamp: Sat 2023-05-06 17:27:30 CEST (3h 31min ago)
  Command Line: picom --daemon
    Executable: /usr/bin/picom
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (vejksez)
       Boot ID: 57df8f2176d94a05a81735a40fac2549
    Machine ID: 68e3679fba3b4b4b9ea791aa4145a98f
      Hostname: arckes
       Storage: /var/lib/systemd/coredump/core.picom.1000.57df8f2176d94a05a81735a40fac2549.1113.1683386850000000.zst (present)
  Size on Disk: 2.6M
       Message: Process 1113 (picom) of user 1000 dumped core.

                Stack trace of thread 1113:
                #0  0x0000555b66198b91 n/a (picom + 0x39b91)
                #1  0x0000555b6619ca88 n/a (picom + 0x3da88)
                #2  0x0000555b66170016 n/a (picom + 0x11016)
                #3  0x0000555b66170c9c n/a (picom + 0x11c9c)
                #4  0x00007f3ee277b0cb ev_invoke_pending (libev.so.4 + 0x50cb)
                #5  0x00007f3ee277ed10 ev_run (libev.so.4 + 0x8d10)
                #6  0x0000555b6616bacf n/a (picom + 0xcacf)
                #7  0x00007f3ee2161790 n/a (libc.so.6 + 0x23790)
                #8  0x00007f3ee216184a __libc_start_main (libc.so.6 + 0x2384a)
                #9  0x0000555b6616cf55 n/a (picom + 0xdf55)

                Stack trace of thread 1117:
                #0  0x00007f3ee21c0766 n/a (libc.so.6 + 0x82766)
                #1  0x00007f3ee21c2f90 pthread_cond_wait (libc.so.6 + 0x84f90)
                #2  0x00007f3edfd0c4ee n/a (crocus_dri.so + 0x10c4ee)
                #3  0x00007f3edfcbc8bc n/a (crocus_dri.so + 0xbc8bc)
                #4  0x00007f3edfd0c41c n/a (crocus_dri.so + 0x10c41c)
                #5  0x00007f3ee21c3bb5 n/a (libc.so.6 + 0x85bb5)
                #6  0x00007f3ee2245d90 n/a (libc.so.6 + 0x107d90)

                Stack trace of thread 97696:
                #0  0x00007f3ee21c0766 n/a (libc.so.6 + 0x82766)
                #1  0x00007f3ee21c2f90 pthread_cond_wait (libc.so.6 + 0x84f90)
                #2  0x00007f3edfd0c4ee n/a (crocus_dri.so + 0x10c4ee)
                #3  0x00007f3edfcbc8bc n/a (crocus_dri.so + 0xbc8bc)
                #4  0x00007f3edfd0c41c n/a (crocus_dri.so + 0x10c41c)
                #5  0x00007f3ee21c3bb5 n/a (libc.so.6 + 0x85bb5)
                #6  0x00007f3ee2245d90 n/a (libc.so.6 + 0x107d90)
                ELF object binary architecture: AMD x86-64
  1. The output of journalctl (mostly the same thing):
May 06 17:27:30 arckes kernel: picom[1113]: segfault at 18 ip 0000555b66198b91 sp 00007ffc3c7d8078 error 6 in picom[555b6616a000+41000] likely on CPU 4 (core 0, socket 0)
May 06 17:27:30 arckes kernel: Code: 00 00 f2 0f 11 42 08 c3 90 8b 01 89 42 28 8b 41 04 89 42 2c b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 f2 0f 10 01 b8 01 00 00 00 <f2> 0f 11 42 18 c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa f2 0f 10
May 06 17:27:30 arckes systemd[1]: Created slice Slice /system/systemd-coredump.
May 06 17:27:30 arckes systemd[1]: Started Process Core Dump (PID 151227/UID 0).
May 06 17:27:30 arckes dunst[1319]: WARNING: BadWindow (invalid Window parameter)
                                                 ELF object binary architecture: AMD x86-64
                                                 #6  0x00007f3ee2245d90 n/a (libc.so.6 + 0x107d90)
                                                 #5  0x00007f3ee21c3bb5 n/a (libc.so.6 + 0x85bb5)
                                                 #4  0x00007f3edfd0c41c n/a (crocus_dri.so + 0x10c41c)
                                                 #3  0x00007f3edfcbc8bc n/a (crocus_dri.so + 0xbc8bc)
                                                 #2  0x00007f3edfd0c4ee n/a (crocus_dri.so + 0x10c4ee)
                                                 #1  0x00007f3ee21c2f90 pthread_cond_wait (libc.so.6 + 0x84f90)
                                                 #0  0x00007f3ee21c0766 n/a (libc.so.6 + 0x82766)
                                                 Stack trace of thread 97696:

                                                 #6  0x00007f3ee2245d90 n/a (libc.so.6 + 0x107d90)
                                                 #5  0x00007f3ee21c3bb5 n/a (libc.so.6 + 0x85bb5)
                                                 #4  0x00007f3edfd0c41c n/a (crocus_dri.so + 0x10c41c)
                                                 #3  0x00007f3edfcbc8bc n/a (crocus_dri.so + 0xbc8bc)
                                                 #2  0x00007f3edfd0c4ee n/a (crocus_dri.so + 0x10c4ee)
                                                 #1  0x00007f3ee21c2f90 pthread_cond_wait (libc.so.6 + 0x84f90)
                                                 #0  0x00007f3ee21c0766 n/a (libc.so.6 + 0x82766)
                                                 Stack trace of thread 1117:

                                                 #9  0x0000555b6616cf55 n/a (picom + 0xdf55)
                                                 #8  0x00007f3ee216184a __libc_start_main (libc.so.6 + 0x2384a)
                                                 #7  0x00007f3ee2161790 n/a (libc.so.6 + 0x23790)
                                                 #6  0x0000555b6616bacf n/a (picom + 0xcacf)
                                                 #5  0x00007f3ee277ed10 ev_run (libev.so.4 + 0x8d10)
                                                 #4  0x00007f3ee277b0cb ev_invoke_pending (libev.so.4 + 0x50cb)
                                                 #3  0x0000555b66170c9c n/a (picom + 0x11c9c)
                                                 #2  0x0000555b66170016 n/a (picom + 0x11016)
                                                 #1  0x0000555b6619ca88 n/a (picom + 0x3da88)
                                                 #0  0x0000555b66198b91 n/a (picom + 0x39b91)
                                                 Stack trace of thread 1113:

May 06 17:27:31 arckes systemd-coredump[151229]: [🡕] Process 1113 (picom) of user 1000 dumped core.
  1. A zip file containing the executable and the core dump produced by coredumpctl
    picom-coredump+exec.zip

Other details

As a workaround, I downgrade picom to version 9.1. Starting with version 10, it crashes several times a day.

@absolutelynothelix
Copy link
Collaborator

unfortunately, a core of the non-debug build has a little use. if it’s possible do and run a debug build of picom built from the latest commit of the next branch. when a segfault happens, do coredumpctl gdb, then bt and post the backtrace.

@Monsterovich
Copy link
Contributor

I have a similar problem on Intel integrated hardware (Intel(R) HD Graphics 520). I will provide a debug file when I catch the crash.

@vejkse
Copy link
Author

vejkse commented May 14, 2023

I think I managed to install a debug build of picom (I just used meson --buildtype=debug instead of meson --buildtype=release, I hope this is right).

I now am able to reproduce the crash at will with my XMonad configuration, but not without using XMonad-specific commands. I just need to fill a workspace with 5 or 6 windows, and then

  1. go to another workspace
  2. apply the XMonad function killWindow to all windows of the first workspace
  3. delete the first workspace using removeWorkspaceByTag from XMonad DynamicWorkspaces extension.

Just killing the windows doesn’t crash picom. When there are not that many windows, it doesn’t crash either.

Doing coredumpctl gdb and then bt, I get this:

#0  0x00007f11e3a108ec in ?? () from /usr/lib/libc.so.6
#1  0x00007f11e39c1ea8 in raise () from /usr/lib/libc.so.6
#2  0x00007f11e39ab53d in abort () from /usr/lib/libc.so.6
#3  0x00007f11e39ab45c in ?? () from /usr/lib/libc.so.6
#4  0x00007f11e39ba9f6 in __assert_fail () from /usr/lib/libc.so.6
#5  0x0000563ec987f0d7 in ?? ()
#6  0x0000563ec984ca06 in ?? ()
#7  0x0000563ec984d68c in ?? ()
#8  0x00007f11e3fc60cb in ev_invoke_pending () from /usr/lib/libev.so.4
#9  0x00007f11e3fc9d10 in ev_run () from /usr/lib/libev.so.4
#10 0x0000563ec9847ac8 in ?? ()
#11 0x00007f11e39ac790 in ?? () from /usr/lib/libc.so.6
#12 0x00007f11e39ac84a in __libc_start_main () from /usr/lib/libc.so.6
#13 0x0000563ec9848e25 in ?? ()

which doesn’t look very useful. So I suspect I did something wrong. If I know what, I can easily redo this, since I can now reproduce the crash at will.

@Monsterovich
Copy link
Contributor

I was able to catch a crash.

GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/picom...

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file /memfd:xshmfence (deleted) during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing

warning: Can't open file anon_inode:i915.gem which was expanded to anon_inode:i915.gem during file-backed mapping note processing
[New LWP 94672]
[New LWP 94754]
[New LWP 94756]
[New LWP 94753]
[New LWP 97127]
[New LWP 94755]
[New LWP 97126]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `picom --daemon --config /home/monsterovich/.config/picom.conf'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140535933342592) at ./nptl/pthread_kill.c:44
44	./nptl/pthread_kill.c: Нет такого файла или каталога.
[Current thread is 1 (Thread 0x7fd112629380 (LWP 94672))]
(gdb) back
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140535933342592) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140535933342592) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140535933342592, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fd112442476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fd1124287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007fd11242871b in __assert_fail_base (fmt=0x7fd1125dd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x556fff57ef10 "!(w->flags & WIN_FLAGS_PIXMAP_NONE)", file=0x556fff57ee76 "../src/backend/backend.c", line=206, function=<optimized out>)
    at ./assert/assert.c:92
#6  0x00007fd112439e96 in __GI___assert_fail (assertion=0x556fff57ef10 "!(w->flags & WIN_FLAGS_PIXMAP_NONE)", file=0x556fff57ee76 "../src/backend/backend.c", 
    line=206, function=0x556fff57f0b8 <__PRETTY_FUNCTION__.3> "paint_all_new") at ./assert/assert.c:101
#7  0x0000556fff55d8c8 in paint_all_new (ps=0x556ffff75920, t=0x557000c3e000, ignore_damage=false) at ../src/backend/backend.c:206
#8  0x0000556fff514400 in draw_callback_impl (loop=0x7fd112ba9720, ps=0x556ffff75920, revents=8192) at ../src/picom.c:1527
#9  0x0000556fff514525 in draw_callback (loop=0x7fd112ba9720, w=0x556ffff759e0, revents=8192) at ../src/picom.c:1553
#10 0x00007fd112b9d773 in ev_invoke_pending () from /lib/x86_64-linux-gnu/libev.so.4
#11 0x00007fd112ba1041 in ev_run () from /lib/x86_64-linux-gnu/libev.so.4
#12 0x0000556fff51983d in session_run (ps=0x556ffff75920) at ../src/picom.c:2479
#13 0x0000556fff519b9c in main (argc=4, argv=0x7fff73cca578) at ../src/picom.c:2585

@vejkse
Copy link
Author

vejkse commented May 20, 2023

It doesn’t look like my stack trace with the missing symbols filled in. So maybe this is another bug… But what did you do apart from building with --buildtype=debug to get those symbols?

@Monsterovich
Copy link
Contributor

It doesn’t look like my stack trace with the missing symbols filled in. So maybe this is another bug… But what did you do apart from building with --buildtype=debug to get those symbols?

I didn't do anything, meson generates binary with debug symbols by default.

@vejkse
Copy link
Author

vejkse commented May 21, 2023

I found out what I did wrong: I used the PKGBUILD file from the AUR (Arch User Repository) and just changed the --buildtype and installed the resulting package. In the PKGBUILD, after meson, ninja is run. This seems to be the problem. I directly ran the built binary src/picom/build/src/picom without ninja or installing and got the following much better trace:

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
#1  0x00007fdc0d34d2d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007fdc0d2fda08 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007fdc0d2e6538 in __GI_abort () at abort.c:79
#4  0x00007fdc0d2e645c in __assert_fail_base (fmt=0x7fdc0d462b08 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x5614fc8906d8 "!(w->flags & WIN_FLAGS_PIXMAP_NONE)",
    file=file@entry=0x5614fc8905c9 "../src/backend/backend.c", line=line@entry=206,
    function=function@entry=0x5614fc890800 <__PRETTY_FUNCTION__.3> "paint_all_new") at assert.c:92
#5  0x00007fdc0d2f63d6 in __assert_fail (
    assertion=assertion@entry=0x5614fc8906d8 "!(w->flags & WIN_FLAGS_PIXMAP_NONE)",
    file=file@entry=0x5614fc8905c9 "../src/backend/backend.c", line=line@entry=206,
    function=function@entry=0x5614fc890800 <__PRETTY_FUNCTION__.3> "paint_all_new") at assert.c:101
#6  0x00005614fc8720d7 in paint_all_new (ps=0x5614fe09d000, t=<optimized out>, ignore_damage=<optimized out>)
    at ../src/backend/backend.c:206
#7  0x00005614fc83fa06 in draw_callback_impl (loop=loop@entry=0x7fdc0d914060, ps=ps@entry=0x5614fe09d000,
    revents=<optimized out>) at ../src/picom.c:1527
#8  0x00005614fc84068c in draw_callback (loop=0x7fdc0d914060, w=0x5614fe09d0c0, revents=<optimized out>)
    at ../src/picom.c:1553
#9  0x00007fdc0d9090cb in ev_invoke_pending () from /usr/lib/libev.so.4
#10 0x00007fdc0d90cd10 in ev_run () from /usr/lib/libev.so.4
#11 0x00005614fc83aac8 in session_run (ps=0x5614fe09d000) at ../src/picom.c:2479
#12 main (argc=<optimized out>, argv=<optimized out>) at ../src/picom.c:2585

It looks very much like Monsterovich’s trace after all.

@absolutelynothelix
Copy link
Collaborator

absolutelynothelix commented May 21, 2023

it seems that your crashes are actually assertion failures and they’re new (i don’t remember seeing anything like this recently). and you’re both using the glx backend on an intel gpu, so this issue may be specific to intel gpus. i’ll investigate later.

@Monsterovich
Copy link
Contributor

it seems that your crashes are actually assertion failures and they’re new (i don’t remember seeing anything like this recently). and you’re both using the glx backend on an intel gpu, so this issue may be specific to intel gpus. i’ll investigate later.

@absolutelynothelix

I think it has something to do with turning on the screen and enabling compositing (maybe even related to lightdm). There can be a case where some window doesn't have time to create a pixmap and therefore picom crashes. It's very strange why there's assert instead of just skipping the window and printing a warning.

@absolutelynothelix
Copy link
Collaborator

does the issue still happen?

@vejkse
Copy link
Author

vejkse commented Oct 21, 2024

With picom 11.2, it still happened a few times a day until two weeks ago when something I changed in my system reduced it to once a week. This may be an Alacritty update: before I could reproduce it by killing an XMonad workspace with many Alacritty windows, after I needed to use other kinds of windows.

I just updated to 12.3, and tried to reproduce it by killing a workspace with many windows like I just did with success (i.e. a core dump) with 11.2 a few minutes ago, but could not get a core dump. So it looks like it could be solved with 12.3.

@absolutelynothelix
Copy link
Collaborator

closing as fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants