Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Random segfaults #1791

Open
andreaemonti opened this issue Mar 29, 2024 · 8 comments
Open

[Bug]: Random segfaults #1791

andreaemonti opened this issue Mar 29, 2024 · 8 comments
Labels
bug Bug report or bug fix PR display: wayland Issue related to Wayland backend

Comments

@andreaemonti
Copy link

What happened?

After hours of running smoothly, conky crashes leaving only a segfault message on the system journal.

I'm sorry it's not very reproducible, because it really happens randomly hours after launch. That's why I also haven't experimented much.

Errors in the system journal

These are how most of the logs look like

> journalctl | grep segfault
...
mar 24 00:40:08 yoga kernel: conky[1891]: segfault at 30 ip 0000000000000030 sp 00007fffd1f7a5d8 error 14 in conky[625591428000+c000] likely on CPU 6 (core 3, socket 0)
mar 24 02:30:08 yoga kernel: conky[1902]: segfault at 30 ip 0000000000000030 sp 00007ffd8f8ab4f8 error 14 in conky[5d130410b000+c000] likely on CPU 2 (core 1, socket 0)
mar 25 11:54:16 yoga kernel: conky[1912]: segfault at 30 ip 0000000000000030 sp 00007ffcecc91e38 error 14 in conky[5846d3b63000+c000] likely on CPU 9 (core 4, socket 0)
mar 25 17:08:37 yoga kernel: conky[1991]: segfault at 30 ip 0000000000000030 sp 00007ffeef79ada8 error 14 in conky[64bcaa9b2000+c000] likely on CPU 1 (core 0, socket 0)
mar 26 12:27:45 yoga kernel: conky[1934]: segfault at 30 ip 0000000000000030 sp 00007ffc81c4c448 error 14 in conky[6409a0570000+c000] likely on CPU 15 (core 7, socket 0)
mar 26 20:18:57 yoga kernel: conky[1878]: segfault at 30 ip 0000000000000030 sp 00007ffc4c391768 error 14 in conky[59c247397000+c000] likely on CPU 6 (core 3, socket 0)
mar 28 11:38:48 yoga kernel: conky[1911]: segfault at 30 ip 0000000000000030 sp 00007ffd93b1dc98 error 14 in conky[5bde0de38000+c000] likely on CPU 10 (core 5, socket 0)

I tried launching conky with -D debug flag, but I didn't get more info on the crash. The log on the journal though looked a bit different

mar 29 16:42:53 yoga kernel: conky[2433]: segfault at 700c9f7d26a0 ip 0000700c9f7d26a0 sp 00007ffdf980fd48 error 15 in libc.so.6[700c9f7d1000+2000] likely on CPU 10 (core 5, socket 0)

Error on launch (probably unrelated)

The only error I get as output on the console when I launch conky is
conky: invalid setting of type 'table'
But I cannot understand why

Version

conky 1.19.7_pre compiled 2024-02-26 for Linux x86_64

Which OS/distro are you seeing the problem on?

Arch Linux

Conky config

conky.config = {
	out_to_x = false,
	out_to_wayland = true,
	alignment = 'top_right',
	cpu_avg_samples = 2,
	default_shade_color = '444444',
	draw_graph_borders = false,
	draw_shades = false,
	use_xft = true,
	gap_x = 50,
	gap_y = 50,
	minimum_width = 250,
	net_avg_samples = 2,
	no_buffers = true,

	out_to_stderr = false,

	own_window_argb_visual = true,
	own_window_transparent = true,
	own_window_argb_value = 150,

	update_interval = 1,

	font1 = 'Inter:bold:size=12',
	font2 = 'Inter:size=12',
	font3 = 'DejaVu Sans Mono:size=12'
};


conky.text = [[
${color white}
$alignc${font Inter:size=45}${time %H:%M}${font2 :size=18}${time :%S}
$alignc${font Inter:size=16} ${time %d %B}


#SYSTEM
${font1}CPU:${font2}  $cpu% $alignr $acpitemp°C
${cpugraph 40,250 0000ff ff0000 -t}

${font1}RAM:${font2}  $memperc% $alignr $mem
${memgraph 40,250 00ff00 00ff00}
swap: $swapperc%


${font1}SSD I/O:${font2}  $diskio/s $alignr
${diskiograph 20,250 ffffff ffffff}



#NETWORK
${font1}Network: $alignr${font3} ${addr wlan0}
${font2}$alignc ${alignr 65} speed $alignr total
↑	${alignc -20}${upspeed wlan0}/s ${alignr}${totalup wlan0}
↓	${alignc -20}${downspeed wlan0}/s ${alignr}${totaldown wlan0}



#STORAGE
${font1}root${font2} $alignr ${fs_used /}/${fs_size /}
${fs_bar /}
${font1}home${font2} $alignr ${fs_used /home}/${fs_size /home}
${fs_bar /home}
]];

Stack trace

No response

Relevant log output

No response

@andreaemonti andreaemonti added bug Bug report or bug fix PR triage Issue that hasn't been verified labels Mar 29, 2024
@brndnmtthws
Copy link
Owner

It would be immensely helpful if you're able to get a stack trace with debug symbols from a core dump. It should be as simple as setting ulimit appropriately (i.e., ulimit -c unlimited) and enabling debug symbols for your system.

@brndnmtthws brndnmtthws removed the triage Issue that hasn't been verified label Mar 31, 2024
@andreaemonti
Copy link
Author

I didn't know how to get it, but this should be it (it's the output of thread apply all backtrace full in gdb launched on the coredump)
gdb.txt

Thanks for the help!
Let me know if there is any other info I should give.

@brndnmtthws
Copy link
Owner

It's a little difficult to tell which thread the segfault came from. When you load the core dump into gdb, it should print something like this:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000001234 in some_function () from /lib/somelib.so
(gdb)

Can you provide that output as well?

@andreaemonti
Copy link
Author

Sure! It should be the last one in the log file, aka Thread 1.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `conky -c /home/andrea/.conky/old/home'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000030 in ?? ()
[Current thread is 1 (Thread 0x7121c6378a00 (LWP 1891))]
(gdb)

@brndnmtthws
Copy link
Owner

Okay thanks, that's what I thought. Looks like a race condition around the call to wl_display_dispatch_pending().

@Caellian
Copy link
Collaborator

Caellian commented Apr 7, 2024

Can you say which WM you're using?

The only error I get as output on the console when I launch conky is
conky: invalid setting of type 'table'
But I cannot understand why

I can't see why either. Try changing conky.text (remove first half; second half; then half of the broken one; ...), conky.config seems fine to me.

@Caellian
Copy link
Collaborator

Caellian commented Apr 9, 2024

I can't see why either. Try changing conky.text (remove first half; second half; then half of the broken one; ...), conky.config seems fine to me.

This is actually a (separate) bug: #1806, ignore it for now :)

@andreaemonti
Copy link
Author

Can you say which WM you're using?

I don't know if relevant or not any more. Anyway I'm on EndeavourOS with KDE Plasma. So I have KWin as WM.

I also tried splitting the conky in more files, to see if maybe only one crashed, but had no crash during that testing period, so went back to the single file as above.

Today, after a week without crashes, I noticed that it crashed on resuming from suspension. I don't know if this helps to figure out the issue. The previous times I only noticed after a while that conky was not there any more, and didn't know when it happened precisely. But checking the journal on a couple of previous segfaults I saw it happened once while entering suspension, and once during poweroff (but I also had a kernel panic there, for other GPU drivers issues I believe). So maybe it is related.

@Caellian Caellian added the display: wayland Issue related to Wayland backend label Apr 15, 2024
@Caellian Caellian added this to the Wayland Support milestone May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report or bug fix PR display: wayland Issue related to Wayland backend
Projects
None yet
Development

No branches or pull requests

3 participants