-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] rofi crashes with SIGSEGV #1966
Comments
Interesting, looks like a crash in glib->regex .. can you do a |
Another test does it crash with |
Its the most recent version on Arch Linux.
I couldnt get it to reproduce with that, atleast not for the moment. As I said earlier I didnt find a reproducer so far, so its a bit hard to tell if that is because of the changed option.
The backtrace is obtained via Click to see trace
|
thanks. |
I have the same random crash on archlinux (with lbonn wayland fork) since a few days, it correspond to the following update :
I can reproduce the crash by retrying multiple times with lots of thread and a filter : while sleep 0.1;do timeout 0.1 rofi -threads 100000 -show run -filter a; done I can't reproduce anymore if I downgrade to pcre2 10.42 or if I use |
Was kinda afraid of that. :( Thanks for testing it was the pcre update. |
Can you try this patch to see if it helps? diff --git a/source/view.c b/source/view.c
index aac8c22e..8a9ea5cd 100644
--- a/source/view.c
+++ b/source/view.c
@@ -736,6 +736,7 @@ typedef struct _thread_state_view {
const char *pattern;
/** Length of pattern. */
glong plen;
+ rofi_int_matcher **tokens;
} thread_state_view;
/**
* @param data A thread_state object.
@@ -752,7 +753,7 @@ static void filter_elements(thread_state *ts,
G_GNUC_UNUSED gpointer user_data) {
thread_state_view *t = (thread_state_view *)ts;
for (unsigned int i = t->start; i < t->stop; i++) {
- int match = mode_token_match(t->state->sw, t->state->tokens, i);
+ int match = mode_token_match(t->state->sw, t->tokens, i);
// If each token was matched, add it to list.
if (match) {
t->state->line_map[t->start + t->count] = i;
@@ -1450,6 +1451,7 @@ static gboolean rofi_view_refilter_real(RofiViewState *state) {
unsigned int count = nt;
unsigned int steps = (state->num_lines + nt) / nt;
for (unsigned int i = 0; i < nt; i++) {
+ states[i].tokens = helper_tokenize(pattern, config.case_sensitive);
states[i].state = state;
states[i].start = i * steps;
states[i].stop = MIN(state->num_lines, (i + 1) * steps);
@@ -1478,6 +1480,12 @@ static gboolean rofi_view_refilter_real(RofiViewState *state) {
}
g_cond_clear(&cond);
g_mutex_clear(&mutex);
+ for (unsigned int i = 0; i < nt; i++) {
+ if (states[i].tokens) {
+ helper_tokenize_free(states[i].tokens);
+ states[i].tokens = NULL;
+ }
+ }
for (unsigned int i = 0; i < nt; i++) {
if (j != states[i].start) {
memmove(&(state->line_map[j]), &(state->line_map[states[i].start]), |
@lbonn I suspect this will be reported to your fork too. |
I have built the package with the patch and will see if I get any more segfaults in the next days 🤔 |
Thanks. |
It seems like the patched version also crashes for me 🤔 Click to see trace
|
Thanks for testing. |
I need to setup a system to debug this myself: diff --git a/source/view.c b/source/view.c
index aac8c22e..28849cb3 100644
--- a/source/view.c
+++ b/source/view.c
@@ -655,10 +655,6 @@ void rofi_view_set_selected_line(RofiViewState *state,
}
void rofi_view_free(RofiViewState *state) {
- if (state->tokens) {
- helper_tokenize_free(state->tokens);
- state->tokens = NULL;
- }
// Do this here?
// Wait for final release?
widget_free(WIDGET(state->main_window));
@@ -751,8 +747,12 @@ static void rofi_view_call_thread(gpointer data, gpointer user_data) {
static void filter_elements(thread_state *ts,
G_GNUC_UNUSED gpointer user_data) {
thread_state_view *t = (thread_state_view *)ts;
+
+ /** Regexs used for matching */
+ rofi_int_matcher **tokens =
+ helper_tokenize(t->pattern, config.case_sensitive);
for (unsigned int i = t->start; i < t->stop; i++) {
- int match = mode_token_match(t->state->sw, t->state->tokens, i);
+ int match = mode_token_match(t->state->sw, tokens, i);
// If each token was matched, add it to list.
if (match) {
t->state->line_map[t->start + t->count] = i;
@@ -775,6 +775,8 @@ static void filter_elements(thread_state *ts,
t->count++;
}
}
+
+ helper_tokenize_free(tokens);
if (t->acount != NULL) {
g_mutex_lock(t->mutex);
(*(t->acount))--;
@@ -1450,6 +1452,8 @@ static gboolean rofi_view_refilter_real(RofiViewState *state) {
unsigned int count = nt;
unsigned int steps = (state->num_lines + nt) / nt;
for (unsigned int i = 0; i < nt; i++) {
+ // states[i].tokens = helper_tokenize(pattern,
+ // config.case_sensitive);
states[i].state = state;
states[i].start = i * steps;
states[i].stop = MIN(state->num_lines, (i + 1) * steps);
@@ -1478,6 +1482,8 @@ static gboolean rofi_view_refilter_real(RofiViewState *state) {
}
g_cond_clear(&cond);
g_mutex_clear(&mutex);
+ for (unsigned int i = 0; i < nt; i++) {
+ }
for (unsigned int i = 0; i < nt; i++) {
if (j != states[i].start) {
memmove(&(state->line_map[j]), &(state->line_map[states[i].start]), |
Hmm majarno vm is on 10.42-2 . |
Got PCRE2 10.43 compiled locally and made sure that my rofi use it, but cannot reproduce the crash. |
As I already said, I already didn't find a clear reproducer for the test so far 🤔 It occurs sometimes with regular usage, but not reliably 😢 |
I tried this: while sleep 0.1;do timeout 0.1 rofi -threads 100000 -show run -filter a; done for 15 min. |
This also does not crash for me, at least not when I tried yesterday and just now ... The crashes happen quite spaced out during normal usage:
|
ugh, this will make debugging an absolute pain :-P |
For me it seems to reproduce a bit more often.
I don't have time now, but if it is still useful I can try to debug this myself or run whatever tests you would like during the weekend. |
From what I remember, the crashes started a few days ago. The last few updates I installed of pcre(2) on Arch Linux:
That makes it indeed likely that this is related to version 10.43 of package pcre2. |
Yeah I have also installed this version of pcre2 on the day the crashes started happening for me (see the coredump log from #1966 (comment)):
It really bothers me tho that the reproducer from you does not work for me .. Are you getting the same trace as I do? Because it makes me think that we might suffer from different bugs 😆 |
I suspect same bug.. but it might matter what is in the run list :(. |
I can still reproduce with the second patch applied over the last commit (6c38a49), with a reproducible input (to avoid comparing different run lists) : $ while sleep 0.1;do echo -n .;seq 1 1000 | timeout 0.1 ./build/rofi -dmenu -sync -threads 2 -filter 1;done
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................timeout: the monitored command dumped core
...........................................................................................................................................................................................................timeout: the monitored command dumped core
.....................timeout: the monitored command dumped core
.............^C (if you don't see the rofi window popup, you might need to adjust the timeout) Here's the coredump timings :
And here's the stack :
|
I just ran
Yes, from a quick check it looks exactly the same as your backtrace (same function names, same number of frames between |
Just a quick check. Did you also apply that configuration? |
Yes, I copied the arch package build flags. |
Thanks. When I have some time I'll look more into this. |
kay on debian and compiled pcre2 I can reproduce it.. progress |
#0 sljit_remove_free_block (free_block=0x7fe109bffff0) at src/sljit/allocator_src/sljitExecAllocatorCore.c:140
140 free_block->next->prev = free_block->prev;
[Current thread is 1 (Thread 0x7fe1037fe6c0 (LWP 151385))]
(gdb) print free_block
$1 = (struct free_block *) 0x7fe109bffff0
(gdb) print free_block->next
$2 = (struct free_block *) 0x10102464c457f
(gdb) print free_block->next->prev
Cannot access memory at address 0x10102464c4597 so prev pointer is invalid.. memory corruption? |
not sure is relevant. PCRE2Project/pcre2#318 |
On the other hand glib explicitly presents GRegex as thread-safe https://docs.gtk.org/glib/struct.Regex.html. Something is funny here... pcre2 multi-threading doc is here https://www.pcre.org/current/doc/html/pcre2api.html#SEC17. Looking at the glib code may bring some insight... |
Glib and rofi seemed fine from a cursory look.
@dennisschagt Thanks for bisecting this by the way! From this set of changes, I think this is the most suspicious part:
Some If someone can test if moving back |
Looks like it fixes it for me (tm). |
For what it's worth, valgrind reports tons of errors every time when using rofi, so reproducibility is not an issue for me as long as I'm running rofi using valgrind. Unless they are false positives, they should probably be fixed. |
For me most of the valgrind errors are in the font library pango uses and other dependencies. A lot of these can be hidden by using the suppression files shipped for these libraries. I think this one is kinda confirmed to be a bug in pcre lib |
Closing as it seems to be fixed upstream in PCRE. Regrettable we need to wait for this to trickle down into distributions. Workaround is probably running with -threads 1 |
@lbonn was already kind enough to submit a backport MR to Arch Linux: https://gitlab.archlinux.org/archlinux/packaging/packages/pcre2/-/merge_requests/4 |
:) Otherwise, it should mainly affect Debian experimental and Fedora 41. On NixOs unstable, the JIT is accidentally disabled NixOS/nixpkgs#300056 |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Rofi version (rofi -v)
Version: 1.7.5
Configuration
https://gist.github.com/christian-heusel/c27b5f9bcfdc4f223c4c420ab7755348
Theme
https://gist.github.com/christian-heusel/a42f58d3dee964958ddef4480020d66f
Timing report
No response
Launch command
rofi -lines "4" -fake-transparency -show run
Step to reproduce
A bit unclear on how to reproduce exactly, crashes from time to time only.
Expected behavior
Rofi does not crash
Actual behavior
Rofi crashes with SIGSEGV.
Additional information
Here is the trace from the crash:
Using wayland display server protocol
I've checked if the issue exists in the latest stable release
The text was updated successfully, but these errors were encountered: