New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i3 locks up randomly, unable to click or switch tabs #2979

Open
ElonSatoshi opened this Issue Sep 23, 2017 · 15 comments

Comments

Projects
None yet
6 participants
@ElonSatoshi

ElonSatoshi commented Sep 23, 2017

Output
of i3 --moreversion 2>&- || i3 --version:

_Binary i3 version: 4.14 (2017-09-04) © 2009 Michael Stapelberg and
contributors
Running i3 version: 4.14 (2017-09-04) (pid 15211) abort…)
Loaded i3 config: /home/elonsatoshi/.config/i3/config (Last modified:
Wed 20 Sep 2017 08:25:08 PM CDT, 243725 seconds ago)

The i3 binary you just called: /usr/bin/i3
The i3 binary you are running: i3
_

URL to a logfile as per http://i3wm.org/docs/debugging.html:

http://logs.i3wm.org/logs/5709717443706880.bz2

What I did:

Nothing. It happens while I'm using Icecat. It happens while I've
activated i3lock and am gone for a long period. It happens whenever.
It happens whenever I try to turn down the screen brightness. (I do that by holding fn and pressing End) I do it on a Lenovo Thinkpad t400 with Parabola Gnu/Linux-libre installed with OpenRC, using the base-rc package in the Parabola repos.

What I saw:

i3 is irresponsive to keyboard, but the active window responds to
keyboard. I cannot click on anything. I can see i3blocks' clock ticking
away. Caps-lock works. If I switch to tty2 with ctrl-alt-2, and then
back to i3 with alt-1, the window bars at the top of the screen
disappear, turn black.

What I expected instead:
I expect i3 to continue running without making me run killall xinit in
tty2 and then startx.

@i3bot

This comment has been minimized.

Show comment
Hide comment
@i3bot

i3bot Sep 23, 2017

I don’t see a link to logs.i3wm.org. Did you follow http://i3wm.org/docs/debugging.html? (In case you actually provided a link to a logfile, please ignore me.)

i3bot commented Sep 23, 2017

I don’t see a link to logs.i3wm.org. Did you follow http://i3wm.org/docs/debugging.html? (In case you actually provided a link to a logfile, please ignore me.)

@Airblader

This comment has been minimized.

Show comment
Hide comment
@Airblader

Airblader Sep 23, 2017

Member

It sounds like i3 may have crashed. Can you check from another tty whether i3 is still running in this scenario? If it is, can you attach with gdb to see where it is stuck?

Member

Airblader commented Sep 23, 2017

It sounds like i3 may have crashed. Can you check from another tty whether i3 is still running in this scenario? If it is, can you attach with gdb to see where it is stuck?

@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment
@ElonSatoshi

ElonSatoshi Sep 23, 2017

In past scenarios, pidof i3 returns a pid. I'll attach gdb next time it happens. starts studying how to attach gdb to a thing

ElonSatoshi commented Sep 23, 2017

In past scenarios, pidof i3 returns a pid. I'll attach gdb next time it happens. starts studying how to attach gdb to a thing

@Airblader Airblader added the bug label Sep 23, 2017

@Airblader

This comment has been minimized.

Show comment
Hide comment
@Airblader

Airblader Sep 23, 2017

Member

From another TTY you can just use

gdb $(which i3) $(pidof i3)

But note that you might have to run it as sudo. From the gdb shell you can then dump a backtrace. You have to make sure that you have the debug symbols for i3 installed, though, otherwise the backtrace will be not too useful. :-)

How often does this roughly happen; i.e., how reproducible is it?

Member

Airblader commented Sep 23, 2017

From another TTY you can just use

gdb $(which i3) $(pidof i3)

But note that you might have to run it as sudo. From the gdb shell you can then dump a backtrace. You have to make sure that you have the debug symbols for i3 installed, though, otherwise the backtrace will be not too useful. :-)

How often does this roughly happen; i.e., how reproducible is it?

@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment
@ElonSatoshi

ElonSatoshi Sep 24, 2017

At least a few times a day.

In fact, it just happened while I was watching a clip of a nazi being knocked out with one punch. I ran gdb just as instructed. There were no debugging symbols, BUT I think what was at the end is important.

It said:

_Program recieved signal SIGTTIN, Stopped (tty input)
0x00007f650aa202d0 in epoll_pwait() from /usr/lib/libc.so.6
_

Which (I think) means that something is sending kill signals to i3 just for the fun of it >:(

ElonSatoshi commented Sep 24, 2017

At least a few times a day.

In fact, it just happened while I was watching a clip of a nazi being knocked out with one punch. I ran gdb just as instructed. There were no debugging symbols, BUT I think what was at the end is important.

It said:

_Program recieved signal SIGTTIN, Stopped (tty input)
0x00007f650aa202d0 in epoll_pwait() from /usr/lib/libc.so.6
_

Which (I think) means that something is sending kill signals to i3 just for the fun of it >:(

@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment
@ElonSatoshi

ElonSatoshi Sep 24, 2017

Update: I am able to consistantly reproduce the issue by lowering the screen brightness, which is done by holding fn and pressing End or Home on my laptop. What might be relevant is that I've seen little ^@s appear when adjusting my brightness during pacman operations before I had installed xorg and i3.

ElonSatoshi commented Sep 24, 2017

Update: I am able to consistantly reproduce the issue by lowering the screen brightness, which is done by holding fn and pressing End or Home on my laptop. What might be relevant is that I've seen little ^@s appear when adjusting my brightness during pacman operations before I had installed xorg and i3.

@stapelberg

This comment has been minimized.

Show comment
Hide comment
@stapelberg

stapelberg Sep 24, 2017

Member

Can you obtain an strace log of this issue please? Use e.g. strace -o /tmp/strace.log -s 2048 -v -tt -p $(pidof i3), then upload /tmp/strace.log

Member

stapelberg commented Sep 24, 2017

Can you obtain an strace log of this issue please? Use e.g. strace -o /tmp/strace.log -s 2048 -v -tt -p $(pidof i3), then upload /tmp/strace.log

@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment

ElonSatoshi commented Sep 24, 2017

@stapelberg

This comment has been minimized.

Show comment
Hide comment
@stapelberg

stapelberg Sep 25, 2017

Member

From the strace:

14:03:03.999474 epoll_pwait(6, 0x556324902270, 64, 59743, NULL, 8) = -1 EINTR (Interrupted system call)
14:03:07.653049 --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_KERNEL} ---
14:03:07.653183 --- stopped by SIGTTIN ---
14:03:14.855499 --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=26675, si_uid=1001} ---
14:03:14.855565 --- SIGCONT {si_signo=SIGCONT, si_code=SI_KERNEL} ---

So, yes, something is sending SIGTTIN to i3, for which the default signal action is to stop the process.

http://curiousthing.org/sigttin-sigttou-deep-dive-linux gives some more background.

I think the next step is to figure out what exactly is triggering the SIGTTIN and why. It sounds like maybe your keyboard produces codes which Linux interprets incorrectly.

As a temporary workaround, you can change the i3 source to ignore SIGTTIN, but I suspect this issue is reproducible with other programs (possibly only those running on a tty).

I’m saying this is a temporary workaround because I’m not comfortable changing i3 to always ignore SIGTTIN until we know what the root cause of the issue is and whether ignoring SIGTTIN has any possible unwanted side effects.

Member

stapelberg commented Sep 25, 2017

From the strace:

14:03:03.999474 epoll_pwait(6, 0x556324902270, 64, 59743, NULL, 8) = -1 EINTR (Interrupted system call)
14:03:07.653049 --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_KERNEL} ---
14:03:07.653183 --- stopped by SIGTTIN ---
14:03:14.855499 --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=26675, si_uid=1001} ---
14:03:14.855565 --- SIGCONT {si_signo=SIGCONT, si_code=SI_KERNEL} ---

So, yes, something is sending SIGTTIN to i3, for which the default signal action is to stop the process.

http://curiousthing.org/sigttin-sigttou-deep-dive-linux gives some more background.

I think the next step is to figure out what exactly is triggering the SIGTTIN and why. It sounds like maybe your keyboard produces codes which Linux interprets incorrectly.

As a temporary workaround, you can change the i3 source to ignore SIGTTIN, but I suspect this issue is reproducible with other programs (possibly only those running on a tty).

I’m saying this is a temporary workaround because I’m not comfortable changing i3 to always ignore SIGTTIN until we know what the root cause of the issue is and whether ignoring SIGTTIN has any possible unwanted side effects.

@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment
@ElonSatoshi

ElonSatoshi Sep 25, 2017

How do I change the i3 source to ignore SIGTTIN?

ElonSatoshi commented Sep 25, 2017

How do I change the i3 source to ignore SIGTTIN?

@stapelberg

This comment has been minimized.

Show comment
Hide comment
@stapelberg

stapelberg Sep 25, 2017

Member

This should do the trick (compiles, but untested):

diff --git i/src/main.c w/src/main.c
index 0d1457fd..b9f772e5 100644
--- i/src/main.c
+++ w/src/main.c
@@ -883,6 +883,7 @@ int main(int argc, char *argv[]) {
     /* Ignore SIGPIPE to survive errors when an IPC client disconnects
      * while we are sending them a message */
     signal(SIGPIPE, SIG_IGN);
+    signal(SIGTTIN, SIG_IGN);
 
     /* Autostarting exec-lines */
     if (autostart) {
Member

stapelberg commented Sep 25, 2017

This should do the trick (compiles, but untested):

diff --git i/src/main.c w/src/main.c
index 0d1457fd..b9f772e5 100644
--- i/src/main.c
+++ w/src/main.c
@@ -883,6 +883,7 @@ int main(int argc, char *argv[]) {
     /* Ignore SIGPIPE to survive errors when an IPC client disconnects
      * while we are sending them a message */
     signal(SIGPIPE, SIG_IGN);
+    signal(SIGTTIN, SIG_IGN);
 
     /* Autostarting exec-lines */
     if (autostart) {
@ElonSatoshi

This comment has been minimized.

Show comment
Hide comment
@ElonSatoshi

ElonSatoshi Sep 27, 2017

How do I go about finding the cause of the offending signals?

ElonSatoshi commented Sep 27, 2017

How do I go about finding the cause of the offending signals?

@stapelberg

This comment has been minimized.

Show comment
Hide comment
@stapelberg

stapelberg Sep 27, 2017

Member

I’d recommend to swap out parts of your environment until you found the one which causes the issue.

You could, for example, boot a linux live distribution and see if the same behavior occurs, and which software versions are in use in that system.

Of particular interest are the Linux kernel version (containing many drivers), the Xorg server and all Xorg drivers (check /var/log/Xorg.0.log).

Member

stapelberg commented Sep 27, 2017

I’d recommend to swap out parts of your environment until you found the one which causes the issue.

You could, for example, boot a linux live distribution and see if the same behavior occurs, and which software versions are in use in that system.

Of particular interest are the Linux kernel version (containing many drivers), the Xorg server and all Xorg drivers (check /var/log/Xorg.0.log).

@MarcelPa

This comment has been minimized.

Show comment
Hide comment
@MarcelPa

MarcelPa Nov 30, 2017

I had (have) a similar issue with i3, runnning the current git version (built by myself)

Binary i3 version: 4.14-170-gf9efc09b (2017-11-26, branch "next") © 2009 Michael Stapelberg and contributors
Running i3 version: 4.14-170-gf9efc09b (2017-11-26, branch "next") (pid 25977)

The lockups seemed very random to me, I did not use function keys when the error would occur. The symptoms as well as the gdb output were the same, i3 got a SIGTTIN signal.

I had a custom i3bar running (i3status-rust to be exact). Now I use the default i3bar, the error has not reoccured since then (~3 days). In the next days (where I will have more time to spare) I will plug in my custom i3bar again, eventually I will be able to confirm the source of my error by then.

Hope this helps in some way!

MarcelPa commented Nov 30, 2017

I had (have) a similar issue with i3, runnning the current git version (built by myself)

Binary i3 version: 4.14-170-gf9efc09b (2017-11-26, branch "next") © 2009 Michael Stapelberg and contributors
Running i3 version: 4.14-170-gf9efc09b (2017-11-26, branch "next") (pid 25977)

The lockups seemed very random to me, I did not use function keys when the error would occur. The symptoms as well as the gdb output were the same, i3 got a SIGTTIN signal.

I had a custom i3bar running (i3status-rust to be exact). Now I use the default i3bar, the error has not reoccured since then (~3 days). In the next days (where I will have more time to spare) I will plug in my custom i3bar again, eventually I will be able to confirm the source of my error by then.

Hope this helps in some way!

@polygonalenippel

This comment has been minimized.

Show comment
Hide comment
@polygonalenippel

polygonalenippel Sep 21, 2018

Edit: running Binary i3 version: 4.15.0.1 (03-13-2018) on kernel 4.19.0-1-MANJARO

I can confirm this bug still occurs, though only when a secondary monitor is attached, thus forcing me to start a secondary X session utilizing video card drivers (by means of nvidia-xrun in my case).

After looking what you people were up to and carefully inspecting my start-up routine, I stumbled upon this command in my xinitrc which will eventually be executed (or something along those lines):

exec dbus-launch --sh-syntax --exit-with-session i3 --shmlog-size 0

By removing the --exit-with-session switch, i3 does not crash any longer upon utilizing the function keys.

I hope this could be of some help to further pinpoint the exact cause of this issue, tag me if I could be of any assistance.

polygonalenippel commented Sep 21, 2018

Edit: running Binary i3 version: 4.15.0.1 (03-13-2018) on kernel 4.19.0-1-MANJARO

I can confirm this bug still occurs, though only when a secondary monitor is attached, thus forcing me to start a secondary X session utilizing video card drivers (by means of nvidia-xrun in my case).

After looking what you people were up to and carefully inspecting my start-up routine, I stumbled upon this command in my xinitrc which will eventually be executed (or something along those lines):

exec dbus-launch --sh-syntax --exit-with-session i3 --shmlog-size 0

By removing the --exit-with-session switch, i3 does not crash any longer upon utilizing the function keys.

I hope this could be of some help to further pinpoint the exact cause of this issue, tag me if I could be of any assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment