Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBT stuck running inside Ninja #1190

Closed
darabos opened this issue Oct 4, 2016 · 2 comments
Closed

SBT stuck running inside Ninja #1190

darabos opened this issue Oct 4, 2016 · 2 comments

Comments

@darabos
Copy link

darabos commented Oct 4, 2016

Maybe what I am trying to do is a terrible idea, but it seems to demonstrate a strange bug in SBT or Ninja. SBT is "Scala Build Tool". I have a complex project where each part has its own build system (such as SBT) and I just want a higher level dependency description to tie them together. ("We need to rebuild this subproject only if that subproject changed.") I'm just trying different tools for this, and I tried Ninja with interesting results.

$ mkdir ninja-vs-sbt && cd ninja-vs-sbt
$ sbt about
[info] Set current project to ninja-vs-sbt (in build file:/home/darabos/ninja-vs-sbt/)
[info] This is sbt 0.13.11
[info] The current project is {file:/home/darabos/ninja-vs-sbt/}ninja-vs-sbt 0.1-SNAPSHOT
[info] The current project is built against Scala 2.10.6
[info] Available Plugins: sbt.plugins.IvyPlugin, sbt.plugins.JvmPlugin, sbt.plugins.CorePlugin, sbt.plugins.JUnitXmlReportPlugin
[info] sbt, sbt plugins, and build definitions are using Scala 2.10.6
$ ninja --version
1.7.1
$ cat <<EOF > build.ninja
rule sbt
  command = sbt about
build x: sbt
EOF
$ ninja
[0/1] sbt about

It just hangs.

$ ps aux | grep sbt
darabos  21829  0.0  0.0   4508   752 pts/30   T    23:47   0:00 /bin/sh -c sbt about
darabos  21830  0.0  0.0  21416  3288 pts/30   T    23:47   0:00 bash /usr/bin/sbt about
darabos  21899 10.8  0.2 3716728 91712 pts/30  Tl   23:47   0:00 java -Xms1024m -Xmx1024m -XX:ReservedCodeCacheSize=128m -XX:MaxMetaspaceSize=256m -jar /usr/share/sbt-launcher-packaging/bin/sbt-launch.jar about
$ kill -9 21829

The JVM starts, but I cannot connect to it with jstack to tell what it is up to. From strace it looks like it gets stuck after running stty:

[pid 22332] execve("/bin/stty", ["stty", "-icanon", "min", "1", "-icrnl", "-inlcr", "-ixon"], [/* 38 vars */]) = 0
[pid 22332] brk(NULL)                   = 0xfff000
[pid 22332] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid 22332] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6134247000
[pid 22332] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
[pid 22332] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[pid 22332] fstat(3, {st_mode=S_IFREG|0644, st_size=143212, ...}) = 0
[pid 22332] mmap(NULL, 143212, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6134224000
[pid 22332] close(3)                    = 0
[pid 22332] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid 22332] open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
[pid 22332] read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"..., 832) = 832
[pid 22332] fstat(3, {st_mode=S_IFREG|0755, st_size=1864888, ...}) = 0
[pid 22332] mmap(NULL, 3967488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f6133c5b000
[pid 22332] mprotect(0x7f6133e1b000, 2093056, PROT_NONE) = 0
[pid 22332] mmap(0x7f613401a000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bf000) = 0x7f613401a000
[pid 22332] mmap(0x7f6134020000, 14848, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f6134020000
[pid 22332] close(3)                    = 0
[pid 22332] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6134223000
[pid 22332] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6134222000
[pid 22332] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6134221000
[pid 22332] arch_prctl(ARCH_SET_FS, 0x7f6134222700) = 0
[pid 22332] mprotect(0x7f613401a000, 16384, PROT_READ) = 0
[pid 22332] mprotect(0x610000, 4096, PROT_READ) = 0
[pid 22332] mprotect(0x7f6134249000, 4096, PROT_READ) = 0
[pid 22332] munmap(0x7f6134224000, 143212) = 0
[pid 22332] brk(NULL)                   = 0xfff000
[pid 22332] brk(0x1020000)              = 0x1020000
[pid 22332] open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
[pid 22332] fstat(3, {st_mode=S_IFREG|0644, st_size=11798864, ...}) = 0
[pid 22332] mmap(NULL, 11798864, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f613311a000
[pid 22332] close(3)                    = 0
[pid 22332] open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
[pid 22332] fstat(3, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0
[pid 22332] read(3, "# Locale name alias data base.\n#"..., 4096) = 2995
[pid 22332] read(3, "", 4096)           = 0
[pid 22332] close(3)                    = 0
[pid 22332] open("/usr/lib/locale/UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22332] open("/usr/share/locale-langpack/UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 22332] ioctl(0, TCGETS, {B9600 opost isig icanon echo ...}) = 0
[pid 22332] ioctl(0, TCGETS, {B9600 opost isig icanon echo ...}) = 0
[pid 22332] ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B9600 opost isig -icanon echo ...} <unfinished ...>
[pid 22331] <... wait4 resumed> 0x7ffe510eec7c, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 22332] <... ioctl resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 22331] --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_KERNEL} ---
[pid 22332] --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_KERNEL} ---
[pid 22331] --- stopped by SIGTTOU ---
[pid 22332] --- stopped by SIGTTOU ---

After that it seems this SIGTTOU signal goes and stops all parent processes. (There are 21 more stopped by SIGTTOU lines for different PIDs.)

I decided to raise this issue with Ninja rather than SBT because SBT works fine when I run it normally. It works fine when I run it from a script, or when I redirect its inputs and outputs to /dev/null. It looks more like Ninja may put it in a sort of broken environment.

I see the same behavior on Ubuntu 16.04 and macOS 10.12.

@evmar
Copy link
Collaborator

evmar commented Oct 5, 2016

Ninja runs the subprocess in a pty. Perhaps sbt queries something about its environment in a manner that Ninja doesn't set up right.

I vaguely recall we changed our signal handling recently, perhaps that is related?

Google found this random page: https://github.com/timjr/tomograph/tree/master/doc/openstack-patches which says "We use setsid instead of nohup because sbt seems to try to frob the terminal so it gets a SIGTTOU and stops otherwise:"

@darabos
Copy link
Author

darabos commented Oct 6, 2016

It looks like SBT does not deal well with SIGTTOU and it gets SIGTTOU because it uses JLine which wants to write to the controlling terminal even when it is not the active process. It can be reproduced without Ninja:

$ sbt about &
[1]+  Stopped                 sbt about

(Not the program is just stopped, not terminated. It finishes successfully if you fg, but otherwise hangs forever.)

There is a workaround:

sbt -Djline.terminal=jline.UnsupportedTerminal about &

I can just add this flag to any SBT command.

I guess I'll raise this issue with SBT. Sorry for the noise here! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants