New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MacOS poll() is broken - gsi with --enable-poll consumes 100% CPU #403
Comments
I assume this version is compiled with clang. |
Putting gsi in |
Compiling with the default configure flags from the official release, the problem is gone:
The same release 4.9.3 is used as the basis of the Homebrew build. As mentioned above, the configure flags used by Homebrew are:
Which one would be most likely to cause the problem? I can try compiling from the official source with some of those flags to find out if I can reproduce the problem. |
Tried again with
|
What does |
Those just set UTF-8, right? https://www.iro.umontreal.ca/~gambit/doc/gambit.html#Runtime-options |
Yes, they set the default character encoding to UTF-8 ( |
Would threads be a likely culprit? If there a way to disable multi-threading, or a Scheme procedure / REPL command to view some kind of thread status while gsi is running? If I understood correctly, use of multiple OS threads is disabled in the default build, but "green threads" (i.e. threads emulated by the Gambit runtime) are always on. |
I'll try to reproduce on my end. It may have to do with the UTF-8 parser. |
Thanks! I'll also retry on my end with |
I have now tried builds with each of these configure option sets:
None of those exhibit the problem that the pre-compiled build from Homebrew has. |
Did you try with/without Otherwise, start in lldb or gdb and check where it is looping... |
I didn't enable that one for any of my manual builds. |
With |
Nor with |
If you have access to a Mac, please try |
I reinstalled gambit-scheme using brew on my mac and I don't get the 100% usage... Maybe you have to debug that on your end. |
I wonder if the problem might be caused by an unusual setting of your tty. Can you start gsi from a bash shell in a plain Terminal and see if the problem goes away? |
Interesting. Here's an
Doesn't seem all that different from the build without the problem. |
Using the ordinary Mac Terminal.app doesn't make a difference, and bash is my shell. |
The executable |
Each time I stopped the process above, it was at about 97% CPU (which I guess is the maximum that the kernel gives to a normal process without a special scheduling priority). When I stopped it, the CPU usage naturally also stopped. When I continued it, the CPU usage "revs up" from zero to 97% in about 5 seconds. The rev up takes a little longer when running in the debugger. don't know why it takes several seconds to get to full usage. |
I tried stopping the process when it hits 97% CPU and then single-stepping it. It runs lots of simple instructions in ___H___thread. I think at one point it got into a stack overflow handler. Here's a backtrace:
At one point it got there via device_select. Does this say anything obvious? |
Seems like it is selecting on the terminal (as it should) but maybe the select or parsing doesn't work as expected. OK, |
|
I always wondered about that one... any idea why? As for the 100% CPU usage problem, we will have to use some of the heavy duty debugging tricks to figure this one out, to see what Scheme code is being executed. Please |
You mean the
Sounds you have some impressive debugging support built in. Unfortunately for debugging purposes, all of the builds I have made manually with |
I confirm your binary is using 100% CPU. |
|
I missed
And indeed, here's the issue: Homebrew/homebrew-core#39850 What I don't understand is why the old |
For next time, you can get the configure options of gsi with The |
Which one does Mac Gambit use with I believe the libuv source code is the canonical reference for how to do all this portable and fast polling stuff nowadays. I plan to study it myself as well. |
I did |
Just replicated the problem from the 4.9.3 source tarball:
|
Yes I'm looking into the issue, and am baffled that |
In other words, poll always claims the terminal fd is available for reading but |
Yes, that's what I understand (haven't actually tested with a trace). This document might be useful to better understand |
I wonder if linux has the same issue with |
|
FreeBSD 12.0-RELEASE amd64 running in Virtual Box with |
... or macOS is itself buggy... (I have another outstanding bug with macOS concerning the interval timer when running on multi-core... the interrupt rate is divided by the number of cores being used) |
hmmm... https://daniel.haxx.se/blog/2016/10/11/poll-on-mac-10-12-is-broken/ It says the bug was fixed, but maybe there is still a problem with ttys... |
There seems to be a bug in poll in macOS. It returns
|
This is particularly strange because /dev/stdin is an alias for the controlling terminal given that the program's stdin was not redirected. |
It's very plausible that something subtle like this is broken in the Unix layer of MacOS. The curl programmers' experiences are informative. Thanks for taking the time to write that test program. I get the same results as you. Additionally:
I would start by studying what libuv does as they have a good reputation for this stuff and it's actively maintained. However, they tend to prefer the OS-native polling framework, such as kqueue on BSD (including MacOS), to portable facilities like select and poll so it may be a lot of work to adopt their approach. They appear to have a dedicated TTY API - some example code: https://github.com/libuv/libuv/blob/master/docs/code/tty/main.c. Docs front page: http://docs.libuv.org/en/v1.x/ |
I just compiled Bigloo yesterday and they have simply imported libuv into their source tree. That approach may not make sense for Gambit which aims to be extremely portable (though being portable and lightweight are also among the main goals of libuv). We discussed how to handle some murky Unix signal stuff in Scheme with Göran, and both came to the conclusion that it'd be best to start by studying what libuv does and possibly just adopt it wholesale. |
It's a pity that the OS native facilities are so buggy, non-standard or hard to use that third-party wrapper libraries are needed but we just have to live with that. Every Scheme is going to hit problems like this eventually. |
Apparently MacOS poll() is also broken on named pipes and it's the same bug that affects TTYs: https://stackoverflow.com/questions/591826/it-really-looks-like-os-x-has-a-bug-when-using-poll-on-a-named-pipe-fifo |
Unfortunately Apple is very slow at responding to bug reports, so I'm not even going to report this and don't have any hopes it will be resolved (it seems to be a known issue for some time). Since
As for libev, it would be nice to add this as an optional feature for Gambit, i.e. |
That's a perfectly appropriate solution. Indeed, Apple bugs can go unfixed for years. Thanks for taking good care of Gambit and being very responsive with this issue.
That sounds like the best of both worlds. Not sure how easy it is to integrate libuv (or libev or libevent) into Gambit's existing I/O and scheduling framework. |
Gambit installed on Mac OS Mojave from the Homebrew package manager (
brew install gambit-scheme
) consumes all available CPU even when idle: just startgsi
from the shell and wait a few seconds.The Homebrew formula (which can be viewed with
brew edit gambit-scheme
) configures Gambit with:Possibly related Linux problem: #126
Does this have something to do with signals and/or threads?
The text was updated successfully, but these errors were encountered: