-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix io:columns/0 timeout when invoked from application callback #351
Conversation
This patch fixes an issue where io:columns/0 times out when invoked from any application callback (or any supervisor/supervised module since the group leader is inherited). To reproduce the issue, one must simply call io:columns() from any application callback. You will notice the process will block for 2 seconds which then times out and returns {:error, :enotsup}. Note this bug only happens inside the erlang shell (using -noshell or escripts do not trigger the bug). To fix the bug, it is important to understand how io requests flow from application callback processes. Here are the steps followed: 1. Since io:columns/1 is timing out, the first step is to find out who is the group leader for the application callback process. Using process_info/1, we can see the parent process is the application_master and handles io_requests by delegating them to the group_leader. 2. By inspecting the application_master process, we can find the group_leader the message is sent is the registered process named user. The process is running the group module which does handle io:columns/1 requests delegating them to user_drv process. 3. The user_drv process does handle tty_geometry requests, except that a clause above ends up short-circuiting all tty_geometry requests from the user process. This patch moves the user clause below the specific driver messages.
I'm not against this patch, this sounds like a true bug. However, I'm wondering what design decision leads to fetching the number of IO columns in an application callback. Is this done in one of the start (or start phases) on behalf of a supervisor or other process, and is there an expectation that the terminal is never resized? If so, this sounds like the logic should be moved elsewhere where it can be modified more frequently. (if it's done for a one-off message that happens to need the width just for itself, then I have pretty much nothing to say about it) |
Notice it is not only the application callback. Since the supervisor and all supervisor children "inherit" from the application callback process, all of them will have the same issue. Calling io:columns/0 in a GenServer or inside a Cowboy handler will all trigger the 2 seconds wait. I mentioned the application callback because it is the easiest way to reproduce the issue. |
Patch has passed first testings and has been assigned to be reviewed |
This patch also fix a bug where there is a 2 seconds delay to read the shell encoding (via |
merged, will be visible on github soon (tm) |
Thank you! |
This patch fixes an issue where io:columns/0 times out when
invoked from any application callback (or any supervisor/supervised module
since the group leader is inherited).
To reproduce the issue, one must simply call io:columns() from any
application callback. You will notice the process will block for 2 seconds
which then times out and returns {:error, :enotsup}.
Note this bug only happens inside the erlang shell (using -noshell or
escripts do not trigger the bug).
To fix the bug, it is important to understand how io requests flow from
application callback processes. Here are the steps followed:
Since io:columns/1 is timing out, the first step is to find out who is the
group leader for the application callback process. Using process_info/1, we
can see the parent process is the application_master and it does handle
io_requests by delegating to the group_leader:
otp/lib/kernel/src/application_master.erl
Lines 172 to 174 in aca0b61
By inspecting the application_master process, we can find the group_leader
the message is sent above is the registered process named user. The process is
running the group module which does handle io:columns/1 requests:
otp/lib/kernel/src/group.erl
Lines 263 to 276 in aca0b61
Those requests are delegated to the underlying driver defined here:
otp/lib/kernel/src/group.erl
Line 37 in aca0b61
The driver is the user_drv process. The user_drv process does handle
tty_geometry requests, except that a clause above ends up ignoring the
tty_geometry requests from the user process:
otp/lib/kernel/src/user_drv.erl
Lines 175 to 180 in aca0b61
This patch moves the user clause below the specific driver messages.