Support :on_timeout in Task.async_stream and friends#6009
Conversation
There was a problem hiding this comment.
Maybe ":exit (default) - ..."?
This way we don't need the trailing "Defaults to :exit.".
There was a problem hiding this comment.
Good point, will update.
There was a problem hiding this comment.
Don't we log some messages when we kill?
There was a problem hiding this comment.
Mh, I don't think so. Should we?
There was a problem hiding this comment.
No, I was just asking/checking. :)
There was a problem hiding this comment.
Given the 2 possible ways to leak timers I think we need to send the timer to the Monitor, instead of the Caller.
I believe that sending the timer/cancelling in the Monitor would also allow us to run slightly faster in a multi core scenario because we can run slightly more efficiently:
- cancel timer using
async: true, info: falseso we don't block to cancel (this might be a call to timer wheel on another scheduler) as we wouldn't care for the result and don't need to flush in Monitor - send pid to Caller then start timer (don't block to add to timer wheel)
There was a problem hiding this comment.
When starting a timer for a different process it is not possible to guarantee that the other process receives the timer reference because it is always possible for the process to get an exit signal in between starting the timer and sending the timer reference. Therefore to prevent timer leaks we would require the different process to always go down, i.e. using links, monitors or supervisors. Unfortunately can't guarantee this here.
It could happen in the following situation:
on_timeout: :exitand catches exit when stream enumerated- Monitor process gets exit signal that makes it exit between
Process.send_afterandsend - Caller is trapping exits when it receives the exit signal from monitor process (need not be when spawning monitor process but still possible if
:kill)
There was a problem hiding this comment.
If the caller kills the timed out task and is not trapping exits it will get :killed exit signal from the propagating links. Would it be more useful to have semantics that match Task.yield and Task.shutdown, where we would have nil on timeout? (Note that Task.shutdown does a safe unlink+shutdown/kill).
There was a problem hiding this comment.
nil + safe kill sounds okay to me, @josevalim thoughts?
There was a problem hiding this comment.
We should also test the safe kill, possibly by not trapping exits during the tests.
There was a problem hiding this comment.
There can be pending timers in our waiting map, we would need to cancel all these timers on stream_close, as the Caller might catch the exit/1.
| case waiting do | ||
| %{^position => {_, {:ok, _} = ok}} -> Map.put(waiting, position, {nil, ok}) | ||
| %{^position => {_, :running}} -> Map.put(waiting, position, {nil, {:exit, reason}}) | ||
| %{^position => {_, :timed_out}} -> Map.put(waiting, position, {nil, {:exit, :killed}}) |
There was a problem hiding this comment.
We can not guarantee the reason is :killed here, it can be any reason. For example task can exit while timer message is in the message queue of the Monitor. Since it would be :killed if it was killed we could just use the reason in the :exit. Also the reason might be :killed because the task was killed for another cause.
Therefore I think we should use another value when :timed_out and reason is :killed, otherwise do {:exit, reason}.
There was a problem hiding this comment.
@fishcakez okay, got it. So one option would be to Process.exit(task, :kill) but emit {:exit, :killed_for_timeout}, and another option would be to Process.exit(task, :kill) and emit {:exit, reason_task_died}, did I get it correctly? If so, which one do you think is better?
| case running_tasks do | ||
| %{^ref => {position, _type, pid, _timer_ref}} -> | ||
| send(parent_pid, {:killed_for_timeout, {monitor_ref, position}}) | ||
| Process.exit(pid, :kill) |
There was a problem hiding this comment.
If we aren't trapping exits and on_timeout: :exit this creates 2 possibilities.
Either the task is killed and the exit signals propagates causing Caller to get :killed exit signal. Or the Caller handles the :killed_for_timeout message and tells the Monitor to stop all tasks. The Monitor receives the stop and then traps exits before task exits. Then the Caller will call exit({:timeout, ..}). This differing semantics is complex. I think we would need to unlink from the Monitor, perhaps similar to Task.shutdown.
| send(parent_pid, {:killed_for_timeout, {monitor_ref, position}}) | ||
| caller = self() | ||
| ref = make_ref() | ||
| enforcer = spawn(fn -> |
There was a problem hiding this comment.
If trapping exits we could skip spawning, unsure if worth complexity of that optimisation though.
| %{^position => {_, {:ok, _} = ok}} -> Map.put(waiting, position, {nil, ok}) | ||
| %{^position => {_, :running}} -> Map.put(waiting, position, {nil, {:exit, reason}}) | ||
| %{^position => {_, :timed_out}} -> Map.put(waiting, position, {nil, {:exit, :killed}}) | ||
| %{^position => {_, :timed_out}} -> Map.put(waiting, position, {nil, {:exit, :timed_out}}) |
There was a problem hiding this comment.
GenServer.start_link does {:error, :timeout} : https://github.com/erlang/otp/blob/cd412d911efbda23e7dd3aef5cf910defc886211/lib/stdlib/src/proc_lib.erl#L350
Should we use :timeout as well?
Need new final review.
| case waiting do | ||
| %{^position => {_, {:ok, _} = ok}} -> Map.put(waiting, position, {nil, ok}) | ||
| %{^position => {_, :running}} -> Map.put(waiting, position, {nil, {:exit, reason}}) | ||
| %{^position => {_, :timed_out}} -> Map.put(waiting, position, {nil, {:exit, :timeout}}) |
There was a problem hiding this comment.
Maybe we should use :timeout here too for consistency?
There was a problem hiding this comment.
I thought about that, I wasn't sure. I don't have any argument for :timed_out here though, so I updated to use :timeout. Should I change the message we send with Process.send_after/3 as well?
There was a problem hiding this comment.
Probably yes :), I updated that as well. Let me know how it looks.
josevalim
left a comment
There was a problem hiding this comment.
It looks good to me. If @fishcakez agree, we shall merge it.
|
🎉! |
:on_timeoutcan be:kill_task | :exit, where:exitis the current behaviour.