Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccurate docs regarding gen_server.call/2,3 when server is stopped during a call without reply? #7510

Closed
smaximov opened this issue Jul 25, 2023 · 2 comments · Fixed by #7511
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@smaximov
Copy link
Contributor

smaximov commented Jul 25, 2023

Describe the bug
Erlang docs for gen_server:call/2,3 state that a call can exit with {normal, _Location} if handle_call/3 returns {stop, normal, _} without replying:

normal
{shutdown,Term}
The server stopped during the call by returning {stop,Reason,_} from its Module:handle_call/3 callback, without replying. See also stop/3.

But I encountered a situation where a gen_server call can exit with the normal reason while handle_call/3 returns a reply and handle_info/2 stops the server process by returning {stop, normal, State}.

To Reproduce

Consider the following code (full code with an Elixir example here).

gen_server_call_normal.erl:

-module(gen_server_call_normal).
-behaviour(supervisor).
-export([run/3, init/1]).

run(Iters, Timeout, Puts) ->
  {ok, Sup} = supervisor:start_link(?MODULE, []),

  try
    loop(#{}, Iters, Sup, Timeout, Puts)
  after
    gen_server:stop(Sup)
  end.

init(_InitArg) ->
  SupFlags = #{strategy => simple_one_for_one},
  ChildSpecs = [#{id => gen_server_call_normal_server,
                  start => {gen_server_call_normal_server, start_link, []},
                  restart => transient}],
  {ok, {SupFlags, ChildSpecs}}.

loop(Acc, 0, _Sup, _Timeout, _Puts) ->
  Acc;

loop(Acc, Iters, Sup, Timeout, Puts) ->
  {ok, Server} = supervisor:start_child(Sup, [{Iters, Timeout, Puts}]),

  receive
  after
    Timeout -> ok
  end,

  try
    gen_server:call(Server, do_work),
    Acc1 = maps:update_with(success, fun(X) -> X + 1 end, 1, Acc),
    loop(Acc1, Iters - 1, Sup, Timeout, Puts)
  catch
    exit:{Reason, _Request} ->
      Acc2 = maps:update_with(Reason, fun(X) -> X + 1 end, 1, Acc),
      loop(Acc2, Iters - 1, Sup, Timeout, Puts)
  end.

gen_server_call_normal_server.erl:

-module(gen_server_call_normal_server).
-behaviour(gen_server).
-export([start_link/1, init/1, handle_call/3, handle_info/2, handle_cast/2]).

start_link(Arg) ->
  gen_server:start_link(?MODULE, Arg, []).

init({Iteration, Timeout, Puts}) ->
  erlang:send_after(Timeout, self(), shutdown),
  {ok, {Iteration, Puts}}.

handle_call(do_work, _From, State) ->
  {reply, ok, State}.

handle_info(shutdown, {Iteration, Puts} = State) ->
  if
    Puts -> io:format("~p: exiting...~n", [Iteration]);
    true -> ok
  end,
  {stop, normal, State}.

handle_cast(_Msg, State) ->
  {noreply, State}.

Perform 100 gen_server calls in the shell:

6> gen_server_call_normal:run(100, 100, true).
% [...skip output...]
#{normal => 100}

This will return a map with exit reasons (or success if a call didn't result in an exit) as keys and how many times each corresponding exit reason occurred as values. The numbers can differ from run to run, in this particular invocation all 100 calls resulted in normal exits.

Expected behavior
The map doesn't contain normal as a key; i.e. not a single call results in a normal exit since we stop the server process from inside the handle_info/2 callback, not the handle_call/3 callback.

Affected versions
25.3.2.4.

Additional context
I encountered this "bug" in an Elixir project and this is a stripped down more-or-less-literal translation to Erlang; because I don't usually code in Erlang it can be incorrect. Also, interestingly enough, if we don't print to the standard output inside the handle_info/2 callback (i.e., pass false as the third argument of gen_server_call_normal:run/3) the issue seems to no longer occur:

4> gen_server_call_normal:run(100, 100, false).
#{noproc => 100}
@smaximov smaximov added the bug Issue is reported as a bug label Jul 25, 2023
@garazdawi
Copy link
Contributor

garazdawi commented Jul 26, 2023

Yes, the documentation is incorrect/inaccurate. You will get a normal exit reason if the process returns {stop,normal,_} from any callback while the call is in progress. What happens is this:

  1. Caller sets up a monitor and sends message.
  2. Server timeout triggers and stops the server with reason normal.
  3. Caller gets monitor 'DOWN' signal with reason normal and returns that as reason.

When the caller gets the 'DOWN' it has no way of knowing whether it was its call or something else that cause the process to exit, so it just returns the reason to the caller.

The io:format needs to be there because an io:format does a receive which forces a synchronization of the signal queue. Without it the monitor signal in the message queue will not be processed until after the death of the process, which means that it will receive noproc reasons.

Without io:format

  1. Caller sends monitor and message signals.
  2. Server timeout triggers and stops the server with reason normal.
  3. When terminating server sees that it has a monitor signal to handle and sends noproc to caller.

With io:format

  1. Caller sends monitor and message signals.
  2. Server timeout triggers
  3. Server does io:format (i.e. a receive) and detects the monitor signal
  4. Server terminates and sees that it has a monitor to handle and sends normal to caller.

You should get exactly the same behavior by only adding a receive unknown -> ok after 0 -> ok end instead of the io:format.

@garazdawi garazdawi linked a pull request Jul 26, 2023 that will close this issue
@jhogberg jhogberg added the team:VM Assigned to OTP team VM label Jul 31, 2023
jhogberg added a commit that referenced this issue Jul 31, 2023
* jhogberg-patch-1:
  gen_server: Clarify normal shutdown return on gen_server:call/2,3

OTP-18690
PR-7511
GH-7510
@jhogberg
Copy link
Contributor

I've merged the documentation update now, thanks for your report!

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants