Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERL-1291: Race condition in gen_server: start can return already_started when there is no server. #4259

Closed
OTP-Maintainer opened this issue Jun 24, 2020 · 2 comments
Labels
bug Issue is reported as a bug priority:low team:PS Assigned to OTP team PS wontfix Issue will not be fixed by OTP

Comments

@OTP-Maintainer
Copy link

Original reporter: rjmh
Affected version: Not Specified
Component: stdlib
Migrated from: https://bugs.erlang.org/browse/ERL-1291


The code that calls the init callback contains the following, to handle a stop result:

{{  \{ok, {stop, Reason}} ->}}
{{    %% For consistency, we must make sure that the}}
{{    %% registered name (if any) is unregistered before}}
{{    %% the parent process is notified about the failure.}}
{{    %% (Otherwise, the parent process could get}}
{{    %% an 'already_started' error if it immediately}}
{{    %% tried starting the process again.)}}
{{    gen:unregister_name(Name0),}}

The code that calls handle_call does not, which means that if a call returns a {stop, Reason, Reply, NState} result, and then the caller tries to restart the server soon afterwards, then start can return an already_started error, even though the server is already stopped.

We have a test that does this, and see already_started once every few thousand tests.
@OTP-Maintainer
Copy link
Author

raimo said:

This ancient code is probably written to cover the specific case where a service is started in a tight loop, polling for some condition only known to the service.


When a server terminates there is no code to clean up the name.  It seems the problem has always been ignored, and that cleaning up the name will be done via the normal process mechanisms, which opens for a race e.g when global finds out that the process has died and unregisters the name, but finds out later than someone that restarts the service.

 

The name and how it is registered (Name0) is lost in gen_server, after init.  It might be fixable, but adding a call to global or to a via module during process termination sounds a bit risky, so I think it is unfortunately not a dead simple and obvious bug fix...

@OTP-Maintainer OTP-Maintainer added bug Issue is reported as a bug team:PS Assigned to OTP team PS priority:low labels Feb 10, 2021
@RaimoNiskanen RaimoNiskanen added the wontfix Issue will not be fixed by OTP label Feb 22, 2023
@RaimoNiskanen
Copy link
Contributor

I guess the best you can do is to use start_monitor and wait for the DOWN message, but still, if the name is registered through global or via this seems to be an impossible problem to solve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug priority:low team:PS Assigned to OTP team PS wontfix Issue will not be fixed by OTP
Projects
None yet
Development

No branches or pull requests

2 participants