Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition while reconnecting. #92

Closed
ololoru opened this issue Nov 9, 2016 · 5 comments
Closed

Race condition while reconnecting. #92

ololoru opened this issue Nov 9, 2016 · 5 comments

Comments

@ololoru
Copy link

ololoru commented Nov 9, 2016

I've discovered race condition while testing reconnection logic. Here below is setup I'm using:

  1. Erlang (R16) is running on Macos, I start erlang shell and create new connection via eredis:start_link
    and do basic put/get test. I'm using default database, and no authentication
  2. Redis is started in docker container. (I'm using native docker for macos)
  3. I stop redis service in docker and see race condition results either in process crash or closed socket saved in eredis_client state.

When connection dies and socket receives tcp_closed message eredis_client:reconnect_loop is called in a separate process where it:

  1. Creates new socket
  2. Changes controlling process to original eredis_client PID
  3. Sends message to eredis_client PID to replace connection

There are the following issues with described approach:

  • Between steps 1) and 2) newly created socket(connection) dies and spawned process (where reconnect_loop is executed) receives tcp_closed message what causes eredis client process using dead socket
  • Between steps 2. and 3. socket(connection) dies and eredis_client receives tcp_closed prior to connection_ready message while socket is undefined what causes process termination on unhandled_message
@ntrepid8
Copy link

ntrepid8 commented Feb 11, 2017

I see this issue pop up when trying to connect to Redis version 3.0.6, but do not see it when connecting to Redis version 2.8.4.

edit: this turned out to be a different networking issue on my server running Redis 3.0.6, please disregard...

@knutin
Copy link
Collaborator

knutin commented Feb 11, 2017 via email

@benbro
Copy link

benbro commented Mar 14, 2017

How is this different than any other socket failure?
The socket can close in the following steps:

  • After calling gen_tcp:controlling_process/2 in reconnect_loop/2
  • in your PR, after sending the messages in the queue with [Client ! M || M <- Msgs]
  • At any other point in time

do_request(Req, From, State) returns an error when the socket is closed. The client can handle the error and retry. Isn't it enough?

Maybe we should handle the connection_ready message when the socket isn't undefined in the state?

@benbro
Copy link

benbro commented Mar 14, 2017

Another option is to check for a live connection on every request and connect otherwise:
https://github.com/interline/epgsql_pool/blob/master/src/epgsql_pool_worker.erl

@knutin
Copy link
Collaborator

knutin commented Aug 23, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants