Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_worker failed cause poolby crashed #72

Closed
linbo opened this issue Apr 14, 2015 · 6 comments
Closed

new_worker failed cause poolby crashed #72

linbo opened this issue Apr 14, 2015 · 6 comments

Comments

@linbo
Copy link

linbo commented Apr 14, 2015

Hi,

We use poolboy to manage redis connection pool, in production environment find redis connection closed sometimes.

Checked code in poolboy.erl line 275, find if new_worker failed, in our example, maybe redis start failed due to network error, it will cause poolboy crashed, all other existing redis connections will be closed.

Maybe new_worker should deal with failed case, and shouldn't affect the other processes managed by poolboy.

@devinus
Copy link
Owner

devinus commented Apr 14, 2015

@linbo Add smarts to your worker to handle disconnects. This is not for the pool itself to handle. See https://github.com/interline/epgsql_pool/blob/master/src/epgsql_pool_worker.erl#L58 for inspiration.

@devinus devinus closed this as completed Apr 14, 2015
@linbo
Copy link
Author

linbo commented Apr 14, 2015

@devinus @Vagabond @onkel-dirtus I don't get it. I mean poolboy_worker init/start_link failed, it will cause poolboy crashed.

As my understanding, poolboy is a pool, start new worker failed should not cause other workers in the pool terminated.

And refer to https://github.com/devinus/poolboy/blob/master/src/poolboy_worker.erl, start_link can returned {error, Reason}, why eat up error information but return ok?

@devinus
Copy link
Owner

devinus commented Apr 14, 2015

@linbo What does your worker look like?

@linbo
Copy link
Author

linbo commented Apr 15, 2015

@devinus It is eredis (https://github.com/wooga/eredis), now I refer to https://github.com/phoenixframework/phoenix/blob/master/lib/phoenix/pubsub/redis_conn.ex, try add a proxy process to deal with this issue.

@devinus
Copy link
Owner

devinus commented Apr 15, 2015

Yes, your worker itself needs to deal with disconnects/reconnects like Phoenix's example. You never want to let a supervisor handle disconnects/reconnects because of worker churn and restart strategy taking down the entire pool.

@linbo
Copy link
Author

linbo commented Apr 16, 2015

Hi, I don't know where can discuss poolboy design, just comment here.

I add a proxy process (https://github.com/yunba/eredis_pool/blob/master/src/eredis_proxy.erl), normally it works fine. But for redis connection timeout, I find eredis_proxy will be closed by poolboy.

I checked code (https://github.com/devinus/poolboy/blob/master/src/poolboy.erl#L309), when checkin process and overflow >0, it will close checkin process.

For my scenario, maybe network or other cases happened, new overflow process connect to redis are all timeout, at same time old good checkin processes will be closed. Then this time, maybe all connected redis client will be closed.

I am not sure how to deal this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants