Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow queries with timeouts can incapacitate the pool #20

Open
Licenser opened this issue Feb 17, 2017 · 4 comments
Open

Extremely slow queries with timeouts can incapacitate the pool #20

Licenser opened this issue Feb 17, 2017 · 4 comments

Comments

@Licenser
Copy link

Running extremely slow querries, that time out can result in the entire pool becoming unusable. So far it looks like the following happens:

We have:
A pool with the workers [A, B, C]

  1. Start very slow query, worker A get checked out and handed the query.
  2. Pool has now the workers [B, C].
  3. Worker A returns with a timeout (the slow query still running) and gets added back to the pool as the 1st element of the list.
  4. A new query checks out A again, but A is still busy with the slow (timed out) query resulting in it becoming inoperative.
  5. As A is the 1st worker in the pool now every 1st checkout will now always time out.

This has been verified by:

  1. getting the state of pool after 3 showing A being first element agian
  2. querying A's via get_state (no reply)
  3. querying process_info for A (showing the query of 4 in the message queue as well as the get_state request).
  4. spawning the query 4 in a off process, waiting 500ms and then executing 4 again (forcing A to be taken off the queue by the spawned process) and receiving correct results.
@seriyps
Copy link
Member

seriyps commented Feb 17, 2017

Do you expect A to be killed in case of timeout?
FYI, pooler's return_member/3 have 3rd argument indicating if worker should be returned to pool or killed and replaced. But poolboy have nothing like this

@Licenser
Copy link
Author

Given the PR above, I think yes that is a decent approach, however I'm happy to be convinced otherwise.

@davidw
Copy link
Member

davidw commented Feb 17, 2017

@seriyps this is nice in pooler, too:

You can configure Pooler to periodically check for and remove members that have not been used recently to reduce the member count back to its initial size.

I and others kept arguing for something like that to be added to poolboy, but I don't know if they ever did. It makes a lot of sense for an expensive resource to not be 'returned' right away when you have a burst of traffic.

@Licenser
Copy link
Author

Agreed, I think the suggested 'kill on timeout' is sensible tradeoff, as long as the connection works within the requiested timeframe it can keep on living but connections that time out have to die, but at least the death is quick and merciful.

@Licenser Licenser changed the title Extremely low queries with timeouts can incapacitate the pool Extremely slow queries with timeouts can incapacitate the pool Feb 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants