New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TiFlash cop thread pool can not handle request with high QPS #3696
Comments
I think if the pending number of cop requests is more than k times the size of the coprocessor thread pool, then just simple reply something like "TiFlash is busy" to the caller instead of pending by the thread pool. So that TiFlash can recover from large amount of useless retry requests. |
this behavior LGTM |
Are there any plans to implement this behavior? @LittleFall |
Should also take consideration for Elastic Thread Pool/Dynamic Thread Pool model. /cc @bestwoody @fuzhe1989 |
@JaySon-Huang It depends on both TiFlash and TiDB. Do we use exponential backoff retry strategy? |
Another similar problem from asktug: https://asktug.com/t/topic/694336/24 |
closed because #6438 has been basically implemented |
Enhancement
One of our users execute queries like
select count(*) from table where (`url` like 'xxx%') and `uid` in (....)
in tidb. If the size ofuid
is more than several hundred, tidb choose to route that request to TiFlash, with about 15 QPS.However, TiFlash can not handle those queries quickly, the request is lined up by the coprocessor thread pool. Requests are stacking up while TiDB sees all requests are "timeout" and retry, which makes more requests sent to TiFlash. Finally, it makes TiFlash out of memory.
"cop_dag" means those coprocessor requests are being executing and ...
"cop" means the sum of those coprocessor requests are being executed and those requests are lined up.
The text was updated successfully, but these errors were encountered: