TiFlash cop thread pool can not handle request with high QPS #3696

JaySon-Huang · 2021-12-21T11:03:22Z

Enhancement

One of our users execute queries like select count(*) from table where (`url` like 'xxx%') and `uid` in (....) in tidb. If the size of uid is more than several hundred, tidb choose to route that request to TiFlash, with about 15 QPS.

However, TiFlash can not handle those queries quickly, the request is lined up by the coprocessor thread pool. Requests are stacking up while TiDB sees all requests are "timeout" and retry, which makes more requests sent to TiFlash. Finally, it makes TiFlash out of memory.

"cop_dag" means those coprocessor requests are being executing and ...

"cop" means the sum of those coprocessor requests are being executed and those requests are lined up.

The text was updated successfully, but these errors were encountered:

JaySon-Huang · 2021-12-21T11:07:32Z

/cc @LittleFall @windtalker @zanmato1984

JaySon-Huang · 2021-12-22T03:53:45Z

I think if the pending number of cop requests is more than k times the size of the coprocessor thread pool, then just simple reply something like "TiFlash is busy" to the caller instead of pending by the thread pool. So that TiFlash can recover from large amount of useless retry requests.

LittleFall · 2021-12-23T06:32:47Z

I think if the pending number of cop requests is more than k times the size of the coprocessor thread pool, then just simple reply something like "TiFlash is busy" to the caller instead of pending by the thread pool. So that TiFlash can recover from large amount of useless retry requests.

this behavior LGTM

JaySon-Huang · 2021-12-23T09:40:04Z

Are there any plans to implement this behavior? @LittleFall

JaySon-Huang · 2021-12-24T07:08:43Z

Should also take consideration for Elastic Thread Pool/Dynamic Thread Pool model. /cc @bestwoody @fuzhe1989

fuzhe1989 · 2021-12-24T10:47:26Z

@JaySon-Huang It depends on both TiFlash and TiDB. Do we use exponential backoff retry strategy?

JaySon-Huang · 2021-12-30T11:02:53Z

Reproduce when running a QA test that only use 8c for TiKV, and the CPU usage of all TiKV is high, making all read index timeout.

JaySon-Huang · 2022-06-29T07:33:07Z

Another similar problem from asktug: https://asktug.com/t/topic/694336/24
In this case, the number of cop task reach about 30k and make tiflash reach the limit of /proc/sys/vm/max_map_count, no more thread can be created and tiflash crash.

LittleFall · 2023-01-13T09:52:23Z

closed because #6438 has been basically implemented

JaySon-Huang added the type/enhancement Issue or PR for enhancement label Dec 21, 2021

windtalker assigned gengliqi Apr 26, 2022

LittleFall assigned LittleFall and unassigned gengliqi Sep 28, 2022

LittleFall mentioned this issue Sep 28, 2022

Resource temporarily unavailable when thread construct #6046

Closed

LittleFall mentioned this issue Dec 7, 2022

Refine handle logic of coprocessor request #6438

Open

13 tasks

LittleFall closed this as completed Jan 13, 2023

JaySon-Huang mentioned this issue Jul 4, 2023

limit the queued task number and queued duration of coprocess task. (#6394) #7739

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TiFlash cop thread pool can not handle request with high QPS #3696

TiFlash cop thread pool can not handle request with high QPS #3696

JaySon-Huang commented Dec 21, 2021 •

edited

JaySon-Huang commented Dec 21, 2021

JaySon-Huang commented Dec 22, 2021 •

edited

LittleFall commented Dec 23, 2021

JaySon-Huang commented Dec 23, 2021

JaySon-Huang commented Dec 24, 2021

fuzhe1989 commented Dec 24, 2021

JaySon-Huang commented Dec 30, 2021 •

edited

JaySon-Huang commented Jun 29, 2022 •

edited

LittleFall commented Jan 13, 2023 •

edited

TiFlash cop thread pool can not handle request with high QPS #3696

TiFlash cop thread pool can not handle request with high QPS #3696

Comments

JaySon-Huang commented Dec 21, 2021 • edited

Enhancement

JaySon-Huang commented Dec 21, 2021

JaySon-Huang commented Dec 22, 2021 • edited

LittleFall commented Dec 23, 2021

JaySon-Huang commented Dec 23, 2021

JaySon-Huang commented Dec 24, 2021

fuzhe1989 commented Dec 24, 2021

JaySon-Huang commented Dec 30, 2021 • edited

JaySon-Huang commented Jun 29, 2022 • edited

LittleFall commented Jan 13, 2023 • edited

JaySon-Huang commented Dec 21, 2021 •

edited

JaySon-Huang commented Dec 22, 2021 •

edited

JaySon-Huang commented Dec 30, 2021 •

edited

JaySon-Huang commented Jun 29, 2022 •

edited

LittleFall commented Jan 13, 2023 •

edited