New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize query uniqueness check in ProcessList #57106
Optimize query uniqueness check in ProcessList #57106
Conversation
This is an automated comment for commit f447c54 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
src/Interpreters/ProcessList.cpp
Outdated
@@ -313,6 +311,9 @@ ProcessListEntry::~ProcessListEntry() | |||
} | |||
} | |||
|
|||
if (auto query_user = parent.queries_to_user.find(query_id); query_user != parent.queries_to_user.end()) | |||
parent.queries_to_user.erase(query_user); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but at first glance we should erase from queries_to_user after cancelled_cv.wait(), otherwise we are going to have a race condition here.
Hi @adikus , I'm wondering in your test, how many users are there? We also notice that the ClickHouse spends time on the ProcessList processing, in our case, there's only 1 user that issues queries up to 2K. I think the 1 user scenario cannot benefit much from this patch, right? |
I don't think this patch benefits much in you case, since it mainly avoids the cost of scanning over the user_process_list, while in your case just one user. |
Hi!
In our ClickHouse use case, we have a lot of users (currently ~couple thousands per CH instance), all of whom are expected to run queries.
Recently we tried to push this even further (~couple tens of thousands users per CH instance), which resulted in a significant degradation in our QPS throughput. We were able to achieve only between 100 and 200 QPS, with each query taking up to 30s (expected run time for these queries is under 100ms).
With tracing logs enabled, we were able to figure that most queries were spending their time on this lock:
https://github.com/ClickHouse/ClickHouse/blob/c2f4dc0f14f55cf9785da86cb2bf07c556e22d1e/src/Interpreters/ProcessList.cpp#L82C27-L82C27
The reason was that the inserts into the ProcessList were taking too long, specifically in this for loop:
ClickHouse/src/Interpreters/ProcessList.cpp
Lines 185 to 194 in c2f4dc0
The for loop iterates over every user the server ever saw, which explains why most queries spend so much time on this lock (we saw times up to 10ms spend in the loop, which would explain the ~100 QPS limit we saw).
This PR, replaces the for loop with a single lookup into a
std::unordered_map
.We tested this patch in our environment and achieved over 2000 QPS on the same workload as before (could definitely go higher).
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Improve performance of executing queries for use cases with many users
Documentation entry for user-facing changes