New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhead from the pg_stat_kcache extension #41
Comments
Thanks @vitabaks for posting! Worth noting that this comes from our (postgres.ai) new bot activities; it's working on top of gpt4 turbo with lots of additional components https://twitter.com/samokhvalov/status/1743151620555477083 @anayrat raised a couple of questions in https://twitter.com/Adrien_nayrat/status/1744288348217151991 the whole pipeline is here, we collect 70+ artifacts for each iteration, browsable here (or just see the .zip provided above, it's the same). The non-AI part of the automation is here. We can quickly reproduce things if additional checks are needed, but it should be straightforward on any machine. Also interesting that pgss demonstrates noticeable overhead as well at this scale for trivial pgbench workloads https://twitter.com/postgres_ai/status/1747690825709215793 – obviously, contention to update stats for a single query record. But pgss overhead is much, much lower than pgsk's one – so question is, why so significant difference. |
Hi @vitabaks I'm assuming that the bottleneck is coming from the internal lock that protects the array where we store the queryid for each backend in case parallel workers will be launched. That lock was initially added as a precaution but there shouldn't be any risk of concurrent modification while reading the value so I don't think it's necessary. That lock should have been harmless but indeed in case of high client count I can see how it would affect performance. Can you try with the "remove_queryids_lock" branch that I just pushed? https://github.com/powa-team/pg_stat_kcache/tree/remove_queryids_lock |
@rjuju Thanks for the quick response. I tested your patch (on c3d-standard-360), here is the result: Result:Without a patch:
With a patch:
ConclusionsBy eliminating the |
@vitabaks thanks for the testing! And that's a great news that this is enough to remove the overhead. I just merged the commit in the main branch. I will wait a bit just in case and do a release early next week. |
Thanks guys ! |
Thanks! |
I just released version 2.2.3! Thanks again for the report and testing the patch! |
Please take a look at the following results of the synthetic (read-only) pgbench test
Which we run on servers:
c3-standard-176
(176 vCPU Intel, 704 GB Memory) andc3d-standard-360
(360 vCPU AMD, 1440 GB Memory).We observe the degradation with over 100 clients:
Analyzing the expectation event profile (based on pg_wait_sampling), we see how the number of
LWLock:pg_stat_kcache
wait events increases with an increase in the number of clients, until eventually pg_stat_kcache becomes the TOP-1 event in the profile.1 client:
50 clients:
100 clients:
150 clients:
In the attachment you will find artifacts including settings, postgres stats, logs, and more:
The text was updated successfully, but these errors were encountered: