New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tenant monitoring performance improvements #6868
Tenant monitoring performance improvements #6868
Conversation
Codecov Report
@@ Coverage Diff @@
## main #6868 +/- ##
==========================================
- Coverage 93.29% 92.68% -0.61%
==========================================
Files 272 272
Lines 57896 57935 +39
==========================================
- Hits 54012 53699 -313
- Misses 3884 4236 +352 |
211a865
to
0ffc710
Compare
bbebbbe
to
ccd464b
Compare
|
||
monitor->tenants = ShmemInitHash("citus_stats_tenants hash", | ||
StatTenantsLimit * 3, StatTenantsLimit * 3, | ||
&info, HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using HASH_BLOBS
we can we can remove the hash
and match
functions, assuming that TenantStatsHashKey passes the assert_valid_hash_key2
check. See hash_helpers.{c,h}
stats->score = 0; | ||
stats->lastScoreReduction = 0; | ||
|
||
if (!found) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would it ever be found? Seems like we could simplify this a little bit and pass in NULL
instead of &found
to hash_search
.
CreateTenantStatsHashKey(char *tenantAttribute, uint32 colocationGroupId) | ||
{ | ||
TenantStatsHashKey *key = (TenantStatsHashKey *) palloc(sizeof(TenantStatsHashKey)); | ||
memset(key->tenantAttribute, 0, MAX_TENANT_ATTRIBUTE_LENGTH); | ||
strlcpy(key->tenantAttribute, tenantAttribute, MAX_TENANT_ATTRIBUTE_LENGTH); | ||
key->colocationGroupId = colocationGroupId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/food for thought: It might be nice to fill in memory from the caller instead of allocating here. Then you can store the data on the stack instead of on the heap. (which should be a bit faster, and makes sure you don't need to use pfree
)
CreateTenantStatsHashKey(char *tenantAttribute, uint32 colocationGroupId) | |
{ | |
TenantStatsHashKey *key = (TenantStatsHashKey *) palloc(sizeof(TenantStatsHashKey)); | |
memset(key->tenantAttribute, 0, MAX_TENANT_ATTRIBUTE_LENGTH); | |
strlcpy(key->tenantAttribute, tenantAttribute, MAX_TENANT_ATTRIBUTE_LENGTH); | |
key->colocationGroupId = colocationGroupId; | |
FillTenantStatsHashKey(TenantStatsHashKey *key, char *tenantAttribute, uint32 colocationGroupId) | |
{ | |
memset(key->tenantAttribute, 0, MAX_TENANT_ATTRIBUTE_LENGTH); | |
strlcpy(key->tenantAttribute, tenantAttribute, MAX_TENANT_ATTRIBUTE_LENGTH); | |
key->colocationGroupId = colocationGroupId; |
2dfa19c
to
301c258
Compare
TenantStatsHashKey *key = (TenantStatsHashKey *) palloc(sizeof(TenantStatsHashKey)); | ||
FillTenantStatsHashKey(key, AttributeToTenant, AttributeToColocationGroupId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant as an advantage of this approach was that you can store TenantsStatsHashKey on the stack instead.
TenantStatsHashKey *key = (TenantStatsHashKey *) palloc(sizeof(TenantStatsHashKey)); | |
FillTenantStatsHashKey(key, AttributeToTenant, AttributeToColocationGroupId); | |
TenantStatsHashKey key = {0}; | |
FillTenantStatsHashKey(&key, AttributeToTenant, AttributeToColocationGroupId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good now and benchmark results look great. Some extra benchmarks on machines with more cores would be needed to find the final impact of the new code, but the changes are clearly an improvement. If we are going to backport this we should be doing valgrind tests again.
- [x] Use spinlock instead of lwlock per tenant [b437aa9](b437aa9) - [x] Use hashtable to store tenant stats [ccd464b](ccd464b) - [x] Introduce a new GUC for specifying the sampling rate of new tenant entries in the tenant monitor. [a8d3805](a8d3805) Below are the pgbench metrics with select-only workloads from my local machine. Here is the [script](https://gist.github.com/gokhangulbiz/7a2308470597dc06734ff7c08f87c656) I used for benchmarking. | | Connection Count | Initial Implementation (TPS) | On/Off Diff | Final Implementation -Run#1 (TPS) | On/Off Diff | Final Implementation -Run#2 (TPS) | On/Off Diff | Final Implementation -Run#3 (TPS) | On/Off Diff | Avg On/Off Diff | | --- | ---------------- | ---------------------------- | ----------- | ---------------------------------- | ----------- | ---------------------------------- | ----------- | ---------------------------------- | ----------- | --------------- | | On | 32 | 37488.69839 | \-17% | 42859.94402 | \-5% | 43379.63121 | \-2% | 42636.2264 | \-7% | \-5% | | Off | 32 | 43909.83121 | | 45139.63151 | | 44188.77425 | | 45451.9548 | | | | On | 300 | 30463.03538 | \-15% | 33265.19957 | \-7% | 34685.87233 | \-2% | 34682.5214 | \-1% | \-3% | | Off | 300 | 35105.73594 | | 35637.45423 | | 35331.33447 | | 35113.3214 | | | (cherry picked from commit 2c509b7)
Below are the pgbench metrics with select-only workloads from my local machine. Here is the script I used for benchmarking.