Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking mechanism for tenant monitoring probabilistic approach #7026

Merged
merged 19 commits into from
Jul 3, 2023

Conversation

gokhangulbiz
Copy link
Contributor

@gokhangulbiz gokhangulbiz commented Jun 21, 2023

This PR

  • Addresses a concurrency issue in the probabilistic approach of tenant monitoring by acquiring a shared lock for tenant existence checks.
  • Changes citus.stat_tenants_sample_rate_for_new_tenants type to double
  • Renames citus.stat_tenants_sample_rate_for_new_tenants to citus.stat_tenants_untracked_sample_rate

@gokhangulbiz gokhangulbiz requested a review from JelteF June 22, 2023 09:45
@gokhangulbiz gokhangulbiz marked this pull request as ready for review June 22, 2023 09:45
@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from 3e64a99 to bc602c4 Compare June 22, 2023 12:06
Comment on lines 303 to 304
/* Set a good seed for random */
srandom((unsigned int) time(NULL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need seed the random generator. It's already done by Postgres at process start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the InitProcessGlobals function in case you're interested.

Copy link
Contributor

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realised this did not have test yet.

@@ -2473,12 +2473,12 @@ RegisterCitusConfigVariables(void)
NULL, NULL, NULL);


DefineCustomIntVariable(
DefineCustomRealVariable(
"citus.stat_tenants_sample_rate_for_new_tenants",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name needs some PM input.

@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from 012e259 to bffa600 Compare June 22, 2023 13:01
@@ -12,6 +12,7 @@
#include "unistd.h"

#include "access/hash.h"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary newline

@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #7026 (32a3414) into main (ac24e11) will decrease coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #7026      +/-   ##
==========================================
- Coverage   93.34%   93.32%   -0.02%     
==========================================
  Files         273      273              
  Lines       58511    58514       +3     
==========================================
- Hits        54618    54611       -7     
- Misses       3893     3903      +10     

@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from 2f6d7b6 to 13976e0 Compare June 22, 2023 15:12
@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from 13976e0 to 34c9b69 Compare June 22, 2023 16:09
Copy link
Contributor

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, with the exception that we still need a decision on the name of the GUC.

Copy link
Contributor

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but somehow the test is still a little bit flaky. I also don't see obvious reasons why that is happening though. The only thing that might help is adding a SELECT pg_sleep(0.1); after the pg_reload_conf(). We needed that in a few other tests because pg_reload_conf() is asynchronous. We already sleep here until the next period, but that might be to short if the new period is almost starting.

If that still doesn't help let's remove the final part of the test where the sample rate is 1. It's already tested earlier in this test file implicitly anyway.

@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from 5513f0a to 20a2d17 Compare July 3, 2023 07:34
@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch 2 times, most recently from 16278e6 to c39f851 Compare July 3, 2023 08:50
@gokhangulbiz gokhangulbiz force-pushed the gokhangulbiz/fix-monitor-lock branch from c39f851 to d64ff68 Compare July 3, 2023 09:29
@gokhangulbiz gokhangulbiz requested a review from JelteF July 3, 2023 09:57
@gokhangulbiz gokhangulbiz merged commit e0d3476 into main Jul 3, 2023
77 checks passed
@gokhangulbiz gokhangulbiz deleted the gokhangulbiz/fix-monitor-lock branch July 3, 2023 10:08
gokhangulbiz added a commit that referenced this pull request Jul 12, 2023
aykut-bozkurt pushed a commit that referenced this pull request Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants