-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make "List Prometheus rules" API more responsive while synching rule groups #2289
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zenador for working on this! I left few comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zenador for keep working on this. The new design has been improved, but there are still few things I would like to fix / cleanup. Thanks!
@@ -141,33 +137,50 @@ func (r *DefaultMultiTenantManager) syncRulesToManager(ctx context.Context, user | |||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above we call r.mapper.MapRules()
. It's not thread safe. Currently syncRulesToManager()
should be never called concurrently, but we haven't documented it, so at least we should update function comments to clarify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment about this not being thread safe for the same user. I'm not sure if it's completely not safe due to the underlying MkdirAll
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the race would be in the creation of new files + deletion of old ones.
@@ -141,33 +137,50 @@ func (r *DefaultMultiTenantManager) syncRulesToManager(ctx context.Context, user | |||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the race would be in the creation of new files + deletion of old ones.
I'm going to address my own latest comments. |
userManagerMtx sync.RWMutex | ||
userManagers map[string]RulesManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewrs: I split it, to better signal userManagerMtx
only protects userManagers
. userManagerMetrics
is already thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zenador and @56quarters for the conjunt work! I pushed some last fixes and double checked the logic, which I think is the exact same of main
(before this refactoring) with the exception of keeping the lock for a short period.
I will ask for another maintainer review, since I also mangled this PR.
…t Prometheus rules" API more responsive while synching rule groups
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
fb0dada
to
037ff7c
Compare
I pushed a CHANGELOG entry and rebased. @stevesg could you review this PR, please? |
Signed-off-by: Marco Pracucci <marco@pracucci.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
…groups (grafana#2289) * Try to minimise the locking in DefaultMultiTenantManager to make "List Prometheus rules" API more responsive while synching rule groups * Code review changes * Address review feedback Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Addressed self-review comments Signed-off-by: Marco Pracucci <marco@pracucci.com> * Added CHANGELOG entry Signed-off-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Nick Pillitteri <nick.pillitteri@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
Is there a way to test this? |
What this PR does
Minimise the locking in DefaultMultiTenantManager to make "List Prometheus rules" API more responsive while synching rule groups.
Which issue(s) this PR fixes or relates to
Fixes #2283
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]