-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix huge bottleneck in notification emails #4200
Conversation
Before:
(aka roughly one mail a minute) After:
Now roughly one mail every 3 seconds. Still not perfect, but we should get the event backlog down with this speed. |
Codecov Report
@@ Coverage Diff @@
## master #4200 +/- ##
===========================================
+ Coverage 46.58% 87.75% +41.17%
===========================================
Files 340 340
Lines 22439 18567 -3872
===========================================
+ Hits 10453 16294 +5841
+ Misses 11986 2273 -9713
|
This is likely the reason I got 4308 emails this morning instead of the usual ~100. If it is - good job! |
fyi: the delayed job backlog is done as of this morning - we had >11.000 when I started to look at the problem. |
@coolo Nice work thanks for fixing this! Somewhat annoyed I didn't see this problem myself.. I thought that by INNER JOINing the watched_projects and users this wouldn't be a problem but obviously that wasn't the case. |
src/api/app/models/event/request.rb
Outdated
@@ -134,8 +134,8 @@ def source_watchers | |||
|
|||
def find_watchers(project_key) | |||
project_names = payload['actions'].map { |action| action[project_key] }.uniq | |||
projects = Project.where(name: project_names).joins(watched_projects: :user) | |||
projects.flat_map { |project| project.watched_projects.map(&:user) } | |||
projects = WatchedProject.where(project: Project.where(name: project_names)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small point but please change the variable name from projects
to watched_projects
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we had an unneeded join
🙈
The way source_watchers and target_watchers were implemented was pretty suboptimal. It took >20 seconds to dig out 128 users watching openSUSE:Factory by going through the WatchedProject objects connected to these 128 users and listed the users for these, which resulted in 128x128 SELECT user ... - but all we want is the list of 128 User objects (to filter subscriptions for them) So the new query results in one SQL statement taking the database a friction of a second
0f880bd
to
b0d7d26
Compare
@evanrolfe if it helps your ego - the problem was only so severe because Max created 4000 requests from the very often watched openSUSE:Factory. Normally we don't have so many requests to or from it, that you would notice. For the curious: this is how the SQL looks before and after:
|
projects = Project.where(name: project_names).joins(watched_projects: :user) | ||
projects.flat_map { |project| project.watched_projects.map(&:user) } | ||
watched_projects = WatchedProject.where(project: Project.where(name: project_names)) | ||
User.where(id: watched_projects.select(:user_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before we where returning an array, while now a relations is returned. Maybe we want to keep returning an array using find
instead:
User.find(watched_projects.select(:user_id))
I also find it more readable. I do not know if using find
instead of where
can affect performance (either in a negative or a positive way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO returning a relation instead of an array is an improvement since you can always call .to_a
on a relataion but you cant really turn an array into a relation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't matter - the next call on the relation is .each - which will cast into an array. But when you return a relation the caller has the freedom to use in_batches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, then let's merge this! 😉
The way source_watchers and target_watchers were implemented was pretty
suboptimal. It took >20 seconds to dig out 128 users watching
openSUSE:Factory by going through the WatchedProject objects connected
to these 128 users and listed the users for these, which resulted in
128x128 SELECT user ... - but all we want is the list of 128 User
objects (to filter subscriptions for them)
So the new query results in one SQL statement taking the database a
friction of a second