New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert friends-of-friends follow recommendation query to using a CTE #29619
Conversation
Also fixes ordering back to the initial intent
On the ordering change here -- was this an oversight we missed in review? Or was the intention to change the ordering, but due to the perf issues we're going back to original ordering? |
The change of ordering in my earlier PR was an oversight of mind, I misunderstood the original ordering, and there were no tests to catch the change. I just noticed the ordering had been changed by starting from the old CTE. |
Cool - given that, I agree that the spec changes here do recapture what the original pre-refactor ordering looked like. I'll defer to you all on the query change / perf implications / etc.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested the queries on mastodon.social and perf is similar to the previous code
To add to that, we tested multiple versions of the query for multiple sets of active mastodon.social users (random 400 users, 5 with very few follows, 5 with large amounts of follows and blocks) and the results were always pretty consistent: the code in this PR is roughly on par with the old query that was missing filtering, while the code before this PR was way slower (10 to 50 times slower). The code in main...mjankowski:mastodon:fix-order-in-floof-foof-fooferoni-changes was consistently ~1.5× slower than that of this PR, which could be due to some extra work instead of reusing the CTE for filtering.
This is a good question. I guess understanding the query plan is the way to go, but even that is specific to what's actually in the database. Also, extracting the query is a bit of a pain depending on how the query is written. I've been toying with populating fake data, but I think we'd need to take shortcuts in generation for it to be usable. |
One thing I suspect may have been happening with the poorly performing query is that, for the setup of Before I attempt this -- are we open to repeating the style/compositional improvement of these changes, if we can preserve the current more performant query? (putting aside like, framework sql-quoting style and whatnot). That current WIP branch (that you linked) restores the composed scope approach - but adds an additional join which is not present in this restored/performant version. I suspect that contributes to the ~1.5x perf drop on that branch. |
Also fixes ordering back to the initial intent.
Note that I have not verified whether this had better performances than the previous version. I also used subqueries instead of the usual approach of using
Account.not_excluded_by_account
andAccount.not_domain_blocked_by_account
because: