-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Ensure that aggregating search shows the post with the higest rank. #10177
Merged
tgxworld
merged 1 commit into
discourse:master
from
tgxworld:ensure_proper_ranking_after_aggregating_search
Jul 14, 2020
Merged
FIX: Ensure that aggregating search shows the post with the higest rank. #10177
tgxworld
merged 1 commit into
discourse:master
from
tgxworld:ensure_proper_ranking_after_aggregating_search
Jul 14, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tgxworld
force-pushed
the
ensure_proper_ranking_after_aggregating_search
branch
5 times, most recently
from
July 7, 2020 08:52
15b5b2b
to
9dd29bb
Compare
tgxworld
changed the title
FIX: Ensure that aggregating search shows the right posts.
FIX: Ensure that aggregating search shows the post with the higest rank.
Jul 7, 2020
tgxworld
force-pushed
the
ensure_proper_ranking_after_aggregating_search
branch
from
July 7, 2020 08:58
9dd29bb
to
2e19304
Compare
ZogStriP
approved these changes
Jul 7, 2020
ZogStriP
reviewed
Jul 7, 2020
tgxworld
force-pushed
the
ensure_proper_ranking_after_aggregating_search
branch
2 times, most recently
from
July 8, 2020 03:19
f81683b
to
7245c09
Compare
To be honest this change makes me nervous, a lot of stuff is shuffled, that said overall risk (at least conceptually) is not too high given we already had a MIN in the query to start with. Go ahead and merge, watch it carefully please. |
@SamSaffron One thing I can do is to introduce this behind a hidden site setting and enable it on dev/meta first. |
tgxworld
force-pushed
the
ensure_proper_ranking_after_aggregating_search
branch
2 times, most recently
from
July 14, 2020 05:08
9081623
to
1824952
Compare
Previously, we would only take either the `MIN` or `MAX` for `post_number` during aggregation meaning that the ranking is not considered. ``` require 'benchmark/ips' Benchmark.ips do |x| x.config(time: 10, warmup: 2) x.report("current aggregate search query") do DB.exec <<~SQL SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT topics.id, min(posts.post_number) post_number FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN ( SELECT categories.id WHERE categories.search_priority = 1 ) ) AND ((categories.id IS NULL) OR (NOT categories.read_restricted)) GROUP BY topics.id ORDER BY MAX(( TS_RANK_CD( post_search_data.search_data, TO_TSQUERY('english', '''postgres'':*ABCD'), 1|32 ) * ( CASE categories.search_priority WHEN 2 THEN 0.6 WHEN 3 THEN 0.8 WHEN 4 THEN 1.2 WHEN 5 THEN 1.4 ELSE CASE WHEN topics.closed THEN 0.9 ELSE 1 END END ) ) ) DESC, topics.bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number; SQL end x.report("current aggregate search query with proper ranking") do DB.exec <<~SQL SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT subquery.topic_id id, (ARRAY_AGG(subquery.post_number))[1] post_number, MAX(subquery.rank) rank, MAX(subquery.bumped_at) bumped_at FROM (SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id", ( TS_RANK_CD( post_search_data.search_data, TO_TSQUERY('english', '''postgres'':*ABCD'), 1|32 ) * ( CASE categories.search_priority WHEN 2 THEN 0.6 WHEN 3 THEN 0.8 WHEN 4 THEN 1.2 WHEN 5 THEN 1.4 ELSE CASE WHEN topics.closed THEN 0.9 ELSE 1 END END ) ) rank, topics.bumped_at bumped_at FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN ( SELECT categories.id WHERE categories.search_priority = 1 ) ) AND ((categories.id IS NULL) OR (NOT categories.read_restricted))) subquery GROUP BY subquery.topic_id ORDER BY rank DESC, bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number; SQL end x.compare! end ``` ``` Warming up -------------------------------------- current aggregate search query 1.000 i/100ms current aggregate search query with proper ranking 1.000 i/100ms Calculating ------------------------------------- current aggregate search query 17.726 (± 0.0%) i/s - 178.000 in 10.045107s current aggregate search query with proper ranking 17.802 (± 0.0%) i/s - 178.000 in 10.002230s Comparison: current aggregate search query with proper ranking: 17.8 i/s current aggregate search query: 17.7 i/s - 1.00x (± 0.00) slower ```
tgxworld
force-pushed
the
ensure_proper_ranking_after_aggregating_search
branch
from
July 14, 2020 05:47
1824952
to
d8c796b
Compare
Merged in d8c796b |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, we would only take either the
MIN
orMAX
forpost_number
during aggregation meaning that the ranking is notconsidered.