Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: do not allow title stuffing to dominate search #21464

Merged
merged 1 commit into from May 10, 2023

Conversation

SamSaffron
Copy link
Member

We were giving topics with repeated words extra weight in search index.
This meant that it was trivial to stuff words into title to dominate in search
given we search for exact title matches first.

The following tweak means that:

invite invited invites
and
invite some stuff

Both rank the same for title searching.

Titles are short and punchy, duplicating words should not give special
weight.

Requires a full reindex to take effect.

We were giving topics with repeated words extra weight in search index.
This meant that it was trivial to stuff words into title to dominate in search
given we search for exact title matches first.

The following tweak means that:

`invite invited invites`
and
`invite some stuff`

Both rank the same for title searching.

Titles are short and punchy, duplicating words should not give special
weight.

Requires a full reindex to take effect.
@discoursebot
Copy link

This pull request has been mentioned on Discourse Meta. There might be relevant details there:

https://meta.discourse.org/t/making-documentation-more-prominent-in-search/263206/10

@SamSaffron SamSaffron merged commit bd32912 into main May 10, 2023
13 checks passed
@SamSaffron SamSaffron deleted the search_by_title_tweak branch May 10, 2023 01:48
enduvar pushed a commit to ForgottenWorld/discourse that referenced this pull request Sep 8, 2023
We were giving topics with repeated words extra weight in search index.
This meant that it was trivial to stuff words into title to dominate in search
given we search for exact title matches first.

The following tweak means that:

`invite invited invites`
and
`invite some stuff`

Both rank the same for title searching.

Titles are short and punchy, duplicating words should not give special
weight.

Requires a full reindex to take effect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants