Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more accurate hashtag search #11579

Merged
merged 2 commits into from
Aug 18, 2019
Merged

Add more accurate hashtag search #11579

merged 2 commits into from
Aug 18, 2019

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Aug 15, 2019

Similar to #11537, use edge n-grams to autocomplete tags. Additional scoring is provided through:

  • Gaussian decay is used on last_status_at
  • Last 7 days of unique uses

Only hashtags that have been reviewed and are listable can appear in searches.

@Gargron Gargron force-pushed the feature-improve-tags-search branch 10 times, most recently from 2e40455 to 27ee6e3 Compare August 17, 2019 00:45
Using ElasticSearch to index hashtags with edge n-grams and score
them by usage within the last 7 days since last activity. Only
hashtags that have been reviewed and are listable can appear in
searches, unless they match the query exactly
@Gargron Gargron force-pushed the feature-improve-tags-search branch from 27ee6e3 to 6b4d326 Compare August 17, 2019 00:51
@mayaeh
Copy link
Contributor

mayaeh commented Aug 17, 2019

Maybe it works strangely if the hashtag contains numbers or languages ​​other than English
reference video: here
(This video is from my testing server running on this branch.)

@Gargron
Copy link
Member Author

Gargron commented Aug 17, 2019

Okay, I do not want to merge it until I understand why that happens

@Gargron Gargron merged commit cc0a55c into master Aug 18, 2019
@Gargron Gargron deleted the feature-improve-tags-search branch August 18, 2019 01:45
mayaeh pushed a commit to mastodon-ja-l10n-team/mastodon that referenced this pull request Aug 18, 2019
* Add more accurate hashtag search

Using ElasticSearch to index hashtags with edge n-grams and score
them by usage within the last 7 days since last activity. Only
hashtags that have been reviewed and are listable can appear in
searches, unless they match the query exactly

* Fix search analyzer dropping non-ascii characters
hiyuki2578 pushed a commit to ProjectMyosotis/mastodon that referenced this pull request Oct 2, 2019
* Add more accurate hashtag search

Using ElasticSearch to index hashtags with edge n-grams and score
them by usage within the last 7 days since last activity. Only
hashtags that have been reviewed and are listable can appear in
searches, unless they match the query exactly

* Fix search analyzer dropping non-ascii characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants