Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

Investigate use of the BM25 algorithm to search image titles #288

Closed
kgodey opened this issue Apr 27, 2019 · 1 comment
Closed

Investigate use of the BM25 algorithm to search image titles #288

kgodey opened this issue Apr 27, 2019 · 1 comment
Assignees
Labels
✨ goal: improvement Improvement to an existing feature 🙅 status: discontinued Not suitable for work as repo is in maintenance 🏷 status: label work required Needs proper labelling before it can be worked on

Comments

@kgodey
Copy link
Contributor

kgodey commented Apr 27, 2019

The similarity algorithm used to search titles was switched from BM25 to boolean in #281 to avoid ranking repeated words in titles higher.

We should investigate switching back to BM25 and set the k1 tuning value to a low value just for the title field.

See #281 (review) and BM25 algorithm docs for more info.

@kgodey kgodey added this to Features (API) in Backlog Apr 27, 2019
@kgodey kgodey added this to the Backlog milestone Jul 11, 2019
@kgodey kgodey modified the milestones: Backlog, Q3 Sprint 3 Jul 19, 2019
@kgodey kgodey modified the milestones: Q3 Sprint 3, Q3 Sprint 4, Backlog Aug 26, 2019
@annatuma annatuma added this to To Be Prioritized in Backlog Nov 14, 2019
@annatuma annatuma removed this from the Backlog milestone Dec 5, 2019
@annatuma
Copy link

@aldenstpage I'm putting this in Q2 of the backlog, given that there are other search algorithm improvements scheduled for then. Please evaluate if this is a fit for community contributions and if so, label it accordingly.

@annatuma annatuma moved this from To Be Prioritized to Q2 2020 in Backlog Jan 22, 2020
@annatuma annatuma moved this from Q2 2020 to Q3 2020 in Backlog Jun 12, 2020
@kgodey kgodey moved this from Q3 2020 to tmp in Backlog Aug 13, 2020
@kgodey kgodey added ✨ goal: improvement Improvement to an existing feature and removed enhancement labels Sep 24, 2020
@annatuma annatuma moved this from Q3 2020 to Q4 2020 in Backlog Oct 1, 2020
@cc-open-source-bot cc-open-source-bot added the 🏷 status: label work required Needs proper labelling before it can be worked on label Dec 2, 2020
@kgodey kgodey moved this from Q4 2020 to CC Search in Backlog Dec 2, 2020
@kgodey kgodey added this to [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020
@kgodey kgodey removed this from [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020
@kgodey kgodey added the 🙅 status: discontinued Not suitable for work as repo is in maintenance label Dec 16, 2020
@kgodey kgodey closed this as completed Dec 16, 2020
@kgodey kgodey moved this from CC Search to Done in Backlog Dec 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
✨ goal: improvement Improvement to an existing feature 🙅 status: discontinued Not suitable for work as repo is in maintenance 🏷 status: label work required Needs proper labelling before it can be worked on
Development

No branches or pull requests

4 participants