Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search: Keywords filter should not exclude stopwords #1859

Closed
joachimtingvold opened this issue Dec 31, 2021 · 10 comments
Closed

Search: Keywords filter should not exclude stopwords #1859

joachimtingvold opened this issue Dec 31, 2021 · 10 comments
Assignees
Labels
enhancement Refactoring, improvement or maintenance task released Available in the stable release ux Impacts User Experience

Comments

@joachimtingvold
Copy link

joachimtingvold commented Dec 31, 2021

What does not work as expected?

Adding meme or memes as description and keyword for a picture. Trying to search with meme or memes as query string has zero results. Trying to search with keywords:meme or keywords:memes returns a whole lot of images (it seems to be all images, but have not verified). Re-index has no effect.

How can we reproduce it?

  1. Add "meme" and/or "memes" as description and keyword(s).
  2. Click on "Search"
  3. Search for meme or memes, and it shows zero results
  4. Search for keywords:meme or keywords:memes, and it shows many results

What behavior do you expect?

It should show images having meme or memes as description and/or keyword when searching for meme or memes (and not zero, as it does now).

It should show only images having meme or memes as keyword when searching for keywords:meme or keywords:memes (and not "all" as it does now).

More observations

Trying to use kkthxbye as the description and keyword, it seems to work as expected when searching for kkthxbye or keywords:kkthxbye (i.e. only the one image shows up in both searches).

It also shows results for a substring-search, down to 4 characters. If you try to use kkt, it shows no images, and if you try to use keywords:kkt, it shows "all" images again.

There seems to be an exception to the "shorter than 4 characters" in a search, and that is if it has a label with fewer characters (for example "cat"), which then also is added as a keyword. Search for cat and keywords:cat both return only the actual pictures with those labels/keywords. Trying kkk as a manual keyword did not work, but fun works.

Software

Photoprism 211215-93b26f19-Linux-x86_64
Chrome version 96.0.4664.110 (Official Build) (x86_64)
MacOS 11.6.2

@joachimtingvold joachimtingvold added the bug Something isn't working label Dec 31, 2021
@joachimtingvold
Copy link
Author

image
image
image

@lastzero
Copy link
Member

Might be on the full text stoplist in the txt package. Not sure why the word has been added, we didn't create the lists on our own. In that case, the code works as expected and you just need to remove these.

@joachimtingvold
Copy link
Author

joachimtingvold commented Dec 31, 2021

You seem to be correct. Both meme and memes are present in pkg/txt/stopwords.go.

However, that also seems to be directly related to why keywords:<string>-searches yields "all" images as result (if <string> is a word from the stoplist), even though none of the images matches that search. Is that also expected behaviour?

@lastzero
Copy link
Member

We could add code that returns an empty result if all words are stopwords. When there are other filters, they must be ignored.

@joachimtingvold
Copy link
Author

Yeah, that would make more sense. If you have large amounts of photos, it's hard to differentiate between "many photos that actually matches the search query" and "all images returned by default due to random stopwords".

@lastzero lastzero changed the title Bug: Weird search results based on description and keywords Search: Return empty result if query consists of stopwords only Dec 31, 2021
@lastzero lastzero self-assigned this Dec 31, 2021
@lastzero lastzero added enhancement Refactoring, improvement or maintenance task ux Impacts User Experience and removed bug Something isn't working labels Jan 2, 2022
@lastzero lastzero changed the title Search: Return empty result if query consists of stopwords only Search: Return empty result if keywords contain stopwords only Jan 3, 2022
@lastzero lastzero added the please-test Ready for acceptance test label Jan 3, 2022
@lastzero
Copy link
Member

lastzero commented Jan 3, 2022

Meme and memes have been removed from the stopwords list. When keywords contain stopwords only, an empty result is returned with an error ("invalid request"). We can add a more helpful message later if needed (needs translation).

A new Development Preview is available for testing soon.

@lastzero lastzero changed the title Search: Return empty result if keywords contain stopwords only Search: Display no results if keywords contain only stopwords Jan 3, 2022
@joachimtingvold
Copy link
Author

Thanks. I kinda worked around it by using a different keyword.

Is there a reason why stopwords are used for specific keywords: searches, btw? That seems somewhat counter intuitive? I can understand stopwords for freetext-search, and maybe for specific filters, but not keywords?

@lastzero
Copy link
Member

lastzero commented Jan 4, 2022

Correct, it seems to be possible to search for all strings when using the keywords: filter and not the full text search (which also indexes keywords). This would require using a different query builder, as the current one filters out stopwords.

The keywords: filter might have been added more recently. Since we were mainly working on other issues, keywords field and search were changed a few times to do people a "quick" favor without having time to rethink and rework the whole implementation.

@joachimtingvold
Copy link
Author

I see. No worries. It's just a minor "inconvenience", just have to check the stopword-list before I assign any keywords from now on (=

@lastzero lastzero changed the title Search: Display no results if keywords contain only stopwords Search: Keywords filter should not exclude stopwords Jan 5, 2022
@lastzero
Copy link
Member

lastzero commented Jan 5, 2022

Should be fixed with this 😉

New Development Preview build available for testing soon.

@graciousgrey graciousgrey added released Available in the stable release and removed please-test Ready for acceptance test labels Jan 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Refactoring, improvement or maintenance task released Available in the stable release ux Impacts User Experience
Projects
Status: Release 🌈
Development

No branches or pull requests

3 participants