-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement full text search #9197
base: master
Are you sure you want to change the base?
Conversation
Changes in OpenAPI specification found. Expand to see details.What's Changed
|
5b056ee
to
9563c2c
Compare
a02c137
to
d73052b
Compare
I discovered that |
b5b2d5e
to
357e933
Compare
All good. It's now using a content table and linking through the guid so if the rowid changes it won't matter. |
357e933
to
adec3a2
Compare
Found issues with input sanitization. Back to WIP. |
adec3a2
to
13e3af3
Compare
Should be ready for review. It is now not possible to use the search term in an unsanitized form. |
13e3af3
to
3497914
Compare
3497914
to
e0122a0
Compare
How are the fields in Looking at the code if seems like you cannot search for the year ( But of course this should only work in combination with the title because searching for |
The question is, do you add things like year, tags, etc to the FTS index? The more you add to the index, the more false positives you will get. The backend already supports searching by those so I lean towards no. For example, a FTS title and year search can be done like this: |
No, I don't think those fields have to be indexed, they just have to be included in the search in a smart way.
I think Jellyfin should provide a server side way to search through the API that works in every client. For example, Musicbrainz allows searching in a way similar to what you described through its API: https://musicbrainz.org/doc/Indexed_Search_Syntax There's also IMDb, TheTVDB any many others which have no issues with fuzzy searching combined fields like I think it would be awesome if Jellyfin could also recognize years at the end of a query and search for the combination of search term and year as well as the normal search. Of course both of these features are for a follow-up PR, but I wanted to mention it before this is merged. |
FYI, I'm not sure if my example is align with your PR but, if I search for the TV series "Dark", it does not show up in the results. I can't get it to show up at all. |
Changes
This implements a true full text search using sqlite's built in fts5 extension. Currently, jellyfin is trying to implement full text search by doing some sql trickery, but it doesn't support things like keyword searches. For example, it doesn't support a search like
90 day what now
, which in a FTS would return90 Day Fiancé - What Now! (2017)
.This new implementation supports FTS with three different search types, of which one can be specified by the client: Phrase, Prefix, and Keyword. It defaults to Prefix, which makes it backwards compatible with the existing search.
In addition, this implementation is using the Porter tokenizer, which implements the Porter stemming algorithm. (EDIT: I took out the Porter tokenizer. It can give confusing results.) The fts5 extension provides a rank for each search so the results are returned in rank order.This should require no intervention on the user's part. On startup, the user's FTS index is seeded automatically if their index is empty. Real time FTS index updates are achieved using db triggers.
Two changes were made from the existing search
Tags are not included in the FTS index. It doesn't make sense to include it because a FTS plus tag search can be done as follows:
searchTerm=<search term>&tags=<tags>
.ProviderIds
is in the FTS index. This means a search like the following will return the item of interest (if it exists):searchTerm=Tmdb%3D61575
orsearchTerm=Tmdb+61575