Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More sorting options #1225

Open
admtech opened this issue Jun 1, 2024 · 3 comments
Open

More sorting options #1225

admtech opened this issue Jun 1, 2024 · 3 comments
Labels
good first issue Good for newcomers pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue user-request This issue was directly requested by a user

Comments

@admtech
Copy link

admtech commented Jun 1, 2024

What
The default sorting for "pg_search" is always BM25 scoring. In general, relevance is of course the first and most important sort. However, our users have expressed a desire for further sorting, e.g. by timestamp . This means showing all content with the search terms, but then by timestamp and not by relevance.

Why
This does not work optimally at the moment because there is no additional sorting field. Even if the TIMESTAMP field e.g. "stamp" is available in the index, I cannot use it for sorting. This means that I have to search the entire index first in order to re-sort with ORDER BY. The performance is not optimal for many sessions.

How
The ability to add a custom sort order. Any field that is also defined in the index should be able to be used as a sort field. Elasticsearch does this as well.

@rebasedming
Copy link
Collaborator

Looks like the solution is for us to expose Tantivy's order_by_fast_field. https://docs.rs/tantivy/latest/tantivy/collector/struct.TopDocs.html#method.order_by_fast_field

I can take this on over the next few days.

@rebasedming rebasedming self-assigned this Jun 3, 2024
@neilyio neilyio added priority-2-medium Medium priority issue pg_search Issue related to `pg_search/` user-request This issue was directly requested by a user labels Jun 4, 2024
@colexbruhn
Copy link

I can take this on over the next few days.

Awesome, I was just looking to see if there was a better way to do this. For my use-case, I'd like to find relevant documents using bm25 and then sort them relative to user ratings. This is nice to have when user queries don't have enough content/context for semantic vector search.

@philippemnoel philippemnoel added the good first issue Good for newcomers label Jun 25, 2024
@philippemnoel
Copy link
Collaborator

I can take this on over the next few days.

Awesome, I was just looking to see if there was a better way to do this. For my use-case, I'd like to find relevant documents using bm25 and then sort them relative to user ratings. This is nice to have when user queries don't have enough content/context for semantic vector search.

Thanks for sharing. We'll look to prioritize this soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers pg_search Issue related to `pg_search/` priority-2-medium Medium priority issue user-request This issue was directly requested by a user
Projects
None yet
Development

No branches or pull requests

5 participants