Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: find_and_filter for inmemory #1642

Merged
merged 2 commits into from
Jun 13, 2023
Merged

fix: find_and_filter for inmemory #1642

merged 2 commits into from
Jun 13, 2023

Conversation

jupyterjazz
Copy link
Contributor

@jupyterjazz jupyterjazz commented Jun 12, 2023

#1640

Hybrid search (find+filter) for InMemoryExactNNIndex was prioritizing low similarities (lower scores) for returned matches. Fixed by adding an option to sort matches in a reverse order based on their scores.

# prepare a query
q_doc = MyDoc(embedding=np.random.rand(128), text='query')

query = (
    db.build_query()
    .find(query=q_doc, search_field='embedding')
    .filter(filter_query={'text': {'$exists': True}})
    .build()
)

results = db.execute_query(query)
# Before: results was sorted from worst to best matches
# Now: It's sorted in the correct order, showing better matches first

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Comment on lines 31 to 40

Args:
doc_index: Document index instance.
Either InMemoryExactNNIndex or HnswDocumentIndex.
query: Dictionary containing search and filtering configuration.
reverse_order: Flag indicating whether to sort in descending order. If set to
False (default), the sorting will be in ascending order.

Returns:
Sorted documents and their corresponding scores.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the right style for docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got too used to langchain docstrings lol

Copy link
Member

@samsja samsja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good.

Tho I have to say it is unclear why adding this reverse sorting fix the original problem. Can you add a comment somewhere in the code to explain it ?

FYI: docstring are not in the right format

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
@jupyterjazz jupyterjazz requested a review from samsja June 13, 2023 07:42
@github-actions
Copy link

📝 Docs are deployed on https://ft-fix-filter-and-find--jina-docs.netlify.app 🎉

@jupyterjazz jupyterjazz merged commit f36c621 into main Jun 13, 2023
21 checks passed
@jupyterjazz jupyterjazz deleted the fix-filter-and-find branch June 13, 2023 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants