WIP: hybrid search with fastembed #553

generall · 2024-03-26T14:59:48Z

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

netlify · 2024-03-26T15:00:06Z

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Name	Link
🔨 Latest commit	`74936cb`
🔍 Latest deploy log	https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/66041331c08a4a00094562e1
😎 Deploy Preview	https://deploy-preview-553--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

joein · 2024-03-27T10:57:56Z

qdrant_client/qdrant_fastembed.py

@@ -89,6 +105,36 @@ def set_model(
        )
        self._embedding_model_name = embedding_model_name

+    def set_sparse_model(
+        self,
+        model_name: Optional[str],


set_model has embedding_model_name, however set_sparse_model has model_name

joein · 2024-03-27T11:03:11Z

We have no tests for query_batch, I'll add them

* WIP: hybrid search with fastembed * hybrid queries with fastembed * test for hybrid * fix typo * new: extend hybrid search tests, fix mypy, small refactoring (#554) * refactor: align model name parameters in setters, update tests * fix: fix async * fix: add a good test, fix sparse vectors in query batch * refactoring: reduce branching, refactor fastembed tests --------- Co-authored-by: George <george.panchuk@qdrant.tech>

NirantK · 2024-03-28T11:24:42Z

qdrant_client/hybrid/fusion.py

+    for response in responses:
+        for i, scored_point in enumerate(response):
+            if scored_point.id in scores:
+                scores[scored_point.id] += compute_score(i)
+            else:
+                point_pile[scored_point.id] = scored_point
+                scores[scored_point.id] = compute_score(i)


Can we limit the responses that get processed inside the loop?

Since we know that we've only returned 10, perhaps responses can be some multiple of that?

Here is the scenario which I'm trying to avoid: Responses are quite large, let's say a thousand, and then we end up running the loop for those. Alternatively we can implement this in a Numpy matrix operation.

Could you elaborate on limiting the responses?

We might try rewriting it with numpy, but I am not sure whether it actually worth it, we would still need to iterate over all the responses to map the ids to the scores and then we will need to sum up the scores, etc.

Can we limit the number of responses we evaluate? For loops can be large and slow for when someone passes too large a list.

Numpy implementation saves some of this compute fwiw: qdrant/fastembed#165

the implementation in qdrant/fastembed#165
involves 3 times more loops

1. all_items = set(item for rank_list in rank_lists for item, _ in rank_list) 2. item_to_index = {item: idx for idx, item in enumerate(all_items)} 3. for list_idx, rank_list in enumerate(rank_lists): for item, rank in rank_list: rank_matrix[item_to_index[item], list_idx] = rank

So the question is whether it is actually more efficient? (I don't argue that it might be more efficient due to the arithmetic operations, I am just a little bit doubtful)

Can we limit the number of responses we evaluate? For loops can be large and slow for when someone passes too large a list.

I don't really see how to do it at the moment, we can't sort responses with <O(n)

numpy version is actually slower

from typing import List, Dict import time import numpy as np from qdrant_client import models def rrf(rank_lists, alpha=2, default_rank=1000): """ Optimized Reciprocal Rank Fusion (RRF) using NumPy for large rank lists. :param rank_lists: A list of rank lists. Each rank list should be a list of (item, rank) tuples. :param alpha: The parameter alpha used in the RRF formula. Default is 60. :param default_rank: The default rank assigned to items not present in a rank list. Default is 1000. :return: Sorted list of items based on their RRF scores. """ # Consolidate all unique items from all rank lists all_items = set(item for rank_list in rank_lists for item, _ in rank_list) # Create a mapping of items to indices item_to_index = {item: idx for idx, item in enumerate(all_items)} # Initialize a matrix to hold the ranks, filled with the default rank rank_matrix = np.full((len(all_items), len(rank_lists)), default_rank) # Fill in the actual ranks from the rank lists for list_idx, rank_list in enumerate(rank_lists): for item, rank in rank_list: rank_matrix[item_to_index[item], list_idx] = rank # Calculate RRF scores using NumPy operations rrf_scores = np.sum(1.0 / (alpha + rank_matrix), axis=1) # Sort items based on RRF scores sorted_indices = np.argsort(-rrf_scores) # Negative for descending order # Retrieve sorted items sorted_items = [(list(item_to_index.keys())[idx], rrf_scores[idx]) for idx in sorted_indices] return sorted_items def rank_list(search_result: List[models.ScoredPoint]): return [(point.id, rank + 1) for rank, point in enumerate(search_result)] def reciprocal_rank_fusion( responses: List[List[models.ScoredPoint]], limit: int = 1000 ) -> List[models.ScoredPoint]: def compute_score(pos: int) -> float: ranking_constant = ( 2 # the constant mitigates the impact of high rankings by outlier systems ) return 1 / (ranking_constant + pos) scores: Dict[models.ExtendedPointId, float] = {} point_pile = {} for response in responses: for i, scored_point in enumerate(response): if scored_point.id in scores: scores[scored_point.id] += compute_score(i) else: point_pile[scored_point.id] = scored_point scores[scored_point.id] = compute_score(i) sorted_scores = sorted(scores.items(), key=lambda item: item[1], reverse=True) return sorted_scores[:limit] # sorted_points = [] # commented out to make the output the same as the numpy version # for point_id, score in sorted_scores[:limit]: # point = point_pile[point_id] # point.score = score # sorted_points.append(point) # return sorted_points if __name__ == '__main__': import random num_points = 1000 ids = list(range(num_points)) a = [ models.ScoredPoint( id=ids[i], score=random.random(), version=1, ) for i in range(num_points) ] random.shuffle(ids) b = [ models.ScoredPoint( id=ids[i], score=random.random(), version=1, ) for i in range(num_points) ] random.shuffle(ids) c = [ models.ScoredPoint( id=ids[i], score=random.random(), version=1, ) for i in range(num_points) ] start = time.perf_counter() rrf([rank_list(a), rank_list(b), rank_list(c)]) print('numpy, scored point conversion included', time.perf_counter() - start) l_a = rank_list(a) l_b = rank_list(b) l_c = rank_list(c) start = time.perf_counter() rrf([l_a, l_b, l_c]) print('numpy, scored point conversion excluded', time.perf_counter() - start) start = time.perf_counter() reciprocal_rank_fusion([a, b, c], limit=len(a)) print('lists', time.perf_counter() - start)

numpy, scored point conversion included 0.004260916990460828 numpy, scored point conversion excluded 0.0036911249917466193 lists 0.0007186669972725213

WIP: hybrid search with fastembed

46b261e

generall and others added 4 commits March 26, 2024 21:27

hybrid queries with fastembed

75b4808

test for hybrid

0242751

fix typo

c677314

new: extend hybrid search tests, fix mypy, small refactoring (#554)

473d4de

generall marked this pull request as ready for review March 27, 2024 00:09

joein reviewed Mar 27, 2024

View reviewed changes

joein added 4 commits March 27, 2024 12:04

refactor: align model name parameters in setters, update tests

db45d41

fix: fix async

41254ed

fix: add a good test, fix sparse vectors in query batch

0bdfe25

refactoring: reduce branching, refactor fastembed tests

74936cb

joein self-requested a review March 27, 2024 12:57

joein approved these changes Mar 27, 2024

View reviewed changes

joein merged commit 822aefa into dev Mar 27, 2024
14 checks passed

joein mentioned this pull request Mar 27, 2024

add sparse embed support from fastembed 0.2.3 #539

Closed

Anush008 mentioned this pull request Mar 28, 2024

Hybrid Search Tutorial qdrant/fastembed#165

Merged

NirantK reviewed Mar 28, 2024

View reviewed changes

generall deleted the hybrid-with-fastembed branch May 3, 2024 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: hybrid search with fastembed #553

WIP: hybrid search with fastembed #553

generall commented Mar 26, 2024

netlify bot commented Mar 26, 2024 •

edited

Loading

joein Mar 27, 2024

joein commented Mar 27, 2024

NirantK Mar 28, 2024 •

edited

Loading

joein Mar 28, 2024

joein Mar 28, 2024

NirantK Mar 28, 2024 •

edited

Loading

joein Mar 28, 2024

joein Mar 28, 2024

WIP: hybrid search with fastembed #553

WIP: hybrid search with fastembed #553

Conversation

generall commented Mar 26, 2024

All Submissions:

New Feature Submissions:

Changes to Core Features:

netlify bot commented Mar 26, 2024 • edited Loading

✅ Deploy Preview for poetic-froyo-8baba7 ready!

joein Mar 27, 2024

Choose a reason for hiding this comment

joein commented Mar 27, 2024

NirantK Mar 28, 2024 • edited Loading

Choose a reason for hiding this comment

joein Mar 28, 2024

Choose a reason for hiding this comment

joein Mar 28, 2024

Choose a reason for hiding this comment

NirantK Mar 28, 2024 • edited Loading

Choose a reason for hiding this comment

joein Mar 28, 2024

Choose a reason for hiding this comment

joein Mar 28, 2024

Choose a reason for hiding this comment

netlify bot commented Mar 26, 2024 •

edited

Loading

NirantK Mar 28, 2024 •

edited

Loading

NirantK Mar 28, 2024 •

edited

Loading