refactor: QdrantRM #979

Anush008 · 2024-05-06T07:23:46Z

Description

This refactor allows users to query Qdrant with any implementation for BaseVectorizer, the default being the new FastEmbedVectorizer.
The field containing the document content in the Qdrant payload can be specified. It doesn't necessarily have to be "document" anymore.
Qdrant supports multiple named vectors and this update allows specifying one for retrieval. Defaults to the first found vector.

Currently, the implementation relies on qdrant_client's query_batch() abstraction, which uses FastEmbed internally.

Breaking?

The default values to the new params ensure backward compatibility.

Ankush-Chander · 2024-05-06T08:23:42Z

Hi @Anush008
Will this refactor also support SparseVectors?

Anush008 · 2024-05-06T08:26:20Z

Not this one.

If we can have a sparse vectors providers interface in DSPy, in BaseSentenceVectorizer maybe, we could support it in QdrantRM.

Ankush-Chander · 2024-05-06T08:41:40Z

I was thinking in these lines. Please correct me if i am missing something.

qdrant_client.search expects same parameter query_vector in dense as well as sparse search. It can be

query_vector = models.NamedSparseVector(
        name="text",
        vector=models.SparseVector(
            indices=[1, 7],
            values=[2.0, 1.0],
        ),
    ),

for sparse embedding.

and

query_vector=[0.2, 0.1, 0.9, 0.7],

for dense embedding.

So if we can generalize the vectorizer function and let user return the vector consistent with collection, it should work for dense, sparse as well as sparse-embed alike without DSPy intervention.

Anush008 · 2024-05-06T08:48:07Z

vectorizer is of type BaseSentenceVectorizer, which has abstract methods to generate dense vectors.

If it can support sparse vectors, we can have them here.

Anush008 · 2024-05-15T18:10:36Z

Hey @arnavsinghvi11. Just bumping this PR. Please take a look when possible.

arnavsinghvi11 · 2024-05-15T20:48:51Z

LGTM. thanks @Anush008 !

refactor: QdrantRM

f8f7203

Anush008 force-pushed the refactor-qdrant branch from ac5fad4 to f8f7203 Compare May 6, 2024 07:30

Anush008 added 3 commits May 8, 2024 08:42

chore: Simplified return qdrant_rm.py

33949d0

Merge branch 'stanfordnlp:main' into refactor-qdrant

73146ee

Merge branch 'stanfordnlp:main' into refactor-qdrant

0c17657

arnavsinghvi11 merged commit 0e595a7 into stanfordnlp:main May 15, 2024

Anush008 deleted the refactor-qdrant branch May 16, 2024 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: QdrantRM #979

refactor: QdrantRM #979

Uh oh!

Anush008 commented May 6, 2024 •

edited

Loading

Uh oh!

Ankush-Chander commented May 6, 2024

Uh oh!

Anush008 commented May 6, 2024 •

edited

Loading

Uh oh!

Ankush-Chander commented May 6, 2024 •

edited

Loading

Uh oh!

Anush008 commented May 6, 2024 •

edited

Loading

Uh oh!

Anush008 commented May 15, 2024

Uh oh!

arnavsinghvi11 commented May 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor: QdrantRM #979

refactor: QdrantRM #979

Uh oh!

Conversation

Anush008 commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Breaking?

Uh oh!

Ankush-Chander commented May 6, 2024

Uh oh!

Anush008 commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ankush-Chander commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anush008 commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Anush008 commented May 15, 2024

Uh oh!

arnavsinghvi11 commented May 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Anush008 commented May 6, 2024 •

edited

Loading

Anush008 commented May 6, 2024 •

edited

Loading

Ankush-Chander commented May 6, 2024 •

edited

Loading

Anush008 commented May 6, 2024 •

edited

Loading