New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error querying PineconeVectorStore using sparse query mode #12967
Conversation
@@ -433,7 +433,7 @@ def query(self, query: VectorStoreQuery, **kwargs: Any) -> VectorStoreQueryResul | |||
"values": [v * (1 - query.alpha) for v in sparse_vector["values"]], | |||
} | |||
|
|||
query_embedding = None | |||
query_embedding = [0.0] * len(query.query_embedding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is specifically to fix the case where query_mode is sparse right?
Won't providing a vector of zeros still kind of influence the search result? If we are in sparse mode, should we just query without the vector kwarg? Or should we set the alpha to completely ignore the Dense zero vector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
The issue is that if we query without the vector kwarg, we get an error when making the request if we don't pass a record id instead. In their hybrid search tutorial, pinecone says we should scale the dense vector by alpha (in this case zero) before querying, so using a vector of zeros in this case should be fine (it's overwritten later for other query modes anyways) https://www.pinecone.io/learn/hybrid-search-intro/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah great, that works for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok wait, one more worry haha -- we should ensure that the query embedding is not none before doing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, added a check
…lama#12967) * fix sparse query pinecone * none checking
…lama#12967) * fix sparse query pinecone * none checking
Description
Pinecone always expects a vector (or technically also a record id) when querying. When using sparse query mode, we were setting the vector to None and making the request with that, resulting in an error. Changed to make the request with a vector filled with zeroes instead.
Version Bump?
Type of Change