Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: unsupported operand type(s) for -: 'int' and 'simsimd.DistancesTensor' #19905

Open
5 tasks done
amitjoy opened this issue Apr 2, 2024 · 3 comments
Open
5 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: google Primarily related to Google GenAI or VertexAI integrations stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed Ɑ: vector store Related to vector store module

Comments

@amitjoy
Copy link

amitjoy commented Apr 2, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

_embedding = VertexAIEmbeddings(project="PROJECT_ID",
                                             credentials=gcp_auth.credentials,
                                             model_name="textembedding-gecko@003",
                                             location="PROJECT_LOCATION")

 _db = PGVector(connection_string="CONN",
                            collection_name="COLLECTION_NAME",
                            embedding_function=_embedding,
                            use_jsonb=True)

_db.search(query="test", search_type="mmr")

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "/Users/amit/telly/telly-backend/backend/agent/healthcheck/components/hc_vector_db.py", line 29, in check_health
    db.search(query="test", search_type=self.settings.vector_db.retriever.type)
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_core/vectorstores.py", line 160, in search
    return self.max_marginal_relevance_search(query, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 1236, in max_marginal_relevance_search
    return self.max_marginal_relevance_search_by_vector(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 1314, in max_marginal_relevance_search_by_vector
    docs_and_scores = self.max_marginal_relevance_search_with_score_by_vector(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 1196, in max_marginal_relevance_search_with_score_by_vector
    mmr_selected = maximal_marginal_relevance(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_community/vectorstores/utils.py", line 34, in maximal_marginal_relevance
    similarity_to_query = cosine_similarity(query_embedding, embedding_list)[0]
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/amit/telly/telly-backend/backend/venv/lib/python3.12/site-packages/langchain_community/utils/math.py", line 29, in cosine_similarity
    Z = 1 - simd.cdist(X, Y, metric="cosine")
        ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for -: 'int' and 'simsimd.DistancesTensor'
           INFO     Cannot execute Healthcheck: hc-vector-db  hc_vector_db.py:33

Description

Library Versions Used:

  • langchain~=0.1.14
  • langchain-google-vertexai~=0.1.2
  • simsimd~=4.2.2

System Info

NA

@dosubot dosubot bot added Ɑ: vector store Related to vector store module 🔌: google Primarily related to Google GenAI or VertexAI integrations 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Apr 2, 2024
@liugddx
Copy link
Contributor

liugddx commented Apr 3, 2024

Let me see

@rachel-pai
Copy link

i also enounter the similar problem with LaserEmbeddings embedding:(

@rachel-pai
Copy link

installing simsimd == 3.7.7 works for me from andy-tmpt-me's answer.
#18022

baskaryan added a commit that referenced this issue Jun 4, 2024
…re while using simsimd beyond v3.7.7 (#22271)

- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of #18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
hinthornw pushed a commit that referenced this issue Jun 20, 2024
…re while using simsimd beyond v3.7.7 (#22271)

- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of #18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: google Primarily related to Google GenAI or VertexAI integrations stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

3 participants