Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking Assistance: Incompatibility of Vector Index Model and Embedding Function Dimensions in Neo4j #13387

Open
KaifAhmad1 opened this issue Jan 21, 2024 · 1 comment

Comments

@KaifAhmad1
Copy link

Example Code

# Creating Embdeddings of the sentences and storing it into Graph DB
from langchain_community.embeddings import HuggingFaceBgeEmbeddings

model_name = "BAAI/bge-base-en-v1.5"
model_kwargs = {"device": "cpu"}
encode_kwargs = {"normalize_embeddings": True}
embeddings = HuggingFaceBgeEmbeddings(
    model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs
)
from langchain.graphs import Neo4jGraph

graph = Neo4jGraph(
    url=os.environ["NEO4J_URI"],
    username=os.environ["NEO4J_USERNAME"],
    password=os.environ["NEO4J_PASSWORD"]
)
from neo4j import GraphDatabase

uri = os.environ["NEO4J_URI"]
username = os.environ["NEO4J_USERNAME"]
password = os.environ["NEO4J_PASSWORD"]
driver = GraphDatabase.driver(uri, auth=(username, password))
session = driver.session()

result = session.run("SHOW VECTOR INDEXES")

for record in result:
   print(record)
# Instantiate Neo4j vector from documents
neo4j_vector = Neo4jVector.from_documents(
    documents,
    HuggingFaceBgeEmbeddings(),
    name="graph_qa_index",
     url=os.environ["NEO4J_URI"],
    username=os.environ["NEO4J_USERNAME"],
    password=os.environ["NEO4J_PASSWORD"]
)

Description / Actual Behaviour

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-26-b09e1b2ff4ef>](https://localhost:8080/#) in <cell line: 2>()
      1 # Instantiate Neo4j vector from documents
----> 2 neo4j_vector = Neo4jVector.from_documents(
      3     documents,
      4     HuggingFaceBgeEmbeddings(),
      5      url=os.environ["NEO4J_URI"],

2 frames
[/usr/local/lib/python3.10/dist-packages/langchain_community/vectorstores/neo4j_vector.py](https://localhost:8080/#) in __from(cls, texts, embeddings, embedding, metadatas, ids, create_id_index, search_type, **kwargs)
    445         # If the index already exists, check if embedding dimensions match
    446         elif not store.embedding_dimension == embedding_dimension:
--> 447             raise ValueError(
    448                 f"Index with name {store.index_name} already exists."
    449                 "The provided embedding function and vector index "

ValueError: Index with name vector already exists.The provided embedding function and vector index dimensions do not match.
Embedding function dimension: 1024
Vector index dimension: 768

The embedding model utilized in HuggingFaceBgeEmbeddings is denoted as BAAI/bge-base-en-v1.5, possessing an embedding dimension of 768. This specification ostensibly aligns with the vector store index dimension of 768. Nevertheless, upon execution of the provided code, a dimension mismatch error is encountered despite the apparent alignment.

System Info

Python version: 3.10.10
Operating System: Windows 11
Windows: 11
pip == 23.3.1
python == 3.10.10
long-chain == 0.1.0
transformers == 4.36.2
sentence_transformers == 2.2.2
unstructured == 0.12.0

Expected Behaviour

This code will run and store the embedding of the documents in neo4j vector store without raising any error.

@KaifAhmad1 KaifAhmad1 added the bug label Jan 21, 2024
@klaren
Copy link
Member

klaren commented Jan 22, 2024

How was the index created?

If it was created by https://github.com/langchain-ai/langchain I think that the issue is on their side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants