# Vector & Relationships

## Installation

This notebook requires the following dependencies:

In [None]:
%pip install neo4j-graphrag python-dotenv

## Connecting to Neo4j

The following cell creates an instance of the Neo4j Python Driver that the retrievers require to connect to the database.  The driver is created with environment variables set in your `.env` file.


In [None]:
%load_ext dotenv
%dotenv

from os import getenv

NEO4J_URL = getenv("NEO4J_URI") or "neo4j://localhost:7687"
NEO4J_USERNAME = getenv("NEO4J_USERNAME") or "neo4j"
NEO4J_PASSWORD = getenv("NEO4J_PASSWORD") or "neoneoneo"
NEO4J_DATABASE = getenv("NEO4J_DATABASE") or "neo4j"

from neo4j import GraphDatabase

driver = GraphDatabase.driver(
    NEO4J_URL,
    auth=(NEO4J_USERNAME, NEO4J_PASSWORD)
)

driver.verify_connectivity() # Throws an error if the connection is not successful


## Plain vector search

A vector index already exists called `chunkEmbeddings`.  You can [create your own using the `create_vector_index` function](https://github.com/neo4j/neo4j-graphrag-python?tab=readme-ov-file#creating-a-vector-index) or [populate an existing index using the `upsert_vectors` function](https://github.com/neo4j/neo4j-graphrag-python?tab=readme-ov-file#populating-a-vector-index).

In [4]:
INDEX_NAME = "chunkEmbeddings"

from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import VectorRetriever

# Create an Embedder object
embedder = OpenAIEmbeddings()

# Initialize the retriever
retriever = VectorRetriever(
    driver,
    neo4j_database=NEO4J_DATABASE,
    index_name=INDEX_NAME,
    embedder=embedder
)

# Instantiate the LLM
llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})

The `GraphRAG` class creates a retrieval pipeline that accepts a user input, uses a retriever to fetch the context, and uses an LLM to generate an answer.

In [None]:
from neo4j_graphrag.generation import GraphRAG

# Instantiate the RAG pipeline
rag = GraphRAG(
    retriever=retriever,
    llm=llm
)

# Query the graph
query = "What are the top risk factors that Apple faces? Cite your sources."

vector_response = rag.search(query_text=query, return_context=True, retriever_config={"top_k": 5})

print(vector_response.answer)

## Adding context via relationships

The above pipeline will produce a generic, non-deterministic answer.  Adding relationships to the query will provide a deterministic answer based on the contents of the knowledge graph.  We do this with the `VectorCypherRetriever` class.

In [19]:
from neo4j_graphrag.retrievers import VectorCypherRetriever

# --- VectorCypherRetriever Example: Detailed Search with Context
detail_context_query = """
MATCH (node)-[:FROM_DOCUMENT]-(doc:Document)-[:FILED]-(company:Company)-[:FACES_RISK]->(risk:RiskFactor)
RETURN company.name AS company, collect(DISTINCT risk.name) AS risks, node.text AS context
"""

vector_cypher_retriever = VectorCypherRetriever(
    driver=driver,
    index_name='chunkEmbeddings',
    embedder=embedder,
    retrieval_query=detail_context_query
)

rag = GraphRAG(retriever=vector_cypher_retriever, llm=llm)

query = "What are the top risk factors that Apple faces? Cite your sources. "

vector_cypher_response = rag.search(query_text=query, return_context=True, retriever_config={"top_k": 5})

In [None]:
print(vector_cypher_response.answer)


In [None]:
for item in vector_cypher_response.retriever_result.items:
    print(item.content)

## Evaluating the responses

You can use **Noise Sensitivity** to measures the amount of irrelevant information, or noise, in the retrieved documents.

In [None]:
%pip install ragas langchain-openai

In [None]:

from langchain_openai import ChatOpenAI
llm = ChatOpenAI()

from ragas import evaluate, EvaluationDataset
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextPrecisionWithReference, LLMContextPrecisionWithoutReference, NoiseSensitivity
evaluator_llm = LangchainLLMWrapper(llm)

context_precision = LLMContextPrecisionWithReference(llm=evaluator_llm)
noise_sensitivity = NoiseSensitivity(llm=evaluator_llm, mode="irrelevant")

metrics = [
    context_precision,
    noise_sensitivity
]

In [None]:
# Ground truth/reference data from the database
reference = driver.execute_query(
    "match (c:Company {name: 'APPLE INC'})-[:FACES_RISK]->(r) RETURN r.name AS risk ORDER BY risk",
    database_=NEO4J_DATABASE,
    result_transformer_=lambda result: ", ".join([row['risk'] for row in result])
)

reference

In [None]:
from ragas import SingleTurnSample

# Ensure that each retrieved context is a string and not a sequence/object
def flatten_and_stringify_contexts(contexts):
    flat_contexts = []
    for item in contexts:
        # If item.content is a list or sequence, join its elements; else, str it
        content = getattr(item, "content", item)
        if isinstance(content, (list, tuple)):
            flat_contexts.append(" ".join(str(x) for x in content))
        else:
            flat_contexts.append(str(content))
    return flat_contexts

vector_contexts = flatten_and_stringify_contexts(vector_response.retriever_result.items)
cypher_contexts = flatten_and_stringify_contexts(vector_cypher_response.retriever_result.items)

vector_result = SingleTurnSample(
    user_input=query,
    reference=reference,
    retrieved_contexts=vector_contexts,
    response=vector_response.answer,
)

# Ensure cypher_contexts is a flat list of strings (not a list of lists or sequences)
flat_cypher_contexts = []
for ctx in cypher_contexts:
    if isinstance(ctx, (list, tuple)):
        flat_cypher_contexts.append(" ".join(str(x) for x in ctx))
    else:
        flat_cypher_contexts.append(str(ctx))

cypher_result = SingleTurnSample(
    user_input=query,
    reference=reference,
    retrieved_contexts=flat_cypher_contexts,
    response=vector_cypher_response.answer,
)


In [None]:
print("Vector:")
for metric in metrics:
    try:
        print(metric.name, "vector", await metric.single_turn_ascore(vector_result))
    except ValueError as e:
        print(metric.name, "vector", e)


print("\nVector + Relationships:")
for metric in metrics:
    try:
        print(metric.name, "cypher", await metric.single_turn_ascore(cypher_result))
    except ValueError as e:
        print(metric.name, "cypher", e)
