# Vector & Relationships

## Installation

This notebook requires the following dependencies:

In [None]:
%pip install neo4j-graphrag langchain-core langchain-openai langchain-neo4j langgraph python-dotenv

## Connecting to Neo4j

The following cell creates an instance of the Neo4j Python Driver that the retrievers require to connect to the database.  The driver is created with environment variables set in your `.env` file.


In [None]:
%load_ext dotenv
%dotenv

from os import getenv

NEO4J_URL = getenv("NEO4J_URI") or "neo4j://localhost:7687"
NEO4J_USERNAME = getenv("NEO4J_USERNAME") or "neo4j"
NEO4J_PASSWORD = getenv("NEO4J_PASSWORD") or "neoneoneo"
NEO4J_DATABASE = getenv("NEO4J_DATABASE") or "neo4j"

from neo4j import GraphDatabase

driver = GraphDatabase.driver(
    NEO4J_URL,
    auth=(NEO4J_USERNAME, NEO4J_PASSWORD)
)

driver.verify_connectivity() # Throws an error if the connection is not successful


## Plain vector search

A vector index already exists called `chunkEmbeddings`.  You can [create your own using the `create_vector_index` function](https://github.com/neo4j/neo4j-graphrag-python?tab=readme-ov-file#creating-a-vector-index) or [populate an existing index using the `upsert_vectors` function](https://github.com/neo4j/neo4j-graphrag-python?tab=readme-ov-file#populating-a-vector-index).

In [None]:
INDEX_NAME = "chunkEmbeddings"

from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import VectorRetriever

# Create an Embedder object
embedder = OpenAIEmbeddings()

# Initialize the retriever
retriever = VectorRetriever(
    driver,
    neo4j_database=NEO4J_DATABASE,
    index_name=INDEX_NAME,
    embedder=embedder
)

# Instantiate the LLM
llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})

The `GraphRAG` class creates a retrieval pipeline that accepts a user input, uses a retriever to fetch the context, and uses an LLM to generate an answer.

In [None]:
from neo4j_graphrag.generation import GraphRAG

# Instantiate the RAG pipeline
rag = GraphRAG(
    retriever=retriever,
    llm=llm
)

# Query the graph
query = "What are the top risk factors that Apple faces? "

vector_response = rag.search(query_text=query, return_context=True, retriever_config={"top_k": 5})

print(vector_response.answer)

The top risk factors that Apple faces include:

1. **Macroeconomic and Industry Risks**: Apple's operations and performance are significantly dependent on global and regional economic conditions. Adverse economic conditions such as slow growth, recession, high unemployment, inflation, tighter credit, higher interest rates, and currency fluctuations can adversely impact consumer confidence and spending, affecting demand for Apple's products and services.

2. **Supply Chain Risks**: Apple's supply chain is large and complex, with a majority of supplier facilities located outside the U.S. The company faces risks of supply shortages and price increases, especially for custom components available from only one source. Delays or constraints in the supply of components can materially adversely affect Apple's business.

3. **Political and Global Events**: Political events, trade disputes, war, terrorism, natural disasters, public health issues, and other business interruptions can disrupt inte

## Adding context via relationships

The above pipeline will produce a generic, non-deterministic answer.  Adding relationships to the query will provide a deterministic answer based on the contents of the knowledge graph.  We do this with the `VectorCypherRetriever` class.

In [9]:
from neo4j_graphrag.retrievers import VectorCypherRetriever

detail_context_query = """
MATCH (node)-[:FROM_DOCUMENT]-(doc:Document)-[:FILED]-(company:Company)-[:FACES_RISK]->(risk:RiskFactor)
RETURN company.name AS company, doc.path AS document, collect(DISTINCT risk.name) AS risks, node.text AS context
"""

vector_cypher_retriever = VectorCypherRetriever(
    driver=driver,
    neo4j_database=getenv("NEO4J_DATABASE") or "graphrag",
    index_name=INDEX_NAME,
    embedder=embedder,
    retrieval_query=detail_context_query
)

rag = GraphRAG(retriever=vector_cypher_retriever, llm=llm)

query = "What are the top risk factors that Apple faces?"

vector_cypher_response = rag.search(query_text=query, return_context=True, retriever_config={"top_k": 5})

In [13]:
print(vector_cypher_response.answer)


for item in vector_cypher_response.retriever_result.items:
    print(item.content)


Apple faces several top risk factors, including:

1. **Macroeconomic and Industry Risks**: Adverse global and regional economic conditions can significantly impact Apple's operations and performance. Factors such as slow growth, recession, high unemployment, inflation, tighter credit, higher interest rates, and currency fluctuations can adversely affect consumer confidence and spending, thereby impacting demand for Apple's products and services.

2. **Supply Chain Risks**: Apple is subject to significant risks of supply shortages and price increases, particularly for custom components available from only one source. Initial capacity constraints, delays, or constraints in the supply of components can materially adversely affect Apple's business, results of operations, and financial condition.

3. **Technological and Product Risks**: The frequent introduction of new products, short product life cycles, and rapid technological advances pose risks. Design and manufacturing defects in Apple

## Evaluating the responses

You can use **Noise Sensitivity** to measures the amount of irrelevant information, or noise, in the retrieved documents.

In [None]:
%pip install ragas langchain-openai

In [None]:

from langchain_openai import ChatOpenAI
llm = ChatOpenAI()

from ragas import evaluate, EvaluationDataset
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextPrecisionWithReference, LLMContextPrecisionWithoutReference, NoiseSensitivity
evaluator_llm = LangchainLLMWrapper(llm)


context_precision = LLMContextPrecisionWithReference(llm=evaluator_llm)
# context_precision_without_reference = LLMContextPrecisionWithoutReference(llm=evaluator_llm)
noise_sensitivity = NoiseSensitivity(llm=evaluator_llm, mode="irrelevant")


metrics = [
    context_precision,
    noise_sensitivity
]

In [None]:
# Ground truth/reference data from the database
reference = driver.execute_query(
    "match (c:Company {name: 'APPLE INC'})-[:FACES_RISK]->(r) RETURN r.name AS risk ORDER BY risk",
    database_=NEO4J_DATABASE,
    result_transformer_=lambda result: ", ".join([row['risk'] for row in result])
)

reference

"Ability to extend or renew component supply agreements, Adverse macroeconomic conditions, Aggressive price competition, Availability of components at acceptable prices, Change in tax laws, Changes due to competition, market conditions, legal and regulatory requirements affecting the App Store, Charter provisions discouraging takeover, Commodity pricing fluctuations, Compliance with data protection laws, Concentration of stock ownership, Credit risk, Cyber-attacks, Developers focusing efforts on competing platforms, Economic conditions, Evolving industry standards, Failure to make digital content available on commercially reasonable terms, Failure to obtain or create digital content that appeals to the Company's customers, Foreign Exchange Rate Risk, Frequent introduction of new products, General safety, security, and crisis management hazards, Geography, Industry-wide shortage and significant commodity pricing fluctuations, Information technology system failures and network disruption

In [None]:
vector_result = SingleTurnSample(
    user_input=query,
    reference=reference,
    retrieved_contexts=[str(item.content) for item in vector_response.retriever_result.items],
    response=vector_response.answer,
)

cypher_result = SingleTurnSample(
    user_input=query,
    reference=reference,
    retrieved_contexts=[str(item.content) for item in vector_cypher_response.retriever_result.items],
    response=vector_cypher_response.answer,
)


In [None]:
print("Vector:")
for metric in metrics:
    try:
        print(metric.name, "vector", await metric.single_turn_ascore(vector_result))
    except ValueError as e:
        print(metric.name, "vector", e)


print("Vector + Relationships:")
for metric in metrics:
    try:
        print(metric.name, "cypher", await metric.single_turn_ascore(cypher_result))
    except ValueError as e:
        print(metric.name, "cypher", e)
