# Vector Retriever

In the previous notebook, you performed vector search manually using Cypher queries. Now you'll use the **VectorRetriever** class from neo4j-graphrag, which abstracts away the complexity and provides a clean API for semantic search.

You'll also learn to use the **GraphRAG** class, which combines retrieval with LLM generation to build a complete question-answering pipeline.

**Prerequisites:** Complete [02 Embeddings](02_embeddings.ipynb) first to populate the graph with embeddings and create the vector index.

**Learning Objectives:**
- Use VectorRetriever for semantic search
- Inspect retrieval results and similarity scores
- Build a GraphRAG pipeline for question answering
- Understand how retrieved context improves LLM responses

## Install Dependencies

First, install the required packages. This only needs to be run once per session.

In [None]:
# Install neo4j-graphrag with Bedrock support
%pip install "neo4j-graphrag[bedrock] @ git+https://github.com/neo4j-partners/neo4j-graphrag-python.git@bedrock-embeddings" python-dotenv pydantic-settings nest-asyncio -q

In [None]:
from neo4j_graphrag.retrievers import VectorRetriever
from neo4j_graphrag.generation import GraphRAG

from data_utils import Neo4jConnection, get_llm, get_embedder

## Connect to Neo4j

Create and verify the connection to your Neo4j graph database.

In [None]:
neo4j = Neo4jConnection().verify()
driver = neo4j.driver

## Initialize LLM and Embedder

Set up the Large Language Model (LLM) and the embedding model for GraphRAG workflows. Both models are configured in `CONFIG.txt`:

- **LLM**: Uses AWS Bedrock via the `MODEL_ID` setting
- **Embedder**: Uses AWS Bedrock via the `EMBEDDING_MODEL_ID` setting

In [None]:
# Initialize LLM and Embedder from AWS Bedrock
llm = get_llm()
embedder = get_embedder()

print(f"LLM: {llm.model_id}")
print(f"Embedder: {embedder.model_id}")

## Initialize Vector Retriever

The `VectorRetriever` class handles all the complexity of semantic search:
- Automatically embeds your query using the provided embedder
- Queries the Neo4j vector index
- Returns results with similarity scores and content

This is much cleaner than writing manual Cypher queries for every search!

In [None]:
# Initialize Vector Retriever
vector_retriever = VectorRetriever(
    driver=driver,
    index_name='requirement_embeddings',
    embedder=embedder,
    return_properties=['text']
)

print("Vector Retriever initialized!")

**VectorRetriever Parameters:**
- `driver`: The Neo4j Python driver connection
- `index_name`: Name of the vector index to search (`requirement_embeddings`)
- `embedder`: The embedding model to convert queries to vectors
- `return_properties`: List of node properties to include in results (e.g., `['text']`)

> **Tip:** You can add more properties to `return_properties` like `['text', 'index']` to get additional metadata about each chunk.

---

## Diagnostic Search

Before building the full RAG pipeline, it's useful to inspect raw retrieval results. This helps you verify:
- The vector index is working correctly
- The right chunks are being retrieved for your queries
- Similarity scores are reasonable (higher is better, typically 0.7+ indicates good relevance)

In [None]:
# Simple Vector Search
query = "What are the thermal management requirements for the battery?"
result = vector_retriever.search(query_text=query, top_k=5)

print(f"Query: \"{query}\"")
print(f"Number of results returned: {len(result.items)}\n")
for item in result.items:
    score = item.metadata.get('score', 'N/A')
    node_id = item.metadata.get('id', 'N/A')
    content_preview = str(item.content)[:100]
    print(f"Score: {score:.4f}, Content: {content_preview}..., id: {node_id}")

**How it works:**  
1. The example `query`, "What are the thermal management requirements for the battery?", is created
2. `vector_retriever.search()` runs the query and returns the top 5 matches based on vector similarity.
3. The results are formatted displaying:
    * The similarity score (`Score`)
    * A snippet of the retrieved content (`Content`)
    * The unique identifier for each chunk (`id`)

This diagnostic helps you verify that the vector search is working and inspect the quality of the top results for your query.

> **Tip:**
> Inspecting the returned results to verify relevance can help you to adjust your chunking or embedding strategy.

## Graph Retrieval-Augmented Generation (GraphRAG)

Now let's combine retrieval with generation. The `GraphRAG` class orchestrates a complete RAG pipeline:

1. **Retrieve** - Use the VectorRetriever to find relevant chunks
2. **Augment** - Format the retrieved chunks as context for the LLM
3. **Generate** - Send the query + context to the LLM for a grounded answer

This is the core pattern of RAG: instead of asking the LLM to answer from its training data alone, we provide relevant context from our knowledge graph so the answer is grounded in actual data.

In [None]:
# Initialize GraphRAG and Perform Search
query = "What are the thermal management requirements for the battery?"
rag = GraphRAG(
    llm=llm,
    retriever=vector_retriever
)
response = rag.search(query, retriever_config={"top_k": 5}, return_context=True)

print(f"Query: \"{query}\"")
print(f"Number of results returned: {len(response.retriever_result.items)}\n")
print("Answer:")
print(response.answer)

**How the GraphRAG Pipeline Works:**

1. **Query received**: "What are the thermal management requirements for the battery?"
2. **Retrieval**: The VectorRetriever finds the top-k most similar chunks
3. **Context formatting**: The retrieved chunks are formatted into a prompt
4. **LLM generation**: Claude receives both the question and the context
5. **Response**: The LLM generates an answer grounded in the retrieved data

**Key parameters:**
- `retriever_config={"top_k": 5}`: Retrieve 5 chunks to use as context
- `return_context=True`: Include the retrieved chunks in the response (useful for debugging)

The answer is now **grounded** in your actual manufacturing requirement data rather than the LLM's general knowledge!

## Try Different Queries

Experiment with the vector retriever by modifying the `query`.

In [None]:
# Try different queries
queries = [
    "What are the energy density specifications for battery cells?",
    "What safety standards must the BMS comply with?",
    "How is the battery pack protected against water ingress?"
]

for query in queries:
    print(f"\nQuery: \"{query}\"")
    print("-" * 60)
    response = rag.search(query, retriever_config={"top_k": 3})
    print(f"Answer: {response.answer}")

## Summary

In this notebook, you built your first complete GraphRAG pipeline:

1. **VectorRetriever** - Abstracts vector search into a simple `search()` method. No more manual Cypher queries for embedding lookups.

2. **Diagnostic inspection** - Viewing raw retrieval results helps debug and tune your chunking/embedding strategy.

3. **GraphRAG pipeline** - Combines retrieval + LLM generation for grounded question answering. The LLM's response is based on actual data from your knowledge graph.

**Current limitation:** The VectorRetriever only returns the matched chunks themselves. But what if a question requires context that spans multiple chunks, or needs related graph data like which component a requirement belongs to? In the next notebook, you'll learn to use **VectorCypherRetriever** to traverse graph relationships and include adjacent chunks and related manufacturing data for expanded context.

---

**Next:** [Vector Cypher Retriever](04_vector_cypher_retriever.ipynb)

In [None]:
# Cleanup
neo4j.close()