# LangGraph with Vector Similarity Search

Building intelligent agents that combine LangGraph workflows with semantic search capabilities.

## Learning Objectives

By the end of this notebook, you will:

1. **Integrate vector stores with LangGraph** - Connect ChromaDB vector store to LangGraph agents to enable semantic search capabilities
2. **Create semantic search tools** - Build custom tools that perform vector similarity search and return relevant document chunks
3. **Combine multiple tools** - Design agents that can intelligently use both semantic search tools (for information retrieval) and calculation tools (for processing)
4. **Handle complex queries** - Process user requests that require both searching a knowledge base and performing calculations on the retrieved information

## 1. Environment Setup

In [None]:
# Core imports
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain_core.documents import Document
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.prebuilt import ToolNode
from langchain_google_genai import ChatGoogleGenerativeAI

from dotenv import load_dotenv
from typing import Literal, List

load_dotenv("../../.env")
print("âœ… Environment loaded")

## 2. Initialize LLM

In [None]:
# Initialize LLM
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0.3,
    max_tokens=1024
)

print("âœ… LLM initialized")

## 3. Connect to Vector Store

In [None]:
# ChromaDB and Embeddings setup
from langchain_community.vectorstores import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from google.genai import types

# Configuration
PERSIST_DIR = "../../chroma_db"
COLLECTION_NAME = "toyota_specs"
EMBED_MODEL_ID = "gemini-embedding-001"

# Initialize embeddings (uses GOOGLE_API_KEY from environment)
embeddings_model = GoogleGenerativeAIEmbeddings(
    model=EMBED_MODEL_ID,
    output_dimensionality=768
)

# Connect to vectorstore
vectorstore = Chroma(
    collection_name=COLLECTION_NAME,
    embedding_function=embeddings_model,
    persist_directory=PERSIST_DIR
)

print(f"âœ… Connected to vectorstore: {COLLECTION_NAME}")
print(f"âœ… Using embedding model: {EMBED_MODEL_ID}")

In [None]:
vectorstore.get()

## 4. Vector Similarity Search Helper

Create a helper function to perform semantic search against the vector database.

In [None]:
# Vector similarity search helper
def vector_similarity_search(
    query: str, 
    vectorstore, 
    k: int = 5
) -> List[str]:
    """Perform vector similarity search."""
    docs = vectorstore.similarity_search(query, k=k)
    return [doc.page_content for doc in docs]

In [None]:
docs = vector_similarity_search(
    "What is the base price of the Toyota Camry?", 
    vectorstore, 
    k=5
)

In [None]:
docs

## 5. Define LangGraph Tools

Create tools that wrap our vector search and EMI calculator functionalities.

In [None]:
# Tool 1: Vehicle Search
@tool
def search_vehicles(query: str, max_results: int = 5) -> str:
    """
    Search Toyota vehicle database using semantic similarity.
    
    Use this tool when users need information about Toyota vehicles,
    including specifications, pricing, fuel efficiency, or comparisons.
    
    Args:
        query: Natural language search query about Toyota vehicles
        max_results: Maximum number of results to return (default: 5)
    
    Returns:
        Formatted string with vehicle information
    """
    docs = vector_similarity_search(query, vectorstore, k=max_results)
    
    result = "Vehicle Search Results:\n"
    result += "=" * 60 + "\n"
    for i, doc in enumerate(docs, 1):
        result += f"\nResult {i}:\n{doc}\n"
    result += "=" * 60
    
    return result

print("âœ… search_vehicles tool defined")

In [None]:
search_vehicles.invoke({
    "query": "Which Toyota sedan is most fuel-efficient under $30,000?",
    "max_results": 3
})

In [None]:
@tool
def emi_calculator(principal: float, annual_interest_rate: float, tenure_months: int, currency: str) -> str:
    """
    Calculate the EMI (Equated Monthly Installment) for a loan.
    
    Use this tool when users want to know their monthly loan payment,
    total repayment amount, or total interest for a loan.
    
    Args:
        principal: The loan amount
        annual_interest_rate: Annual interest rate as percentage (e.g., 8.5)
        tenure_months: Loan tenure in months
        currency: Currency code (USD, EUR, GBP, INR, JPY)
    """
    if principal <= 0 or annual_interest_rate < 0 or tenure_months <= 0:
        return "Error: Invalid input parameters"
    
    monthly_interest_rate = annual_interest_rate / 12 / 100
    
    if monthly_interest_rate == 0:
        emi = principal / tenure_months
        total_payment = principal
        total_interest = 0
    else:
        emi = principal * monthly_interest_rate * \
              pow(1 + monthly_interest_rate, tenure_months) / \
              (pow(1 + monthly_interest_rate, tenure_months) - 1)
        total_payment = emi * tenure_months
        total_interest = total_payment - principal
    
    return (
        f"EMI Calculation Result:\n"
        f"  Loan Amount: {principal:,.2f} {currency}\n"
        f"  Interest Rate: {annual_interest_rate}% per annum\n"
        f"  Tenure: {tenure_months} months\n"
        f"  Monthly EMI: {emi:,.2f} {currency}\n"
        f"  Total Payment: {total_payment:,.2f} {currency}\n"
        f"  Total Interest: {total_interest:,.2f} {currency}"
    )

print("âœ… emi_calculator tool defined")

## 6. Build LangGraph Workflow

Construct the agent graph with LLM node, tool node, and routing logic.

In [None]:
# Build LangGraph workflow
tools = [search_vehicles, emi_calculator]
llm_with_tools = llm.bind_tools(tools)

def call_llm(state: MessagesState):
    """LLM node: Calls LLM with current messages."""
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: MessagesState) -> Literal["tools", "__end__"]:
    """Router: Check if agent wants to use tools."""
    last_message = state["messages"][-1]
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return END

# Build graph
workflow = StateGraph(MessagesState)
workflow.add_node("llm", call_llm)
workflow.add_node("tools", ToolNode(tools))
workflow.add_edge(START, "llm")
workflow.add_conditional_edges("llm", should_continue, {"tools": "tools", END: END})
workflow.add_edge("tools", "llm")

app = workflow.compile()
print("âœ… Graph compiled")

## 7. Test the Agent

Execute various queries to test the agent's capabilities.

### Test 1: Simple Vehicle Search

In [None]:
# Test: Simple vehicle search
state = {
    "messages": [
        HumanMessage(content="Which Toyota sedan is most fuel-efficient under $30,000?")
    ]
}

print("Query: Which Toyota sedan is most fuel-efficient under $30,000?")
print("=" * 70)

result = app.invoke(state)

print(f"\nTotal messages: {len(result['messages'])}")
print("\nFinal Response:")
print("=" * 70)
print(result['messages'][-1].content)
print("=" * 70)

### Test 2: Combined Search and Calculation

In [None]:
# Test: Comparison query
state2 = {
    "messages": [
        HumanMessage(content="""
                     What is the base price of the Toyota Camry 
                     and what is the EMI for $30,000 loan at 8.5% per annum for 36 months?
                    """)
    ]
}

print("""What is the base price of the Toyota Camry 
         and what is the EMI for $30,000 loan at 8.5% per annum for 36 months?
    """)
print("=" * 70)

result2 = app.invoke(state2)

print(f"\nTotal messages: {len(result2['messages'])}")
print("\nFinal Response:")
print("=" * 70)
print(result2['messages'][-1].content)
print("=" * 70)

In [None]:
result2["messages"]

In [None]:
result2["messages"][-1].content

## 8. Streaming Execution

Observe the agent's decision-making process in real-time through streaming.

In [None]:
# Streaming execution
state_stream = {
    "messages": [
        HumanMessage(content="Show me affordable SUVs with good fuel economy")
    ]
}

print("Streaming Execution")
print("=" * 70)
print("Query: Show me affordable SUVs with good fuel economy\n")

step_count = 0

for event in app.stream(state_stream):
    for node_name, data in event.items():
        step_count += 1
        print(f"\n[Step {step_count}] Node: '{node_name}'")
        print("-" * 60)
        
        if "messages" in data:
            for msg in data["messages"]:
                if isinstance(msg, AIMessage):
                    if hasattr(msg, "tool_calls") and msg.tool_calls:
                        print(f"  ðŸ” Calling {msg.tool_calls[0]['name']}")
                        print(f"     Query: {msg.tool_calls[0]['args'].get('query')}")
                    else:
                        print(f"  ðŸ’¬ Final response generated")
                        
                elif isinstance(msg, ToolMessage):
                    print(f"  âœ… Tool executed")
                    first_line = msg.content.split('\n')[0]
                    print(f"     Result: {first_line}")

print("\n" + "=" * 70)
print(f"Total steps: {step_count}")
print("=" * 70)

## Conclusion

In this notebook, you learned:

✅ **Vector store integration** - Connected ChromaDB to LangGraph agents, enabling semantic search over Toyota vehicle specifications

✅ **Semantic search tools** - Created a `search_vehicles` tool that performs vector similarity search and returns relevant document chunks based on natural language queries

✅ **Multi-tool agents** - Built agents that intelligently combine semantic search (for retrieving vehicle information) with calculation tools (for EMI computation)

✅ **Complex query handling** - Processed queries requiring both information retrieval from the knowledge base and mathematical calculations, demonstrating the power of combining RAG with agentic workflows

### Key Patterns Demonstrated

- **Parallel execution** - Agent can call the search tool to gather information while simultaneously being ready to perform calculations
- **Sequential execution** - Agent first searches for vehicle price, then uses that information to calculate EMI
- **Conversational context** - Each interaction maintains context through the MessagesState, enabling follow-up questions

### Next Steps

Now that you understand vector similarity search with LangGraph, you're ready to explore more **advanced patterns in detail**:

- **Sequential execution patterns** - Deep dive into multi-step dependent workflows where each tool's output feeds into the next
- **Conversational context management** - Learn how to maintain and leverage conversation history across multiple user turns for more natural interactions