# LangChain v1 RAG Application Examples

This notebook demonstrates usage patterns for the refactored RAG application using **LangChain v1**.

## Contents
1. Setup and Initialization
2. Document Ingestion Patterns
3. RAG Chat with Agents
4. Advanced Agent Patterns
5. Retriever Customization
6. Conversation Memory
7. Structured Outputs
8. Error Handling and Observability

## 1. Setup and Initialization

In [1]:
# Add project root to path
import sys
from pathlib import Path
sys.path.insert(0, str(Path.cwd().parent / 'src'))

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Verify imports
from acc_llamaindex.config import config
from acc_llamaindex.infrastructure.llm_providers.langchain_provider import get_llm, get_embeddings
from acc_llamaindex.infrastructure.db.chroma_client import chroma_client
from acc_llamaindex.application.ingest_documents_service.service import ingest_service
from acc_llamaindex.application.chat_service.service import chat_service

print("✓ All imports successful")

[32m2025-10-16 21:05:32.193[0m | [1mINFO    [0m | [36macc_llamaindex.application.ingest_documents_service.service[0m:[36m__init__[0m:[36m38[0m - [1mIngestDocumentsService initialized with documents_path: /Users/kevinknox/coding/acc-llamaindex/data/documents[0m
[32m2025-10-16 21:05:32.240[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.service[0m:[36m__init__[0m:[36m18[0m - [1mChatService initialized[0m


✓ All imports successful


## 2. Document Ingestion Patterns

### Pattern 1: Basic Document Ingestion

In [2]:
# Initialize services
get_llm()
get_embeddings()
chroma_client.initialize()

print(f"Documents path: {config.documents_path}")
print(f"ChromaDB path: {config.chroma_db_path}")
print(f"Chunk size: {config.chunk_size}, overlap: {config.chunk_overlap}")

[32m2025-10-16 21:05:34.369[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.llm_providers.langchain_provider[0m:[36m_initialize_llm[0m:[36m61[0m - [1mInitializing LLM with provider: openai[0m
[32m2025-10-16 21:05:34.598[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.llm_providers.openai_provider[0m:[36minitialize_llm[0m:[36m28[0m - [1mInitializing ChatOpenAI with model: gpt-5-nano-2025-08-07[0m
[32m2025-10-16 21:05:34.708[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.llm_providers.openai_provider[0m:[36minitialize_llm[0m:[36m38[0m - [1mChatOpenAI initialized successfully[0m
[32m2025-10-16 21:05:34.708[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.llm_providers.langchain_provider[0m:[36m_initialize_embeddings[0m:[36m92[0m - [1mInitializing embeddings with provider: openai[0m
[32m2025-10-16 21:05:34.708[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.llm_providers.openai_provider[0m:[36min

Documents path: /Users/kevinknox/coding/acc-llamaindex/data/documents
ChromaDB path: /Users/kevinknox/coding/acc-llamaindex/data/chroma_db
Chunk size: 1024, overlap: 200


In [4]:
# Ingest documents from default directory
result = ingest_service.ingest_documents_from_directory()

print(f"Success: {result.success}")
print(f"Documents processed: {result.documents_processed}")
print(f"Documents failed: {result.documents_failed}")
print(f"Message: {result.message}")
print(f"\nCollection stats: {result.collection_stats}")

[32m2025-10-16 21:06:27.065[0m | [1mINFO    [0m | [36macc_llamaindex.application.ingest_documents_service.service[0m:[36mingest_documents_from_directory[0m:[36m78[0m - [1mStarting document ingestion from: /Users/kevinknox/coding/acc-llamaindex/data/documents[0m
[32m2025-10-16 21:06:27.067[0m | [1mINFO    [0m | [36macc_llamaindex.application.ingest_documents_service.service[0m:[36mingest_documents_from_directory[0m:[36m91[0m - [1mFound 9 documents to process[0m
[32m2025-10-16 21:06:27.068[0m | [1mINFO    [0m | [36macc_llamaindex.application.ingest_documents_service.service[0m:[36mingest_documents_from_directory[0m:[36m102[0m - [1mLoading document: langchain_intro.txt[0m
[32m2025-10-16 21:06:27.068[0m | [1mINFO    [0m | [36macc_llamaindex.application.ingest_documents_service.service[0m:[36mingest_documents_from_directory[0m:[36m102[0m - [1mLoading document: AGOA Trade Fact Sheet_final.pdf[0m
[32m2025-10-16 21:06:27.183[0m | [1mINFO    [

Success: True
Documents processed: 9
Documents failed: 0
Message: Successfully ingested 9 documents (2937 chunks)

Collection stats: {'collection_name': 'documents', 'document_count': 11748, 'status': 'active'}


### Pattern 2: Ingest from Custom Directory

In [None]:
# Create custom directory with documents
import tempfile
import os

temp_dir = tempfile.mkdtemp()

# Create test documents
test_doc = os.path.join(temp_dir, "test.txt")
with open(test_doc, "w") as f:
    f.write("This is a test document about artificial intelligence and machine learning.")

# Ingest from custom directory
result = ingest_service.ingest_documents_from_directory(temp_dir)
print(f"Ingested {result.documents_processed} documents from custom directory")

# Cleanup
import shutil
shutil.rmtree(temp_dir)

### Pattern 3: Ingest Single File

In [None]:
# Ingest a single file
single_file = "../data/documents/langchain_intro.txt"
result = ingest_service.ingest_single_file(single_file)

print(f"Success: {result.success}")
print(f"Message: {result.message}")

## 3. RAG Chat with Agents

### Pattern 1: Basic Chat

In [None]:
# Initialize chat service
chat_service.initialize()

# Simple chat query
response = chat_service.chat("What is LangChain v1?")

print(f"Success: {response['success']}")
print(f"\nResponse:\n{response['response']}")

### Pattern 2: Chat with Conversation History

In [None]:
# Start a conversation
conversation_history = []

# First message
response1 = chat_service.chat(
    "What are the key features of LangChain?",
    conversation_history=conversation_history
)
print("User: What are the key features of LangChain?")
print(f"Assistant: {response1['response'][:200]}...\n")

# Add to history
conversation_history.extend([
    {"role": "user", "content": "What are the key features of LangChain?"},
    {"role": "assistant", "content": response1['response']}
])

# Follow-up message
response2 = chat_service.chat(
    "Can you explain more about agents?",
    conversation_history=conversation_history
)
print("User: Can you explain more about agents?")
print(f"Assistant: {response2['response'][:200]}...")

## 4. Advanced Agent Patterns

### Pattern 1: Direct Agent Creation with Custom Tools

In [None]:
from langchain.agents import create_agent
from langchain_core.tools import tool
from datetime import datetime

# Create custom tools
@tool
def get_current_time() -> str:
    """Get the current time in a human-readable format."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

@tool
def search_documents(query: str) -> str:
    """Search the document knowledge base for relevant information."""
    vector_store = chroma_client.get_vector_store()
    docs = vector_store.similarity_search(query, k=3)
    if not docs:
        return "No relevant documents found."
    return "\n\n".join([doc.page_content for doc in docs])

# Create agent with multiple tools
llm = get_llm()
agent = create_agent(
    model=llm,
    tools=[get_current_time, search_documents],
    system_prompt="You are a helpful assistant with access to document search and time utilities."
)

# Test the agent
response = agent.invoke({
    "messages": [{"role": "user", "content": "What time is it and what do the documents say about RAG?"}]
})

print("Agent response:")
for msg in response["messages"]:
    if hasattr(msg, 'content') and msg.content:
        print(f"{msg.__class__.__name__}: {msg.content[:200]}...")

### Pattern 2: Agent with Dynamic Model Selection

In [None]:
from langchain.agents.middleware import wrap_model_call, ModelRequest
from langchain_openai import ChatOpenAI

# Create different models for different complexity
basic_model = ChatOpenAI(model="gpt-5-nano-2025-08-07", temperature=0.3)
advanced_model = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

@wrap_model_call
def dynamic_model_selection(request: ModelRequest, handler):
    """Select model based on message count."""
    message_count = len(request.state["messages"])
    
    # Use advanced model for complex conversations
    if message_count > 5:
        print(f"Using advanced model (message count: {message_count})")
        request.model = advanced_model
    else:
        print(f"Using basic model (message count: {message_count})")
        request.model = basic_model
    
    return handler(request)

# Create agent with dynamic model selection
agent = create_agent(
    model=basic_model,
    tools=[search_documents],
    middleware=[dynamic_model_selection]
)

# Test with simple query
response = agent.invoke({"messages": [{"role": "user", "content": "Hello!"}]})
print("\nSimple query completed")

## 5. Retriever Customization

### Pattern 1: Custom Retriever with Score Threshold

In [None]:
# Get vector store
vector_store = chroma_client.get_vector_store()

# Similarity search with scores
query = "What is retrieval-augmented generation?"
results = vector_store.similarity_search_with_score(query, k=5)

print(f"Query: {query}\n")
for i, (doc, score) in enumerate(results, 1):
    print(f"Result {i} (score: {score:.4f}):")
    print(f"{doc.page_content[:150]}...\n")

### Pattern 2: Multi-Query Retrieval

In [None]:
# Multiple related queries
queries = [
    "What is LangChain?",
    "How do agents work?",
    "What are the benefits of RAG?"
]

all_results = []
for query in queries:
    results = vector_store.similarity_search(query, k=2)
    all_results.extend(results)

# Deduplicate based on content
unique_docs = {}
for doc in all_results:
    unique_docs[doc.page_content[:100]] = doc

print(f"Retrieved {len(unique_docs)} unique documents from {len(queries)} queries")
for i, doc in enumerate(list(unique_docs.values())[:3], 1):
    print(f"\nDocument {i}:")
    print(f"{doc.page_content[:150]}...")

## 6. Conversation Memory

### Pattern 1: Persistent Conversation with Checkpointer

In [None]:
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.runnables import RunnableConfig

# Create agent with memory
checkpointer = InMemorySaver()

agent = create_agent(
    model=get_llm(),
    tools=[search_documents],
    checkpointer=checkpointer,
    system_prompt="You are a helpful assistant. Remember the conversation context."
)

# Conversation thread
config: RunnableConfig = {"configurable": {"thread_id": "user-123"}}

# First message
response1 = agent.invoke(
    {"messages": [{"role": "user", "content": "My name is Alice"}]},
    config
)
print("User: My name is Alice")
print(f"Assistant: {[m for m in response1['messages'] if hasattr(m, 'content')][-1].content[:100]}...\n")

# Second message (should remember name)
response2 = agent.invoke(
    {"messages": [{"role": "user", "content": "What's my name?"}]},
    config
)
print("User: What's my name?")
print(f"Assistant: {[m for m in response2['messages'] if hasattr(m, 'content')][-1].content}")

## 7. Structured Outputs

### Pattern 1: Extract Structured Data from Documents

In [None]:
from pydantic import BaseModel, Field
from langchain.agents.structured_output import ToolStrategy

# Define schema
class DocumentSummary(BaseModel):
    """Summary of document content."""
    title: str = Field(description="Main topic or title")
    key_points: list[str] = Field(description="List of key points mentioned")
    category: str = Field(description="Document category (e.g., technical, guide, reference)")

# Create agent with structured output
agent = create_agent(
    model=get_llm(),
    tools=[search_documents],
    response_format=ToolStrategy(DocumentSummary)
)

# Get structured response
response = agent.invoke({
    "messages": [{"role": "user", "content": "Summarize what the documents say about LangChain"}]
})

if "structured_response" in response:
    summary = response["structured_response"]
    print(f"Title: {summary.title}")
    print(f"Category: {summary.category}")
    print(f"\nKey Points:")
    for i, point in enumerate(summary.key_points, 1):
        print(f"{i}. {point}")

## 8. Error Handling and Observability

### Pattern 1: Graceful Error Handling

In [None]:
from langchain.agents.middleware import wrap_tool_call
from langchain_core.messages import ToolMessage

@wrap_tool_call
def handle_tool_errors(request, handler):
    """Catch and handle tool execution errors."""
    try:
        return handler(request)
    except Exception as e:
        print(f"Tool error caught: {str(e)[:100]}")
        return ToolMessage(
            content=f"Tool execution failed. Please try rephrasing your request.",
            tool_call_id=request.tool_call["id"]
        )

# Create agent with error handling
agent = create_agent(
    model=get_llm(),
    tools=[search_documents],
    middleware=[handle_tool_errors]
)

# Test with query
response = agent.invoke({"messages": [{"role": "user", "content": "Search for information"}]})
print("Agent handled potential errors gracefully")

### Pattern 2: LangSmith Tracing

In [None]:
import os
from langsmith import traceable

# Check if LangSmith is configured
if os.getenv("LANGCHAIN_API_KEY"):
    print("LangSmith tracing is enabled")
    print(f"Project: {os.getenv('LANGCHAIN_PROJECT', 'default')}")
    
    @traceable(run_type="chain", name="custom_rag_chain")
    def custom_rag_chain(query: str) -> str:
        """Custom RAG chain with tracing."""
        # Retrieve documents
        vector_store = chroma_client.get_vector_store()
        docs = vector_store.similarity_search(query, k=3)
        
        # Generate response
        context = "\n\n".join([doc.page_content for doc in docs])
        llm = get_llm()
        response = llm.invoke(
            f"Based on this context:\n{context}\n\nAnswer: {query}"
        )
        return response.content
    
    # Test with tracing
    result = custom_rag_chain("What are the benefits of RAG?")
    print(f"\nResponse: {result[:200]}...")
    print("\n✓ Check LangSmith for full trace details")
else:
    print("LangSmith not configured. Set LANGCHAIN_API_KEY to enable tracing.")

## Summary

This notebook demonstrated:

1. **Document Ingestion**: Multiple patterns for loading documents into the vector store
2. **RAG Chat**: Basic and conversational chat patterns
3. **Advanced Agents**: Custom tools, dynamic model selection, and middleware
4. **Retrievers**: Custom retrieval strategies and multi-query patterns
5. **Memory**: Conversation persistence with checkpointers
6. **Structured Outputs**: Extracting validated data from LLM responses
7. **Error Handling**: Graceful degradation and tool error management
8. **Observability**: LangSmith tracing integration

## Next Steps

- Experiment with different embedding models
- Try various chunk sizes and overlaps
- Implement reranking for better retrieval
- Add evaluation metrics (faithfulness, relevance, etc.)
- Explore LangGraph for complex workflows
- Add streaming responses for better UX