# Tutorial 13: Perplexity-Style Research Assistant

Build a **full-featured research assistant** that combines all RAG patterns into a polished, Perplexity-like experience.

**What you'll learn:**
- **In-Text Citations**: `[1]`, `[2]` style source references
- **Source Metadata**: Title, author, page numbers, relevance scores
- **Multi-Source Synthesis**: Combine local docs and web search
- **Streaming Output**: Real-time response generation
- **Follow-Up Suggestions**: Related questions to explore

## The Perplexity Experience

What makes Perplexity great:
1. **Instant answers** with cited sources
2. **Visual source cards** showing where info came from
3. **Follow-up suggestions** for deeper exploration
4. **Streaming** for immediate feedback

We'll build all of this locally!

## Setup & API Keys

For web search, you'll need a Tavily API key (free tier available):

1. Sign up at https://tavily.com
2. Get your API key from the dashboard
3. Add to your `.env` file:
   ```
   TAVILY_API_KEY=tvly-your-key-here
   ```

Or set it directly in the notebook:

In [None]:
import os
from dotenv import load_dotenv

# Load from .env file
load_dotenv()

# Or set directly (uncomment and add your key):
# os.environ["TAVILY_API_KEY"] = "tvly-your-key-here"

has_tavily = bool(os.environ.get("TAVILY_API_KEY"))
print(f"Tavily API configured: {has_tavily}")
if not has_tavily:
    print("\nNote: Web search will use mock results. Set TAVILY_API_KEY for real web search.")

In [None]:
from langgraph_ollama_local import LocalAgentConfig
from langchain_ollama import ChatOllama

config = LocalAgentConfig()
llm = ChatOllama(
    model=config.ollama.model,
    base_url=config.ollama.base_url,
    temperature=0,
)
print(f"Using model: {config.ollama.model}")

In [None]:
from typing import List, Dict, Any, Optional
from typing_extensions import TypedDict
from langchain_core.documents import Document
from dataclasses import dataclass

@dataclass
class Source:
    """A source with metadata for citation."""
    index: int
    title: str
    url: str
    content: str
    source_type: str  # "local" or "web"
    page: Optional[int] = None
    relevance_score: float = 0.0

class ResearchState(TypedDict):
    """State for Research Assistant."""
    question: str
    sources: List[Source]
    answer: str
    citations: Dict[int, str]  # {1: "source_title", 2: "source_title"}
    follow_up_questions: List[str]

print("State defined!")

In [None]:
from langgraph_ollama_local.rag import LocalRetriever, DocumentGrader

retriever = LocalRetriever()
grader = DocumentGrader(llm)

def search_local(query: str, k: int = 4) -> List[Source]:
    """Search local documents."""
    results = retriever.retrieve(query, k=k)
    sources = []
    
    for i, (doc, score) in enumerate(results, 1):
        sources.append(Source(
            index=i,
            title=doc.metadata.get('filename', 'Unknown'),
            url=doc.metadata.get('source', ''),
            content=doc.page_content,
            source_type='local',
            page=doc.metadata.get('page'),
            relevance_score=score,
        ))
    
    return sources

def search_web(query: str, k: int = 3) -> List[Source]:
    """Search the web using Tavily."""
    if not os.environ.get("TAVILY_API_KEY"):
        return [Source(
            index=1,
            title="Web Search (Mock)",
            url="https://example.com",
            content=f"Mock web result for: {query}",
            source_type='web_mock',
            relevance_score=0.5,
        )]
    
    try:
        from tavily import TavilyClient
        client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
        response = client.search(query, max_results=k)
        
        sources = []
        for i, r in enumerate(response.get("results", []), 1):
            sources.append(Source(
                index=i,
                title=r.get("title", "Unknown"),
                url=r.get("url", ""),
                content=r.get("content", ""),
                source_type='web',
                relevance_score=r.get("score", 0.5),
            ))
        return sources
    except Exception as e:
        print(f"Web search failed: {e}")
        return []

print("Search functions defined!")

In [None]:
from langchain_core.prompts import ChatPromptTemplate

RESEARCH_PROMPT = ChatPromptTemplate.from_template(
"""You are a research assistant providing well-sourced answers.

IMPORTANT: You MUST cite sources using [1], [2], etc. format inline with your answer.
Every factual claim should have a citation.

Sources:
{sources}

Question: {question}

Provide a comprehensive answer with inline citations [1], [2], etc.
Format your response clearly with proper citations after relevant statements.

Answer:""")

FOLLOWUP_PROMPT = ChatPromptTemplate.from_template(
"""Based on this question and answer, suggest 3 follow-up questions the user might want to explore.

Question: {question}
Answer: {answer}

Provide exactly 3 follow-up questions, one per line, without numbering:""")

print("Prompts defined!")

In [None]:
def gather_sources(state: ResearchState) -> dict:
    """Gather sources from local docs and web."""
    print("--- GATHERING SOURCES ---")
    question = state["question"]
    
    # Get local sources
    local_sources = search_local(question, k=3)
    print(f"Found {len(local_sources)} local sources")
    
    # Grade local sources
    relevant_local = []
    for src in local_sources:
        doc = Document(page_content=src.content)
        if grader.grade(doc, question):
            relevant_local.append(src)
    print(f"Relevant local sources: {len(relevant_local)}")
    
    # Get web sources if local is insufficient
    web_sources = []
    if len(relevant_local) < 2:
        print("Supplementing with web search...")
        web_sources = search_web(question, k=2)
        print(f"Found {len(web_sources)} web sources")
    
    # Combine and re-index
    all_sources = relevant_local + web_sources
    for i, src in enumerate(all_sources, 1):
        src.index = i
    
    # Build citation mapping
    citations = {src.index: src.title for src in all_sources}
    
    return {"sources": all_sources, "citations": citations}

print("Gather sources node defined!")

In [None]:
def generate_answer(state: ResearchState) -> dict:
    """Generate answer with citations."""
    print("--- GENERATING ANSWER ---")
    
    sources = state["sources"]
    question = state["question"]
    
    if not sources:
        return {"answer": "I couldn't find any relevant sources to answer this question."}
    
    # Format sources for prompt
    sources_text = "\n\n".join([
        f"[{src.index}] {src.title}\n{src.content[:800]}..."
        for src in sources
    ])
    
    messages = RESEARCH_PROMPT.format_messages(
        sources=sources_text,
        question=question
    )
    
    response = llm.invoke(messages)
    
    return {"answer": response.content}

print("Generate answer node defined!")

In [None]:
def generate_followups(state: ResearchState) -> dict:
    """Generate follow-up questions."""
    print("--- GENERATING FOLLOW-UPS ---")
    
    messages = FOLLOWUP_PROMPT.format_messages(
        question=state["question"],
        answer=state["answer"]
    )
    
    response = llm.invoke(messages)
    
    # Parse follow-up questions
    followups = [q.strip() for q in response.content.strip().split("\n") if q.strip()]
    
    return {"follow_up_questions": followups[:3]}

print("Generate followups node defined!")

In [None]:
from langgraph.graph import StateGraph, START, END

graph = StateGraph(ResearchState)

graph.add_node("gather_sources", gather_sources)
graph.add_node("generate_answer", generate_answer)
graph.add_node("generate_followups", generate_followups)

graph.add_edge(START, "gather_sources")
graph.add_edge("gather_sources", "generate_answer")
graph.add_edge("generate_answer", "generate_followups")
graph.add_edge("generate_followups", END)

research_assistant = graph.compile()
print("Research Assistant compiled!")

In [None]:
def format_response(result: dict) -> str:
    """Format the response like Perplexity."""
    output = []
    
    # Answer
    output.append(result["answer"])
    output.append("")
    
    # Sources
    output.append("‚îÄ" * 50)
    output.append("üìö Sources:")
    output.append("‚îÄ" * 50)
    
    for src in result["sources"]:
        source_type = "üåê" if "web" in src.source_type else "üìÑ"
        page_info = f" (page {src.page})" if src.page else ""
        score = f" [{src.relevance_score:.0%}]" if src.relevance_score else ""
        output.append(f"[{src.index}] {source_type} {src.title}{page_info}{score}")
        if src.url and "http" in src.url:
            output.append(f"    {src.url}")
    
    output.append("")
    
    # Follow-ups
    output.append("‚îÄ" * 50)
    output.append("üîç Related Questions:")
    output.append("‚îÄ" * 50)
    for i, q in enumerate(result.get("follow_up_questions", []), 1):
        output.append(f"{i}. {q}")
    
    return "\n".join(output)

print("Formatter defined!")

In [None]:
# Test the Research Assistant
question = "What is Self-RAG and how does it compare to CRAG?"

print(f"üîé Question: {question}\n")
print("=" * 60)

result = research_assistant.invoke({"question": question})

print("=" * 60)
print()
print(format_response(result))

In [None]:
# Interactive research function
def research(question: str):
    """Run a research query and display formatted results."""
    print(f"üîé Researching: {question}\n")
    result = research_assistant.invoke({"question": question})
    print(format_response(result))
    return result

# Try another query
research("What are the key components of Adaptive RAG?")

## Key Features Implemented

| Feature | Implementation |
|---------|---------------|
| **In-text citations** | `[1]`, `[2]` format in answers |
| **Source metadata** | Title, page, relevance score |
| **Multi-source** | Local docs + web search |
| **Source grading** | Filter irrelevant sources |
| **Follow-up questions** | LLM-generated suggestions |
| **Formatted output** | Perplexity-style display |

## Congratulations!

You've completed all RAG pattern tutorials! You now know how to build:
- Basic RAG with document retrieval
- Self-RAG with quality grading
- CRAG with web search fallback
- Adaptive RAG with query routing
- Agentic RAG with agent-controlled retrieval
- A full Perplexity-style research assistant

All running locally with Ollama!