# Research Agent with LangGraph and Ollama


## This notebook implements a research agent that:
 1. Generates search queries
 2. Performs web research
 3. Summarizes findings
 4. Reflects on results to identify knowledge gaps
# We'll use LangGraph for the agent's workflow and Ollama for local LLM inference.


# 1. Setup and Imports

In [22]:
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
# os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
# os.environ["TAVILY_API_KEY"] = "your-api-key"
# os.environ["OPENAI_API_KEY"] = 'your-api-key'
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"

In [23]:
import os
import json
from typing import Optional, Any, List
from typing_extensions import Literal, Annotated

from pydantic import BaseModel, Field
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.runnables import RunnableConfig
from langchain_ollama import ChatOllama
from langgraph.graph import START, END, StateGraph

# 2. Configuration Model
First, let's define our configuration using Pydantic instead of dataclasses

In [None]:
class Configuration(BaseModel):
    """The configurable fields for the research assistant."""
    max_web_research_loops: int = 3
    local_llm: str = "deepseek-r1"

    @classmethod
    def from_runnable_config(
        cls, config: Optional[RunnableConfig] = None
    ) -> "Configuration":
        """Create a Configuration instance from a RunnableConfig."""
        configurable = (
            config["configurable"] if config and "configurable" in config else {}
        )
        values: dict[str, Any] = {
            field: os.environ.get(field.upper(), configurable.get(field))
            for field in cls.model_fields.keys()
        }
        return cls(**{k: v for k, v in values.items() if v is not None})

# %% [markdown]
# ## 3. State Models
# Now let's define our state models using Pydantic

# %%
class SummaryState(BaseModel):
    """Main state model for the research agent."""
    research_topic: Optional[str] = None
    search_query: Optional[str] = None
    web_research_results: List[str] = Field(default_factory=list)
    sources_gathered: List[str] = Field(default_factory=list)
    research_loop_count: int = 0
    running_summary: Optional[str] = None

    class Config:
        arbitrary_types_allowed = True

class SummaryStateInput(BaseModel):
    """Input state model."""
    research_topic: Optional[str] = None

class SummaryStateOutput(BaseModel):
    """Output state model."""
    running_summary: Optional[str] = None

# %% [markdown]
# ## 4. Utility Functions
# Let's define our helper functions for web search and source formatting

# %%
from langsmith import traceable
from tavily import TavilyClient

def deduplicate_and_format_sources(search_response, max_tokens_per_source, include_raw_content=True):
    """
    Takes either a single search response or list of responses from Tavily API and formats them.
    Limits the raw_content to approximately max_tokens_per_source.
    """
    # Convert input to list of results
    if isinstance(search_response, dict):
        sources_list = search_response['results']
    elif isinstance(search_response, list):
        sources_list = []
        for response in search_response:
            if isinstance(response, dict) and 'results' in response:
                sources_list.extend(response['results'])
            else:
                sources_list.extend(response)
    else:
        raise ValueError("Input must be either a dict with 'results' or a list of search results")
    
    # Deduplicate by URL
    unique_sources = {}
    for source in sources_list:
        if source['url'] not in unique_sources:
            unique_sources[source['url']] = source
    
    # Format output
    formatted_text = "Sources:\n\n"
    for i, source in enumerate(unique_sources.values(), 1):
        formatted_text += f"Source {source['title']}:\n===\n"
        formatted_text += f"URL: {source['url']}\n===\n"
        formatted_text += f"Most relevant content from source: {source['content']}\n===\n"
        if include_raw_content:
            char_limit = max_tokens_per_source * 4
            raw_content = source.get('raw_content', '')
            if raw_content is None:
                raw_content = ''
                print(f"Warning: No raw_content found for source {source['url']}")
            if len(raw_content) > char_limit:
                raw_content = raw_content[:char_limit] + "... [truncated]"
            formatted_text += f"Full source content limited to {max_tokens_per_source} tokens: {raw_content}\n\n"
                
    return formatted_text.strip()

def format_sources(search_results):
    """Format search results into a bullet-point list of sources."""
    return '\n'.join(
        f"* {source['title']} : {source['url']}"
        for source in search_results['results']
    )

@traceable
def tavily_search(query, include_raw_content=True, max_results=3):
    """Search the web using the Tavily API."""
    tavily_client = TavilyClient()
    return tavily_client.search(query, 
                         max_results=max_results, 
                         include_raw_content=include_raw_content)

# %% [markdown]
# ## 5. Agent Prompts
# Define the instruction prompts for our agent's components

# %%
query_writer_instructions = """Your goal is to generate targeted web search query.

The query will gather information related to a specific topic.

Topic:
{research_topic}

Return your query as a JSON object:
{{
    "query": "string",
    "aspect": "string",
    "rationale": "string"
}}
"""

summarizer_instructions = """Your goal is to generate a high-quality summary of the web search results.

When EXTENDING an existing summary:
1. Seamlessly integrate new information without repeating what's already covered
2. Maintain consistency with the existing content's style and depth
3. Only add new, non-redundant information
4. Ensure smooth transitions between existing and new content

When creating a NEW summary:
1. Highlight the most relevant information from each source
2. Provide a concise overview of the key points related to the report topic
3. Emphasize significant findings or insights
4. Ensure a coherent flow of information

CRITICAL REQUIREMENTS:
- Start IMMEDIATELY with the summary content - no introductions or meta-commentary
- Focus ONLY on factual, objective information
- Maintain a consistent technical depth
- Avoid redundancy and repetition
- DO NOT use phrases like "based on the new results" or "according to additional sources"
- DO NOT add a References or Works Cited section
- Begin directly with the summary text
"""

reflection_instructions = """You are an expert research assistant analyzing a summary about {research_topic}.

Your tasks:
1. Identify knowledge gaps or areas that need deeper exploration
2. Generate a follow-up question that would help expand your understanding
3. Focus on technical details, implementation specifics, or emerging trends that weren't fully covered

Ensure the follow-up question is self-contained and includes necessary context for web search.

Return your analysis as a JSON object:
{{ 
    "knowledge_gap": "string",
    "follow_up_query": "string"
}}"""

# %% [markdown]
# ## 6. Agent Nodes
# Define the core functions that make up our agent's workflow

# %%
def generate_query(state: SummaryState, config: RunnableConfig):
    """Generate a query for web search"""
    query_writer_instructions_formatted = query_writer_instructions.format(
        research_topic=state.research_topic
    )
    print("Research topic:")
    print(state.research_topic)

    configurable = Configuration.from_runnable_config(config)
    llm_json_mode = ChatOllama(model=configurable.local_llm, temperature=0, format="json")
    result = llm_json_mode.invoke(
        [SystemMessage(content=query_writer_instructions_formatted),
        HumanMessage(content=f"Given this research topic: {state.research_topic}, generate a query for web search, your output should contain a json with a query key:")]
    )
    print("result")
    print(result)   
    query = json.loads(result.content)
    
    print("Query created in generate_query: ", query)
    
    return {"search_query": query['query']}

def web_research(state: SummaryState):
    """Gather information from the web"""
    print("Current search query", )
    print(state.search_query)
    print(state.research_topic)
    search_results = tavily_search(state.search_query, include_raw_content=True, max_results=1)
    search_str = deduplicate_and_format_sources(search_results, max_tokens_per_source=1000)
    return {
        "sources_gathered": [format_sources(search_results)], 
        "research_loop_count": state.research_loop_count + 1, 
        "web_research_results": [search_str]
    }

def summarize_sources(state: SummaryState, config: RunnableConfig):
    """Summarize the gathered sources"""
    existing_summary = state.running_summary
    most_recent_web_research = state.web_research_results[-1]

    if existing_summary:
        human_message_content = (
            f"Extend the existing summary: {existing_summary}\n\n"
            f"Include new search results: {most_recent_web_research} "
            f"That addresses the following topic: {state.research_topic}"
        )
    else:
        human_message_content = (
            f"Generate a summary of these search results: {most_recent_web_research} "
            f"That addresses the following topic: {state.research_topic}"
        )

    configurable = Configuration.from_runnable_config(config)
    llm = ChatOllama(model=configurable.local_llm, temperature=0)
    result = llm.invoke(
        [SystemMessage(content=summarizer_instructions),
        HumanMessage(content=human_message_content)]
    )

    running_summary = result.content
    return {"running_summary": running_summary}

def reflect_on_summary(state: SummaryState, config: RunnableConfig):
    """Reflect on the summary and generate a follow-up query"""
    configurable = Configuration.from_runnable_config(config)
    llm_json_mode = ChatOllama(model=configurable.local_llm, temperature=0, format="json")
    result = llm_json_mode.invoke(
        [SystemMessage(content=reflection_instructions.format(research_topic=state.research_topic)),
        HumanMessage(content=f"Identify a knowledge gap and generate a follow-up web search query based on our existing knowledge: {state.running_summary}")]
    )   
    follow_up_query = json.loads(result.content)
    query = follow_up_query.get('follow_up_query')
    
    if not query:
        return {"search_query": f"Tell me more about {state.research_topic}"}
    
    return {"search_query": follow_up_query['follow_up_query']}

def finalize_summary(state: SummaryState):
    """Finalize the summary"""
    all_sources = "\n".join(source for source in state.sources_gathered)
    final_summary = f"## Summary\n\n{state.running_summary}\n\n### Sources:\n{all_sources}"
    return {"running_summary": final_summary}

def route_research(state: SummaryState, config: RunnableConfig) -> Literal["finalize_summary", "web_research"]:
    """Route the research based on the follow-up query"""
    configurable = Configuration.from_runnable_config(config)
    if state.research_loop_count <= configurable.max_web_research_loops:
        return "web_research"
    else:
        return "finalize_summary"

# %% [markdown]
# ## 7. Build and Run the Agent
# Now let's put it all together and create our research agent

# %%
# Create the graph
builder = StateGraph(SummaryState, 
                    input=SummaryStateInput, 
                    output=SummaryStateOutput, 
                    config_schema=Configuration)

# Add nodes
builder.add_node("generate_query", generate_query)
builder.add_node("web_research", web_research)
builder.add_node("summarize_sources", summarize_sources)
builder.add_node("reflect_on_summary", reflect_on_summary)
builder.add_node("finalize_summary", finalize_summary)

# Add edges
builder.add_edge(START, "generate_query")
builder.add_edge("generate_query", "web_research")
builder.add_edge("web_research", "summarize_sources")
builder.add_edge("summarize_sources", "reflect_on_summary")
builder.add_conditional_edges("reflect_on_summary", route_research)
builder.add_edge("finalize_summary", END)

# Compile the graph
graph = builder.compile()

# %% [markdown]
# ## 8. Example Usage
# Here's how to use the research agent

# %%
# Example research topic
research_topic = "Write a quick report on the latest LLMs that came out in 2025."

# Run the research agent
config = {"configurable": {"max_web_research_loops": 3, "local_llm": "llama3.3"}}
result = graph.invoke(
    {"research_topic": research_topic},
    config=config
)

Research topic:
Write a quick report on the latest LLMs that came out in 2025.
result
content='{"query": "latest large language models 2025", "aspect": "technological advancements", "rationale": "To gather information on the newest LLMs released in 2025, focusing on their capabilities, applications, and potential impact on various industries."}' additional_kwargs={} response_metadata={'model': 'llama3.3', 'created_at': '2025-01-29T14:30:22.693042Z', 'done': True, 'done_reason': 'stop', 'total_duration': 24745997667, 'load_duration': 818026792, 'prompt_eval_count': 129, 'prompt_eval_duration': 16830000000, 'eval_count': 56, 'eval_duration': 6798000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None)} id='run-7fd57ac3-fbea-4597-bcdb-b0265690350c-0' usage_metadata={'input_tokens': 129, 'output_tokens': 56, 'total_tokens': 185}
Query created in generate_query:  {'query': 'latest large language models 2025', 'aspect': 'technological advancements', 'rationale':

In [21]:
Markdown(result["running_summary"])

## Summary

<think>
Alright, I need to create a quick report summarizing the latest large language models (LLMs) as of 2025 based on the provided sources. The user has given specific instructions for summarization: starting immediately, focusing only on facts, maintaining technical depth without redundancy, and avoiding certain phrases.

First, I'll review the source content about LLMs from 2024 and a partial report from 2025. The key points include:

1. **Definition of LLMs**: Large language models are AI systems trained on vast datasets to understand and generate human language using transformer architectures.
2. **Architectures**: They use neural networks, specifically transformers, which allow for context-aware processing.
3. **Applications**: Beyond text generation, they're used in various industries like healthcare, finance, and customer service due to their ability to process large amounts of data quickly.
4. **Challenges**: Concerns about bias, inaccuracy, and toxicity limit broader adoption and raise ethical issues.
5. **Future Directions**: Approaches like self-training, fact-checking, and sparse expertise are being explored to mitigate these issues.

From the sources, I can gather that GPT-3 and its successor GPT-4 are highlighted as significant models. Additionally, BLOOM is mentioned but only partially due to truncation in the source content.

I need to structure this information into a concise report without using certain phrases. The report should be technical yet clear, focusing on facts and avoiding any unnecessary jargon that might not add value or could be misinterpreted.

I'll start by introducing LLMs, their architecture, applications, challenges, and then discuss the latest models like GPT-3 and GPT-4, along with emerging approaches to improve them. I should ensure each section flows logically into the next, providing a clear progression of ideas.

Finally, I'll conclude with a summary that encapsulates the key points without introducing new information.
</think>

**Summary of Large Language Models (LLMs) as of 2025**

Large language models are AI systems designed to generate and understand human-like text by analyzing vast datasets. Utilizing transformer architectures, these models process context-aware information, enabling tasks such as text generation and comprehension.

These models find applications across industries, leveraging their ability to handle large data efficiently for various purposes like customer service and healthcare. However, challenges related to bias, accuracy, and toxicity constrain broader adoption and raise ethical concerns.

Emerging approaches aim to enhance these models by incorporating self-training, fact-checking, and sparse expertise to address biases and improve reliability. Notable advancements include GPT-3 and its successor, GPT-4, which represent significant milestones in AI development, with ongoing research focused on refining their capabilities for real-world applications.

### Sources:
* The Future of Large Language Models in 2025 - AIMultiple : https://research.aimultiple.com/future-of-large-language-models/

In [20]:
result["sources_gathered"]

KeyError: 'sources_gathered'