# Advanced Grounded Search Agent for Real-Time Research Assistance

## Overview and Purpose
This comprehensive research workflow leverages the power of **two specialized AI agents** to provide thorough, fact-checked research responses. The system combines:

- **Real-time Google Search grounding** for current, accurate information
- **Multi-document PDF processing** (support for multiple files or none)
- **Structured JSON output** with detailed research components
- **Two-agent architecture** that separates search from synthesis for optimal performance

## Architecture
The workflow employs a **dual-agent design** to overcome API limitations:

1. **Search Agent**: Uses Google Search grounding tools to gather current information from the web
2. **Synthesis Agent**: Processes search results + PDF content to create structured, comprehensive responses

This separation ensures both tools and structured output work seamlessly together.

## Use Cases
Perfect for:
- Academic researchers needing current literature reviews
- Business analysts requiring market research with document analysis
- Students combining course materials with current developments
- Developers researching latest frameworks and best practices

## Import Required Libraries
Essential dependencies for multimodal AI research agent functionality.

In [None]:
import base64
from langchain_google_genai import ChatGoogleGenerativeAI
from google.ai.generativelanguage_v1beta.types import Tool as GenAITool
from pydantic import BaseModel, Field
from langchain_core.messages import HumanMessage
from langchain_core.callbacks import StreamingStdOutCallbackHandler
from typing import List

## Configuration Setup
Before running the agents, ensure you have:

1. **Google Gemini API Key**: Get from [Google AI Studio](https://aistudio.google.com/)
2. **PDF Files**: Place PDF files in the same directory or provide full paths
3. **Internet Connection**: Required for Google Search grounding

## Enhanced Structured Output Models
Defines comprehensive Pydantic models for organizing research output:

### SearchResults Model
Structures findings from the search agent with query tracking, key findings, and source URLs.

### ResearchResponse Model  
Enhanced output structure includes:
- **Query & Answer**: Original query with comprehensive answer breakdown
- **Search Insights**: Key findings from web research with sources
- **File Insights**: Analysis from uploaded PDF documents
- **Synthesis**: Intelligent combination of all information sources

This ensures responses are well-organized, traceable, and machine-readable for downstream processing.

In [2]:
# Enhanced models for better structured output

class SearchInsight(BaseModel):
    """Individual search finding with source attribution"""
    finding: str = Field(description="A key finding or insight from search")
    source_url: str = Field(description="The URL where this information was found")
    relevance: str = Field(description="Brief explanation of why this is relevant")

class FileInsight(BaseModel):
    """Insights extracted from a single PDF file"""
    filename: str = Field(description="Name of the PDF file")
    key_points: List[str] = Field(description="Main points extracted from this file")
    relevance_to_query: str = Field(description="How this file relates to the research query")
    summary: str = Field(description="Brief summary of the file's content")

class ResearchAnswer(BaseModel):
    """Comprehensive answer structure"""
    summary: str = Field(description="Concise overview of findings")
    detailed_analysis: str = Field(description="In-depth analysis and explanation")
    key_findings: List[str] = Field(description="Bulleted list of main discoveries")
    conclusion: str = Field(description="Final synthesis and recommendations")

class ResearchResponse(BaseModel):
    """Complete structured research response"""
    query: str = Field(description="The original research query")
    answer: ResearchAnswer = Field(description="Comprehensive answer structure")
    search_insights: List[SearchInsight] = Field(description="Detailed findings from web search")
    search_sources: List[str] = Field(description="All source URLs found during search")
    file_insights: List[FileInsight] = Field(description="Analysis from all provided PDF files")
    synthesis: str = Field(description="Integration of search results and file content")

## Initialize LLM with Structured Output
Initializes the Gemini 2.5 Flash model with streaming callbacks for real-time output and configures it to return structured responses using the `ResearchResponse` model. The `google_api_key` must be replaced with a valid API key.

In [None]:
# LLM for search agent (with tools)
search_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    callbacks=[StreamingStdOutCallbackHandler()],
    google_api_key="your-api-key"  # Replace with your actual API key
)

# LLM for synthesis agent (structured output)
synthesis_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    callbacks=[StreamingStdOutCallbackHandler()],
    google_api_key="your-api-key"  # Replace with your actual API key
)

# Configure structured output for the synthesis agent only
structured_synthesis_llm = synthesis_llm.with_structured_output(ResearchResponse, method="json_mode")

## PDF Processing Utilities
Utilities for handling PDF files as base64-encoded content for multimodal processing:

- **Single PDF**: Convert one PDF file to base64 for agent processing
- **Multiple PDFs**: Process a list of PDF files for comprehensive document analysis
- **Optional PDFs**: Support workflows with or without document inputs

These functions enable flexible document processing workflows.

In [4]:
def load_pdf_as_base64(pdf_path: str) -> str:
    """Load a single PDF file and encode as base64"""
    with open(pdf_path, 'rb') as file:
        pdf_bytes = file.read()
    return base64.b64encode(pdf_bytes).decode('utf-8')

## Search Agent Functions
Creates functions for the first agent that performs internet search using Google Search grounding to gather information about the query. This agent focuses solely on retrieving relevant information without structured output constraints.

In [5]:
def run_search_agent(query: str) -> str:
    """
    First agent: Performs internet search using Google Search grounding
    Returns raw search results and findings as text
    """
    search_message = HumanMessage(content=f"""
    Research the following query thoroughly using internet search:
    {query}
    
    Please provide:
    1. Key findings and recent information
    2. Important sources and URLs
    3. Relevant facts and data
    4. Any recent developments or trends
    
    Focus on comprehensive information gathering from reliable sources.
    """)
    
    # Use tools for grounding with Google Search
    response = search_llm.invoke(
        [search_message],
        tools=[GenAITool(google_search={})]
    )
    
    return response.content

In [6]:
def create_synthesis_message(query: str, search_results: str, pdf_paths: List[str] = None) -> HumanMessage:
    """
    Creates a message for the synthesis agent that combines search results with PDF content
    """
    content = [{
        "type": "text", 
        "text": f"""
        Based on the search results below and the provided PDF (if any), create a comprehensive structured response for this query: {query}

        SEARCH RESULTS:
        {search_results}

        Please synthesize this information with any insights from the PDF to provide a complete research response.
        Return your response in the structured format with query, summary, sources, pdf_insights, and synthesis fields.
        """
    }]

    if pdf_paths:
        for pdf_path in pdf_paths:
            pdf_base64 = load_pdf_as_base64(pdf_path)
            content.append({
                "type": "file",
                "source_type": "base64",
                "mime_type": "application/pdf",
                "data": pdf_base64
            })

    return HumanMessage(content=content)

In [7]:
def run_synthesis_agent(query: str, search_results: str, pdf_path: str = None) -> dict:
    """
    Second agent: Synthesizes search results with PDF content using structured output
    Returns structured ResearchResponse
    """
    synthesis_message = create_synthesis_message(query, search_results, pdf_path)
    
    # Use structured output (no tools needed)
    response = structured_synthesis_llm.invoke([synthesis_message])
    
    return response

## Main Research Agent Orchestrator
Orchestrates the two-agent workflow: first calls the search agent to gather information from the internet using Google Search grounding, then calls the synthesis agent to combine search results with PDF insights and produce structured output.

In [8]:
def run_research_agent(query: str, pdf_path: str = None) -> dict:
    """
    Main orchestrator function that coordinates the two-agent workflow
    """
    print("Step 1: Running search agent to gather information from the internet...")
    search_results = run_search_agent(query)
    
    print("\nStep 2: Running synthesis agent to combine search results with PDF content...")
    final_response = run_synthesis_agent(query, search_results, pdf_path)
    
    return final_response

## Execute Two-Agent Workflow
Runs the complete two-agent workflow: first agent searches the internet for information using Google Search grounding, then the second agent synthesizes the search results with PDF content to produce a structured response. This approach separates tool usage from structured output to avoid compatibility issues.

In [None]:
if __name__ == "__main__":
    query = "Summarize recent advancements in model compression for LLMs from 2024 and 2025"
    pdf_paths = ["sample.pdf"]  # Replace with paths to sample PDFs
    result = run_research_agent(query, pdf_paths)
    print("\nFinal Research Response:")
    print(result)

Step 1: Running search agent to gather information from the internet...

Step 2: Running synthesis agent to combine search results with PDF content...

Final Research Response:
query='Summarize recent advancements in model compression for LLMs from 2024 and 2025' answer=ResearchAnswer(summary='Recent advancements in Large Language Model (LLM) compression (2024-2025) emphasize efficiency and practical deployment over sheer model size. Key techniques like quantization, pruning, and knowledge distillation continue to evolve, with a strong focus on post-training methods and hardware-aware optimization. A significant development is the rise of Mixture-of-Experts (MoE) architectures, which enable massive models with efficient inference by activating only a subset of parameters. Knowledge distillation is being leveraged to transfer complex reasoning and emergent abilities to smaller models. These efforts aim to reduce the high computational costs and memory requirements of LLMs, making them m

In [10]:
# Utility functions for enhanced workflow

def display_research_summary(result: ResearchResponse) -> None:
    """Pretty print research results summary"""
    print("RESEARCH SUMMARY")
    print("=" * 50)
    print(f"Query: {result.query}")
    print(f"Summary: {result.answer.summary}")
    print(f"\nKey Findings ({len(result.answer.key_findings)}):")
    for i, finding in enumerate(result.answer.key_findings, 1):
        print(f"   {i}. {finding}")
    
    print(f"\nSearch Sources ({len(result.search_sources)}):")
    for i, source in enumerate(result.search_sources[:3], 1):  # Show first 3
        print(f"   {i}. {source}")
    if len(result.search_sources) > 3:
        print(f"   ... and {len(result.search_sources) - 3} more sources")
    
    print(f"\nFiles Analyzed ({len(result.file_insights)}):")
    for insight in result.file_insights:
        print(f"   {insight.filename}: {insight.summary[:300]}...")
    
    print(f"\nSynthesis: {result.synthesis[:500]}...")
    print("=" * 50)

display_research_summary(result)

RESEARCH SUMMARY
Query: Summarize recent advancements in model compression for LLMs from 2024 and 2025
Summary: Recent advancements in Large Language Model (LLM) compression (2024-2025) emphasize efficiency and practical deployment over sheer model size. Key techniques like quantization, pruning, and knowledge distillation continue to evolve, with a strong focus on post-training methods and hardware-aware optimization. A significant development is the rise of Mixture-of-Experts (MoE) architectures, which enable massive models with efficient inference by activating only a subset of parameters. Knowledge distillation is being leveraged to transfer complex reasoning and emergent abilities to smaller models. These efforts aim to reduce the high computational costs and memory requirements of LLMs, making them more accessible, faster, and more cost-effective for real-world applications. Open-source models and multi-model deployment strategies are further democratizing LLM utilization.

Key F