### Web Search Agent 3

## Validation and Reflection Agent (LLM feedback loop)

This agent:
- First performs a duckduckgo websearch
- Summarizes and evaluates the results with an LLM (groq)
- Decides weather to accept these summarized results directly, or reqrie the query, or answer directly using the LLM

In [54]:
from ddgs import DDGS
from langchain_groq import ChatGroq
from langchain.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.chains import LLMChain
from pydantic import BaseModel
from typing import List, Optional
from dotenv import load_dotenv
import os

load_dotenv()
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
llm = ChatGroq(
    model_name="llama-3.3-70b-versatile",
    temperature=0.7
)

Defining the `pydantic` datamodel for validation of data

In [55]:
class SearchResult(BaseModel):
    title: str
    url: str
    snippet: Optional[str]

Defining the duckduckgo search function:

In [56]:
def ddg_search(query: str, max_results: int = 5) -> List[SearchResult]:
    """
    Perform a search using the DuckDuckGo Search API.
    """
    try: 
        with DDGS() as ddgs:
            results = ddgs.text(query, region='wt-wt',safesearch='Moderate', max_results=max_results)
            lines = []
            for r in results:
                lines.append(SearchResult(
                    title=r["title"],
                    url=r["href"],
                    snippet=r.get("body", "")
                ))
            if not lines:
                lines.append(SearchResult(
                    title="No results found",
                    url="",
                    snippet="No relevant information available."
                ))
            return lines
    except Exception as e:
        return [SearchResult(
            title="Error",
            url="",
            snippet=f"An error occurred while searching: {str(e)}"
        )]

### Making the PromptTemplate for the Reflective Chain
The prompt asks the LLM to evaluate the search results for relevance and completeness, suggests rewrite if needed, or provides a direct answer.

In [57]:
REFLECTION_PROMPT = PromptTemplate(
    input_variables=["query", "search_results"],
    template=(
        "You are an expert AI assistant. A user asked: {query}\n"
        "Here are the web search results:\n{search_results}\n"
        "If these results adequately answer the question, reply:\n"
        "ACCEPT\n"
        "If they do not, provide only the rewritten concise technical search query after the word 'REWRITE:' "
        "without any additional explanation or punctuation.\n"
        "If you can answer confidently without further search, reply:\n"
        "ANSWER: Your detailed answer here.\n"
        "Do not provide any additional text or explanation."
    )
)

initialising the reflection chain using the LLM

In [58]:
prompt = REFLECTION_PROMPT
llm_chain = prompt | llm

writing a helper function to format the Search Results as text for LLM input

In [59]:
def format_search_results(results: List[SearchResult]) -> str:
    lines = []
    for i, r in enumerate(results, start=1):
        if r.snippet:
            snippet = r.snippet
        else:
            snippet = "No snippet available."
        if r.url:
            url = r.url
        else:
            url = "No URL available."
        lines.append(f"{i}. Title:{r.title}\nURL:{url}\nSnippet:{snippet}\n") 
    return "\n".join(lines)

defining the main agent function (it also includes things like what steps the agent followed before giving the final answer)

In [60]:
def agent_function(user_query: str, max_search_results: int = 5, max_reflections: int = 2) -> str:
    current_query = user_query
    for attempt in range(max_reflections):
        results = ddg_search(query=current_query, max_results=max_search_results)
        results_f = format_search_results(results)
        message = llm_chain.invoke({"query": current_query, "search_results": results_f})
        reflection_response = message.content.strip()
        
        # Accept logic
        if reflection_response.strip() == "ACCEPT":
            summary_lines = [f"{r.title} - {r.url}" for r in results]
            return f"Search results accepted:\n" + "\n".join(summary_lines)
        
        # Try to extract REWRITE from anywhere in the text
        if "REWRITE:" in reflection_response:
            # Take everything after 'REWRITE:' (if there are multiple, pick the last one)
            rewritten_parts = reflection_response.split("REWRITE:")
            new_query = rewritten_parts[-1].strip()
            current_query = new_query
            continue
        
        # Try to extract ANSWER from anywhere in the text
        if "ANSWER:" in reflection_response:
            answer_parts = reflection_response.split("ANSWER:")
            answer = answer_parts[-1].strip()
            return answer
        
        # Otherwise, fallback
        return f"Could not interpret reflection response. Here's the raw output:\n{reflection_response}"
    return f"Reflection attempts exhausted. Latest results:\n" + format_search_results(results)


In [61]:
def agent_function(user_query: str, max_search_results: int = 5, max_reflections: int = 2) -> str:
    current_query = user_query
    steps = []
    for attempt in range(max_reflections):
        steps.append(f"Step {attempt+1}: Searching with query: '{current_query}'")
        results = ddg_search(query=current_query, max_results=max_search_results)
        results_f = format_search_results(results)
        
        message = llm_chain.invoke({"query": current_query, "search_results": results_f})
        reflection_response = message.content.strip()
        steps.append(f"Reflection Response: {reflection_response}")

        if reflection_response == "ACCEPT":
            steps.append("Action: Accepting current results")
            print("\n".join(steps))
            summary_lines = [f"{r.title} - {r.url}" for r in results]
            return f"Search results accepted:\n" + "\n".join(summary_lines)
        
        if "REWRITE:" in reflection_response:
            rewritten_parts = reflection_response.split("REWRITE:")
            new_query = rewritten_parts[-1].strip()
            steps.append(f"Action: Rewriting query to: '{new_query}'")
            current_query = new_query
            continue
        
        if "ANSWER:" in reflection_response:
            answer_parts = reflection_response.split("ANSWER:")
            answer = answer_parts[-1].strip()
            steps.append("Action: Returning direct answer from LLM")
            print("\n".join(steps))
            return answer
        
        steps.append("Action: Could not interpret response, fallback")
        print("\n".join(steps))
        return f"Could not interpret reflection response. Here's the raw output:\n{reflection_response}"
    
    steps.append("Reflection attempts exhausted.")
    print("\n".join(steps))
    return f"Reflection attempts exhausted. Latest results:\n" + format_search_results(results)


## Testing the Agent

In [65]:
test_query = "What are popular tourist destinations in India during summer?"
output = agent_function(test_query)
print(output)

Step 1: Searching with query: 'What are popular tourist destinations in India during summer?'
Reflection Response: REWRITE: popular summer tourist destinations in India
Action: Rewriting query to: 'popular summer tourist destinations in India'
Step 2: Searching with query: 'popular summer tourist destinations in India'
Reflection Response: ACCEPT
Action: Accepting current results
Search results accepted:
What are some family-friendly summer tourist destinations ... | Medium - https://medium.com/@raindropsresort/what-are-some-family-friendly-summer-tourist-destinations-in-south-india-1b5f14bf5b70
Exploring the Best Tourist Destinations in India in... - NYC 360 News - https://www.nyc360news.com/exploring-the-best-tourist-destinations-in-india-in-june-and-july
Top 10 places to visit in June in India . [Video] | Top places to travel... - https://in.pinterest.com/pin/top-10-places-to-visit-in-june-in-india-video--338121884554460963/
Best Places To Visit In May In India - https://www.travela

In [64]:
test_query = "Compare sparse vs dense retrieval methods for document search in modern NLP systems."
output = agent_function(test_query)
print(output)

Step 1: Searching with query: 'Compare sparse vs dense retrieval methods for document search in modern NLP systems.'
Reflection Response: ANSWER: In modern NLP systems, sparse and dense retrieval methods are two prominent approaches for document search. Sparse retrieval methods typically represent documents and queries using high-dimensional, sparse vectors, often based on term-frequency inverse-document-frequency (TF-IDF) weights. This approach excels in traditional information retrieval tasks, especially when dealing with large document collections, as it can efficiently filter out irrelevant documents.

On the other hand, dense retrieval methods utilize low-dimensional, dense vectors, usually learned through neural network-based embedding models. These models capture semantic relationships between words and documents, enabling more accurate matching of queries and documents based on their meanings. Dense embeddings have shown impressive performance in various NLP tasks, including do

In [66]:
query = "What are the top Bollywood movies released in 2024?"
output = agent_function(query)
print(output)

Step 1: Searching with query: 'What are the top Bollywood movies released in 2024?'
Reflection Response: REWRITE: top bollywood movies 2024 release date and name
Action: Rewriting query to: 'top bollywood movies 2024 release date and name'
Step 2: Searching with query: 'top bollywood movies 2024 release date and name'
Reflection Response: ANSWER: According to the search results, some of the top Bollywood movies of 2024 include Amar Singh Chamkila, and over 30 big-budget movies are scheduled for release in 2024. However, the exact names and release dates of all the movies are not provided in the search results. Some of the upcoming movies mentioned include Dunki, Sam Bahadur, Animal, and Salaar, but these are scheduled for release in December 2023, not 2024. The search results suggest that 2024 will be a blockbuster year for Bollywood with a range of genres, including thrillers and comedies, but do not provide a comprehensive list of the top Bollywood movies of 2024.
Action: Returning d