# Task 1: Implement any 3 web search agents with error handling and fallbacks

In [1]:
import dotenv
dotenv.load_dotenv()

True

In [2]:
# Importing Libraries

from tavily import TavilyClient
import requests
import os
from bs4 import BeautifulSoup
from langchain.chat_models import init_chat_model
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema import Document

## Web Search Agents

I used 3 websearch agents- Tavily, Serper and Brave Search.

### Agent 1: Tavily

In [3]:
# Tavily Search Function

def Tavily(query):
    try: 
        tavily_client = TavilyClient()
        response = tavily_client.search(query, limit=5)['results']
        return response

    except Exception as e:
        # print("Tavily failed with Error:", e)
        return f"ERROR"
    
print(Tavily("What is RAG?"))

[{'url': 'https://www.ibm.com/think/topics/retrieval-augmented-generation', 'title': 'What is RAG (Retrieval Augmented Generation)?', 'content': 'Retrieval augmented generation (RAG) is an architecture for optimizing the performance of an artificial intelligence (AI) model by connecting it with external knowledge bases. RAG allows generative AI models to access additional external knowledge bases, such as internal organizational data, scholarly journals and specialized datasets. RAG systems connect models with supplemental external data in real-time and incorporate up-to-date information into generated responses. By combining generative AI with retrieval systems, RAG models can retrieve and integrate information from multiple data sources in response to complex queries. RAG AI systems plug models into internal data to equip customer support chatbots with the latest knowledge about a company’s products, services and policies. * The **generator**: A generative AI model that creates an ou

In [4]:
# Made a faulty function to test fallback

def Tavily_Faulty(query):
    try: 
        tavily_client = TavilyClient(api_key="happyindependenceday")
        response = tavily_client.search(query, limit=5)['results']
        return response

    except Exception as e:
        # print(f"Tavily failed with Error: {e}")
        return f"ERROR"
    
print(Tavily_Faulty("What is RAG?"))

ERROR


### Agent 2: Serper

In [5]:
# Serper Search Function

def Serper(query):
    api_key = os.getenv("SERPER_API_KEY")
    url = "https://google.serper.dev/search"

    headers = {
        "X-API-KEY": api_key,
        "Content-Type": "application/json"
    }

    payload = {
        "q": query,
        "count": 5
    }

    try:
        response = requests.post(url, headers=headers, json=payload, timeout=10)
        response.raise_for_status()
        results = response.json()
        
        output = []
        for item in results.get("organic", [])[:3]:
            output.append({
                "url": item.get("link"),
                "title": item.get("title"),
                "content": item.get("snippet")
            })
        if output:
            return output
        else:
            # print("Serper failed with Error : No Results Found")
            return("ERROR")

    except Exception as e:
        # print(f"Serper failed with Error: {e}")
        return f"ERROR"

print(Serper("What is RAG"))

[{'url': 'https://aws.amazon.com/what-is/retrieval-augmented-generation/', 'title': 'What is RAG? - Retrieval-Augmented Generation AI Explained - AWS', 'content': 'RAG is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources ...'}, {'url': 'https://cloud.google.com/use-cases/retrieval-augmented-generation', 'title': 'What is Retrieval-Augmented Generation (RAG)? - Google Cloud', 'content': 'RAG (Retrieval-Augmented Generation) is an AI framework that combines the strengths of traditional information retrieval systems (such as search and databases)'}, {'url': 'https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/', 'title': 'What Is Retrieval-Augmented Generation aka RAG - NVIDIA Blog', 'content': 'Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched ...'}]


### Agent 3: Brave Search

In [6]:
# Brave Search Function

def Brave(query):
    api_key = os.getenv("BRAVE_API_KEY")
    url = "https://api.search.brave.com/res/v1/web/search"

    headers = {
        "Accept": "application/json",
        "X-Subscription-Token": api_key
    }

    params = {
        "q": query,
        "count": 5
    }

    try:
        response = requests.get(url, headers=headers, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()

        output = []
        for item in data.get("web", {}).get("results", []):
            output.append({
                "url": item.get("url"),
                "title": item.get("title"),
                "content": item.get("description")
            })

        if output:
            return output
        else:
            # print("Brave failed with Error : No Results Found")
            return("ERROR")
        
    except Exception as e:
        # print(f"Brave failed with Error: {e}")
        return f"ERROR"


print(Brave("What is RAG"))

[{'url': 'https://aws.amazon.com/what-is/retrieval-augmented-generation/', 'title': 'What is RAG? - Retrieval-Augmented Generation AI Explained - AWS', 'content': 'How can AWS support your Retrieval-Augmented Generation requirements? Retrieval-Augmented Generation (RAG) is <strong>the process of optimizing the output of a large language model</strong>, so it references an authoritative knowledge base outside of its training data sources before generating a response.'}, {'url': 'https://cloud.google.com/use-cases/retrieval-augmented-generation', 'title': 'What is Retrieval-Augmented Generation (RAG)? | Google Cloud', 'content': 'RAG (Retrieval-Augmented Generation) is <strong>an AI framework that combines the strengths of traditional information retrieval systems (such as search and databases) with the capabilities of generative large language models (LLMs).</strong> By combining your data and world knowledge with LLM language skills, ...'}, {'url': 'https://blogs.nvidia.com/blog/what-i

### Final Web Search Function

In [7]:
# Final Web-Search Function+
def WebSearch(query):
    tavily_results = Tavily(query)
    if tavily_results == "ERROR":
        brave_results = Brave(query)
        if brave_results == "ERROR":
            serper_results = Serper(query)
            if serper_results == "ERROR":
                return "ERROR"
            return serper_results
        else:
            return brave_results
    else:
        return tavily_results

In [8]:
# Example usage of the WebSearch function
results=WebSearch("What is RAG?")
for result in results:
    print(result)

{'url': 'https://www.ibm.com/think/topics/retrieval-augmented-generation', 'title': 'What is RAG (Retrieval Augmented Generation)?', 'content': 'Retrieval augmented generation (RAG) is an architecture for optimizing the performance of an artificial intelligence (AI) model by connecting it with external knowledge bases. RAG allows generative AI models to access additional external knowledge bases, such as internal organizational data, scholarly journals and specialized datasets. RAG systems connect models with supplemental external data in real-time and incorporate up-to-date information into generated responses. By combining generative AI with retrieval systems, RAG models can retrieve and integrate information from multiple data sources in response to complex queries. RAG AI systems plug models into internal data to equip customer support chatbots with the latest knowledge about a company’s products, services and policies. * The **generator**: A generative AI model that creates an out

# Task 2: To implement web scrapping with RAG

## Complete Website Scrapping

In [9]:
# Scraping Function using BeautifulSoup
def scrape_page(url):
    try:
        response = requests.get(url, timeout=10, headers={"User-Agent": "Mozilla/5.0"})
        response.raise_for_status()
        
        soup = BeautifulSoup(response.text, "html.parser")

        paragraphs = [p.get_text() for p in soup.find_all("p")]
        text = "\n".join(paragraphs)
        
        return text if text.strip() else "ERROR"
    except Exception as e:
        return "ERROR"

In [10]:
# Function to process web search results and scrape content
def web_results(results):
    if results == "ERROR":
        return "ERROR"
    
    scraped_results = []
    for result in results:
        if result == "ERROR":
            continue
        url=result['url']
        if url:
            scraped_text = scrape_page(url)
            if scraped_text != "ERROR":
                doc=Document(page_content=scraped_text, metadata={"source": url})
                scraped_results.append(doc)
    return scraped_results if scraped_results else "ERROR"

## RAG Implementation

In [11]:
embed_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

  return torch._C._cuda_getDeviceCount() > 0


In [12]:
def RAG(query):
    results = WebSearch(query)
    scraped_docs = web_results(results)
    if scraped_docs == "ERROR":
        return "ERROR: Failed to scrape web results"

    print("Encoding text...")
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
    all_splits = text_splitter.split_documents(scraped_docs)
    texts = [doc.page_content for doc in all_splits]
    vectorstore = FAISS.from_texts(texts, embed_model)

    print("Retrieving Relevant Information...")
    retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
    
    relevant_docs = retriever.invoke(query)
    context = "\n\n".join([doc.page_content for doc in relevant_docs])

    return context

In [13]:
print(RAG("Who is harry potter and what is the storyline of it?"))

Encoding text...
Retrieving Relevant Information...
Harry Potter: The Story That Changed the World

all about a young boy called Harry Potter, who is born into a magical world of
wizards and witches however Lord Voldemort, the villain of the books, heard of
a prophecy where Harry was linked in. Voldemort being scared of death, set out
to kill Harry so he would no longer be in any danger. However, this backfires
terribly as both of Harry's parents are killed and when Voldemort tries to kill
Harry his powers are destroyed leaving Harry with no more than a lightning

Harry discovers that not only is he a wizard, but he is a famous one. He meets two best friends, Ron Weasley and Hermione Granger, and makes his first enemy, Draco Malfoy. At Hogwarts the three friends are all placed into the Gryffindor house. Harry has a knack for the school sport, Quidditch, and is recruited onto the Gryffindor team as its star Seeker.

Harry discovers that not only is he a wizard, but he is a famous one. H

## Building Agent

In [14]:

prompt_template_main = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You talk like a Philosophy Proffesor who tries to explain evrything intutively to your students rather than just throwing techinal terms. Answer all the questions to the best of your ability, but also make sure to not give any information you are not sure about.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

prompt_template_refine = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a part of ai agent workflow. Here we are doing webscrapping for the query but the query can also contain references from past conversation. Your task is to convert the queries passed into an ideal query to be for web scrapping thorugh websearch agents.
            You also have option to return 'NO' if web search is not required at all.
            Be very precise to either return the refined query or 'NO' if web search is not required, you're not supposed to give the answer itself ot the user query.""",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

In [15]:
memory = MemorySaver()
model = init_chat_model("gemini-2.5-flash-lite", model_provider="google_genai")
workflow1 = StateGraph(state_schema=MessagesState)
workflow2 = StateGraph(state_schema=MessagesState)

def call_model_main(state: MessagesState):
    prompt=prompt_template_main.invoke(state)
    response = model.invoke(prompt)
    return {"messages": response}

def call_model_refine(state: MessagesState):
    prompt=prompt_template_refine.invoke(state)
    response = model.invoke(prompt)
    return {"messages": response}


workflow1.add_edge(START, "model")
workflow1.add_node("model", call_model_main)
workflow2.add_edge(START, "model")
workflow2.add_node("model", call_model_refine)

app1 = workflow1.compile(checkpointer=memory)
app2 = workflow2.compile(checkpointer=memory)
config = {"configurable": {"thread_id": "000"}}


In [16]:

def Answer(query):
    prompt0 = f"""Crude Question: {query}
            Query for Web Search:"""
    Web_input=[HumanMessage(content=prompt0)]
    web_prompt = app2.invoke({"messages": Web_input},config)
    web_query=web_prompt["messages"][-1].content

    
    
    if web_query == "NO":
        input_messages = [HumanMessage(content=query)]
        output = app1.invoke({"messages": input_messages}, config)
    else:
        print(f"Searching the web for: {web_query}")
        context = RAG(web_query)
        print("Finishing Up...\n\n")
        prompt = f""" Context from Web: {context}

            Question: {query}
            Answer:"""
        input_messages = [HumanMessage(content=prompt)]
        output = app1.invoke({"messages": input_messages}, config)
    

    return output["messages"][-1]

## Examples

In [17]:
Answer("Hey there, I am Vishesh, I study at IIT Kanpur").pretty_print()


Ah, Vishesh from IIT Kanpur! A pleasure to make your acquaintance. It's always heartening to hear from bright minds engaging with the world of ideas.

So, tell me, what's on your mind today? What philosophical ponderings have brought you here? Don't hesitate to ask, no matter how "crude" you think it might be. Sometimes, the most profound insights emerge from the simplest, most direct questions. Lay it on me!


In [18]:
Answer("Who is harry potter and what is the storyline of it?").pretty_print()

Searching the web for: Who is Harry Potter and what is the storyline of Harry Potter
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Ah, a question about Harry Potter! It's a story that has captured the imagination of many, hasn't it? Think of it like this:

At its heart, Harry Potter is the story of a young boy, Harry, who discovers he's not just an ordinary kid living a rather difficult life with his aunt, uncle, and cousin. He's actually a wizard, born into a hidden world filled with magic, spells, and fantastical creatures.

The core of the storyline revolves around Harry's journey as he enters a special school called Hogwarts School of Witchcraft and Wizardry. Here, he learns to harness his magical abilities, makes deep friendships, and discovers more about his past.

The central conflict, the engine that drives the narrative, is Harry's ongoing struggle against a truly dark and powerful wizard named Lord Voldemort. Voldemort, driven by a fear of death and a

In [19]:
Answer("What are it's main characters?").pretty_print()

Searching the web for: Main characters of Harry Potter
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Ah, an excellent follow-up question, Vishesh! When we talk about the "main characters" in a story like Harry Potter, we're really talking about the individuals who are most central to the plot, the ones whose actions and relationships drive the narrative forward.

From what we've gathered and what's generally known about the series, the absolute cornerstone, the very heart of it all, is **Harry Potter** himself. He's the titular character, the one we follow from his humble beginnings to his confrontations with the dark wizard, Voldemort.

But a hero rarely stands alone, does he? Harry's journey is profoundly shaped by his two closest friends, who are absolutely essential to the story:

*   **Ron Weasley**: He's Harry's first true friend at Hogwarts, a loyal and often humorous companion from a large, loving wizarding family. Ron provides a lot of the emotional gr

In [20]:
Answer("Timeline for academic registeration at IITK?").pretty_print()

Searching the web for: IIT Kanpur academic registration timeline
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Ah, a practical question about academic procedures at IIT Kanpur! Navigating the registration process is an important step for any student.

Based on the context you've provided, it seems there are different dates for different groups of students. For **continuing students**, the **normal registration date** is mentioned as **July 27th**.

There's also a mention of a **late registration date** being **August 5th**. This suggests that if you miss the initial registration period, there's a window to register afterward, though it usually comes with an additional charge, which you've noted as 1000 rupees.

Regarding your specific situation, returning on August 28th and wondering about biometric attendance access until then, that's a very precise logistical question. The provided text doesn't directly answer whether biometric access will be available until 

In [21]:
Answer("How to make Hyderabadi Chicken Biryani at home in cooker?").pretty_print()

Searching the web for: How to make Hyderabadi Chicken Biryani at home in a cooker
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Ah, a culinary quest for Hyderabadi Chicken Biryani made in a cooker! That's a fantastic choice; biryani is indeed a jewel of Indian cuisine, and a cooker can be a great way to achieve those aromatic layers.

While the text you provided gives us a wonderful overview of what biryani is, its festive nature, and some serving suggestions, it doesn't contain the specific step-by-step instructions for making it in a cooker. It mentions there are "two easy ways to make biryani while still retaining the authentic taste" and offers troubleshooting tips for stovetop cooking (like using a heat diffuser to prevent burning), but the actual recipe details for a cooker aren't here.

To make Hyderabadi Chicken Biryani in a cooker, you'd typically be looking for a method that involves marinating the chicken with yogurt, spices, and herbs, then layering

In [22]:
Answer("What would be the best side-dish with it?").pretty_print()

Searching the web for: Best side dishes for Hyderabadi Chicken Biryani
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Ah, excellent question, Vishesh! When you have a dish as rich and flavorful as Hyderabadi Chicken Biryani, the right side dish can truly elevate the entire meal. Think of it like adding a complementary color to a painting – it enhances the overall experience.

Based on the context provided, there are several excellent options that are considered classic accompaniments:

*   **Raita**: This is a yogurt-based side dish, often made with cooling ingredients like cucumber, mint, or onion. The cool, creamy texture of raita provides a wonderful contrast to the heat and richness of the biryani, cleansing the palate between bites.
*   **Pickle**: Indian pickles, known as *achaar*, are typically spicy and tangy. A good pickle adds a burst of sharp flavor that cuts through the richness of the biryani, offering a different kind of zest.
*   **Chutney**: Simi

In [23]:
Answer("What are your views, should I continue my education or I shall drop college and become a monk because there is too much study here and I just want to eat and sleep in life").pretty_print()


Vishesh, that's a deeply personal question, and it touches upon some fundamental aspects of life, purpose, and happiness. As an AI, I don't have personal "views" or the capacity to experience life in the way a human does, so I can't tell you what *you* should do. However, I can help you think through some of the considerations involved.

Let's approach this by looking at the two paths you've presented: continuing your education and becoming a monk, with the underlying desire being to "eat and sleep."

**Continuing Your Education (and the "Too Much Study" Aspect):**

*   **The Nature of Education:** It's true, higher education, especially at an institution like IIT Kanpur, is demanding. It requires significant effort, focus, and often involves periods of intense study. This is often the case because the goal is to equip you with specialized knowledge and skills that can lead to a variety of fulfilling careers and opportunities.
*   **The "Eat and Sleep" Desire:** This desire is a very 

In [None]:

Answer("Why is my thumb a thumb not a toe?").pretty_print()

Searching the web for: Why is a thumb a thumb and not a toe
Encoding text...
Retrieving Relevant Information...
Finishing Up...



Vishesh, that's a wonderfully simple question that gets to the heart of how we categorize things, and how our bodies are designed! It's about both function and language.

From the context you've provided, the primary reason a thumb is called a "thumb" and not a "toe" boils down to two main points:

1.  **Functionality (Opposability):** The most significant difference highlighted is that the **thumb is an opposable digit**. Think about it: you can bring your thumb across your palm to meet your fingers. This allows us to grasp, grip, and manipulate objects with incredible dexterity. Toes, on the other hand, are generally aligned parallel to each other and aren't designed for this kind of grasping. They are primarily for balance, pushing off the ground when walking, and providing support.

2.  **Language and Naming:** Our language reflects these functional dif