# 🔍 Building an AI-Powered Web Search Agent with OpenAI and Tavily 🚀

Hey there! Welcome to this exciting guide where we'll create something awesome - a smart search agent that combines the power of OpenAI's language models with Tavily's search capabilities! 🌟 

## 🎯 What We'll Build

We're going to create a super cool search agent that can:
1. 🌐 Search the web in real-time for accurate information
2. 🧠 Use OpenAI's powerful GPT models to understand and process search results
3. ⚡ Provide contextual and up-to-date responses to queries

## ✅ Prerequisites

Before we jump in, make sure you have these things ready:
- 🔑 An OpenAI API key
- 🎯 A Tavily API key (get one at tavily.com)

## 🎮 Part 1: Setting Up Our Environment

First things first - let's get our tools ready! We'll need to install the Tavily Python package to interact with their search API:


In [10]:
# Install necessary libraries
!pip install streamlit openai requests python-dotenv



In [11]:
# Load environment variables
from dotenv import load_dotenv
load_dotenv()

import os

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")


In [12]:
if TAVILY_API_KEY:
    print("✅ Tavily API Key loaded successfully.")
else:
    print("❌ Error: Tavily API Key not found. Check your .env file.")


✅ Tavily API Key loaded successfully.



## 🛠️ Part 2: Building Our Search Tools

Let's create the foundation of our search agent! We'll define a set of tools that our AI can use to search the web:

In [13]:
import json
from openai import OpenAI
from tavily import TavilyClient
import pprint

In [14]:
# Initialize Tavily
tavily = TavilyClient(api_key=TAVILY_API_KEY)

# Search query Ombu Urban Lab
import requests

def web_search(city, topic, timeframe, num_results=5):
    query = f"{topic} urban planning report OR government policy study OR academic paper in {city} during {timeframe}"
    url = "https://api.tavily.com/search"
    headers = {"Authorization": f"Bearer {TAVILY_API_KEY}"}
    payload = {"query": query, "num_results": num_results}

    response = requests.post(url, json=payload, headers=headers)
    if response.status_code == 200:
        return response.json().get("results", [])
    else:
        print("Error:", response.text)
        return []


In [15]:
web_search("Tokyo", "Green Areas", "2020-2025", num_results=5)

[{'title': "Policy Assessment of Japan's 'Decarbonisation-Leading Regions'",
  'url': 'https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/smc2.70002?af=R',
  'content': 'Refers to smart urban development for urban areas. Ministry of economy, trade and industry: Ministry of land, infrastructure, transport and tourism: Digital garden city-state concept: 2021: Cabinet office: Using digital technology to solve social issues and enhance the attractiveness of local communities. Refers to smart rural development.',
  'score': 0.16791369,
  'raw_content': None},
 {'title': 'Urban form and its impacts on air pollution and access to green space ...',
  'url': 'https://pmc.ncbi.nlm.nih.gov/articles/PMC9876236/',
  'content': 'Access to green spaces in urban areas has been recognized as an essential component of the urban environment . Green spaces in cities mostly include semi-natural vegetation cover, such as street trees, lawns, parks, gardens, forests, green roofs . Therefore, this study 

In [16]:
TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "web_search",
            "description": "Search the web for urban planning information, reports, studies, or academic papers.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query including topic, city, and timeframe"}
                },
                "required": ["query"]
            },
        },
    },
]



## 🎓 Part 3: Creating Our AI Agent

Now comes the exciting part! Let's create our AI agent that can understand questions and use our search tools to find answers:


In [21]:
from datetime import datetime

messages = [
    {
        "role": "system",
        "content": f"""
        You are a helpful Urban Research Assistant specializing in mobility, urban planning, green spaces, and public policies.
        
        Your mission is to:
        1. Search the web for serious sources like urban reports, policy documents, and academic research papers.
        2. Summarize the main points clearly in bullet form.
        3. Generate 1–2 research hypotheses based on the summaries.
        
        Make as many tool calls as needed to find good-quality sources before responding.
        Focus on reliable information, avoid blogs, tourism websites, and commercial content.
        
        Current date: {datetime.now().strftime('%Y-%m-%d')}
        """
    },
    {
        "role": "user",
        "content": "Find urban planning trends related to shared mobility in Utrecht for 2020-2025."
    }
]


In [23]:
def invoke_model(messages):
    # Initialize the OpenAI client
    client = OpenAI()

    # Make a ChatGPT API call with tool calling
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )

    return completion.choices[0].message.content

In [24]:
# Initialize the OpenAI client
client = OpenAI()

# Make a ChatGPT API call with tool calling
completion = client.chat.completions.create(
    model="gpt-4o-mini",
    tools=TOOLS,
    messages=messages
)

response = completion.choices[0].message
pprint.pprint(response.tool_calls)

# Parse the response to get the tool call arguments
if response.tool_calls:
    # Process each tool call
    for tool_call in response.tool_calls:
        # Get the tool call arguments
        tool_call_arguments = json.loads(tool_call.function.arguments)
        if tool_call.function.name == "search_web":
            print("Searching for", tool_call_arguments)
            search_results = search_web(tool_call_arguments["query"])
            messages.append({"role": "assistant", "content": f"{tool_call_arguments["query"]}: {search_results}"})
    print(invoke_model(messages))

else:
    # If there are no tool calls, return the response content
    print(response.content)

[ChatCompletionMessageToolCall(id='call_K0qsw7cVmAsHDHyeeAyz6JwC', function=Function(arguments='{"query":"Utrecht shared mobility urban planning trends 2020-2025"}', name='web_search'), type='function')]
I’ll conduct a search for academic reports, urban planning documents, and other credible sources regarding the trends in shared mobility in Utrecht from 2020 to 2025. Please hold on while I gather the information. 

### Summary of Findings on Urban Planning Trends Related to Shared Mobility in Utrecht (2020-2025)

1. **Integration of Shared Mobility with Public Transport**:
   - Local authorities are emphasizing the integration of shared mobility services (such as bikes, scooters, and car-sharing) with public transport to enhance connectivity and reduce dependency on private cars.
   - The introduction of unified ticketing systems is being explored to provide seamless transitions between different modes of transport.

2. **Sustainability Goals**:
   - Utrecht aims to become carbon-neut

In [26]:
# create a function call the OpenAi API 

def agent(messages):

    # Initialize the OpenAI client
    client = OpenAI()

    # Make a ChatGPT API call with tool calling
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        tools=TOOLS, # here we pass the tools to the LLM
        messages=messages
    )

    # Get the response from the LLM
    response = completion.choices[0].message

    # Parse the response to get the tool call arguments
    if response.tool_calls:
        # Process each tool call
        for tool_call in response.tool_calls:
            # Get the tool call arguments
            tool_call_arguments = json.loads(tool_call.function.arguments)
            if tool_call.function.name == "save_memory":
                return save_memory(tool_call_arguments["memory"])
            elif tool_call.function.name == "web_search":
                search_results = search_web(tool_call_arguments["query"])
                messages.append({"role": "assistant", "content": f"Here are the search results: {search_results}"})
                return invoke_model(messages)
    else:
        # If there are no tool calls, return the response content
        return response.content

In [25]:
messages

[{'role': 'system',
  'content': '\n        You are a helpful Urban Research Assistant specializing in mobility, urban planning, green spaces, and public policies.\n        \n        Your mission is to:\n        1. Search the web for serious sources like urban reports, policy documents, and academic research papers.\n        2. Summarize the main points clearly in bullet form.\n        3. Generate 1–2 research hypotheses based on the summaries.\n        \n        Make as many tool calls as needed to find good-quality sources before responding.\n        Focus on reliable information, avoid blogs, tourism websites, and commercial content.\n        \n        Current date: 2025-04-28\n        '},
 {'role': 'user',
  'content': 'Find urban planning trends related to shared mobility in Utrecht for 2020-2025.'}]