# AI Competitor Intelligence Agent Tutorial
This notebook demonstrates how to build an AI-powered competitor analysis system.

## What we'll learn:
1. Setting up API connections (Exa, Firecrawl, OpenAI)
2. Finding competitor URLs automatically
3. Extracting competitor data from websites
4. Generating structured comparison reports
5. Creating actionable business insights

In [2]:
# Import required libraries
from exa_py import Exa
from phi.agent import Agent
from phi.tools.firecrawl import FirecrawlTools
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo
import pandas as pd

print("✅ All libraries imported successfully!")

✅ All libraries imported successfully!


In [None]:
# Set up API keys (replace with your actual keys)
OPENAI_API_KEY = ""
FIRECRAWL_API_KEY = "" 
EXA_API_KEY = ""

# Initialize Exa client
exa = Exa(api_key=EXA_API_KEY)
print("🔑 API keys configured")

🔑 API keys configured


In [5]:
firecrawl_tools = FirecrawlTools(
    api_key= FIRECRAWL_API_KEY,
    scrape= False,
    crawl=True,
    limit=5
)

firecrawl_agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini", api_key=OPENAI_API_KEY),
    tools= [firecrawl_tools, DuckDuckGo()],
    show_tool_calls=True,
    markdown=True
)



In [6]:
comparision_agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini", api_key=OPENAI_API_KEY),
    show_tool_calls=True,
    markdown=True
)

print("AI Agent Initilized successfully")

AI Agent Initilized successfully


In [9]:
def get_competitor_urls(url=None, description=None):
    """
    Find competitor URLs using Exa's search capabilities
    
    Args:
        url: Company website URL
        description: Company description text
    
    Returns:
        List of competitor URLs
    """
    if url:
        # Find similar companies based on URL
        result = exa.find_similar(
            url=url,
            num_results=10,
            exclude_source_domain=True,
            category="company"
        )
    elif description:
        # Search for companies based on description
        result = exa.search(
            description,
            type="neural",
            category="company",
            use_autoprompt=True,
            num_results=10
        )
    else:
        raise ValueError("Please provide either a URL or a description.")
    
    competitor_urls = [item.url for item in result.results]
    return competitor_urls

In [10]:
get_competitor_urls("https://www.moneycontrol.com/","For a financial portal born in late 1999, just when bursting of the dotcom bubble was about to nearly bring down both financial markets and the fledgling worldwide web, we couldn't have chosen a more difficult time to launch. But it was really passion and belief that saw us through. A single-minded passion to become the country's greatest resource for financial information on the Internet. And the belief, that through it, we would be able to make a difference to people's financial lives.")

['https://moneycontrol.net/',
 'https://www.cnbctv18.com/',
 'https://stat2.moneycontrol.com/',
 'https://www.etnownews.com/',
 'https://www.capitalmarket.com/',
 'https://investmentguruindia.com/',
 'https://www.goodreturns.in/',
 'https://img.moneycontrol.co.in/',
 'https://www.zeebiz.com/hindi/']

In [15]:
test_url = "https://www.moneycontrol.com/"
competitors = get_competitor_urls(url=test_url)
print(f"Found competitors: {competitors}")

Found competitors: ['https://moneycontrol.net/', 'https://www.cnbctv18.com/', 'https://stat2.moneycontrol.com/', 'https://www.etnownews.com/', 'https://www.capitalmarket.com/', 'https://investmentguruindia.com/', 'https://www.goodreturns.in/', 'https://img.moneycontrol.co.in/', 'https://www.zeebiz.com/hindi/']


In [16]:
## Adjus this function to work with the multiple URL

def extract_competitor_info(competitor_url: str):
    """
    Extract detailed information from competitor websites
    
    Args:
        competitor_url: URL of competitor website
        
    Returns:
        Dictionary with competitor data
    """
    try:
        # Use AI agent to crawl and summarize the website
        crawl_response = firecrawl_agent.run(f"Crawl and summarize {competitor_url}")
        crawled_data = crawl_response.content
        
        return {
            "competitor": competitor_url,
            "data": crawled_data
        }
    except Exception as e:
        print(f"Error extracting info for {competitor_url}: {e}")
        return {
            "competitor": competitor_url,
            "error": str(e)
        }

# Test the function 
sample_data = extract_competitor_info(competitors[0])
print("Sample competitor data extracted!")
print(f"Data length: {len(str(sample_data))}")

Sample competitor data extracted!
Data length: 722


In [17]:
print(sample_data)

{'competitor': 'https://moneycontrol.net/', 'data': "\nRunning:\n - crawl_website(url=https://moneycontrol.net/, limit=5)\n\nThe website **[Moneycontrol.net](https://moneycontrol.net/)** currently appears to be parked and is not offering any content related to financial news or market updates. Instead, it states that the domain is up for sale and provides contact information for purchasing it. \n\n### Summary\n- **Status**: The domain is for sale.\n- **Content**: No financial news or stock market updates available.\n- **Contact Information**: For purchasing, call BuyDomains.com.\n\nIf you're looking for information about financial markets or Indian economy news, please refer to other reputable finance websites."}
