# AI Competitor Intelligence Agent Tutorial
This notebook demonstrates how to build an AI-powered competitor analysis system.

## What we'll learn:
1. Setting up API connections (Exa, Firecrawl, OpenAI)
2. Finding competitor URLs automatically
3. Extracting competitor data from websites
4. Generating structured comparison reports
5. Creating actionable business insights

In [7]:
pip install exa-py==1.7.1 firecrawl-py==1.9.0 duckduckgo-search==7.2.1 phidata==2.7.3 streamlit==1.41.1

Note: you may need to restart the kernel to use updated packages.


In [8]:
# Import required libraries
from exa_py import Exa
from phi.agent import Agent
from phi.tools.firecrawl import FirecrawlTools
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo
import pandas as pd

print("✅ All libraries imported successfully!")

✅ All libraries imported successfully!


In [9]:
# Set up API keys (replace with your actual keys)
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()
os.environ["FIRECRAWL_API_KEY"] = getpass.getpass()
os.environ["EXA_API_KEY"] = getpass.getpass()

In [10]:
# Initialize Exa client
exa = Exa(api_key=EXA_API_KEY)
print("🔑 API keys configured")

🔑 API keys configured


In [11]:
firecrawl_tools = FirecrawlTools(
    api_key= FIRECRAWL_API_KEY,
    scrape= False,
    crawl=True,
    limit=5
)

firecrawl_agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini", api_key=OPENAI_API_KEY),
    tools= [firecrawl_tools, DuckDuckGo()],
    show_tool_calls=True,
    markdown=True
)



In [12]:
comparision_agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini", api_key=OPENAI_API_KEY),
    show_tool_calls=True,
    markdown=True
)

print("AI Agent Initilized successfully")

AI Agent Initilized successfully


In [13]:
def get_competitor_urls(url=None, description=None):
    """
    Find competitor URLs using Exa's search capabilities
    
    Args:
        url: Company website URL
        description: Company description text
    
    Returns:
        List of competitor URLs
    """
    if url:
        # Find similar companies based on URL
        result = exa.find_similar(
            url=url,
            num_results=10,
            exclude_source_domain=True,
            category="company"
        )
    elif description:
        # Search for companies based on description
        result = exa.search(
            description,
            type="neural",
            category="company",
            use_autoprompt=True,
            num_results=10
        )
    else:
        raise ValueError("Please provide either a URL or a description.")
    
    competitor_urls = [item.url for item in result.results]
    return competitor_urls

In [14]:
get_competitor_urls("https://www.moneycontrol.com/","For a financial portal born in late 1999, just when bursting of the dotcom bubble was about to nearly bring down both financial markets and the fledgling worldwide web, we couldn't have chosen a more difficult time to launch. But it was really passion and belief that saw us through. A single-minded passion to become the country's greatest resource for financial information on the Internet. And the belief, that through it, we would be able to make a difference to people's financial lives.")

ValueError: Request failed with status code 400: {"requestId":"fde7c52d5bd24df619c0c97f5e5f1a38","error":"x-api-key header must not be empty"}

In [None]:
test_url = "https://www.moneycontrol.com/"
competitors = get_competitor_urls(url=test_url)
print(f"Found competitors: {competitors}")

Found competitors: ['https://moneycontrol.net/', 'https://www.cnbctv18.com/', 'https://stat2.moneycontrol.com/', 'https://www.etnownews.com/', 'https://www.capitalmarket.com/', 'https://investmentguruindia.com/', 'https://www.goodreturns.in/', 'https://img.moneycontrol.co.in/', 'https://www.zeebiz.com/hindi/']


In [None]:
## Adjus this function to work with the multiple URL

def extract_competitor_info(competitor_url: str):
    """
    Extract detailed information from competitor websites
    
    Args:
        competitor_url: URL of competitor website
        
    Returns:
        Dictionary with competitor data
    """
    try:
        # Use AI agent to crawl and summarize the website
        crawl_response = firecrawl_agent.run(f"Crawl and summarize {competitor_url}")
        crawled_data = crawl_response.content
        
        return {
            "competitor": competitor_url,
            "data": crawled_data
        }
    except Exception as e:
        print(f"Error extracting info for {competitor_url}: {e}")
        return {
            "competitor": competitor_url,
            "error": str(e)
        }

# Test the function 
sample_data = extract_competitor_info(competitors[0])
print("Sample competitor data extracted!")
print(f"Data length: {len(str(sample_data))}")

Sample competitor data extracted!
Data length: 850


In [None]:
print(sample_data)

{'competitor': 'https://moneycontrol.net/', 'data': '\nRunning:\n - crawl_website(url=https://moneycontrol.net/, limit=5)\n\nThe website **Moneycontrol.net** appears to be currently for sale. The content suggests that individuals can purchase the domain through BuyDomains.com by contacting their numbers.\n\n### Summary:\n\n- **Website:** [Moneycontrol.net](https://moneycontrol.net/)\n- **Status:** Domain is available for sale.\n- **Contact Information for Purchase:** \n  - Phone: 781-373-6841 or 844-896-7299\n\nUnfortunately, the website does not seem to provide any current financial information, articles, or services related to money management or investing, which might typically be expected from a financial news platform.\n\nFor further details or to inquire about purchasing the domain, visitors are directed to the BuyDomains website.'}
