# Agentic AI for Market Intelligence

## Understanding Market Trends for a Startup’s Product Idea

This notebook explores how multi-agent AI systems can autonomously gather and analyze market intelligence to help startups evaluate product ideas.

We build AI agents that perform the following tasks:

1. **Web Scraping** – Using LlamaIndex's simple web crawler to collect market data from relevant sources.
2. **Sentiment Analysis** – Analyzing customer sentiment about similar products or industry trends.
3. **Competitive Intelligence** – Extracting key insights about competitors, potential gaps, and market trends.

### Why This Matters for Startups

- **Validate Product-Market Fit:** Identify if the startup’s product idea has demand.
- **Understand Competitor Positioning:** Learn from existing competitors and their market strategies.
- **Discover Emerging Trends:** Use AI-driven insights to spot trends that influence product development.

### Requirements
Ensure you have the necessary dependencies installed:
- `llama-index`
- `beautifulsoup4`
- `requests`

Let's begin by setting up the required tools and agents.


In [1]:
!pip install llama-index beautifulsoup4 requests groq llama-index-llms-groq llama-index-readers-web faiss-cpu


Collecting llama-index
  Downloading llama_index-0.12.15-py3-none-any.whl.metadata (12 kB)
Collecting groq
  Downloading groq-0.16.0-py3-none-any.whl.metadata (14 kB)
Collecting llama-index-llms-groq
  Downloading llama_index_llms_groq-0.3.1-py3-none-any.whl.metadata (2.3 kB)
Collecting llama-index-readers-web
  Downloading llama_index_readers_web-0.3.5-py3-none-any.whl.metadata (1.2 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Collecting llama-index-agent-openai<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.3-py3-none-any.whl.metadata (727 bytes)
Collecting llama-index-cli<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_cli-0.4.0-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core<0.13.0,>=0.12.15 (from llama-index)
  Downloading llama_index_core-0.12.15-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-embeddings-openai<0.4.0,>=0.3.0 (from llama-index)
  Down

## Import Required Modules

We import necessary components from `llama_index` to set up web scraping, sentiment analysis, and competitive intelligence agents.


In [2]:
%pip install llama-index-vector-stores-faiss

Collecting llama-index-vector-stores-faiss
  Downloading llama_index_vector_stores_faiss-0.3.0-py3-none-any.whl.metadata (658 bytes)
Downloading llama_index_vector_stores_faiss-0.3.0-py3-none-any.whl (3.9 kB)
Installing collected packages: llama-index-vector-stores-faiss
Successfully installed llama-index-vector-stores-faiss-0.3.0


In [4]:
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.workflow import ReActAgent, AgentWorkflow
from llama_index.llms.openai import OpenAI
from llama_index.readers.web import SimpleWebPageReader
from llama_index.core.workflow import Context
from llama_index.vector_stores.faiss import FaissVectorStore
from llama_index.core import (VectorStoreIndex,StorageContext)
import faiss
import requests
from bs4 import BeautifulSoup

from llama_index.llms.groq import Groq


INFO:faiss.loader:Loading faiss with AVX2 support.
INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
INFO:faiss:Failed to load GPU Faiss: name 'GpuIndexIVFFlat' is not defined. Will not load constructor refs for GPU indexes.


## Define Tools

We use LlamaIndex’s `SimpleWebPageReader` to scrape content from webpages. This tool extracts raw text from URLs for further processing.


In [5]:
def scrape_webpage(url: str) -> str:
    try:
        documents = SimpleWebPageReader(html_to_text=True).load_data([url])
        # Depending on use case and requirement use FAISS Index
        # d = 1536
        # faiss_index = faiss.IndexFlatL2(d)
        # vector_store = FaissVectorStore(faiss_index=faiss_index)
        # storage_context = StorageContext.from_defaults(vector_store=vector_store)
        # index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
        # return index
        # Depending on use case and requirement use FAISS Index
        return documents[0].text if documents else "No content found."
    except Exception as e:
        return f"Error: {e}"

scraper_tool = FunctionTool.from_defaults(fn=scrape_webpage, name="web_scraper", description="Extracts text from a webpage.")



async def write_report(ctx: Context, report_content: str) -> str:
    """Useful for writing a report on a given topic. Your input should be a markdown formatted report."""
    current_state = await ctx.get("state")
    current_state["report_content"] = report_content
    await ctx.set("state", current_state)
    return "Report written."
writer_tool = FunctionTool.from_defaults(fn=write_report, name="report_writer", description="writes")


async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
    """Useful for recording notes on a given topic. Your input should be notes with a title to save the notes under."""
    current_state = await ctx.get("state")
    if "research_notes" not in current_state:
        current_state["research_notes"] = {}
    current_state["research_notes"][notes_title] = notes
    await ctx.set("state", current_state)
    return "Notes recorded."
notes_tool = FunctionTool.from_defaults(fn=record_notes, name="note-taker", description="take notes")


async def review_report(ctx: Context, review: str) -> str:
    """Useful for reviewing a report and providing feedback. Your input should be a review of the report."""
    current_state = await ctx.get("state")
    current_state["review"] = review
    await ctx.set("state", current_state)
    return "Report reviewed."
reviewer_tool = FunctionTool.from_defaults(fn=review_report, name="reviews_report", description="reviewer")

## Define Sentiment Analysis Tool

We use OpenAI’s LLM to analyze sentiment from extracted content. The function classifies text into **Positive**, **Negative**, or **Neutral** sentiments.


In [6]:
def analyze_sentiment(text: str) -> str:
    prompt = f"Analyze the sentiment of the following text and return Positive, Negative, or Neutral: \n\n{text}"
    return llm.complete(prompt)

sentiment_tool = FunctionTool.from_defaults(fn=analyze_sentiment, name="sentiment_analysis", description="Analyzes sentiment of the given text.")


## Define Competitive Intelligence Tool

This tool processes extracted text to identify key insights, trends, and competitor strategies.


In [7]:
def extract_competitive_insights(text: str) -> str:
    prompt = f"Extract key insights, market trends, and competitor strategies from the following text: \n\n{text}"
    return llm.complete(prompt)

competitive_tool = FunctionTool.from_defaults(fn=extract_competitive_insights, name="competitive_intelligence", description="Extracts market trends and competitor insights.")


## Initialize OpenAI LLM

We use OpenAI’s GPT model to process and analyze scraped web data.


In [18]:
from google.colab import userdata
GROQ_API_KEY = userdata.get('GROQ_API_KEY')
OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")


llama_llm = Groq(model="llama-3.3-70b-specdec", api_key=GROQ_API_KEY)
openai_llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)

llm = openai_llm

## Define Multi-Agent System

We create three AI agents:
1. **ResearchAgent** – Research market data from webpages.
2. **SentimentAgent** – Performs sentiment analysis on extracted text.
3. **WriteAgent** – Writes insights from data.
3. **ReviewAgent** – Reviews Contents from Reported Data.

These agents collaborate to perform autonomous market intelligence tasks.


In [19]:

from llama_index.core.agent.workflow import FunctionAgent, ReActAgent

research_agent = ReActAgent(
    name="ResearchAgent",
    description="Useful for searching the web for information on a given topic and recording notes on the topic.",
    system_prompt=(
        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
        "Once notes are recorded and you are satisfied, you should hand off control to the WriteAgent to write a report on the topic. "
        "You should have at least some notes on a topic before handing off control to the WriteAgent."
        "Extract key insights, market trends, and competitor strategies"
        "restricts tokens to less than 50000"
    ),
    llm=llm,
    tools=[scraper_tool, notes_tool],
    can_handoff_to=["WriteAgent"],
)

write_agent = ReActAgent(
    name="WriteAgent",
    description="Useful for writing a report on a given topic.",
    system_prompt=(
        "You are the WriteAgent that can write a report on a given topic. "
        "Assess the sentiment on a given topic. "
        "Your report should be in a markdown format. The content should be grounded in the research notes. "
        "Once the report is written, you should get feedback at least once from the ReviewAgent."
    ),
    llm=llm,
    tools=[writer_tool, sentiment_tool],
    can_handoff_to=["ReviewAgent", "ResearchAgent"],
)

review_agent = ReActAgent(
    name="ReviewAgent",
    description="Useful for reviewing a report and providing feedback.",
    system_prompt=(
        "You are the ReviewAgent that can review the write report and provide feedback. "
        "Your review should either approve the current report or request changes for the WriteAgent to implement. "
        "If you have feedback that requires changes, you should hand off control to the WriteAgent to implement the changes after submitting the review."
    ),
    llm=llm,
    tools=[reviewer_tool,],
    can_handoff_to=["WriteAgent"],
)


1. **ResearchAgent** : Research market data from webpages
2. **SentimentAgent** : Performs sentiment analysis on extracted text (Note: it is tool not agent)
3. **WriteAgent** : Writes insights from data
4. **ReviewAgent** : Reviews Contents from Reported Data

These Agents Collaborate to perform autonomous market intelligence tasks
<br>Note: Agent is sort of an SME and Tool are the skills or ability we provide to them

## Set Up the Agent Workflow

The `AgentWorkflow` enables our agents to work together sequentially, performing web scraping, sentiment analysis, and extracting insights.


In [20]:
agent_workflow = AgentWorkflow(
    agents=[research_agent, write_agent, review_agent],
    root_agent=research_agent.name
)


## Running the Multi-Agent Workflow

We test our system by extracting market intelligence from a sample webpage.


In [21]:
# url = "https://economictimes.indiatimes.com/"  # Replace with a real URL
# result = await agent_workflow.run(user_msg=f"Analyze industry trends for healthcare from {url}")

url = "https://economictimes.indiatimes.com/budget"
result = await agent_workflow.run(user_msg=f"Analyze the budget for me {url}")
result


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 20.000000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 19.940000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completi

WorkflowRuntimeError: Error in step 'run_agent_step': Error code: 429 - {'error': {'message': 'Rate limit reached for gpt-4o-mini in organization org-XqxsrErw5IF0yK8HMOvuryK2 on tokens per min (TPM): Limit 60000, Used 58279, Requested 50030. Please try again in 48.309s. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}}

## Streaming Agent Events

We monitor agent actions by streaming events, displaying web scraping, sentiment analysis, and intelligence gathering steps.


In [22]:
from llama_index.core.agent.workflow import AgentOutput, ToolCall, ToolCallResult

url = "https://economictimes.indiatimes.com/budget"
handler = agent_workflow.run(user_msg=f"Analyze the budget for me {url}")

current_agent = None
async for event in handler.stream_events():
    if hasattr(event, "current_agent_name") and event.current_agent_name != current_agent:
        current_agent = event.current_agent_name
        print(f"\n{'='*50}")
        print(f"🤖 Agent: {current_agent}")
        print(f"{'='*50}\n")
    elif isinstance(event, AgentOutput):
        if event.response.content:
            print("📤 Output:", event.response.content)
        if event.tool_calls:
            print("🛠️ Tools Used:", [call.tool_name for call in event.tool_calls])
    elif isinstance(event, ToolCallResult):
        print(f"🔧 Tool Result ({event.tool_name}): {event.tool_output}")



🤖 Agent: ResearchAgent



INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


📤 Output: Thought: The current language of the user is: English. I need to use a tool to help me gather more information about the budget from the provided link.
Action: web_scraper
Action Input: {"url":"https://economictimes.indiatimes.com/budget"}
🛠️ Tools Used: ['web_scraper']


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 0.463645 seconds


🔧 Tool Result (web_scraper): [![The Economic Times](https://img.etimg.com/photo/msid-76931895/et-
logo.jpg)](/ "The Economic Times")[Budget
2025](https://economictimes.indiatimes.com/budget "Budget 2025")

English EditionEnglish
Edition[हिन्दी](https://hindi.economictimes.com/)[ગુજરાતી](https://gujarati.economictimes.com/)[मराठी](https://marathi.economictimes.com/)[বাংলা](https://bengali.economictimes.com/)[ಕನ್ನಡ](https://kannada.economictimes.com/)[മലയാളം](https://malayalam.economictimes.com/)[தமிழ்](https://tamil.economictimes.com/)[తెలుగు](https://telugu.economictimes.com/)

| 01 February, 2025, 12:54 PM IST | [Today's ePaper](https://epaper.indiatimes.com/timesepaper/publication-the-economic-times,city-delhi.cms)

[ My Watchlist
](https://economictimes.indiatimes.com/watchlist?source=homepage&medium=header&campaign=watchlist)[Subscribe](javascript:cdpGoToPlan\(\))

[Sign In](javascript:objUser.login\(\))

Search

+

[Home](/)

[BUDGET'25](https://economictimes.indiatimes.com/budget

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 39.440000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 1.877860 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
ERROR:asyncio:Exception in callback Dispatcher.span.<locals>.wrapper.<locals>.handle_future_result(span_id='Workflow.run...-0eebb1546158', bound_args=<BoundArgumen...mory': None})>, instance=<llama_index....x7dc1e8717710>, context=<_contextvars...x7dc1e8925e40>)(<WorkflowHand...exceeded'}}")>) at /usr/local/lib/python3.11/dist-packages/llama_index/core/instrumentation/dispatcher.py:273
handle: <Handle Dispatcher.span.<locals>.wrapper.<locals>.handle_future_result(span_id='Workflow.run...-0eebb1546158', bound_a