# Auto-GPT Style Agent with Explicit Chain-of-Thought (CoT)

This notebook demonstrates a simplified **Auto-GPT** flow in which the agent **plans** its steps via **Chain-of-Thought (CoT)**. The user query:
> **“Find the price of the best EV in the market based on buyer sentiment.”**

The agent’s process:
1. **Identify** the need to determine buyer sentiment from a review site.
2. **Scrape** the review site to figure out which EV is top-rated.
3. **Scrape** an auto marketplace site for pricing details.
4. **Provide** a final answer combining both details.

We’ll break the code into **four sections**:
1. **Web Tools** (Mocks site data and provides search/scrape functions)
2. **LLM Adapter** (Mock LLM that returns explicit chain-of-thought steps)
3. **Auto-GPT Style Agent** (Main loop that parses chain-of-thought, visits sites, and decides when to finalize)
4. **Main Orchestration** (Runs the scenario with a user query)

We'll show how the LLM's output includes:
- **`CHAIN_OF_THOUGHT:`** for step-by-step reasoning.
- **`PLAN_TO_VISIT:`** to decide which sites/tools to call.
- **`FINAL_ANSWER:`** once it has enough data to respond.

Let's begin!

## Section 1: Web Tools
We simulate two websites:
- **review_evs.com**: buyer sentiment (which EV is top-rated).
- **auto_market.com**: pricing data for certain EVs.

We define:
- `web_search_tool(query)`: a mock function that returns relevant sites based on keywords.
- `web_scrape_tool(url)`: returns the site content.


In [None]:
# web_tools.py

# Mock data: websites and their content
MOCK_WEBSITES = {
    "review_evs.com": (
        "Buyer Sentiment Summary:\n"
        " - Tesla Model Y: Highly rated by most users, praising range and performance.\n"
        " - Nissan Leaf: Moderate reviews, good for city driving but limited range.\n"
        " - E-Car Pro: Newer entrant, positive early feedback on interior.\n"
        "Overall, the top-rated EV by user sentiment is the Tesla Model Y."
    ),
    "auto_market.com": (
        "Pricing Data:\n"
        " - Tesla Model Y: Starting at $54,000.\n"
        " - Nissan Leaf: Starting at $28,000.\n"
        " - E-Car Pro: Starting at $39,000.\n"
    )
}

def web_search_tool(query):
    """
    Mock function to 'search' the web for relevant sites based on the query.
    We'll just match query keywords to decide which sites to visit.
    """
    results = []
    query_lower = query.lower()

    if "review" in query_lower or "sentiment" in query_lower:
        results.append("review_evs.com")
    if "price" in query_lower or "market" in query_lower:
        results.append("auto_market.com")

    # If no direct match, return a default site
    if not results:
        results.append("review_evs.com")

    return results

def web_scrape_tool(url):
    """
    Mock function to fetch text content from a site.
    """
    return MOCK_WEBSITES.get(url, "No content found for this URL.")

## Section 2: LLM Adapter
Here, we implement a mock `call_llm` function that:
- Parses the conversation log (lowercased).
- Checks whether we already have the **review data** and **pricing data** in the log.
- If yes, it produces a **final answer**.
- Otherwise, it provides a **chain-of-thought** and a **plan** to visit sites.

This simulates a **Chain-of-Thought** approach, where the LLM enumerates reasoning steps explicitly.

In [None]:
# llm_adapter.py

def call_llm(conversation, temperature=0.2):
    """
    Simulated LLM call that explicitly includes Chain-of-Thought for demonstration.
    We inspect the conversation to see what's been scraped.
    If we have both sentiment data and price data, we produce a final answer.
    Otherwise, we produce a plan to visit more sites.
    """
    full_text = "\n".join(conversation).lower()

    # Check if we have the review (sentiment) data
    has_review_data = "review_evs.com" in full_text
    # Check if we have the auto market site data
    has_price_data = "auto_market.com" in full_text

    # If we have both sentiment data and pricing, finalize
    if has_review_data and has_price_data:
        return (
            "CHAIN_OF_THOUGHT:\n"
            "1. I've identified the top-rated EV from buyer sentiment (Tesla Model Y).\n"
            "2. I've found the price on the auto_market.com site ($54,000).\n"
            "Therefore, I can form the final answer.\n"
            "FINAL_ANSWER: The best EV based on buyer sentiment is the Tesla Model Y, "
            "which is priced around $54,000."
        )

    # Otherwise, generate chain-of-thought and a plan
    chain_of_thought = [
        "1. The user wants the best EV by buyer sentiment AND its price.",
        "2. First, gather sentiment data from a review site.",
        "3. Then, gather pricing data from an auto marketplace site."
    ]

    # If we haven't visited the review site, plan to do that
    if not has_review_data:
        chain_of_thought.append("4. We still need to visit a review site to see buyer sentiment.")
        plan_str = "[review sentiment]"
    elif has_review_data and not has_price_data:
        chain_of_thought.append("4. We have buyer sentiment but still need pricing info.")
        plan_str = "[price market]"
    else:
        plan_str = "[review sentiment, price market]"  # fallback

    return (
        "CHAIN_OF_THOUGHT:\n" +
        "\n".join(chain_of_thought) + "\n" +
        "PLAN_TO_VISIT: " + plan_str
    )

## Section 3: Auto-GPT Style Agent
An **Auto-GPT** style loop:
1. We have a conversation log where we store user queries, LLM responses, and any system messages.
2. Each iteration, we **call** the LLM.
3. If we see a `FINAL_ANSWER`, we **stop**.
4. If we see `PLAN_TO_VISIT: [...]`, we parse the keywords, use a **web search** to get sites, then **scrape** them.
5. We add the scraped content to the conversation and loop again until we have enough data.
6. We limit the loop via `max_iterations` to avoid infinite cycles.

In [None]:
# autogpt_agent.py

import re
from llm_adapter import call_llm
from web_tools import web_search_tool, web_scrape_tool

class AutoGPTCoTAgent:
    def __init__(self, max_iterations=5):
        self.conversation_log = []
        self.visited_sites = set()
        self.max_iterations = max_iterations

    def run(self, user_query):
        # Add user query to log
        self.conversation_log.append(f"User: {user_query}")

        for _ in range(self.max_iterations):
            # 1. Call the LLM with our conversation log
            llm_response = call_llm(self.conversation_log)
            self.conversation_log.append(f"Agent: {llm_response}")

            # 2. Check for final answer
            if "FINAL_ANSWER:" in llm_response:
                final_answer = llm_response.split("FINAL_ANSWER:")[-1].strip()
                return final_answer

            # 3. Otherwise parse "PLAN_TO_VISIT:" with a regex
            match = re.search(r"PLAN_TO_VISIT:\s*\[(.*?)\]", llm_response)
            if match:
                plan_str = match.group(1)
                # e.g., "review sentiment" or "price market"
                keywords = [kw.strip() for kw in plan_str.split(",")]

                for kw in keywords:
                    # 4. Use web_search_tool to find relevant sites
                    sites = web_search_tool(kw)
                    self.conversation_log.append(f"System: Searching for '{kw}' -> {sites}")

                    # For each site, if not visited, scrape
                    for site in sites:
                        if site not in self.visited_sites:
                            content = web_scrape_tool(site)
                            self.visited_sites.add(site)
                            # Add the scraped content to the conversation
                            self.conversation_log.append(f"System: SCRAPED {site} -> {content}")
            else:
                # No plan found; might need more context or next iteration
                self.conversation_log.append("System: No plan found, waiting for next step.")

        # If we exit without final answer
        return "Sorry, I couldn't complete this research in time."

## Section 4: Main Orchestration
We now instantiate our **AutoGPTCoTAgent**, pass the user query, and watch the agent's step-by-step chain-of-thought as it visits the review site and then the marketplace site, finally returning the requested price.

In [None]:
# main.py

from autogpt_agent import AutoGPTCoTAgent

if __name__ == "__main__":
    user_query = "Find the price of the best EV in the market based on buyer sentiment."

    agent = AutoGPTCoTAgent(max_iterations=5)
    final_answer = agent.run(user_query)

    print("=== FINAL ANSWER ===")
    print(final_answer)

## How to Run
1. **Run all cells** in the notebook.
2. Check the **FINAL ANSWER** printed by the last cell. It should say something like:
   > "The best EV based on buyer sentiment is the Tesla Model Y, which is priced around \$54,000."
3. If you want to **debug** or see the conversation, you can print out `agent.conversation_log` after the loop.

## Example Conversation Log
You might see:
```
User: Find the price of the best EV in the market based on buyer sentiment.
Agent: CHAIN_OF_THOUGHT:
1. The user wants the best EV by buyer sentiment AND its price.
2. First, gather sentiment...
PLAN_TO_VISIT: [review sentiment]
System: Searching for 'review sentiment' -> ['review_evs.com']
System: SCRAPED review_evs.com -> Buyer Sentiment Summary...
Agent: CHAIN_OF_THOUGHT:
1. We know Tesla Model Y is top-rated. Still need price...
PLAN_TO_VISIT: [price market]
System: Searching for 'price market' -> ['auto_market.com']
System: SCRAPED auto_market.com -> Pricing Data: Tesla Model Y...
Agent: CHAIN_OF_THOUGHT:
1. Found top-rated EV (Tesla Model Y) and price ($54,000)
FINAL_ANSWER: The best EV based on buyer sentiment is the Tesla Model Y, priced at $54,000.
```

## Key Points
1. **Explicit Chain-of-Thought**: The `CHAIN_OF_THOUGHT:` section spells out the reasoning steps.
2. **Auto-GPT Loop**: The agent calls the LLM repeatedly, checking each time if we should **visit more sites** or **provide a final answer**.
3. **Tool Usage**: The agent uses `web_search_tool` to find sites, then `web_scrape_tool` to get content. This content is appended to the conversation.
4. **Iteration Limit**: We use `max_iterations=5` to avoid infinite loops.
5. **Mocked Data**: In production, you'd replace these steps with **real** web searches, retrieval APIs, or other data sources.

This approach highlights how **Chain-of-Thought** can be made **explicit** in an **Auto-GPT style** agent, facilitating transparent planning and multi-step problem-solving.