# 🤖✨ AI-Powered Small Business Reflection Notebook 🚀📈

---

Welcome to the **AI Reflexion Workflow Notebook**!  
This notebook demonstrates a powerful, iterative, and modular approach to answering complex questions about **how Artificial Intelligence (AI) is reshaping small businesses and entrepreneurship**.  
It leverages the latest in LLMs, search tools, and LangChain's graph-based orchestration to provide high-quality, referenced answers.  
Let's dive in! 🏊‍♂️🐬

---

## 🗂️ Table of Contents

- [Overview](#overview)
- [Key Features](#key-features)
- [How It Works](#how-it-works)
- [Workflow Diagram](#workflow-diagram)
- [Setup & Requirements](#setup--requirements)
- [Code Walkthrough](#code-walkthrough)
- [Reflexion Mechanism Explained](#reflexion-mechanism-explained)
- [Sample Outputs](#sample-outputs)
- [Customization Tips](#customization-tips)
- [Credits & References](#credits--references)

---

## 📝 Overview

This notebook is a **hands-on demonstration** of an AI-powered reflexion workflow for answering research questions with citations.  
It combines:

- **Large Language Models (LLMs)** via HuggingFace 🤗
- **LangChain** for prompt engineering and workflow orchestration 🔗
- **Tavily Search** for real-time web search 🔍
- **Graph-based iterative refinement** for answer improvement 🔄

The goal:  
**Produce concise, accurate, and referenced answers to complex business questions using a multi-step, self-improving process.**

---

## 🌟 Key Features

- **Iterative Reflexion**: The AI answers, searches, revises, and repeats for better quality.
- **Citations & References**: Every answer is backed by real web sources.
- **Customizable Prompts**: Easily adapt word limits, formats, and search strategies.
- **Modular Graph Workflow**: Each step is a node in a LangChain MessageGraph.
- **GPU/CPU Support**: Runs on your available hardware.
- **User-Friendly Output**: Clear, readable, and well-cited answers.

---

## 🛠️ How It Works

1. **User asks a question** (e.g., "How is AI reshaping small businesses?")
2. **First Responder Node**:  
    - The LLM generates an initial answer and suggests search queries.
3. **Search Node**:  
    - Tavily Search fetches relevant web results for those queries.
4. **Revisor Node**:  
    - The LLM revises its answer using the search results, adding citations.
5. **Event Loop**:  
    - The process repeats for a set number of iterations or until the answer is complete.
6. **Final Output**:  
    - A polished, referenced answer is presented to the user.

---

## 🗺️ Workflow Diagram

```
👤 User Question
        |
        v
🤖 First Responder (LLM) ————→ 🔍 Search (Tavily) ————→ ✍️ Revisor (LLM)
        ^                                                        |
        |                                                        v
        <——————— (Iterative Reflexion Loop) ———————— [Up to N times]
```

---

## ⚙️ Setup & Requirements

**Install dependencies:**
```python
!pip install dotenv langchain langgraph langchain_community langchain_huggingface langchain_google_vertexai transformers
```

**Authenticate with HuggingFace and Tavily:**
```python
from huggingface_hub import notebook_login
notebook_login()
!huggingface-cli login

import getpass, os
os.environ["TAVILY_API_KEY"] = getpass.getpass("Tavily API key:\n")
```

---

## 🧩 Code Walkthrough

### 1. **Imports & Initialization**
- Loads all necessary modules: `transformers`, `langchain`, `langgraph`, etc.
- Sets up device (GPU/CPU) and initializes the LLM pipeline.

### 2. **Prompt Engineering**
- Defines dynamic and simplified prompts for both the initial answer and revision steps.
- Prompts are **token-aware** and adapt to your configuration.

### 3. **Parsing Functions**
- Robust functions extract answers and search queries from LLM outputs.
- Ensures clean, reliable parsing even if the LLM output varies.

### 4. **Workflow Nodes**
- **First Responder**: Generates the first answer and search terms.
- **Search Executor**: Runs Tavily search and formats results.
- **Revisor**: Improves the answer using search data and adds citations.

### 5. **Event Loop & Graph**
- Uses `MessageGraph` to orchestrate the flow.
- The event loop controls the number of reflexion cycles.

### 6. **Testing & Output**
- Runs the workflow on a sample question.
- Prints each step, including intermediate and final answers with references.

---

## 🔄 Reflexion Mechanism Explained

**Reflexion** is the process of **iterative self-improvement**:
- The AI doesn't just answer once—it **reflects** on its answer, checks new information, and **revises**.
- Each cycle uses **fresh search results** to ground the answer in real, up-to-date sources.
- The process stops when the answer is complete or after a set number of iterations.

**Why is this powerful?**  
It mimics how a human researcher would answer:  
Draft → Research → Revise → Repeat → Finalize! 🧑‍💻🔁

---

## 📝 Sample Output

> **Question:**  
> *How is AI reshaping small businesses and entrepreneurship?*

> **Final Answer:**  
> *Artificial Intelligence (AI) is revolutionizing the landscape of small businesses and entrepreneurship by automating tasks, analyzing data, and providing personalized recommendations. This empowers entrepreneurs to focus on strategic decision-making, leading to increased efficiency, productivity, and profitability. Moreover, AI-powered tools can provide valuable insights and predictions, enabling entrepreneurs to make informed decisions that optimize their operations and achieve long-term success [1].  Additionally, AI can automate repetitive tasks such as data entry, freeing up entrepreneurs to focus on more creative and strategic endeavors [2]. By leveraging AI, small businesses can gain access to a wide range of tools and resources that would be difficult or expensive for them to afford otherwise. This allows them to compete more effectively with larger enterprises [1].*

> **References:**  
> [1] https://www.salesforce.com/ai-small-business/  
> [2] https://www.forbes.com/ai-business-guide/

---

## 🛠️ Customization Tips

- **Change the LLM**: Swap out `"google/gemma-2b-it"` for any HuggingFace-supported model.
- **Adjust Iterations**: Set `MAX_ITERATIONS` for more or fewer reflexion cycles.
- **Modify Prompts**: Tweak the prompt templates for different answer styles or lengths.
- **Plug in Other Search Tools**: Replace Tavily with your preferred search API.

---

## 👩‍💻👨‍💻 Credits & References

- **LangChain**: Modular LLM orchestration framework ([GitHub](https://github.com/langchain-ai/langchain))
- **HuggingFace Transformers**: State-of-the-art LLMs ([Website](https://huggingface.co/))
- **Tavily Search**: Real-time web search for LLMs ([Website](https://www.tavily.com/))
- **Notebook Author**: abuzar01440 -----> abuzarbhutta@gmail.com // abuzarbhutta.0@outlook.com

---

## 🎉 Happy Experimenting! 🚀

Feel free to fork, adapt, and extend this notebook for your own AI-powered research workflows.  
If you have questions or want to share your results, drop a comment or open an issue!  

---

  <b>Let AI do the heavy lifting for your business insights! 💡🤓</b>
</p>

In [None]:
! pip install dotenv langchain langgraph

In [None]:
import os
import langchain
import langgraph

In [None]:
from huggingface_hub import notebook_login
notebook_login()

In [None]:
!huggingface-cli login

In [None]:
!pip install langchain_community langchain_huggingface

In [None]:
! pip install langchain_google_vertexai

In [None]:
import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain_community.llms import HuggingFacePipeline
from langchain_huggingface import HuggingFacePipeline
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain.agents import initialize_agent, AgentType
from langchain.tools import tool # Import the tool decorator
from typing import List, Sequence
from langchain_core.messages import BaseMessage, HumanMessage
from langgraph.graph import END, MessageGraph
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage, ToolMessage # To structure messages
import tqdm as notebook_tqdm
from typing import List, Dict, Any, Optional
import torch
import re
#import PydanticToolsParser
from langchain_core.output_parsers.openai_tools import PydanticToolsParser, JsonOutputToolsParser
from langchain.output_parsers import PydanticOutputParser
print("ALL IMPort DonE")

In [None]:
import os
import getpass
if not os.environ.get("TAVILY_API_KEY"):
    os.environ["TAVILY_API_KEY"] = getpass.getpass("Tavily API key:\n")

In [None]:
MAX_NEW_TOKENS = 256  # You can change this value
MAX_ITERATIONS = 2    # Maximum number of revision cycles (used in the last cell)
TEMPERATURE = 0.6



In [None]:
device ="cuda" if torch.cuda.is_available() else "cpu"
print(device)
print("--" * 33)

pipe = pipeline("text-generation", model="google/gemma-2b-it",max_new_tokens=MAX_NEW_TOKENS,
    temperature=TEMPERATURE,
    do_sample=True,
    repetition_penalty=1.1)


llm = HuggingFacePipeline(pipeline=pipe)
print("\nLangChain LLM (GEMMA 3 via HuggingFacePipeline) initialized successfully!")

In [None]:
print(f"Configuration:")
print(f"- Max New Tokens: {MAX_NEW_TOKENS}")
print(f"- Max Iterations: {MAX_ITERATIONS}")
print(f"- Temperature: {TEMPERATURE}")
print("=" * 50)

In [None]:
# generate output using LLM Gemma 3
llm("how AI is playing role to develop small businesses")

In [None]:

MAX_ITERATIONS = 3

from langchain.prompts import ChatPromptTemplate

# Simplified word limit
WORD_LIMIT = 176


# MUCH SIMPLER PROMPTS - NO CONFUSION
first_responder_prompt = ChatPromptTemplate.from_messages([
    ("user", "Answer this question in exactly 150 words: {question}\n\nProvide your answer followed by 2 search terms:\nANSWER: [your 150-word answer]\nSEARCH: term1, term2"),
])

revisor_prompt = ChatPromptTemplate.from_messages([
    ("user", "Improve this answer using the search data. Write ONLY the improved answer with citations [1], [2]:\n\nOriginal: {original_answer}\nSearch Data: {search_results}\n\nIMPROVED ANSWER:"),
])

def parse_first_response(text: str):
    """Simple parsing for first response"""
    try:
        if "ANSWER:" in text:
            answer_section = text.split("ANSWER:")[1]
            if "SEARCH:" in answer_section:
                answer = answer_section.split("SEARCH:")[0].strip()
                search_section = answer_section.split("SEARCH:")[1].strip()
                # Simple comma-split for search terms
                search_queries = [q.strip() for q in search_section.split(",")[:2]]
            else:
                answer = answer_section.strip()
                search_queries = ["AI small business", "business automation"]
        else:
            answer = text.strip()[:400]  # Use first 400 chars
            search_queries = ["AI small business", "business automation"]

        # Clean up answer
        if len(answer) < 50:
            answer = "AI is transforming small businesses by automating processes, improving customer service, and providing data-driven insights for better decision making."

        return {
            "answer": answer,
            "search_queries": search_queries[:2]
        }
    except Exception as e:
        print(f"Parse error: {e}")
        return {
            "answer": "AI helps small businesses through automation and intelligent decision-making tools.",
            "search_queries": ["AI business tools", "small business automation"]
        }

def parse_revised_response(text: str):
    """Simple parsing for revised response"""
    try:
        # Look for IMPROVED ANSWER section
        if "IMPROVED ANSWER:" in text:
            answer = text.split("IMPROVED ANSWER:")[1].strip()
        elif "REVISED:" in text:
            answer = text.split("REVISED:")[1].strip()
        else:
            answer = text.strip()

        # Clean answer - remove any meta-commentary
        lines = answer.split('\n')
        clean_lines = []
        for line in lines:
            line = line.strip()
            # Skip meta-commentary lines
            if not line.startswith(('*', '**Improvements', '**Additional', '**Search Results', '**Notes')):
                clean_lines.append(line)

        answer = ' '.join(clean_lines).strip()

        # Extract URLs
        references = re.findall(r'https?://[^\s\]]+', answer)

        # Ensure minimum length
        if len(answer) < 100:
            answer = "AI is revolutionizing small businesses by providing automation tools, customer insights, and operational efficiency improvements that help companies compete effectively."

        return {
            "answer": answer,
            "references": references[:5]
        }
    except Exception as e:
        print(f"Parse error: {e}")
        return {
            "answer": "AI empowers small businesses with automation and intelligent tools.",
            "references": []
        }

def first_responder_chain(state: List[BaseMessage]) -> List[BaseMessage]:
    """Generate initial answer - SIMPLIFIED"""
    user_question = state[0].content

    prompt = first_responder_prompt.invoke({"question": user_question[:100]})
    response = llm.invoke(prompt.to_string())

    # Clean response
    clean_response = response
    if prompt.to_string() in response:
        clean_response = response.replace(prompt.to_string(), "").strip()

    print(f"🟢 First response: {clean_response[:150]}...")

    parsed = parse_first_response(clean_response)

    tool_call = {
        "name": "AnswerQuestion",
        "args": parsed,
        "id": "call_001"
    }

    return [AIMessage(content=clean_response, tool_calls=[tool_call])]

def revisor_chain(state: List[BaseMessage]) -> List[BaseMessage]:
    """Revise answer - MUCH SIMPLER"""
    iteration_count = sum(isinstance(msg, ToolMessage) for msg in state)
    print(f"🔄 Revision {iteration_count}")

    # Get the last actual answer content
    original_answer = ""
    search_results = ""

    # Find most recent search results
    for msg in reversed(state):
        if isinstance(msg, ToolMessage):
            search_results = msg.content
            break

    # Find the most recent answer
    for msg in reversed(state):
        if isinstance(msg, AIMessage) and hasattr(msg, 'tool_calls') and msg.tool_calls:
            try:
                original_answer = msg.tool_calls[0]["args"]["answer"]
                break
            except:
                continue

    if not original_answer:
        original_answer = "AI helps small businesses automate tasks and improve efficiency."

    print(f"✅ Using answer: {original_answer[:80]}...")

    # Very simple prompt
    prompt = revisor_prompt.invoke({
        "original_answer": original_answer[:300],  # Keep it short
        "search_results": search_results[:200]     # Keep it short
    })

    response = llm.invoke(prompt.to_string())

    # Clean response
    clean_response = response
    if prompt.to_string() in response:
        clean_response = response.replace(prompt.to_string(), "").strip()

    print(f"🟢 Revised: {clean_response[:150]}...")

    parsed = parse_revised_response(clean_response)

    tool_call = {
        "name": "ReviseAnswer",
        "args": parsed,
        "id": f"revision_{iteration_count}"
    }

    return [AIMessage(content=clean_response, tool_calls=[tool_call])]

def execute_tools(state: List[BaseMessage]) -> List[BaseMessage]:
    """Execute search - SIMPLIFIED"""
    last_ai_message = state[-1]

    if not hasattr(last_ai_message, "tool_calls") or not last_ai_message.tool_calls:
        return []

    tool_call = last_ai_message.tool_calls[0]
    iteration = sum(isinstance(msg, ToolMessage) for msg in state) + 1

    # Get search queries
    if tool_call["name"] == "AnswerQuestion":
        search_queries = tool_call["args"].get("search_queries", ["AI small business"])
    else:
        # For revisions, use predefined searches
        search_queries = [f"AI small business examples", f"AI business benefits"]

    print(f"🔍 Searching: {search_queries}")

    # Execute search
    all_results = []
    try:
        tavily_tool = TavilySearchResults(max_results=3)
        for query in search_queries[:3]:
            try:
                results = tavily_tool.invoke(query)
                if results:
                    all_results.extend(results[:3])  # Max 2 per query
            except Exception as e:
                print(f"Search failed: {e}")
    except:
        # Fallback data
        all_results = [
            {'title': 'AI for Small Business', 'url': 'https://www.salesforce.com/ai-small-business/', 'content': 'AI tools help automate tasks'},
            {'title': 'Business Automation Guide', 'url': 'https://www.forbes.com/ai-business-guide/', 'content': 'Automation improves efficiency'}
        ]

    # Format results SIMPLY
    formatted = []
    for i, result in enumerate(all_results[:3], 1):
        title = result.get('title', 'AI Resource')[:40]
        url = result.get('url', 'https://example.com')
        formatted.append(f"[{i}] {title}: {url}")

    result_text = "\n".join(formatted)

    return [ToolMessage(content=result_text, tool_call_id=tool_call["id"])]

# SIMPLE EVENT LOOP
def event_loop(state: List[BaseMessage]) -> str:
    """Simple iteration control"""
    count = sum(isinstance(item, ToolMessage) for item in state)

    print(f"🔄 Iteration {count}/{MAX_ITERATIONS}")

    if count >= MAX_ITERATIONS:
        print("✅ Done!")
        return END

    return "search"

# Build graph
graph = MessageGraph()
graph.add_node("draft", first_responder_chain)
graph.add_node("search", execute_tools)
graph.add_node("revise", revisor_chain)

graph.add_edge("draft", "search")
graph.add_edge("search", "revise")
graph.add_conditional_edges("revise", event_loop)
graph.set_entry_point("draft")

app = graph.compile()

# Test
if __name__ == "__main__":
    print(f"\n📅 Date: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"👤 User: abuzar01440")
    print(f"\n🚀 TESTING SIMPLIFIED WORKFLOW")
    print("=" * 50)

    try:
        response = app.invoke([
            HumanMessage(content="How is AI reshaping small businesses and entrepreneurship?")
        ])

        print(f"\n✅ Completed with {len(response)} messages")

        # Find final answer
        final_answer = None
        final_refs = []

        for msg in reversed(response):
            if isinstance(msg, AIMessage) and hasattr(msg, 'tool_calls') and msg.tool_calls:
                if msg.tool_calls[0]["name"] == "ReviseAnswer":
                    final_answer = msg.tool_calls[0]["args"]["answer"]
                    final_refs = msg.tool_calls[0]["args"].get("references", [])
                    break

        if final_answer:
            print("\n" + "="*60)
            print("🎯 FINAL ANSWER:")
            print("="*60)
            print(final_answer)

            if final_refs:
                print(f"\n📚 REFERENCES ({len(final_refs)}):")
                for i, ref in enumerate(final_refs, 1):
                    print(f"[{i}] {ref}")

            print(f"\n📊 Stats: {len(final_answer)} chars, ~{len(final_answer.split())} words")
        else:
            print("❌ No final answer found")

    except Exception as e:
        print(f"❌ Error: {e}")
        import traceback
        traceback.print_exc()

Artificial Intelligence (AI) is revolutionizing the landscape of small businesses and entrepreneurship by automating tasks, analyzing data, and providing personalized recommendations. This empowers entrepreneurs to focus on strategic decision-making, leading to increased efficiency, productivity, and profitability. Moreover, AI-powered tools can provide valuable insights and predictions, enabling entrepreneurs to make informed decisions that optimize their operations and achieve long-term success [1].  Additionally, AI can automate repetitive tasks such as data entry, freeing up entrepreneurs to focus on more creative and strategic endeavors [2]. By leveraging AI, small businesses can gain access to a wide range of tools and resources that would be difficult or expensive for them to afford otherwise. This allows them to compete more effectively with larger enterprises [1].
