🚀 AutoStream AI Agent - Social-to-Lead Conversion System

An intelligent conversational AI agent that converts social media conversations into qualified business leads using state-of-the-art LLM technology and agentic workflows.

📋 Table of Contents

Overview
Key Features
Architecture
Installation
Usage
WhatsApp Integration
Technical Deep Dive
Demo & Examples
Project Structure
Future Enhancements

🎯 Overview

This project implements a production-ready conversational AI agent for AutoStream, a fictional SaaS company offering AI-powered video editing tools. The agent demonstrates advanced capabilities in:

Intent Recognition - Classifying user intent with context-awareness
RAG-Powered Knowledge Retrieval - Answering questions using a local knowledge base
Lead Qualification - Detecting high-intent users and capturing their information
Tool Execution - Triggering backend actions through validated workflows

Tech Stack: Python, LangChain, LangGraph, Google Gemini 2.5 Flash

✨ Key Features

🧠 Intelligent Intent Classification

Multi-turn context awareness
Distinguishes between greetings, inquiries, and high-intent signals
Fallback logic for robustness
Handles conversation state transitions

📚 Enhanced RAG Implementation

Direct context injection for accurate responses
Structured knowledge base with comprehensive product information
Fallback knowledge base if file is missing
Optimized prompts for concise, helpful answers

🎯 Smart Lead Capture

Progressive information gathering (Name → Email → Platform)
Multi-strategy extraction:
- Regex patterns for email/platform
- LLM-assisted name extraction
- Heuristic validation
Prevents infinite loops with attempt tracking
Professional confirmation messages

🛠️ Production-Ready Features

Comprehensive error handling
Graceful degradation on failures
Extensive logging for debugging
Type hints and documentation
State management with validation

🏗️ Architecture

Why LangGraph?

I chose LangGraph over AutoGen for this implementation because:

Explicit State Management: LangGraph provides a clear, type-safe state graph that makes conversation flow predictable and debuggable. The AgentState TypedDict explicitly defines all state variables, preventing bugs from implicit state changes.
Deterministic Flow Control: Unlike AutoGen's agent-to-agent communication which can be non-deterministic, LangGraph uses explicit edges and conditional routing. This ensures:
- Predictable conversation paths
- Easier testing and validation
- Clear separation of concerns
Production Scalability: LangGraph's architecture naturally extends to:
- Persistent state stores (Redis, PostgreSQL)
- Streaming responses for real-time UX
- Checkpointing for conversation resume
- Easy integration with FastAPI/Flask
Tool Calling Precision: The conditional routing in LangGraph ensures tools are called only when conditions are met (all 3 fields collected), preventing premature API calls.

State Management Strategy

The agent maintains state through a strongly-typed AgentState dictionary:

class AgentState(TypedDict):
    messages: List[BaseMessage]      # Conversation history
    lead_info: dict                  # {name, email, platform, validated}
    intent: str                      # Current intent (GREETING/INQUIRY/HIGH_INTENT)
    conversation_stage: str          # Workflow stage
    missing_fields: List[str]        # Tracks incomplete data
    attempts: int                    # Prevents infinite loops

State persistence across turns is achieved through:

operator.add annotation for message accumulation
Explicit state updates in each node's return value
Conditional routing based on state inspection

Memory window: Last 3 messages used for intent classification context while maintaining full conversation history for reference.

Graph Flow Diagram

┌─────────────┐
│   START     │
└──────┬──────┘
       │
       ▼
┌──────────────────┐
│ Intent Classifier │
└──────┬───────────┘
       │
       ├──[GREETING]──────────────┐
       │                          ▼
       ├──[INQUIRY]───────┐    ┌──────────┐
       │                  │    │ Greeting │
       │                  ▼    └────┬─────┘
       │            ┌─────────┐     │
       │            │   RAG   │     │
       │            └────┬────┘     │
       │                 │          │
       └──[HIGH_INTENT]──┼──────────┤
                         │          │
                         ▼          │
                  ┌──────────────┐  │
                  │ Lead Capture │  │
                  └──────┬───────┘  │
                         │          │
                         ▼          ▼
                      ┌──────────────┐
                      │     END      │
                      └──────────────┘

Routing Logic:

If in QUALIFYING stage with missing fields → Stay in Lead Capture
Otherwise, route based on latest intent classification
Lead Capture loops until all fields validated
All nodes eventually reach END state

🚀 Installation

Prerequisites

Python 3.9 or higher
Google API Key (Gemini)

Step-by-Step Setup

Clone the repository

git clone https://github.com/yourusername/autostream-ai-agent.git
cd autostream-ai-agent

Create virtual environment (recommended)

# Windows
python -m venv autoenv
autoenv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Set up environment variables

Create a .env file in the project root:

GOOGLE_API_KEY=your_gemini_api_key_here

To get your Gemini API key:

Visit Google AI Studio
Create a new API key
Copy it to your .env file

Verify knowledge base

Ensure knowledge_base.md is in the project root. A fallback is included if missing.

💻 Usage

Running the Agent

python agent.py

Example Conversation Flow

You: Hi there!

Agent: Hello! 👋 Welcome to AutoStream.

I'm here to help you streamline your video editing workflow. I can:
• Explain our pricing and features
• Answer questions about our AI-powered tools
• Help you get started with a 14-day free trial

What would you like to know?

You: What's the difference between Basic and Pro?

Agent: The key differences between our plans:

Basic Plan ($29/month):
- 10 videos/month, 720p resolution
- Email support during business hours

Pro Plan ($79/month):
- Unlimited videos, 4K resolution
- AI captions in 40+ languages
- 24/7 premium support
- Priority processing (2x faster)
- Advanced AI features like smart scene detection

Plus, you can try Pro free for 14 days with no credit card needed!

You: That sounds great! I want to try the Pro plan for my YouTube channel.

Agent: Great! Let's get you set up. What's your name?

You: John Doe

Agent: Thanks, John! What email address should we use to contact you?

You: john@example.com

Agent: Perfect! Which platform do you primarily create content for? (YouTube, Instagram, TikTok, etc.)

You: YouTube

Agent: Perfect! I've got everything I need, John. ✅

Your information has been captured:
• Name: John Doe
• Email: john@example.com
• Platform: YouTube

Our team will reach out within 24 hours to get you started with AutoStream Pro...

📱 WhatsApp Integration

Webhook-Based Architecture

To integrate this agent with WhatsApp, I would use the WhatsApp Business API with a webhook-based architecture:

High-Level Integration Flow

WhatsApp User → WhatsApp Cloud API → Webhook Server → AI Agent → Response → WhatsApp

1. Infrastructure Setup
To establish the connection, the system uses a FastAPI web server that acts as a secure gateway to listen for real-time events from WhatsApp. This server exposes two critical endpoints: a GET route that handles the initial verification "handshake" by checking a security token to confirm the server's legitimacy with Meta, and a POST route that actively captures incoming message payloads. When a user sends a message, WhatsApp pushes the data to this listener, where the system extracts the sender's phone number and the text body to pass them downstream for immediate processing.

2. Session Management
Because WhatsApp communication is asynchronous and stateless, the system utilizes Redis as a high-speed database to maintain the conversational memory for each unique user. When a message arrives, the code uses the sender's phone number as a key to look up their specific conversation history; if a session is found, the previous context is loaded, but if not, a fresh state is initialized. After the AI generates a response, the updated state is serialized and saved back to Redis with a set expiration time (e.g., 24 hours), ensuring the agent remembers the context of the chat while automatically clearing out old, inactive data.

3. WhatsApp-Specific Adaptations
The final stage involves formatting and transmitting the AI's output back to the user through the official WhatsApp Cloud API. The system includes a safety utility that automatically splits any AI response exceeding 4,096 characters into smaller chunks to comply with WhatsApp's strict message limits. Once the text is prepared, the system authenticates using a secure access token and dispatches an HTTP POST request to the Meta Graph API, routing the reply specifically to the user's phone number to complete the interaction loop.

4. Deployment Architecture

┌─────────────────┐
│  WhatsApp User  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ WhatsApp Cloud  │
│      API        │
└────────┬────────┘
         │
         ▼ HTTPS Webhook
┌─────────────────┐
│   Load Balancer │
│   (nginx/AWS)   │
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
    ▼         ▼
┌──────┐  ┌──────┐
│FastAPI │FastAPI│  (Horizontal scaling)
│Server 1│Server 2│
└───┬────┘└───┬──┘
    │         │
    └────┬────┘
         ▼
┌─────────────────┐
│  Redis Cache    │  (Session storage)
│  + Agent State  │
└─────────────────┘
         │
         ▼
┌─────────────────┐
│   PostgreSQL    │  (Lead storage)
└─────────────────┘

5. Required Components

Webhook Server: FastAPI/Flask hosted on cloud (AWS, GCP, Railway)
Public HTTPS URL: Required by WhatsApp (use ngrok for testing)
Session Store: Redis for conversation state
Database: PostgreSQL for captured leads
Message Queue: Celery/RabbitMQ for async processing (optional but recommended)

6. Key Considerations

Rate Limiting: WhatsApp has rate limits (1000 messages/user/day for free tier)
Message Templates: Pre-approved templates needed for proactive messages
Media Handling: Support images/videos users might send
Latency: Target <3 second response time (use streaming if longer)
Error Handling: Implement retry logic for failed webhooks
Security: Validate webhook signatures to prevent spoofing

🔍 Technical Deep Dive

RAG Implementation

Unlike traditional vector database approaches, this implementation uses direct context injection for RAG:

prompt = f"""You are AutoStream's AI assistant.

Knowledge Base:
{KNOWLEDGE_BASE_TEXT}  # Entire markdown file loaded into context

User Query: {query}
"""

Advantages:

No embedding API costs or complexity
Perfect for small knowledge bases (<10k tokens)
100% retrieval accuracy (entire context available)
Simpler to debug and maintain

When to use vector DB instead:

Knowledge base >20k tokens
Need semantic search across many documents
Frequent updates requiring re-indexing

Intent Classification Logic

Three-layer approach for robustness:

LLM Classification: Primary method using Gemini with structured prompt
Keyword Fallback: If LLM returns invalid category, use keyword matching
Context Awareness: Uses last 3 messages to understand conversation flow

# Example: Detecting HIGH_INTENT after pricing question
History: 
  User: "What's your pricing?"
  Agent: [explains plans]
  User: "That sounds perfect!"  # Context makes this HIGH_INTENT

Lead Capture State Machine

Progressive disclosure pattern:

Start → Ask Name → Ask Email → Ask Platform → Execute Tool → End
                ↑_______|________|___________|
                    (Repeat if missing)

Validation layers:

Regex extraction (email, platform keywords)
LLM-assisted extraction (name from natural language)
User confirmation before tool execution

📊 Demo & Examples

Example 1: Pricing Inquiry

User: How much does AutoStream cost?

Agent: AutoStream offers two plans:

1. Basic Plan - $29/month
   - 10 videos/month
   - 720p resolution
   - Email support

2. Pro Plan - $79/month (Most Popular)
   - Unlimited videos
   - 4K resolution
   - AI captions
   - 24/7 support

Both plans come with a 14-day free trial. Which features interest you most?

Example 2: Policy Question

User: What's your refund policy?

Agent: We offer a 7-day money-back guarantee on all plans. If you're not satisfied 
within the first week, we'll refund your payment, no questions asked. After 7 days, 
refunds are not available, but you can cancel anytime to avoid future charges.

Is there anything specific about the policy you'd like to know?

Example 3: High-Intent Lead Capture

User: I want to sign up for Pro plan

Agent: Great! Let's get you set up. What's your name?

User: Sarah Johnson

Agent: Thanks, Sarah! What email address should we use to contact you?

User: sarah.j@gmail.com

Agent: Perfect! Which platform do you primarily create content for?

User: I create content for Instagram and TikTok

Agent: Perfect! I've got everything I need, Sarah. ✅
[Lead Captured Successfully - see console output]

📁 Project Structure

autostream-ai-agent/
│
├── agent.py                 # Main agent implementation
├── knowledge_base.md        # RAG knowledge base
├── requirements.txt         # Python dependencies
├── .env                     # Environment variables (create this)
├── .env.example            # Template for environment variables
│
├── README.md               # This file
├── DEMO.md                 # Detailed demo scenarios
│
└── tests/                  # Unit tests (future)
    ├── test_intent.py
    ├── test_rag.py
    └── test_lead_capture.py

🎓 Learning Outcomes

This project demonstrates proficiency in:

LLM Application Development
- Prompt engineering for classification and extraction
- Context window management
- Temperature tuning for consistency
State Machine Design
- Conditional routing based on conversation stage
- Loop prevention and error recovery
- State persistence strategies
Production Engineering
- Comprehensive error handling
- Logging and debugging
- Type safety with TypedDict
- Graceful degradation
AI Agent Patterns
- Tool calling workflows
- Progressive information gathering
- Multi-turn context tracking

🚀 Future Enhancements

Phase 1: Robustness

Add unit tests (pytest)
Implement conversation logging to database
Add monitoring/metrics (response time, success rate)
Support multiple LLM providers (OpenAI, Anthropic)

Phase 2: Features

Multi-language support
Voice input/output integration
Sentiment analysis for better intent detection
A/B testing framework for prompt variations

Phase 3: Scale

Deploy as REST API (FastAPI)
Add Redis for distributed state
Implement conversation replay/debugging UI
Real WhatsApp integration
Slack/Discord bot versions

📄 License

MIT License - feel free to use this project as a learning resource or foundation for your own agents.

🙏 Acknowledgments

LangChain Team - For the excellent LangGraph framework
Google DeepMind - For Gemini API access
ServiceHive - For the challenging and educational assignment

⭐ If you found this helpful, please consider starring the repository!

Last Updated: January 2026

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Readme.md		Readme.md
agent.py		agent.py
knowledge_base.md		knowledge_base.md
requirements.txt		requirements.txt
test.py		test.py

Folders and files

Latest commit

History

Repository files navigation

🚀 AutoStream AI Agent - Social-to-Lead Conversion System

📋 Table of Contents

🎯 Overview

✨ Key Features

🧠 Intelligent Intent Classification

📚 Enhanced RAG Implementation

🎯 Smart Lead Capture

🛠️ Production-Ready Features

🏗️ Architecture

Why LangGraph?

State Management Strategy

Graph Flow Diagram

🚀 Installation

Prerequisites

Step-by-Step Setup

💻 Usage

Running the Agent

Example Conversation Flow

📱 WhatsApp Integration

Webhook-Based Architecture

High-Level Integration Flow

🔍 Technical Deep Dive

RAG Implementation

Intent Classification Logic

Lead Capture State Machine

📊 Demo & Examples

Example 1: Pricing Inquiry

Example 2: Policy Question

Example 3: High-Intent Lead Capture

📁 Project Structure

🎓 Learning Outcomes

🚀 Future Enhancements

Phase 1: Robustness

Phase 2: Features

Phase 3: Scale

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages