Skip to content

ashish-doing/autostream-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoStream Conversational AI Agent

A production-ready conversational AI agent for AutoStream, a fictional SaaS platform providing automated video editing tools for content creators. Built as part of the ServiceHive ML Intern assignment.


How to Run Locally

Prerequisites

Setup

# 1. Clone the repository
git clone https://github.com/ashish-doing/autostream-agent
cd autostream-agent

# 2. Install dependencies
pip install -r requirements.txt

# 3. Create a .env file in the root directory
echo "GROQ_API_KEY=your_groq_api_key_here" > .env

# 4. Run the agent
python main.py

Example Conversation Flow

You: Hi
Agent: Hello! Welcome to AutoStream...

You: What are your pricing plans?
Agent: We have two plans...

You: I want to try the Pro plan for my YouTube channel
Agent: That's great! Could I get your name first?

You: Ashish
You: ashish@gmail.com
You: YouTube
Agent: Lead captured successfully!

Architecture Explanation

Why LangGraph?

LangGraph was chosen over AutoGen because it provides explicit, deterministic state management through a typed state graph. For a lead capture agent, predictability is critical — we need guaranteed control over when the lead capture tool fires, and LangGraph's node-based architecture makes this straightforward. AutoGen's multi-agent conversation model introduces unnecessary complexity for a single-agent workflow.

How State is Managed

The agent uses a TypedDict-based AgentState that persists across all conversation turns within a session. The state tracks:

  • messages — full conversation history using LangGraph's add_messages reducer
  • intent — classified intent of the latest user message
  • lead_name, lead_email, lead_platform — collected incrementally across turns
  • lead_captured — boolean flag to prevent duplicate tool calls
  • awaiting — tracks which lead field the agent is currently collecting

This design ensures the agent never triggers mock_lead_capture() until all three fields are collected, and never prematurely exits the collection flow regardless of what the user says mid-collection.

RAG Pipeline

Product knowledge (pricing, features, policies) is stored in a local knowledge_base.json file. The rag.py module performs keyword-based retrieval to find the most relevant context, which is then injected into the LLM prompt. This keeps responses grounded and prevents hallucination of non-existent plans or features.


WhatsApp Deployment via Webhooks

To deploy this agent on WhatsApp, the following architecture would be used:

  1. WhatsApp Business API — Register a WhatsApp Business account and obtain API credentials via Meta's developer portal.

  2. Webhook Endpoint — Deploy a FastAPI or Flask server that exposes a POST /webhook endpoint. WhatsApp sends incoming messages to this URL.

  3. Message Handling — On each incoming webhook event, extract the user's phone number and message text, then pass it into the LangGraph agent's invoke() method with the appropriate session state.

  4. Session Persistence — Store per-user AgentState in a Redis or PostgreSQL database, keyed by phone number, so state persists across messages.

  5. Reply — After the agent responds, use the WhatsApp Business API to send the response back to the user's phone number.

WhatsApp User
     ↓ message
WhatsApp Business API
     ↓ webhook POST
FastAPI Server (/webhook)
     ↓ invoke
LangGraph Agent
     ↓ response
WhatsApp Business API
     ↓ reply
WhatsApp User

Project Structure

autostream-agent/
├── main.py              # Entry point, conversation loop
├── agent.py             # LangGraph agent, state management, intent detection
├── rag.py               # Knowledge base loader and retrieval
├── tools.py             # mock_lead_capture tool
├── knowledge_base.json  # AutoStream pricing, features, policies
├── .env                 # API keys (not committed)
├── .gitignore
├── requirements.txt
└── README.md

Tech Stack

  • Language: Python 3.9+
  • Framework: LangGraph + LangChain
  • LLM: Llama 3.1 8B via Groq API (free tier)
  • State Management: LangGraph TypedDict state graph
  • Knowledge Base: Local JSON file with keyword-based RAG

About

Conversational AI agent for AutoStream SaaS — intent detection, RAG, and lead capture built with LangGraph + Groq

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages