Multi-Strategy AI Reasoning System implementing cutting-edge techniques from recent AI research papers. Built with Groq LLM, Tavily Search, and ChromaDB for production-ready AI agent capabilities.
| Feature | Description |
|---|---|
| 🔗 Chain-of-Thought | Step-by-step reasoning with self-consistency voting |
| 🌳 Tree-of-Thoughts | Multi-path exploration with beam search |
| ⚡ ReAct Agent | Reasoning + Acting with real web search |
| 👥 Multi-Agent | Planner → Worker → Critic collaboration |
| 🧠 LLM Auto-Classifier | Intelligent strategy routing based on task type |
| 🌐 Real Web Search | Tavily API integration for live information |
| 💾 Vector Memory | ChromaDB for persistent knowledge storage |
| 🛡️ Rate Limiting | API protection (10/min, 100/day per user) |
| 🌊 Streaming | Real-time response streaming |
Try it now: https://huggingface.co/spaces/SaiTejaSrivilli/ai-agent-system
┌─────────────────────────────────────────────────────────────┐
│ User Input │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 🧠 LLM-Based Auto-Classifier │
│ (Intelligent routing based on task analysis) │
└─────────────────────────┬───────────────────────────────────┘
│
┌───────────────┼───────────────┬───────────────┐
▼ ▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Chain-of │ │ Tree-of │ │ ReAct │ │ Multi- │
│ Thought │ │ Thoughts │ │ Agent │ │ Agent │
│ │ │ │ │ │ │ │
│ • 3 paths │ │ • Beam=3 │ │ • Search │ │ • Planner │
│ • Voting │ │ • Depth=3 │ │ • Memory │ │ • Worker │
│ • Consensus │ │ • Scoring │ │ • Tools │ │ • Critic │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │ │ │
└───────────────┴───────────────┴───────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 📤 Final Response │
│ (Answer + Metadata + Confidence) │
└─────────────────────────────────────────────────────────────┘
The system uses an LLM-based classifier (not keyword matching) to intelligently route queries:
def _classify(self, task: str) -> str:
"""LLM-based intelligent task classification."""
classify_prompt = f"""Classify this task into ONE category:
- cot: Math problems, calculations, logic puzzles
- tot: Creative tasks, design, brainstorming
- react: Research questions, factual queries, current events
- multi: Complex writing, essays, detailed analysis
Task: "{task}"
Category:"""
response = self.llm.generate(classify_prompt, temperature=0.1)
# Returns: cot, tot, react, or multi| Query | Strategy | Reason |
|---|---|---|
| "Calculate 15% tip on $85" | Chain-of-Thought | Math calculation |
| "Design a logo for a coffee shop" | Tree-of-Thoughts | Creative task |
| "What are the latest AI developments?" | ReAct Agent | Research/current events |
| "Write an analysis of remote work" | Multi-Agent | Complex writing task |
| "Explain quantum computing" | ReAct Agent | Factual explanation |
| "Create a marketing strategy" | Multi-Agent | Complex planning |
# Generates 3 independent reasoning paths
# Uses majority voting for final answer
# Smart answer extraction with money detection ($5 not just 5)
Example Output:
- Path 1: "$5" (via step-by-step calculation)
- Path 2: "$5" (via different approach)
- Path 3: "$5" (via verification)
- Final: "$5" with 100% confidence# Explores multiple solution branches
# Beam width: 3, Depth: 3
# Scores and prunes paths for best solutions
Example: "Design an AI fitness feature"
- Branch 1: Personalized workout AI (Score: 8.5)
- Branch 2: Real-time form correction (Score: 9.0) ← Selected
- Branch 3: Social fitness challenges (Score: 7.5)# Available Tools:
# - web_search: Tavily API for real-time info
# - memory_search: ChromaDB vector search
# - memory_store: Save important information
# - calculate: Math operations
# Reasoning loop:
Thought → Action → Observation → Thought → ... → Answer# Three specialized agents:
# 1. Planner: Creates execution plan
# 2. Worker: Executes tasks with full content
# 3. Critic: Reviews and improves quality
# Produces direct content, not descriptionsProtects API usage with per-user limits:
| Limit | Value | Purpose |
|---|---|---|
| Per Minute | 10 requests | Prevents spam |
| Per Day | 100 requests | Protects daily quota |
class RateLimiter:
def __init__(self, max_per_minute=10, max_per_day=100):
# Tracks requests by user IP
# Shows friendly messages when limited
# Displays current usage stats# Clone repository
git clone https://github.com/SaiTejaSrivilli/ai-agent-system.git
cd ai-agent-system
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export GROQ_API_KEY="your-groq-key"
export TAVILY_API_KEY="your-tavily-key" # Optional
# Run
python app.py- Create a new Space on HuggingFace
- Upload
app.pyandrequirements.txt - Add secrets in Settings:
GROQ_API_KEY(Required)TAVILY_API_KEY(Optional - for real web search)
- Space will auto-deploy
| Key | Required | Free Tier | Get It |
|---|---|---|---|
GROQ_API_KEY |
✅ Yes | ✅ Generous | console.groq.com |
TAVILY_API_KEY |
❌ Optional | ✅ 1000/month | tavily.com |
ai-agent-system/
├── app.py # Main application (~1200 lines)
├── requirements.txt # Dependencies
├── README.md # Documentation
└── LICENSE # MIT License
Lines 1-50: Imports & Configuration
Lines 51-120: LLM Client with Streaming
Lines 121-250: Web Search Tool (Tavily + Fallback)
Lines 251-350: Vector Memory (ChromaDB)
Lines 351-500: Chain-of-Thought Reasoner
Lines 501-650: Tree-of-Thoughts Reasoner
Lines 651-800: ReAct Agent
Lines 801-950: Multi-Agent System
Lines 951-1050: Creative Agent (Orchestrator)
Lines 1051-1200: Gradio UI & Event Handlers
Query: "A bakery sells cupcakes for $3. Tom buys 5 and pays with $20. How much change?"
Reasoning:
Step 1: Cost = 5 × $3 = $15
Step 2: Change = $20 - $15 = $5
Answer: $5
Confidence: high (3/3 paths agreed)
Query: "What is Chain of Thought prompting?"
Thought: I need to search for information about CoT prompting
Action: web_search("Chain of Thought prompting AI")
Observation: [Search results about CoT...]
Thought: I found relevant information, let me synthesize
Answer: Chain of Thought prompting is a technique where...
Tools Used: web_search, memory_store
Query: "Write an analysis of remote work benefits"
Planner: Created 3-section plan
Worker: Generated full analysis content
Critic: Improved clarity and added examples
Output: [Complete 500+ word analysis]
Quality Score: 8.5/10
| Paper | Authors | Year | Technique |
|---|---|---|---|
| Chain-of-Thought Prompting | Wei et al. | 2022 | Step-by-step reasoning |
| Self-Consistency | Wang et al. | 2022 | Multiple paths + voting |
| Tree of Thoughts | Yao et al. | 2023 | Tree search reasoning |
| ReAct | Yao et al. | 2022 | Reasoning + Acting |
This project showcases:
- AI/ML Engineering: LLM integration, prompt engineering, agent architectures
- Software Architecture: Clean code, modular design, error handling
- API Integration: Groq, Tavily, HuggingFace APIs
- Full-Stack Development: Gradio UI, async processing, streaming
- Production Practices: Rate limiting, graceful degradation, logging
- Research Implementation: Converting academic papers to working code
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
MIT License - see LICENSE for details.
Sai Teja Srivilli
- 📂 GitHub
- 🤗 HuggingFace
⭐ Star this repo if you find it useful!