# 🚀 Hero Project: On-Device AI Agent with Vision and RAG

**Pocket Agents: A Practical Guide to On‑Device Artificial Intelligence**

This notebook demonstrates a complete local AI agent system using:
- **Atomic Agents Framework** for agent orchestration
- **Qwen3-4B-Instruct** for vision-language processing in GGUF format
- **ChromaDB** for vector storage and RAG
- **Local processing** - everything runs on your device!

## 🎯 What You'll Learn

1. **RAG Q&A**: Answer questions using a knowledge base with optional image context
2. **Vision Capabilities**: Process uploaded images for analysis
3. **Task Automation**: Execute file operations and system tasks
4. **Agent Orchestration**: How to build composable AI agents
5. **Local Deployment**: Complete on-device AI system

## 🚀 Quick Setup

Run the setup script: `./setup_and_test.sh`

## 📋 Learning Flow

1. **Model Loading**: Load Qwen3-4B-Instruct with vision support
2. **Vector Store**: Set up ChromaDB with sample knowledge base
3. **RAG Agent**: Test question answering with retrieval
4. **Task Agent**: Test task automation with tools
5. **Vision Demo**: Test image analysis capabilities
6. **Performance**: Measure system performance
7. **Gradio UI**: Launch interactive web interface


In [None]:
# KERNEL CHECK - Make sure you're using the correct Python environment
import sys
import os

print("🐍 Python Environment Check")
print("=" * 50)
print(f"Python version: {sys.version}")
print(f"Python executable: {sys.executable}")
print(f"Current working directory: {os.getcwd()}")

# Change to the correct directory
# Note: Adjust this path based on your setup
notebook_dir = os.path.dirname(os.path.abspath('__file__'))
hero_project_dir = os.path.join(notebook_dir, '..')
os.chdir(hero_project_dir)
print(f"✅ Changed to: {os.getcwd()}")

# Check if we're in the right directory
if not os.path.exists('src/model_loader.py'):
    print("⚠️ Warning: Not in hero-project directory. Please run from companion-code/hero-project/")
else:
    print("✅ In correct directory")

# Add src to path
sys.path.append('src')
sys.path.append('.')
print("✅ Added src and current directory to Python path")


🐍 Python Environment Check
Python version: 3.11.9 (main, May 12 2025, 23:53:03) [Clang 17.0.0 (clang-1700.0.13.3)]
Python executable: /Users/freddyayala/Documents/GitHub/slm-ebook/companion-code/chapters/chapter-08/venv/bin/python3
Current working directory: /Users/freddyayala/Documents/GitHub/slm-ebook/companion-code/hero-project/notebooks
✅ Added src to Python path


In [None]:
# Import all required libraries
import warnings
warnings.filterwarnings("ignore")

import torch
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import time
import os

# Import our custom modules
from src.model_loader import Qwen3VLLoader
from src.vector_store import VectorStore
from src.agents.rag_agent import RAGAgent
from src.agents.task_agent import TaskAgent

print("✅ All imports successful!")
print(f"   PyTorch: {torch.__version__}")
print(f"   Device: {'CUDA' if torch.cuda.is_available() else 'CPU'}")
if torch.cuda.is_available():
    print(f"   GPU: {torch.cuda.get_device_name(0)}")


ModuleNotFoundError: No module named 'model_loader'

## 🧠 STEP 1: LOAD QWEN3-4B-INSTRUCT MODEL

Load the Qwen3-4B-Instruct model for text generation and tool calling.


In [None]:
# Initialize and load the model
print("🔄 Loading Qwen3-4B-Instruct model...")
print("=" * 60)

model_loader = Qwen3VLLoader()

# Get model information
model_info = model_loader.get_model_info()
print("\n📊 Model Information:")
for key, value in model_info.items():
    print(f"   {key}: {value}")

print("\n✅ Model loaded successfully!")


In [None]:
# Test basic text generation
print("🧪 Testing basic text generation...")
print("=" * 50)

test_messages = [
    {"role": "user", "content": "Hello! Can you tell me about small language models?"}
]

start_time = time.time()
response = model_loader.generate_response(test_messages)
end_time = time.time()

print(f"🤖 Model Response:")
print(f"{response}")
print(f"\n⏱️ Generation time: {end_time - start_time:.2f} seconds")
print(f"📝 Response length: {len(response)} characters")


## 🗄️ STEP 2: SETUP VECTOR STORE WITH CHROMADB

Create a knowledge base using ChromaDB for RAG (Retrieval-Augmented Generation).


In [None]:
# Initialize vector store
print("🔄 Setting up ChromaDB vector store...")
print("=" * 50)

vector_store = VectorStore()

# Create sample documents
print("\n📚 Creating sample knowledge base...")
vector_store.save_sample_documents()

# Get collection info
collection_info = vector_store.get_collection_info()
print("\n📊 Vector Store Information:")
for key, value in collection_info.items():
    print(f"   {key}: {value}")

print("\n✅ Vector store ready!")


In [None]:
# Test vector search
print("🔍 Testing vector search...")
print("=" * 40)

test_query = "What are small language models?"
search_results = vector_store.search(test_query, n_results=2)

print(f"Query: {test_query}")
print(f"Found {search_results['count']} relevant documents:")
print()

for i, (doc, metadata) in enumerate(zip(search_results['documents'], search_results['metadatas']), 1):
    print(f"📄 Document {i}:")
    print(f"   Source: {metadata.get('filename', 'Unknown')}")
    print(f"   Content: {doc[:200]}...")
    print()

print("✅ Vector search working!")


## 🤖 STEP 3: INITIALIZE RAG AGENT

Create the RAG agent that combines retrieval with generation.


In [37]:
# Initialize RAG agent
print("🔄 Initializing RAG Agent...")
print("=" * 40)

rag_agent = RAGAgent(model_loader, vector_store)
print("✅ RAG Agent initialized!")

# Test RAG agent with a question
print("\n🧪 Testing RAG Agent...")
test_question = "What is artificial intelligence?"
result = rag_agent.run(test_question)

print(f"\n🤖 Question: {test_question}")
print(f"🤖 Answer: {result['answer']}")
print(f"📚 Context used: {len(result['context'])} characters")

print("\n✅ RAG Agent working!")


🔄 Initializing RAG Agent...
✅ RAG Agent initialized
✅ RAG Agent initialized!

🧪 Testing RAG Agent...
🔍 Searching knowledge base for: 'What is artificial intelligence?'
🤖 Generating response...

🤖 Question: What is artificial intelligence?
🤖 Answer: Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence. These tasks include learning, reasoning, problem-solving, perception, and language understanding.
📚 Context used: 1201 characters

✅ RAG Agent working!


In [38]:
# Initialize Task Agent
print("🔄 Initializing Task Agent...")
print("=" * 40)

task_agent = TaskAgent(model_loader)
print("✅ Task Agent initialized!")

# Test Task agent with file operations
print("\n🧪 Testing Task Agent...")
test_task = "Create a file called test_agent.txt with the content: Hello from the AI agent!"
result = task_agent.run(test_task)

print(f"\n🤖 Task: {test_task}")
print(f"🤖 Result: {result['result']}")
print(f"🔧 Tools used: {result['tools_used']}")

print("\n✅ Task Agent working!")


🔄 Initializing Task Agent...
✅ Task Agent initialized
✅ Task Agent initialized!

🧪 Testing Task Agent...
🤖 Processing task: 'Create a file called test_agent.txt with the content: Hello from the AI agent!'
🔧 Agent decided to use tool: file_write

🤖 Task: Create a file called test_agent.txt with the content: Hello from the AI agent!
🤖 Result: The file `test_agent.txt` has been successfully created with the content: "Hello from the AI agent!"
🔧 Tools used: ['file_write']

✅ Task Agent working!


In [39]:
# 🎉 FINAL DEMONSTRATION
print("🎯 HERO PROJECT: COMPLETE DEMONSTRATION")
print("=" * 60)

# Test 1: RAG Agent
print("\n🔍 DEMO 1: RAG Agent Question Answering")
print("-" * 50)
rag_question = "What are the benefits of small language models?"
rag_result = rag_agent.run(rag_question)
print(f"Question: {rag_question}")
print(f"Answer: {rag_result['answer'][:200]}...")

# Test 2: Task Agent
print("\n🔧 DEMO 2: Task Agent File Operations")
print("-" * 50)
task_instruction = "Create a file called hero_demo.txt with a summary of what we learned"
task_result = task_agent.run(task_instruction)
print(f"Task: {task_instruction}")
print(f"Result: {task_result['result']}")

# Test 3: Verify file creation
print("\n📁 DEMO 3: Verify File Creation")
print("-" * 50)
try:
    with open('hero_demo.txt', 'r') as f:
        content = f.read()
    print(f"✅ File created successfully!")
    print(f"Content: {content[:100]}...")
except FileNotFoundError:
    print("❌ File not found")

print("\n🎉 HERO PROJECT DEMONSTRATION COMPLETE!")
print("✅ RAG Agent: Working")
print("✅ Task Agent: Working") 
print("✅ File Operations: Working")
print("✅ Real Agentic AI: Achieved!")


🎯 HERO PROJECT: COMPLETE DEMONSTRATION

🔍 DEMO 1: RAG Agent Question Answering
--------------------------------------------------
🔍 Searching knowledge base for: 'What are the benefits of small language models?'
🤖 Generating response...
Question: What are the benefits of small language models?
Answer: The benefits of small language models (SLMs) include:

- **Local processing and privacy**: Models can run on local devices, reducing the need to send data to remote servers and enhancing user privacy....

🔧 DEMO 2: Task Agent File Operations
--------------------------------------------------
🤖 Processing task: 'Create a file called hero_demo.txt with a summary of what we learned'
🔧 Agent decided to use tool: file_write
Task: Create a file called hero_demo.txt with a summary of what we learned
Result: The file `hero_demo.txt` has been successfully created with a summary of what we learned. The summary includes key aspects of autonomous decision-making, tool usage, proactive behavior, infor