AI Agent from Scratch

A fully-functional AI agent built from scratch in Python without using any frameworks. This agent features conversational memory, tool use capabilities, and an agentic reasoning loop powered by Google's Gemini 2.0 Flash.

🌟 Features

💬 Conversational Memory: Maintains context across conversations with configurable history limits
🔧 Tool Use: Extensible tool system with built-in tools (calculator, time, note-taking)
🤖 Agentic Loop: Multi-step reasoning with autonomous tool selection and execution
💾 Persistence: Save and load conversation history
🎨 Interactive CLI: User-friendly command-line interface
📦 Fast Setup: Uses uv for blazing-fast package management

🏗️ Architecture

ai-agent-scratch/
├── agent.py           # Core agent logic and Gemini integration
├── memory.py          # Conversation history and context management
├── tools.py           # Tool registry and execution
├── main.py            # Interactive CLI interface
├── test_gemini.py     # Test suite
├── .env               # Environment variables (API keys)
└── README.md          # This file

Components

Agent Core (agent.py): Orchestrates the entire system, manages API calls to Gemini, and handles the agentic loop
Memory System (memory.py): Stores conversation history, manages context, and provides long-term fact storage
Tool Registry (tools.py): Defines available tools, handles tool execution, and manages tool schemas
Main Interface (main.py): Provides interactive CLI and programmatic usage examples

🚀 Quick Start

Prerequisites

Python 3.8 or higher
Google API key (Get one here)

Installation

Install uv (fast Python package manager):

curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc  # or ~/.zshrc

Clone or create the project:

mkdir ai-agent-scratch
cd ai-agent-scratch

Set up the project:

# Initialize with uv
uv init

# Create virtual environment
uv venv

# Activate virtual environment
source .venv/bin/activate  # On Linux/macOS
# .venv\Scripts\activate   # On Windows

# Install dependencies
uv pip install google-generativeai python-dotenv

Configure API key:

Create a .env file:

echo "GOOGLE_API_KEY=your_api_key_here" > .env

Replace your_api_key_here with your actual Google API key.

Copy the code files (agent.py, memory.py, tools.py, main.py) into your project directory.

Running the Agent

# Interactive mode
uv run main.py
# or
python3 main.py

# Run tests
python3 test_gemini.py

💡 Usage Examples

Interactive Mode

$ python3 main.py

╔══════════════════════════════════════╗
║   AI Agent - Built from Scratch     ║
║   Type 'quit' to exit               ║
║   Type 'clear' to clear history     ║
║   Type 'history' to see history     ║
║   Type 'save' to save conversation  ║
╚══════════════════════════════════════╝

You: What is 25 * 47?
Assistant: Let me calculate that for you. 25 × 47 = 1,175

You: What time is it?
Assistant: The current time is 2024-10-10 14:32:15

You: Save a note titled "Meeting" with content "Discuss Q4 goals"
Assistant: I've saved your note titled "Meeting" successfully.

Programmatic Usage

from agent import AIAgent
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")

# Initialize agent
agent = AIAgent(api_key=api_key)

# Simple conversation
response = agent.chat("Hello! What can you help me with?")
print(response)

# Use tools
response = agent.chat("Calculate the square root of 144")
print(response)

# Multi-step reasoning
response = agent.chat(
    "Calculate 15 * 23, then take the square root, then tell me the time"
)
print(response)

Adding Custom Tools

# Define your custom tool
def reverse_string(text: str) -> str:
    """Reverse a string."""
    return text[::-1]

# Add it to the agent
agent.add_tool(
    name="reverse_string",
    function=reverse_string,
    description="Reverses the characters in a string",
    parameters={
        "type": "object",
        "properties": {
            "text": {
                "type": "string",
                "description": "The text to reverse"
            }
        },
        "required": ["text"]
    }
)

# Use the custom tool
response = agent.chat("Reverse the string 'Hello World'")
print(response)  # Will use your custom tool

🛠️ Built-in Tools

Calculator

Performs mathematical calculations with support for basic arithmetic and common math functions.

Example: "What is sqrt(144) + 10?"

Get Current Time

Returns the current date and time.

Example: "What time is it?"

Save Note

Saves notes to a file for later reference.

Example: "Save a note titled 'Ideas' with content 'Build a chatbot'"

🎮 CLI Commands

While in interactive mode:

quit - Exit the application
clear - Clear conversation history
history - Display conversation history
save - Save conversation to JSON file
verbose - Enable detailed logging for next message

🧪 Testing

Run the comprehensive test suite:

python3 test_gemini.py

Tests included:

Basic conversation
Calculator tool
Time tool
Note saving
Multi-step reasoning
Conversation memory
Custom tool addition

🔧 Configuration

Changing the Model

# Use different Gemini models
agent = AIAgent(
    api_key=api_key,
    model="gemini-2.0-flash-exp"  # Fast and efficient (default)
    # model="gemini-1.5-pro"      # More capable
    # model="gemini-1.5-flash"    # Faster, simpler tasks
)

Customizing System Prompt

agent = AIAgent(
    api_key=api_key,
    system_prompt="""You are a specialized assistant for data analysis.
    Focus on providing accurate statistical insights and visualizations."""
)

Memory Configuration

from memory import Memory

# Adjust memory limits
memory = Memory(max_history=100)  # Keep last 100 messages
agent.memory = memory

📚 API Reference

AIAgent Class

AIAgent(
    api_key: str,
    model: str = "gemini-2.0-flash-exp",
    system_prompt: str = None,
    max_tokens: int = 8192
)

Methods

chat(user_message: str, verbose: bool = False) -> str: Send a message and get response
add_tool(name, function, description, parameters): Register a custom tool
clear_memory(): Clear conversation history
get_history() -> List[Dict]: Retrieve conversation history
save_conversation(filename: str): Save conversation to file
load_conversation(filename: str): Load conversation from file

Memory Class

Memory(max_history: int = 50)

Methods

add_message(role: str, content: str, metadata: Dict = None): Add message to history
get_history(last_n: int = None) -> List[Dict]: Get conversation history
store_fact(key: str, value: Any): Store long-term fact
get_fact(key: str) -> Any: Retrieve stored fact
clear_history(): Clear all history

ToolRegistry Class

ToolRegistry()

Methods

register_tool(name, function, description, parameters): Register a new tool
execute_tool(tool_name: str, tool_input: Dict) -> str: Execute a tool
get_tool_definitions() -> List[Dict]: Get all tool definitions

🔍 How It Works

The Agentic Loop

graph TD
    A[User Input] --> B[Send to Gemini with Tools]
    B --> C{Response Type?}
    C -->|Text| D[Return Response]
    C -->|Function Call| E[Execute Tool]
    E --> F[Send Results Back]
    F --> B
    D --> G[Add to Memory]

Input: User sends a message
Processing: Agent sends message to Gemini with available tools
Decision: Gemini decides whether to use tools or respond directly
Tool Use: If tools are needed, execute them and send results back
Iteration: Continue until Gemini provides final response
Memory: Store conversation in memory for context

🚨 Troubleshooting

"Command 'python' not found"

Use python3 instead:

python3 main.py

Or install python-is-python3:

sudo apt install python-is-python3

"GOOGLE_API_KEY not found"

Make sure your .env file exists and contains:

GOOGLE_API_KEY=your_actual_key_here

"Module not found" errors

Install dependencies:

uv pip install google-generativeai python-dotenv

API Rate Limits

Gemini has rate limits. If you hit them:

Wait a few moments between requests
Consider upgrading your API plan
Implement retry logic with exponential backoff

🎯 Future Enhancements

Add web search tool
Implement RAG (Retrieval Augmented Generation)
Add multi-agent collaboration
Create web interface with Gradio/Streamlit
Add voice input/output
Implement persistent storage (database)
Add logging and monitoring
Create Docker container

📄 License

This project is open source and available under the MIT License.

🤝 Contributing

Contributions are welcome! Feel free to:

Add new tools
Improve the agentic loop
Enhance memory management
Add new features
Fix bugs

🙏 Acknowledgments

Built with Google Gemini API
Package management by uv
Inspired by modern AI agent architectures

📞 Support

For issues, questions, or suggestions:

Open an issue on GitHub
Check the troubleshooting section above
Review the Gemini API documentation

Built from scratch with ❤️ and Python

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env		.env
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
agent.py		agent.py
main.py		main.py
memory.py		memory.py
notes.txt		notes.txt
pyproject.toml		pyproject.toml
test_gemini.py		test_gemini.py
tools.py		tools.py
uv.lock		uv.lock

pksw4u/ai-agent-from-scratch-python

Folders and files

Latest commit

History

Repository files navigation

AI Agent from Scratch

🌟 Features

🏗️ Architecture

Components

🚀 Quick Start

Prerequisites

Installation

Running the Agent

💡 Usage Examples

Interactive Mode

Programmatic Usage

Adding Custom Tools

🛠️ Built-in Tools

Calculator

Get Current Time

Save Note

🎮 CLI Commands

🧪 Testing

🔧 Configuration

Changing the Model

Customizing System Prompt

Memory Configuration

📚 API Reference

AIAgent Class

Methods

Memory Class

Methods

ToolRegistry Class

Methods

🔍 How It Works

The Agentic Loop

🚨 Troubleshooting

"Command 'python' not found"

"GOOGLE_API_KEY not found"

"Module not found" errors

API Rate Limits

🎯 Future Enhancements

📄 License

🤝 Contributing

🙏 Acknowledgments

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages