Skip to content

Experimental AI agent built from scratch in Python to explore low-level architectures for reasoning, planning, and autonomous decision-making.

Notifications You must be signed in to change notification settings

pksw4u/ai-agent-from-scratch-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AI Agent from Scratch

A fully-functional AI agent built from scratch in Python without using any frameworks. This agent features conversational memory, tool use capabilities, and an agentic reasoning loop powered by Google's Gemini 2.0 Flash.

๐ŸŒŸ Features

  • ๐Ÿ’ฌ Conversational Memory: Maintains context across conversations with configurable history limits
  • ๐Ÿ”ง Tool Use: Extensible tool system with built-in tools (calculator, time, note-taking)
  • ๐Ÿค– Agentic Loop: Multi-step reasoning with autonomous tool selection and execution
  • ๐Ÿ’พ Persistence: Save and load conversation history
  • ๐ŸŽจ Interactive CLI: User-friendly command-line interface
  • ๐Ÿ“ฆ Fast Setup: Uses uv for blazing-fast package management

๐Ÿ—๏ธ Architecture

ai-agent-scratch/
โ”œโ”€โ”€ agent.py           # Core agent logic and Gemini integration
โ”œโ”€โ”€ memory.py          # Conversation history and context management
โ”œโ”€โ”€ tools.py           # Tool registry and execution
โ”œโ”€โ”€ main.py            # Interactive CLI interface
โ”œโ”€โ”€ test_gemini.py     # Test suite
โ”œโ”€โ”€ .env               # Environment variables (API keys)
โ””โ”€โ”€ README.md          # This file

Components

  1. Agent Core (agent.py): Orchestrates the entire system, manages API calls to Gemini, and handles the agentic loop
  2. Memory System (memory.py): Stores conversation history, manages context, and provides long-term fact storage
  3. Tool Registry (tools.py): Defines available tools, handles tool execution, and manages tool schemas
  4. Main Interface (main.py): Provides interactive CLI and programmatic usage examples

๐Ÿš€ Quick Start

Prerequisites

Installation

  1. Install uv (fast Python package manager):
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc  # or ~/.zshrc
  1. Clone or create the project:
mkdir ai-agent-scratch
cd ai-agent-scratch
  1. Set up the project:
# Initialize with uv
uv init

# Create virtual environment
uv venv

# Activate virtual environment
source .venv/bin/activate  # On Linux/macOS
# .venv\Scripts\activate   # On Windows

# Install dependencies
uv pip install google-generativeai python-dotenv
  1. Configure API key:

Create a .env file:

echo "GOOGLE_API_KEY=your_api_key_here" > .env

Replace your_api_key_here with your actual Google API key.

  1. Copy the code files (agent.py, memory.py, tools.py, main.py) into your project directory.

Running the Agent

# Interactive mode
uv run main.py
# or
python3 main.py

# Run tests
python3 test_gemini.py

๐Ÿ’ก Usage Examples

Interactive Mode

$ python3 main.py

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘   AI Agent - Built from Scratch     โ•‘
โ•‘   Type 'quit' to exit               โ•‘
โ•‘   Type 'clear' to clear history     โ•‘
โ•‘   Type 'history' to see history     โ•‘
โ•‘   Type 'save' to save conversation  โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

You: What is 25 * 47?
Assistant: Let me calculate that for you. 25 ร— 47 = 1,175

You: What time is it?
Assistant: The current time is 2024-10-10 14:32:15

You: Save a note titled "Meeting" with content "Discuss Q4 goals"
Assistant: I've saved your note titled "Meeting" successfully.

Programmatic Usage

from agent import AIAgent
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")

# Initialize agent
agent = AIAgent(api_key=api_key)

# Simple conversation
response = agent.chat("Hello! What can you help me with?")
print(response)

# Use tools
response = agent.chat("Calculate the square root of 144")
print(response)

# Multi-step reasoning
response = agent.chat(
    "Calculate 15 * 23, then take the square root, then tell me the time"
)
print(response)

Adding Custom Tools

# Define your custom tool
def reverse_string(text: str) -> str:
    """Reverse a string."""
    return text[::-1]

# Add it to the agent
agent.add_tool(
    name="reverse_string",
    function=reverse_string,
    description="Reverses the characters in a string",
    parameters={
        "type": "object",
        "properties": {
            "text": {
                "type": "string",
                "description": "The text to reverse"
            }
        },
        "required": ["text"]
    }
)

# Use the custom tool
response = agent.chat("Reverse the string 'Hello World'")
print(response)  # Will use your custom tool

๐Ÿ› ๏ธ Built-in Tools

Calculator

Performs mathematical calculations with support for basic arithmetic and common math functions.

Example: "What is sqrt(144) + 10?"

Get Current Time

Returns the current date and time.

Example: "What time is it?"

Save Note

Saves notes to a file for later reference.

Example: "Save a note titled 'Ideas' with content 'Build a chatbot'"

๐ŸŽฎ CLI Commands

While in interactive mode:

  • quit - Exit the application
  • clear - Clear conversation history
  • history - Display conversation history
  • save - Save conversation to JSON file
  • verbose - Enable detailed logging for next message

๐Ÿงช Testing

Run the comprehensive test suite:

python3 test_gemini.py

Tests included:

  • Basic conversation
  • Calculator tool
  • Time tool
  • Note saving
  • Multi-step reasoning
  • Conversation memory
  • Custom tool addition

๐Ÿ”ง Configuration

Changing the Model

# Use different Gemini models
agent = AIAgent(
    api_key=api_key,
    model="gemini-2.0-flash-exp"  # Fast and efficient (default)
    # model="gemini-1.5-pro"      # More capable
    # model="gemini-1.5-flash"    # Faster, simpler tasks
)

Customizing System Prompt

agent = AIAgent(
    api_key=api_key,
    system_prompt="""You are a specialized assistant for data analysis.
    Focus on providing accurate statistical insights and visualizations."""
)

Memory Configuration

from memory import Memory

# Adjust memory limits
memory = Memory(max_history=100)  # Keep last 100 messages
agent.memory = memory

๐Ÿ“š API Reference

AIAgent Class

AIAgent(
    api_key: str,
    model: str = "gemini-2.0-flash-exp",
    system_prompt: str = None,
    max_tokens: int = 8192
)

Methods

  • chat(user_message: str, verbose: bool = False) -> str: Send a message and get response
  • add_tool(name, function, description, parameters): Register a custom tool
  • clear_memory(): Clear conversation history
  • get_history() -> List[Dict]: Retrieve conversation history
  • save_conversation(filename: str): Save conversation to file
  • load_conversation(filename: str): Load conversation from file

Memory Class

Memory(max_history: int = 50)

Methods

  • add_message(role: str, content: str, metadata: Dict = None): Add message to history
  • get_history(last_n: int = None) -> List[Dict]: Get conversation history
  • store_fact(key: str, value: Any): Store long-term fact
  • get_fact(key: str) -> Any: Retrieve stored fact
  • clear_history(): Clear all history

ToolRegistry Class

ToolRegistry()

Methods

  • register_tool(name, function, description, parameters): Register a new tool
  • execute_tool(tool_name: str, tool_input: Dict) -> str: Execute a tool
  • get_tool_definitions() -> List[Dict]: Get all tool definitions

๐Ÿ” How It Works

The Agentic Loop

graph TD
    A[User Input] --> B[Send to Gemini with Tools]
    B --> C{Response Type?}
    C -->|Text| D[Return Response]
    C -->|Function Call| E[Execute Tool]
    E --> F[Send Results Back]
    F --> B
    D --> G[Add to Memory]
Loading
  1. Input: User sends a message
  2. Processing: Agent sends message to Gemini with available tools
  3. Decision: Gemini decides whether to use tools or respond directly
  4. Tool Use: If tools are needed, execute them and send results back
  5. Iteration: Continue until Gemini provides final response
  6. Memory: Store conversation in memory for context

๐Ÿšจ Troubleshooting

"Command 'python' not found"

Use python3 instead:

python3 main.py

Or install python-is-python3:

sudo apt install python-is-python3

"GOOGLE_API_KEY not found"

Make sure your .env file exists and contains:

GOOGLE_API_KEY=your_actual_key_here

"Module not found" errors

Install dependencies:

uv pip install google-generativeai python-dotenv

API Rate Limits

Gemini has rate limits. If you hit them:

  • Wait a few moments between requests
  • Consider upgrading your API plan
  • Implement retry logic with exponential backoff

๐ŸŽฏ Future Enhancements

  • Add web search tool
  • Implement RAG (Retrieval Augmented Generation)
  • Add multi-agent collaboration
  • Create web interface with Gradio/Streamlit
  • Add voice input/output
  • Implement persistent storage (database)
  • Add logging and monitoring
  • Create Docker container

๐Ÿ“„ License

This project is open source and available under the MIT License.

๐Ÿค Contributing

Contributions are welcome! Feel free to:

  • Add new tools
  • Improve the agentic loop
  • Enhance memory management
  • Add new features
  • Fix bugs

๐Ÿ™ Acknowledgments

  • Built with Google Gemini API
  • Package management by uv
  • Inspired by modern AI agent architectures

๐Ÿ“ž Support

For issues, questions, or suggestions:


Built from scratch with โค๏ธ and Python

About

Experimental AI agent built from scratch in Python to explore low-level architectures for reasoning, planning, and autonomous decision-making.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages