Skip to content

BotSmith is a modular multi-agent automation framework that orchestrates autonomous workflows using LLM-powered agents. It emphasizes clean architecture, dependency injection, and provider-agnostic LLM integration, enabling reliable task execution, extensibility, and transparent agent collaboration.

Notifications You must be signed in to change notification settings

blexyyyyy/botsmith

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

BotSmith

Python Tests Status Demo

A Governed Multi-Agent System for Building Bots from Natural Language

BotSmith is a modular, production-oriented multi-agent framework that converts natural language requests into planned, validated, governed, and executable workflows. It is designed to demonstrate how autonomous agents can be built safely, with strong separation of concerns, deterministic execution, and full observability.


πŸŽ₯ See It In Action

BotSmith Demo

▢️ Watch Full Demo (YouTube) - See BotSmith orchestrate a multi-agent workflow from natural language to executable code.

What's demonstrated:

  • Natural language parsing & intent extraction
  • Multi-agent orchestration (Router β†’ Planner β†’ Validator β†’ Executor)
  • Governance gates (cost estimation, security scanning)
  • Real-time workflow visualization
  • Memory persistence & session management

Quick Start

git clone https://github.com/blexyyyyy/botsmith.git
cd botsmith

# Set environment
export PYTHONPATH=$(pwd)

# Run integration tests
pytest tests/integration

# Run a sample workflow
python main.py

BotSmith is designed to be deterministic, observable, and safe by default.

Demo

Watch the BotSmith Demo Video to see the system in action.

Key Capabilities

Natural Language Interface

Parses user intent from plain English with confidence and ambiguity handling.

Planner–Compiler–Executor Architecture

Translates intent to plan to workflow to execution deterministically.

Hybrid Agent System

  • Logic-first agents for planning, validation, routing, execution
  • LLM-assisted agents where language reasoning is useful
  • Model-agnostic design via LLM abstraction

Governance Built In

  • Validation gates
  • Cost estimation
  • Security scanning
  • Workflow optimization

[NEW] Dynamic Tool Generation Engine

Automatically plans, builds, and validates safe, immutable, and test-verified custom tools during the build phase.

End-to-End Tested

Full integration tests covering the complete lifecycle, including the dynamic tool pipeline.

3-Layer Memory System (State + Preferences + Persistence)

  • Execution Context: Ephemeral, step-level state for tools and reasoning.
  • Session Memory: Short-term, workflow-scoped coordination across agents.
  • Long-Term Memory: Persistent, policy-gated storage for user preferences and project knowledge.

High-Level Architecture

Note

For a deep dive into our design philosophy, compiler-style lifecycle, and safety invariants, see architecture.md.

graph TD
    A[Natural Language Input] --> B[NLP Interpreter]
    B --> C[Router Agent]
    C --> D[Planner Agent]
    D --> E[Validator Agent]
    
    subgraph "Dynamic Tool Pipeline (v1.1)"
        E --> T1[Tool Plan Agent]
        T1 --> T2[Tool Builder Agent]
        T2 --> T3[Tool Validator Agent]
    end
    
    T3 --> F[Workflow Compiler]
    F --> G[Optimizer / Cost / Security]
    G --> H[Workflow Executor]
    H --> I[Memory Manager]
    I --> J[(SQLite Persistence)]
Loading

Each stage is explicit, testable, and replaceable.

Core Design Principles

  • SOLID and Clean Architecture
  • Dependency Inversion (interfaces over implementations)
  • Deterministic execution
  • No blind trust in LLMs
  • Governed autonomy over raw autonomy

Agent Types

Core Logic Agents

  • RouterAgent: selects the appropriate workflow
  • PlannerAgent: generates a structured execution plan
  • ValidatorAgent: validates plans and invariants
  • WorkflowCompilerAgent: compiles plans into executable workflows
  • WorkflowExecutor: executes workflows step by step
  • CostEstimatorAgent: estimates and gates execution cost
  • SecurityAgent: blocks unsafe operations
  • WorkflowOptimizerAgent: reorders and deduplicates steps

Tool Generation Agents (NEW)

  • ToolPlanAgent: plans tool specifications from requirements
  • ToolBuilderAgent: generates isolated Python tool code and unit tests
  • ToolValidatorAgent: validates generated tools via test execution and AST scanning

NLP Agent

  • NLPInterpreterAgent: LLM-assisted semantic parsing, Schema validation, Confidence and ambiguity handling

LLM Support

  • Gated Memory: Multi-layer scoped storage with policy-enforced writes.
  • Local inference via Ollama
  • Cloud-ready design (Groq, Gemini, OpenAI supported via abstraction)

Example End-to-End Flow

User: "Build a Python weather bot"

  1. NLP extracts intent
  2. Router selects bot_creation_workflow
  3. Planner generates steps
  4. Validator enforces correctness
  5. Compiler builds workflow
  6. Cost and security checks pass
  7. Executor runs steps
  8. Execution persisted to database

Running Tests

Create and activate a virtual environment

python -m venv .venv
source .venv/bin/activate
# Windows: .venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Run integration tests

pytest tests/integration

Run full end-to-end test

pytest tests/integration/test_end_to_end_creation.py

Current Status

  • 3-Layer Memory System: Execution, Session, and Long-Term Persistent memory implemented.
  • Hybrid Multi-Agent Core: Fully implemented and end-to-end tested.
  • Workflow Governance: Cost estimation, security scanning, and optimization gates active.
  • [UPGRADE] Dynamic Tool Generation: Fully integrated Plan-Build-Validate engine for custom tools.
  • [UPGRADE] Dependency Injection Architecture: Refactored AgentFactory as a production-grade Composition Root for better testability and decoupled agent logic.
  • API & UI: FastAPI implementation and React/Vite visualization (botsmith-ui) in progress.
  • Workflow Persistence: SQLite-backed audit trails and session history functional.

Planned

  • Dynamic Agent Synthesis: Auto-generation of specialized agents based on task complexity.
  • Human-in-the-Loop: Interactive control gates for high-risk operations.
  • Adaptive Execution: Real-time workflow adjustment based on tool feedback.
  • Advanced visualization: Enhanced pipeline and agent state monitoring.

Architecture & Verification

BotSmith follows a strictly decoupled, local-first architecture designed for stability and auditability.

πŸ›  Refined Structure

  • Core Abstractions: Foundational interfaces remain in botsmith/core/.
  • Concrete Packages: Driver-level logic is promote to botsmith/llm/, botsmith/memory/, and botsmith/utils/.
  • Standardized Imports: 100% absolute import paths ensure reliable module resolution.

Advanced Memory Contract

  • Policy-Gated: Agents only propose state changes; MemoryManager enforces MemoryPolicy.
  • Multi-Layer Persistence:
    • EXECUTION: Ephemeral step state.
    • SESSION: Workflow coordination.
    • PROJECT/USER: SQLite-backed long-term storage (verified to survive restarts).

Verification Baseline

All core systems are verified via automated integration suites:

  • Persistence: test_agent_memory_persistence.py confirms interaction logs route to disk.
  • Workflow: test_workflow_execution.py validates the full Factory -> Executor pipeline.
  • Governance: test_governance_agents.py verifies cost/security gates.

Codebase frozen at v1.1.0-tool-generation-upgrade

Project Structure

botsmith/
β”œβ”€β”€ core/               # Interfaces, base classes, utilities
β”œβ”€β”€ agents/             # Specialized agents
β”œβ”€β”€ workflows/          # Workflow compiler and executor
β”œβ”€β”€ nlp/                # NLP parsing and intent normalization
β”œβ”€β”€ persistence/        # SQLite persistence layer
β”œβ”€β”€ factory/            # Agent and workflow factories
β”œβ”€β”€ tests/              # Unit, integration, end-to-end tests
β”œβ”€β”€ botsmith-ui/        # React + Vite + Tailwind CSS Frontend
└── main.py

Why This Project Exists

Most AI agent projects focus on prompting. BotSmith focuses on systems design.

The goal is to demonstrate how autonomous systems can be structured, governed, tested, and safely extended.

This repository is intended as a portfolio-grade systems project, not a product demo.

License

MIT

About

BotSmith is a modular multi-agent automation framework that orchestrates autonomous workflows using LLM-powered agents. It emphasizes clean architecture, dependency injection, and provider-agnostic LLM integration, enabling reliable task execution, extensibility, and transparent agent collaboration.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published