A Governed Multi-Agent System for Building Bots from Natural Language
BotSmith is a modular, production-oriented multi-agent framework that converts natural language requests into planned, validated, governed, and executable workflows. It is designed to demonstrate how autonomous agents can be built safely, with strong separation of concerns, deterministic execution, and full observability.
What's demonstrated:
- Natural language parsing & intent extraction
- Multi-agent orchestration (Router β Planner β Validator β Executor)
- Governance gates (cost estimation, security scanning)
- Real-time workflow visualization
- Memory persistence & session management
git clone https://github.com/blexyyyyy/botsmith.git
cd botsmith
# Set environment
export PYTHONPATH=$(pwd)
# Run integration tests
pytest tests/integration
# Run a sample workflow
python main.pyBotSmith is designed to be deterministic, observable, and safe by default.
Watch the BotSmith Demo Video to see the system in action.
Parses user intent from plain English with confidence and ambiguity handling.
Translates intent to plan to workflow to execution deterministically.
- Logic-first agents for planning, validation, routing, execution
- LLM-assisted agents where language reasoning is useful
- Model-agnostic design via LLM abstraction
- Validation gates
- Cost estimation
- Security scanning
- Workflow optimization
Automatically plans, builds, and validates safe, immutable, and test-verified custom tools during the build phase.
Full integration tests covering the complete lifecycle, including the dynamic tool pipeline.
- Execution Context: Ephemeral, step-level state for tools and reasoning.
- Session Memory: Short-term, workflow-scoped coordination across agents.
- Long-Term Memory: Persistent, policy-gated storage for user preferences and project knowledge.
Note
For a deep dive into our design philosophy, compiler-style lifecycle, and safety invariants, see architecture.md.
graph TD
A[Natural Language Input] --> B[NLP Interpreter]
B --> C[Router Agent]
C --> D[Planner Agent]
D --> E[Validator Agent]
subgraph "Dynamic Tool Pipeline (v1.1)"
E --> T1[Tool Plan Agent]
T1 --> T2[Tool Builder Agent]
T2 --> T3[Tool Validator Agent]
end
T3 --> F[Workflow Compiler]
F --> G[Optimizer / Cost / Security]
G --> H[Workflow Executor]
H --> I[Memory Manager]
I --> J[(SQLite Persistence)]
Each stage is explicit, testable, and replaceable.
- SOLID and Clean Architecture
- Dependency Inversion (interfaces over implementations)
- Deterministic execution
- No blind trust in LLMs
- Governed autonomy over raw autonomy
- RouterAgent: selects the appropriate workflow
- PlannerAgent: generates a structured execution plan
- ValidatorAgent: validates plans and invariants
- WorkflowCompilerAgent: compiles plans into executable workflows
- WorkflowExecutor: executes workflows step by step
- CostEstimatorAgent: estimates and gates execution cost
- SecurityAgent: blocks unsafe operations
- WorkflowOptimizerAgent: reorders and deduplicates steps
- ToolPlanAgent: plans tool specifications from requirements
- ToolBuilderAgent: generates isolated Python tool code and unit tests
- ToolValidatorAgent: validates generated tools via test execution and AST scanning
- NLPInterpreterAgent: LLM-assisted semantic parsing, Schema validation, Confidence and ambiguity handling
- Gated Memory: Multi-layer scoped storage with policy-enforced writes.
- Local inference via Ollama
- Cloud-ready design (Groq, Gemini, OpenAI supported via abstraction)
User: "Build a Python weather bot"
- NLP extracts intent
- Router selects
bot_creation_workflow - Planner generates steps
- Validator enforces correctness
- Compiler builds workflow
- Cost and security checks pass
- Executor runs steps
- Execution persisted to database
python -m venv .venv
source .venv/bin/activate
# Windows: .venv\Scripts\activatepip install -r requirements.txtpytest tests/integrationpytest tests/integration/test_end_to_end_creation.py- 3-Layer Memory System: Execution, Session, and Long-Term Persistent memory implemented.
- Hybrid Multi-Agent Core: Fully implemented and end-to-end tested.
- Workflow Governance: Cost estimation, security scanning, and optimization gates active.
- [UPGRADE] Dynamic Tool Generation: Fully integrated Plan-Build-Validate engine for custom tools.
- [UPGRADE] Dependency Injection Architecture: Refactored
AgentFactoryas a production-grade Composition Root for better testability and decoupled agent logic. - API & UI: FastAPI implementation and React/Vite visualization (
botsmith-ui) in progress. - Workflow Persistence: SQLite-backed audit trails and session history functional.
- Dynamic Agent Synthesis: Auto-generation of specialized agents based on task complexity.
- Human-in-the-Loop: Interactive control gates for high-risk operations.
- Adaptive Execution: Real-time workflow adjustment based on tool feedback.
- Advanced visualization: Enhanced pipeline and agent state monitoring.
BotSmith follows a strictly decoupled, local-first architecture designed for stability and auditability.
- Core Abstractions: Foundational interfaces remain in
botsmith/core/. - Concrete Packages: Driver-level logic is promote to
botsmith/llm/,botsmith/memory/, andbotsmith/utils/. - Standardized Imports: 100% absolute import paths ensure reliable module resolution.
- Policy-Gated: Agents only propose state changes;
MemoryManagerenforcesMemoryPolicy. - Multi-Layer Persistence:
EXECUTION: Ephemeral step state.SESSION: Workflow coordination.PROJECT/USER: SQLite-backed long-term storage (verified to survive restarts).
All core systems are verified via automated integration suites:
- Persistence:
test_agent_memory_persistence.pyconfirms interaction logs route to disk. - Workflow:
test_workflow_execution.pyvalidates the full Factory -> Executor pipeline. - Governance:
test_governance_agents.pyverifies cost/security gates.
Codebase frozen at v1.1.0-tool-generation-upgrade
botsmith/
βββ core/ # Interfaces, base classes, utilities
βββ agents/ # Specialized agents
βββ workflows/ # Workflow compiler and executor
βββ nlp/ # NLP parsing and intent normalization
βββ persistence/ # SQLite persistence layer
βββ factory/ # Agent and workflow factories
βββ tests/ # Unit, integration, end-to-end tests
βββ botsmith-ui/ # React + Vite + Tailwind CSS Frontend
βββ main.py
Most AI agent projects focus on prompting. BotSmith focuses on systems design.
The goal is to demonstrate how autonomous systems can be structured, governed, tested, and safely extended.
This repository is intended as a portfolio-grade systems project, not a product demo.
MIT
