Autonomous AI Coding System

A sophisticated multi-agent system that uses Claude AI to autonomously generate, verify, and improve code through intelligent orchestration and continuous verification loops.

Project Overview

The Autonomous AI Coding System is a development-phase platform (approaching production-ready status) that coordinates multiple AI agents to create complete software projects. The system employs a hierarchical architecture with specialized agents that work together to achieve 100% code quality through continuous verification and automated repair loops.

Key Features

Multi-Agent Architecture: Hierarchical system with Meta Agent orchestration and specialized Sub-Agents
Continuous Verification: Dual verification system (compilation + testing) ensures code quality
Intelligent Repair: Automated failure analysis and repair loops
Parallel Processing: Multiple agents work simultaneously for efficiency
Comprehensive Artifacts: Version-controlled storage of all generated code
Production-Ready Features: Async processing, error handling, monitoring, and observability (in development)

Architecture Overview

Core Components

Meta Agent (Task Manager)
- Orchestrates the entire system using OpenAI O3 for task decomposition
- Coordinates multiple sub-agents and manages global context
- Implements intelligent retry and repair mechanisms
Sub-Agents (Always use Claude Code)
- Core Logic Agent: Generates main application code
- Testing Agent: Creates comprehensive test suites
- Documentation Agent: Writes documentation and README files
- Optimization Agent: Improves code performance and quality
- Verification Agent: Validates code and runs quality checks
Verification System
- Compiler Verification: Syntax checking and compilation validation
- Test Verification: Automated test execution and coverage analysis
- Quality Gates: Configurable quality thresholds
Artifact Management
- Version-controlled storage with compression
- Dependency tracking and metadata management
- Hierarchical organization with UUID-based indexing
Communication Hub
- Inter-agent messaging and coordination
- Shared context management
- Event-driven architecture

Installation & Setup

Prerequisites

Python 3.9 or higher
Anthropic API key (for Claude)
OpenAI API key (for Meta Agent)
Claude Code CLI tool installed

Installation Steps

Clone the repository

git clone https://github.com/agentic-system/agentic-coding-system.git
cd agentic-coding-system

Set up virtual environment

python -m venv agentic-env
source agentic-env/bin/activate  # On Windows: agentic-env\Scripts\activate

Install dependencies

pip install -r requirements.txt
# Or using pyproject.toml
pip install -e .

Configure environment variables

cp .env.example .env
# Edit .env with your API keys:
# ANTHROPIC_API_KEY=your_claude_api_key
# OPENAI_API_KEY=your_openai_api_key

Run health check
```
python scripts/health_check.py
```

Development Setup

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run code formatting
black src/ tests/
ruff check src/ tests/

# Run type checking
mypy src/

Usage Guide

Basic Usage

from src.agents.meta_agent import MetaAgent
import asyncio

async def main():
    meta_agent = MetaAgent()
    
    # Simple code generation
    result = await meta_agent.process_request(
        "Create a Python calculator with tests"
    )
    
    print(f"Project created: {result.project_id}")
    print(f"Files generated: {len(result.artifacts)}")

asyncio.run(main())

Command Line Interface

# Run a simple example
python examples/basic_usage.py

# Run the main demo
python demo_agentic_system.py

# Health check
python scripts/health_check.py

Advanced Usage

from src.agents.meta_agent import MetaAgent
from src.core.interfaces import TaskContext
import asyncio

async def advanced_example():
    meta_agent = MetaAgent()
    
    # Create custom task context
    context = TaskContext(
        project_id="my-custom-project",
        base_path="/path/to/workspace",
        settings={
            "max_parallel_tasks": 4,
            "verification_enabled": True,
            "test_coverage_threshold": 90
        }
    )
    
    # Process complex request
    result = await meta_agent.process_request(
        request="Create a REST API with PostgreSQL backend, include authentication, CRUD operations, and comprehensive tests",
        context=context
    )
    
    # Access results
    for artifact in result.artifacts:
        print(f"Generated: {artifact.file_path} ({artifact.type})")

asyncio.run(advanced_example())

Components Documentation

Meta Agent

The Meta Agent serves as the central orchestrator:

Task Decomposition: Uses OpenAI O3 to break complex requests into manageable tasks
Agent Coordination: Manages sub-agent lifecycle and task assignment
Quality Assurance: Ensures 100% compilation and test success
Failure Recovery: Implements intelligent retry and repair mechanisms

Configuration Options:

max_parallel_tasks: Maximum concurrent sub-agents (default: 3)
retry_attempts: Number of retry attempts for failed tasks (default: 3)
verification_required: Enable/disable verification loops (default: True)

Sub-Agents

All sub-agents use Claude Code for enhanced capabilities:

Core Logic Agent

Generates main application code
Implements business logic and data structures
Follows coding best practices and patterns

Testing Agent

Creates comprehensive test suites
Generates unit, integration, and end-to-end tests
Ensures high code coverage

Documentation Agent

Writes technical documentation
Creates README files and code comments
Generates API documentation

Verification System

The verification system ensures code quality through multiple stages:

Syntax Verification: AST parsing and syntax checking
Compilation Verification: Full compilation testing
Test Execution: Automated test running with coverage analysis
Quality Gates: Configurable quality thresholds

Supported Languages:

Python (pytest, coverage)
JavaScript (Jest, Mocha)
TypeScript (tsc, Jest)

Artifact Management

All generated code is stored as versioned artifacts:

Compression: Gzip compression for efficient storage
Metadata: Full tracking of creation, modification, and dependencies
Versioning: Complete version history with rollback capabilities
Integrity: SHA256 checksums for corruption detection

Configuration

Environment Variables

# API Configuration
ANTHROPIC_API_KEY=your_claude_api_key
OPENAI_API_KEY=your_openai_api_key

# System Configuration
MAX_PARALLEL_TASKS=3
VERIFICATION_ENABLED=true
TEST_COVERAGE_THRESHOLD=90

# Storage Configuration
ARTIFACTS_BASE_PATH=./artifacts
PROJECTS_BASE_PATH=./projects

# Logging Configuration
LOG_LEVEL=INFO
LOG_FORMAT=json

Settings Files

Configuration is managed through Pydantic settings:

# config/settings.py
class Settings(BaseSettings):
    # API Settings
    anthropic_api_key: str
    openai_api_key: str
    
    # Agent Settings
    max_parallel_tasks: int = 3
    retry_attempts: int = 3
    
    # Verification Settings
    verification_enabled: bool = True
    test_coverage_threshold: int = 90
    
    class Config:
        env_file = ".env"

Testing

Test Structure

tests/
├── unit/                   # Unit tests for individual components
├── integration/           # Integration tests
├── test_agents.py        # Agent system tests
├── test_verification.py  # Verification system tests
└── test_artifacts.py     # Artifact management tests

Running Tests

# Run all tests
pytest

# Run specific test categories
pytest tests/unit/              # Unit tests only
pytest tests/integration/       # Integration tests only

# Run with coverage
pytest --cov=src --cov-report=html

# Run with verbose output
pytest -v

# Run specific test file
pytest tests/test_agents.py -v

Test Configuration

# pytest.ini
[tool:pytest]
testpaths = tests
asyncio_mode = auto
filterwarnings = 
    error
    ignore::UserWarning
    ignore::DeprecationWarning
addopts = 
    --strict-markers
    --tb=short

API Reference

MetaAgent API

class MetaAgent:
    async def process_request(
        self,
        request: str,
        context: Optional[TaskContext] = None
    ) -> TaskResult:
        """Process a complete user request."""
        
    async def decompose_task(
        self,
        request: str
    ) -> List[Task]:
        """Break down request into executable tasks."""
        
    async def coordinate_execution(
        self,
        tasks: List[Task]
    ) -> List[TaskResult]:
        """Coordinate task execution across agents."""

Sub-Agent API

class SubAgent(Agent):
    async def execute_task(
        self,
        task: Task,
        context: TaskContext
    ) -> TaskResult:
        """Execute a specific task."""
        
    async def verify_result(
        self,
        result: TaskResult
    ) -> VerificationResult:
        """Verify task execution result."""

Artifact Manager API

class ArtifactManager:
    async def store_artifact(
        self,
        artifact: Artifact
    ) -> str:
        """Store an artifact and return its ID."""
        
    async def retrieve_artifact(
        self,
        artifact_id: str
    ) -> Artifact:
        """Retrieve an artifact by ID."""
        
    async def list_artifacts(
        self,
        project_id: Optional[str] = None
    ) -> List[Artifact]:
        """List artifacts, optionally filtered by project."""

Development

Code Organization

src/
├── agents/          # Agent implementations
├── core/           # Core system components
├── clients/        # API clients (Claude, OpenAI)
├── verification/   # Verification system
├── utils/          # Utility functions
└── prompts/        # AI prompts and templates

Development Workflow

Code Style: Uses Black for formatting, Ruff for linting
Type Checking: MyPy for static type analysis
Testing: Pytest with async support and coverage
Documentation: Docstrings and type hints throughout
Version Control: Git with conventional commits

Contributing Guidelines

Fork the repository
Create feature branch: git checkout -b feature/new-feature
Follow code style: Run black and ruff before committing
Add tests: Ensure new code has test coverage
Update documentation: Update README and docstrings
Submit pull request: Include description of changes

Monitoring & Observability

Logging

The system uses structured logging with multiple output formats:

import structlog

logger = structlog.get_logger(__name__)

# Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
logger.info("Task completed", task_id=task_id, duration=duration)

Metrics

Key metrics tracked:

Task completion rates
Agent performance
Verification success rates
API usage and costs
System health indicators

Health Checks

# Run comprehensive health check
python scripts/health_check.py

# Check specific components
python scripts/health_check.py --component agents
python scripts/health_check.py --component verification

Troubleshooting

Common Issues

1. API Key Issues

# Check API key configuration
python -c "from config import get_settings; print(get_settings().anthropic_api_key[:10])"

# Test API connectivity
python scripts/health_check.py

2. Claude Code CLI Not Found

# Install Claude Code CLI
# Follow instructions at https://docs.anthropic.com/claude/docs/claude-code

# Verify installation
claude --version

3. Verification Failures

# Check verification configuration
python -c "from config import get_settings; print(get_settings().verification_enabled)"

# Run verification tests
pytest tests/test_verification.py -v

4. Memory Issues

# Check system resources
python scripts/health_check.py --resource-check

# Reduce parallel tasks
export MAX_PARALLEL_TASKS=1

Error Messages

CLINotAvailableError: Claude Code CLI not installed or not in PATH
APIKeyError: Invalid or missing API keys
VerificationError: Code failed verification checks
TaskTimeoutError: Task execution exceeded timeout
ArtifactCorruptionError: Artifact integrity check failed

Debug Mode

Enable debug logging:

export LOG_LEVEL=DEBUG
python your_script.py

Performance Considerations

Optimization Settings

# Recommended production settings
MAX_PARALLEL_TASKS=3          # Balance between speed and resource usage
VERIFICATION_ENABLED=true     # Always verify in production
TEST_COVERAGE_THRESHOLD=90    # High quality threshold

Resource Requirements

Memory: 4GB+ recommended for concurrent operations
CPU: 2+ cores for parallel processing
Disk: 1GB+ for artifact storage
Network: Stable connection for API calls

System Status

Current Implementation Status

✅ Core Architecture: Fully implemented
✅ Agent System: Complete with Meta Agent and Sub-Agents
✅ Verification System: Comprehensive verification pipeline
✅ Artifact Management: Full versioning and storage
✅ API Integration: Claude and OpenAI client support
✅ Testing Infrastructure: Extensive test coverage
✅ Documentation: Complete API and usage documentation

Generated Projects

The system has successfully generated 200+ projects including:

Hello World applications
Calculators and mathematical tools
Weather applications (CLI and desktop)
REST APIs with PostgreSQL backends
Todo list applications
Complex multi-component systems

Version Information

Version: 0.1.0
Python: 3.9+
License: MIT
Status: Alpha (Development Phase - Approaching Production Ready)

Acknowledgments

Built with:

Claude AI by Anthropic for code generation
OpenAI GPT for task decomposition
Python ecosystem for implementation
pytest for testing framework
Pydantic for configuration management
NetworkX for dependency management

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
agents/test_core_logic_agent		agents/test_core_logic_agent
benchmark_data/workspace		benchmark_data/workspace
config		config
docs		docs
examples		examples
projects		projects
reports		reports
scripts		scripts
src		src
tests		tests
zero-engine		zero-engine
.env.production		.env.production
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
DASHBOARD_INTEGRATION_GUIDE.md		DASHBOARD_INTEGRATION_GUIDE.md
MASTER_CONTROL_PROGRAM_COMPLETION_REPORT.md		MASTER_CONTROL_PROGRAM_COMPLETION_REPORT.md
README.md		README.md
TESTING_GUIDE.md		TESTING_GUIDE.md
Workshop_Final.ipynb		Workshop_Final.ipynb
baseline_results.json		baseline_results.json
batch_cleanup.py		batch_cleanup.py
bfg.jar		bfg.jar
cleanup_projects.py		cleanup_projects.py
cleanup_projects.sh		cleanup_projects.sh
config.py		config.py
conftest.py		conftest.py
debug_artifacts_error.py		debug_artifacts_error.py
debug_dashboard.py		debug_dashboard.py
demo_agentic_system.py		demo_agentic_system.py
demo_task_decomposition.py		demo_task_decomposition.py
demo_task_decomposition_clean.py		demo_task_decomposition_clean.py
direct_cleanup.py		direct_cleanup.py
exec_test.py		exec_test.py
execute_cleanup.py		execute_cleanup.py
execute_now.py		execute_now.py
final_cleanup.py		final_cleanup.py
final_direct_cleanup.py		final_direct_cleanup.py
immediate_cleanup.py		immediate_cleanup.py
inline_cleanup.py		inline_cleanup.py
instant_cleanup.py		instant_cleanup.py
manual_cleanup.py		manual_cleanup.py
mcp-config.json		mcp-config.json
monitoring_dashboard.py		monitoring_dashboard.py
performance_results.json		performance_results.json
performance_summary.txt		performance_summary.txt
projects_cleanup_direct.py		projects_cleanup_direct.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
quick_test.py		quick_test.py
requirements.txt		requirements.txt
run_agent_tests.py		run_agent_tests.py
run_all_tests.py		run_all_tests.py
run_cleanup.py		run_cleanup.py
run_exec.py		run_exec.py
run_tests.py		run_tests.py
run_tests.sh		run_tests.sh
run_tests_with_coverage.sh		run_tests_with_coverage.sh
simple_cleanup.py		simple_cleanup.py
simple_test.py		simple_test.py
subprocess_cleanup.py		subprocess_cleanup.py
system_plan.md		system_plan.md
test_basic_functionality.py		test_basic_functionality.py
test_evolutionary_agent_fix.py		test_evolutionary_agent_fix.py
test_full_execution.py		test_full_execution.py
test_json_parsing.py		test_json_parsing.py
test_modular_meta_agent.py		test_modular_meta_agent.py
test_zero_comprehensive.py		test_zero_comprehensive.py
validate_all_fixes.py		validate_all_fixes.py
validate_fix.py		validate_fix.py
verify_meta_agent_fix.py		verify_meta_agent_fix.py
working_cleanup.py		working_cleanup.py
zero-mcp-config.json		zero-mcp-config.json
zero.py		zero.py

Folders and files

Latest commit

History

Repository files navigation

Autonomous AI Coding System

Project Overview

Key Features

Architecture Overview

Core Components

Installation & Setup

Prerequisites

Installation Steps

Development Setup

Usage Guide

Basic Usage

Command Line Interface

Advanced Usage

Components Documentation

Meta Agent

Sub-Agents

Core Logic Agent

Testing Agent

Documentation Agent

Verification System

Artifact Management

Configuration

Environment Variables

Settings Files

Testing

Test Structure

Running Tests

Test Configuration

API Reference

MetaAgent API

Sub-Agent API

Artifact Manager API

Development

Code Organization

Development Workflow

Contributing Guidelines

Monitoring & Observability

Logging

Metrics

Health Checks

Troubleshooting

Common Issues

1. API Key Issues

2. Claude Code CLI Not Found

3. Verification Failures

4. Memory Issues

Error Messages

Debug Mode

Performance Considerations

Optimization Settings

Resource Requirements

System Status

Current Implementation Status

Generated Projects

Version Information

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages