Skip to content

cormaclydon/autonomous-coder

Repository files navigation

Autonomous Coder

An autonomous AI system that discovers, solves, and validates GitHub issues using local LLMs via Ollama. The system analyzes issue complexity, extracts relevant context, generates solutions, validates them through static analysis, and learns from successes and failures.

Features

  • Automated Issue Discovery: Finds and ranks GitHub issues by complexity and priority
  • Context-Aware Solutions: Extracts relevant code context using keyword search and semantic analysis
  • Multi-Tier Intelligence: Uses different LLM models based on issue complexity (7B for simple, 14B/16B for complex)
  • Validation Pipeline: Validates solutions with Black, flake8, pylint, mypy, and optional Docker testing
  • Learning System: Stores successes and failures for continuous improvement
  • Refinement Loop: Iteratively improves solutions based on validation errors

Quick Start

Prerequisites

  • Python 3.11+
  • Ollama installed and running
  • GitHub API token
  • Docker (optional, for testing)

Installation

  1. Clone the repository
git clone https://github.com/cormaclydon/autonomous-coder.git
cd autonomous-coder
  1. Install dependencies
pip install -r requirements.txt
  1. Install Ollama models
# Install recommended models
ollama pull qwen2.5-coder:7b      # 4.7GB - Fast, for simple issues
ollama pull qwen2.5-coder:14b     # 9GB - For medium complexity
ollama pull deepseek-coder-v2:16b # 10GB - For complex issues
  1. Configure the system
cp config/config.yaml.example config/config.yaml
# Edit config.yaml with your GitHub token and preferences

Running the System

# Process a specific issue
python src/main.py --issue "https://github.com/owner/repo/issues/123"

# Run in autonomous mode (discovers issues automatically)
python src/main.py --mode autonomous --duration 8h

How It Works

1. Issue Discovery

  • Searches GitHub for issues matching configured labels and complexity
  • Ranks issues by complexity (tier 1-3) and priority
  • Filters by repository stars, activity, and maintainer responsiveness

2. Context Extraction

  • Extracts issue description, comments, and code snippets
  • Searches repository for relevant files using keyword matching
  • Builds context prompt with file contents and issue details

3. Solution Generation

  • Selects appropriate LLM model based on complexity tier
  • Generates solution using structured XML format
  • Iteratively refines solution based on validation errors (up to 5-15 attempts)

4. Validation

  • Static Analysis: Black (auto-fix), flake8, pylint, mypy, bandit
  • Docker Testing: Runs repository test suite in isolated environment
  • Confidence Scoring: Calculates confidence based on validation results

5. Learning

  • Stores successful solutions in data/training_data/successes/
  • Stores failures in data/training_data/failures/
  • Tracks metrics for continuous improvement

Configuration

Edit config/config.yaml to customize:

GitHub Settings

  • API token and rate limits
  • Target repositories and labels
  • Repository star range

Model Selection

models:
  tier_1:  # Simple issues (<30 complexity score)
    primary: "qwen2.5-coder:7b"
  tier_2:  # Medium issues (30-60 complexity score)
    primary: "qwen2.5-coder:7b"
    fallback: "qwen2.5-coder:14b"
  tier_3:  # Complex issues (60-100 complexity score)
    primary: "deepseek-coder-v2:16b"
    fallback: "qwen2.5-coder:14b"

Validation Settings

static_analysis:
  black:
    enabled: true
    auto_fix: true  # Automatically fix formatting
  flake8:
    enabled: true
  pylint:
    enabled: true
    min_score: 7.0
  mypy:
    enabled: true
  bandit:
    enabled: true

Project Structure

autonomous-coder/
├── src/
│   ├── modules/
│   │   ├── issue_discovery/      # Issue search and ranking
│   │   ├── context_extraction/   # Code context extraction
│   │   ├── solution_generation/  # LLM solution generation
│   │   ├── validation/           # Static analysis & testing
│   │   └── learning/             # Training data storage
│   ├── utils/                    # Shared utilities
│   └── main.py                   # Main orchestration
├── config/
│   └── config.yaml.example       # Configuration template
├── data/
│   ├── repos/                    # Cloned repositories
│   ├── training_data/            # Successes and failures
│   └── embeddings/               # Vector embeddings
└── logs/                         # System logs

Examples

Successful Resolution

The system successfully resolved fastapi/typer#1159 - "help line length miscalculated when using stylized text":

  • Complexity: Tier 2 (score: 58.9)
  • Model Used: qwen2.5-coder:7b
  • Solution: Added format_rich_text() function to strip \b characters
  • Validation: Passed Black formatting, confidence 0.88
  • Attempts: 1 (first try success)

Troubleshooting

Model Timeout Errors

If you see Read timed out errors, increase the timeout in config.yaml:

ollama:
  timeout: 300  # Increase for larger models

Empty Solutions

If the model returns empty solutions, check:

  • Ollama is running: ollama list
  • Model is downloaded: ollama pull qwen2.5-coder:7b
  • Sufficient VRAM/RAM for the model

Validation Failures

If solutions repeatedly fail validation:

  • Temporarily disable strict checkers (mypy, pylint, bandit)
  • Enable Black auto-fix to handle formatting
  • Increase refinement attempts for the complexity tier

Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

License

MIT License - see LICENSE file for details

Acknowledgments

  • Built with Ollama for local LLM deployment
  • Models: Qwen2.5-Coder, DeepSeek-Coder-V2
  • Inspired by autonomous coding agents and AI-assisted development

Support

For issues or questions:

  • Open an issue on GitHub
  • Check existing issues for solutions
  • Review the configuration documentation

About

An autonomous AI system that discovers, solves, and validates GitHub issues using local LLMs via Ollama. The system analyzes issue complexity, extracts relevant context, generates solutions, validates them through static analysis, and learns from successes and failures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors