An autonomous AI system that discovers, solves, and validates GitHub issues using local LLMs via Ollama. The system analyzes issue complexity, extracts relevant context, generates solutions, validates them through static analysis, and learns from successes and failures.
- Automated Issue Discovery: Finds and ranks GitHub issues by complexity and priority
- Context-Aware Solutions: Extracts relevant code context using keyword search and semantic analysis
- Multi-Tier Intelligence: Uses different LLM models based on issue complexity (7B for simple, 14B/16B for complex)
- Validation Pipeline: Validates solutions with Black, flake8, pylint, mypy, and optional Docker testing
- Learning System: Stores successes and failures for continuous improvement
- Refinement Loop: Iteratively improves solutions based on validation errors
- Python 3.11+
- Ollama installed and running
- GitHub API token
- Docker (optional, for testing)
- Clone the repository
git clone https://github.com/cormaclydon/autonomous-coder.git
cd autonomous-coder- Install dependencies
pip install -r requirements.txt- Install Ollama models
# Install recommended models
ollama pull qwen2.5-coder:7b # 4.7GB - Fast, for simple issues
ollama pull qwen2.5-coder:14b # 9GB - For medium complexity
ollama pull deepseek-coder-v2:16b # 10GB - For complex issues- Configure the system
cp config/config.yaml.example config/config.yaml
# Edit config.yaml with your GitHub token and preferences# Process a specific issue
python src/main.py --issue "https://github.com/owner/repo/issues/123"
# Run in autonomous mode (discovers issues automatically)
python src/main.py --mode autonomous --duration 8h- Searches GitHub for issues matching configured labels and complexity
- Ranks issues by complexity (tier 1-3) and priority
- Filters by repository stars, activity, and maintainer responsiveness
- Extracts issue description, comments, and code snippets
- Searches repository for relevant files using keyword matching
- Builds context prompt with file contents and issue details
- Selects appropriate LLM model based on complexity tier
- Generates solution using structured XML format
- Iteratively refines solution based on validation errors (up to 5-15 attempts)
- Static Analysis: Black (auto-fix), flake8, pylint, mypy, bandit
- Docker Testing: Runs repository test suite in isolated environment
- Confidence Scoring: Calculates confidence based on validation results
- Stores successful solutions in
data/training_data/successes/ - Stores failures in
data/training_data/failures/ - Tracks metrics for continuous improvement
Edit config/config.yaml to customize:
- API token and rate limits
- Target repositories and labels
- Repository star range
models:
tier_1: # Simple issues (<30 complexity score)
primary: "qwen2.5-coder:7b"
tier_2: # Medium issues (30-60 complexity score)
primary: "qwen2.5-coder:7b"
fallback: "qwen2.5-coder:14b"
tier_3: # Complex issues (60-100 complexity score)
primary: "deepseek-coder-v2:16b"
fallback: "qwen2.5-coder:14b"static_analysis:
black:
enabled: true
auto_fix: true # Automatically fix formatting
flake8:
enabled: true
pylint:
enabled: true
min_score: 7.0
mypy:
enabled: true
bandit:
enabled: trueautonomous-coder/
├── src/
│ ├── modules/
│ │ ├── issue_discovery/ # Issue search and ranking
│ │ ├── context_extraction/ # Code context extraction
│ │ ├── solution_generation/ # LLM solution generation
│ │ ├── validation/ # Static analysis & testing
│ │ └── learning/ # Training data storage
│ ├── utils/ # Shared utilities
│ └── main.py # Main orchestration
├── config/
│ └── config.yaml.example # Configuration template
├── data/
│ ├── repos/ # Cloned repositories
│ ├── training_data/ # Successes and failures
│ └── embeddings/ # Vector embeddings
└── logs/ # System logs
The system successfully resolved fastapi/typer#1159 - "help line length miscalculated when using stylized text":
- Complexity: Tier 2 (score: 58.9)
- Model Used: qwen2.5-coder:7b
- Solution: Added
format_rich_text()function to strip\bcharacters - Validation: Passed Black formatting, confidence 0.88
- Attempts: 1 (first try success)
If you see Read timed out errors, increase the timeout in config.yaml:
ollama:
timeout: 300 # Increase for larger modelsIf the model returns empty solutions, check:
- Ollama is running:
ollama list - Model is downloaded:
ollama pull qwen2.5-coder:7b - Sufficient VRAM/RAM for the model
If solutions repeatedly fail validation:
- Temporarily disable strict checkers (mypy, pylint, bandit)
- Enable Black auto-fix to handle formatting
- Increase refinement attempts for the complexity tier
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
MIT License - see LICENSE file for details
- Built with Ollama for local LLM deployment
- Models: Qwen2.5-Coder, DeepSeek-Coder-V2
- Inspired by autonomous coding agents and AI-assisted development
For issues or questions:
- Open an issue on GitHub
- Check existing issues for solutions
- Review the configuration documentation