Skip to content

ikhyunAn/code-reviewer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CodeReviewer

Status: 🚧 Early Development - Not yet runnable

A multi-agent code review system that simulates real-world peer review discussions using heterogeneous LLM models, with support for both cloud-based and on-device (MacBook) inference.

Overview

CodeReviewer introduces a novel approach to automated code review by orchestrating multiple LLM agents in different reviewer roles (Senior Developer, Junior Developer) that engage in multi-round discussionsβ€”just like human reviewers would. This collaborative approach aims to produce more nuanced, thorough, and reliable code reviews than single-agent systems.

Why Multi-Agent?

Traditional automated code review tools use a single model to analyze code. CodeReviewer takes a different approach:

  • Role-Based Agents: Each LLM assumes a specific role (Senior/Junior) with distinct perspectives and system prompts
  • Multi-Round Discussions: Agents review each other's feedback, building consensus through iterative dialogue
  • Heterogeneous Models: Different agents can use different models (cloud or local) optimized for their role
  • Consensus Detection: The system identifies when agents reach agreement, mimicking real code review workflows

Key Features (Planned)

  • βœ… Multi-Agent Pipeline: Configurable conversation flow between Senior and Junior developer agents
  • βœ… Hybrid Inference: Seamlessly switch between cloud APIs (OpenAI, Anthropic) and on-device models
  • βœ… On-Device First: Optimized for Apple Silicon with quantized models (GGUF) and GPU acceleration
  • βœ… Multiple Input Sources: Support for Git diffs, local files, and GitHub Pull Requests
  • βœ… Conversation Memory: Intelligent context management for long discussions
  • βœ… Consensus Detection: Automatic detection of agreement between agents
  • βœ… Streaming Output: Real-time feedback as agents discuss code

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              UI Layer (Future: Tauri 2.0)           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚          Multi-Agent Pipeline Coordinator           β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚   β”‚ Senior Agent ←→ Junior Agent             β”‚      β”‚
β”‚   β”‚ β€’ Conversation State Machine             β”‚      β”‚
β”‚   β”‚ β€’ Round Management (up to N rounds)      β”‚      β”‚
β”‚   β”‚ β€’ Consensus Detection                    β”‚      β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚           LLM Backend Abstraction Layer             β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚   β”‚ Cloud APIs   β”‚   On-Device Inference     β”‚      β”‚
β”‚   β”‚ - OpenAI     β”‚   - Candle (GGUF)         β”‚      β”‚
β”‚   β”‚ - Anthropic  β”‚   - Metal Acceleration    β”‚      β”‚
β”‚   β”‚              β”‚   - CoreML (future)       β”‚      β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚              Code Input Layer                       β”‚
β”‚   β€’ Git Diffs (git2) β€’ Files β€’ GitHub PRs           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure (Planned)

code-reviewer/
β”œβ”€β”€ Cargo.toml
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.rs
β”‚   β”œβ”€β”€ lib.rs
β”‚   β”‚
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   β”œβ”€β”€ agent.rs              // Agent trait & implementation
β”‚   β”‚   β”œβ”€β”€ roles.rs              // Senior/Junior role definitions
β”‚   β”‚   β”œβ”€β”€ pipeline.rs           // Multi-round discussion coordinator
β”‚   β”‚   └── consensus.rs          // Agreement detection logic
β”‚   β”‚
β”‚   β”œβ”€β”€ llm/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   β”œβ”€β”€ provider.rs           // LLMProvider trait
β”‚   β”‚   β”œβ”€β”€ cloud.rs              // Cloud backend (llm-connector)
β”‚   β”‚   β”œβ”€β”€ candle_backend.rs     // Candle on-device
β”‚   β”‚   β”œβ”€β”€ coreml_backend.rs     // CoreML (future)
β”‚   β”‚   └── config.rs             // Backend configuration
β”‚   β”‚
β”‚   β”œβ”€β”€ code_input/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   β”œβ”€β”€ source.rs             // CodeSource trait
β”‚   β”‚   β”œβ”€β”€ git.rs                // Git diff handling (git2)
β”‚   β”‚   β”œβ”€β”€ files.rs              // File reader
β”‚   β”‚   └── github.rs             // GitHub PR fetcher (octocrab)
β”‚   β”‚
β”‚   β”œβ”€β”€ conversation/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   β”œβ”€β”€ memory.rs             // Conversation state management
β”‚   β”‚   β”œβ”€β”€ message.rs            // Message types
β”‚   β”‚   └── state_machine.rs      // Discussion state transitions
β”‚   β”‚
β”‚   β”œβ”€β”€ prompts/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   β”œβ”€β”€ senior_prompts.rs     // Senior developer system prompts
β”‚   β”‚   └── junior_prompts.rs     // Junior developer system prompts
β”‚   β”‚
β”‚   └── cli/
β”‚   β”‚   β”œβ”€β”€ mod.rs
β”‚   β”‚   └── commands.rs           // CLI interface
β”‚   β”‚
β”‚   β”‚
β”‚   └── config/
β”‚   β”‚   └── mod.rs                // Contains Rust code to load, parse, and manage config
β”‚
β”œβ”€β”€ models/                       // TBD: Quantized models directory (gitignored)
β”œβ”€β”€ config/
β”‚   └── config.toml               // Application configuration
└── tests/
    β”œβ”€β”€ integration/
    └── unit/

Tech Stack

Component Technology Rationale
Language Rust 2021 Memory safety, performance, async support
Cloud APIs llm-connector Unified interface for OpenAI/Anthropic
On-Device Candle + Candle-Transformers Pure Rust, GGUF support, Metal acceleration
Git Integration git2 Mature libgit2 bindings
GitHub API octocrab Async PR/issue management
Async Runtime tokio Industry standard
Future UI Tauri 2.0 Lightweight, native macOS integration

Recommended Models for On-Device

  • StarCoder2-7B (Q5_K_M): Specialized for code understanding
  • CodeLlama-7B-Instruct (Q5_K_M): Good instruction following
  • Qwen2.5-Coder-7B (Q5_K_M): Latest, excellent code comprehension
  • DeepSeek-Coder-6.7B (Q5_K_M): Strong coding capabilities

All models will use Q5_K_M quantization for optimal size/quality balance (~1.6GB per model)

Roadmap

Phase 1: Foundation

  • Architecture design
  • Core abstractions (LLMProvider, Agent, CodeSource traits)
  • Cloud API integration
  • Git diff parsing
  • Basic 2-agent conversation (1 round)
  • CLI interface

Phase 2: Multi-Round Pipeline

  • Conversation state machine
  • Memory management (sliding window)
  • Multi-round discussion logic (up to 5 rounds)
  • Consensus detection algorithm
  • GitHub PR input support

Phase 3: On-Device Inference

  • Candle integration for GGUF models
  • Model download/caching system
  • Metal GPU acceleration
  • Performance benchmarking

Phase 4: CoreML Optimization

  • Apple Neural Engine support
  • Model conversion pipeline
  • Performance tuning

Phase 5: UI & Polish

  • Tauri 2.0 desktop application
  • Real-time streaming output
  • Review history & export (Markdown/PDF)
  • Settings UI

Future Enhancements

  • Product Manager agent (3rd role)
  • Fine-tuned models on code review datasets
  • GitHub App for automatic PR reviews
  • Vector database integration for codebase context
  • Web interface

Performance Targets: TBD

Metric Cloud Local (Q5_K_M)
First Token 200-500ms 1-3s
Throughput 50-100 tok/s 20-40 tok/s
Memory <500MB 2-3GB
Full Review (500 LOC) 30-60s 2-4min

Contributing

This project is in early development. Contributions, ideas, and feedback are welcome once the foundation is stable.

License

This project is licensed under a Custom Non-Commercial License (MIT-NC Variant) β€” see the LICENSE file for details.


Author: ikhyunAn

Last Updated: October 2025

About

A multi-agent code review system that simulates real-world peer review discussions using heterogeneous LLM models, with support for both cloud-based and on-device inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages