This project demonstrates how to build a simple multi-agent system using AutoGen where:
- A Tutor Agent writes Python code based on user queries.
- A Reviewer Agent critiques the code for correctness, readability, and improvements.
- A Student Proxy Agent manages the conversation flow, executes code safely in a sandbox, and stores logs.
The system also:
- Saves the entire chat transcript into
chat_log.txt
. - Extracts the Tutor’s final code block and saves it into
final_code.txt
for testing or reuse.
- Tutor Agent: Generates clear, correct Python solutions.
- Reviewer Agent: Reviews Tutor’s solutions and suggests improvements.
- Student Proxy Agent: Automates interaction, prevents infinite loops, and runs code in a sandbox.
- Conversation Logging: Full transcript saved to
chat_log.txt
. - Code Extraction: Clean Python code saved to
final_code.txt
.
code-reviewer-autogen/
│── sandbox/ # Safe environment for code execution
│── chat\_log.txt # Full transcript of Tutor & Reviewer conversation
│── final\_code.txt # Extracted Python code from Tutor
│── main.py # Main script (agents + logic)
│── requirements.txt # Python dependencies
│── .gitignore
│── README.md
-
Clone the repository:
git clone https://github.com/YOUR-USERNAME/code-reviewer-autogen.git cd code-reviewer-autogen
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # macOS/Linux venv\Scripts\activate # Windows
-
Install dependencies:
pip install -r requirements.txt
This project requires an OpenAI API key. The script loads it from system environment variables using:
import os
config = {
"model": "gpt-4o-mini",
"api_key": os.getenv("OPENAI_API_KEY")
}
-
macOS/Linux (bash/zsh):
export OPENAI_API_KEY="your_api_key_here"
-
Windows (PowerShell):
setx OPENAI_API_KEY "your_api_key_here"
If you don’t want to use environment variables, you can hardcode it:
config = {
"model": "gpt-4o-mini",
"api_key": "your_api_key_here"
}
Run the main script:
python main.py
After execution:
chat_log.txt
will contain the full conversation (Tutor + Reviewer).final_code.txt
will contain the clean extracted Python code.
- Student asks Tutor: "Write a function that prints n factorial for n to be any number"
- Tutor generates the Python code.
- Student executes it inside the sandbox.
- Reviewer critiques the Tutor’s code and suggests improvements.
- Logs and final code are saved for review.
- Add a Critic Agent to review the Reviewer’s feedback.
- Integrate with linters (e.g., Pylint/Flake8) as tools for Reviewer.
- Extend pipeline to support multiple rounds of refinement.
- Store logs in a database instead of flat files.