An automated test readability improvement system that uses Model Context Protocol (MCP) and Large Language Models to systematically enhance test suite quality and maintainability.
This project extends the research work of Gay (2023) on "Improving the Readability of Generated Tests Using GPT-4 and ChatGPT Code Interpreter" by providing a fully automated, context-aware system for improving test readability. The system implements a sophisticated two-stage LLM process combined with MCP tools for comprehensive code analysis.
- Automated Test Improvement: Transform test suites without manual intervention
- Context-Aware Analysis: Deep understanding of source code and test relationships
- Six Transformation Types: Implements all readability transformations from research
- Quality Validation: Ensures syntactic correctness and PEP 8 compliance
- Batch Processing: Handle multiple test files simultaneously
- Comprehensive Logging: Complete transparency into the improvement process
- Descriptive Test Names:
test_case_0βtest_queue_initialization_with_valid_size - Documentation: Comprehensive docstrings with scenario descriptions
- Clear Test Structure: Arrange-Act-Assert (AAA) pattern with explicit comments
- Meaningful Variable Names:
int_0βmax_size,queue_0βqueue - Informative Assertion Messages: Clear failure messages for better debugging
- Code Organization: Logical grouping and improved structure
- Python 3.13+
- OpenRouter API key (supports GPT-4o, Claude, etc.)
- Clone the repository:
git clone <repository-url>
cd mcp-enhanced-testextender- Create and activate virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install --upgrade pip
pip install -r requirements_mcp.txt- Configure environment:
# Create .env file and add your API key
echo "OPENROUTER_API_KEY=your_api_key_here" > .envRun the complete demonstration:
python3 demo_real_mcp.pyImprove individual test files:
# Queue Example
python3 test_improvement_orchestrator.py \
--source examples/queue_example/queue_example.py \
--test examples/queue_example/test_queue_example.py \
--output improved_tests
# HTTPie Sessions Example
python3 test_improvement_orchestrator.py \
--source examples/httpie_sessions/sessions.py \
--test examples/httpie_sessions/test_httpie_sessions.py \
--output improved_tests
# String Utils Validation Example
python3 test_improvement_orchestrator.py \
--source examples/string_utils_validation/validation.py \
--test examples/string_utils_validation/test_string_utils_validation.py \
--output improved_testsProgrammatic usage:
from test_improvement_orchestrator import improve_single_test_file
result = await improve_single_test_file(
source_file="examples/queue_example/queue_example.py",
test_file="examples/queue_example/test_queue_example.py",
output_dir="improved_tests"
)
if result.success:
print(f"β
Improved test saved to: {result.improved_filepath}")
else:
print(f"β Error: {result.error}")mcp-enhanced-testextender/
βββ examples/ # Test case examples
β βββ queue_example/
β β βββ queue_example.py # Source: Queue data structure
β β βββ test_queue_example.py # Original tests
β βββ httpie_sessions/
β β βββ sessions.py # Source: HTTP session management
β β βββ test_httpie_sessions.py # Original tests
β βββ string_utils_validation/
β βββ validation.py # Source: String validation utilities
β βββ test_string_utils_validation.py # Original tests
βββ improved_tests/ # Generated improved test files
βββ mcp_tools_server.py # MCP tools implementation
βββ openrouter_llm_client.py # Two-stage LLM processing
βββ test_improvement_orchestrator.py # Main workflow coordinator
βββ demo_real_mcp.py # Complete demonstration script
βββ requirements_mcp.txt # Project dependencies
βββ README.md # This file
Six specialized tools for comprehensive code analysis:
- Source Code Reader: Parse and extract class/method information
- Test File Analyzer: Analyze test structure and quality metrics
- Context Discovery: Map relationships between source and test code
- Code Quality Checker: Validate syntax and identify improvement areas
- Prompt Generator: Create dynamic, context-specific instructions
- Test Transformer: Apply systematic transformations
- Stage 1: Generate context-aware improvement prompts
- Stage 2: Transform tests using generated instructions
- OpenRouter Integration: Support for multiple LLM providers
- Workflow Coordination: Manages the complete improvement process
- Quality Validation: Ensures correctness and compliance
- Output Generation: Creates improved tests, reports, and logs
| Metric | Original | Improved |
|---|---|---|
| Generic Test Names | 100% | 0% |
| Missing Docstrings | 100% | 0% |
| Quality Issues | 37 total | 0 |
| Syntactic Correctness | β | β |
- Average Processing Time: ~63 seconds per test file
- Token Efficiency: ~14,000 tokens per test file
- Success Rate: 100% (maintains all original functionality)
Each improvement run generates:
- Improved Test File:
test_[original]_[example]_[timestamp].py - Improvement Report:
improvement_report_[example]_[timestamp].md - Complete Prompt Log:
all_prompts_[example]_[timestamp].txt
python3 test_improvement_orchestrator.py \
--source examples/queue_example/queue_example.py \
--test examples/queue_example/test_queue_example.py \
--model "anthropic/claude-3-sonnet"python3 test_improvement_orchestrator.py \
--source examples/queue_example/queue_example.py \
--test examples/queue_example/test_queue_example.py \
--focus documentation # Options: naming, documentation, structure, assertions, setup, alltest_configurations = [
{
"source_file": "examples/queue_example/queue_example.py",
"test_file": "examples/queue_example/test_queue_example.py"
},
{
"source_file": "examples/httpie_sessions/sessions.py",
"test_file": "examples/httpie_sessions/test_httpie_sessions.py"
}
]
results = await improve_multiple_test_files(test_configurations)def test_case_0():
int_0 = 1256
queue_0 = module_0.Queue(int_0)
assert queue_0.max == 1256
assert queue_0.head == 0
assert queue_0.tail == 0
assert queue_0.size == 0def test_queue_initialization_with_valid_size():
"""
Test the __init__ method of the Queue class.
Scenario: Initialize a queue with a valid size.
Expected Outcome: The queue is initialized with the correct attributes.
"""
# Arrange
max_size = 1256
# Act
queue = module_0.Queue(max_size)
# Assert
assert queue.max == max_size, "The max size should be set correctly."
assert queue.head == 0, "The head should be initialized to 0."
assert queue.tail == 0, "The tail should be initialized to 0."
assert queue.size == 0, "The size should be initialized to 0."- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Gay, G. (2023): "Improving the Readability of Generated Tests Using GPT-4 and ChatGPT Code Interpreter" - Original research foundation
- Model Context Protocol (MCP): Standardized tool communication framework
- OpenRouter: Multi-provider LLM access platform
This project was developed as part of a Ph.D. application demonstration for the TestExtender project, focusing on automated test improvement and software engineering research. The implementation demonstrates practical applications of LLM-assisted software development tools with emphasis on code quality and maintainability.