Skip to content

feat: Add commit message evaluation system #11

Merged
nick-galluzzo merged 11 commits intomainfrom
feature/evaluation
Aug 1, 2025
Merged

feat: Add commit message evaluation system #11
nick-galluzzo merged 11 commits intomainfrom
feature/evaluation

Conversation

@nick-galluzzo
Copy link
Copy Markdown
Owner

This PR introduces a comprehensive commit message evaluation system alongside architectural improvements to the generation workflow.

Key Features

  • New Evaluation Command: Added evaluate command to assess commit
    message quality with LLM-based Chain-of-Thought reasoning
  • Quality Rating System: Implemented human-readable quality levels
    with color-coded display formatting
  • Commit Message Generation Service: Refactored generation logic into
    dedicated service layer with proper models
  • Enhanced Git Parsing: Added support for parsing specific commit
    diffs with improved error handling

Major Changes

  • Added evaluation module with quality assessment models and display
    formatters
  • Restructured AI prompt management and moved generation logic to CLI
    layer

Minor Changes

  • Improved decimal precision in score displays (2 → 1 decimal place)
  • Enhanced type safety with additional type hints
  • Comprehensive test coverage for new functionality

Commands

  • diffmage evaluate - Evaluate existing commit message quality
  • diffmage generate - Generate commit messages (improved architecture)

New Pydantic models for evaluating commit message quality with WHAT/WHY dimensions, scoring system, and quality thresholds. Includes EvaluationDimension enum and EvaluationResult model with
validation rules and serialization methods.
…h Chain-of-Thought reasoning

- Introduce LLMEvaluator class for assessing commit message quality
- Add evaluation prompts with few-shot examples and structured JSON responses
- Implement EvaluationResult model with WHAT/WHY scoring (1-5 scale)
- Update CLI to display evaluation results alongside generated commit messages
- Refactor AI client to support both generation and evaluation workflows
- Add comprehensive tests for evaluation system components
- Update default model to qwen_3_coder (non-free version)
- Improve debugging workflow with interactive debug runner script
- Remove deprecated debugpy attachment configuration from CLI
…ommand

- Added new evaluate CLI command for commit message evaluation
- Removed automatic evaluation from generate command
- Updated CLI module imports and registrations
- Fixed missing newline in evaluate.py file
…-readable

levels

- Introduce ScoreThresholds constants for quality ratings
- Add QualityRater class with methods for quality level assessment
- Update EvaluationResult model with quality_level and is_high_quality properties
- Replace inline quality logic with centralized rating system
- Add comprehensive tests for new rating functionality
- Update existing tests to use new quality_level property
…tests

- Introduce EvaluationService to encapsulate evaluation workflow
- Update CLI command to use new service and display formatter
- Fix typo in prompt template tag
- Reduce AI client temperature from 0.3 to 0.1 for more deterministic results
- Remove redundant model_used field from evaluation response parsing
- Move prompt construction from AIClient to CLI generate command
- Simplify AIClient to accept pre-built prompts
- Update tests to reflect new parameter structure
- Remove redundant validation logic from client layer
- Introduce GenerationService as high-level interface for commit message generation
- Add GenerationResult and GenerationRequest models for structured data handling
- Create CommitMessageGenerator for LLM-based message creation
- Refactor CLI generate command to use new service layer
- Update tests to reflect new architecture and component responsibilities
- Rename LLMEvaluator to CommitMessageEvaluator for clarity
- Remove direct AI client and diff parser usage from CLI layer

This change establishes a cleaner separation of concerns in the generation pipeline while maintaining existing functionality.
@nick-galluzzo nick-galluzzo merged commit 8caec28 into main Aug 1, 2025
3 checks passed
@nick-galluzzo nick-galluzzo deleted the feature/evaluation branch August 1, 2025 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant