Skip to content

lovelynfnt/agentic-qa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Agentic QA

AI-powered test automation quality evaluator - Assess your test automation projects with structured, repeatable evaluations

Claude Code Version

Agentic QA is a specialized evaluation tool for test automation projects. It provides comprehensive, structured assessments of test automation quality across multiple dimensions, helping teams identify issues and track improvements over time.

🎯 What is Agentic QA?

Agentic QA uses Claude Code's CLAUDE.md feature to provide structured, multi-step evaluations of test automation projects. Unlike generic AI assistance, it follows a systematic evaluation protocol with:

  • Structured workflows - Consistent 5-step evaluation process
  • Metric-specific analysis - Deep-dive into dependency health, performance, and code quality
  • Actionable recommendations - Specific commands and steps to improve
  • Historical tracking - Compare evaluations over time to measure progress
  • Framework-agnostic - Works with .NET, Node.js, Python, and Java projects

πŸ€– Available Evaluations

πŸ“Š Dependency Health βœ… (Available Now)

Evaluate package freshness and security vulnerabilities.

What it checks:

  • Outdated packages (major, minor, patch versions)
  • Security vulnerabilities (critical, high, moderate, low)
  • Framework-specific dependency commands
  • Generates actionable update commands

How to use:

cd /path/to/agentic-qa
claude
# Then type:
"Evaluate this project /path/to/your-automation-project"

⚑ Test Performance 🚧 (Coming Soon)

Measure test execution speed and identify bottlenecks.

Planned checks:

  • Average test duration
  • Total suite execution time
  • Slow test identification (>30s threshold)
  • Parallelization opportunities
  • Setup/teardown overhead

🧹 Code Quality 🚧 (Coming Soon)

Evaluate test code maintainability using Clean Code principles.

Planned checks:

  • Long methods (>50 lines)
  • Large classes (>300 lines)
  • Code duplication
  • Magic numbers
  • Deep nesting (>3 levels)
  • Naming conventions

πŸš€ Getting Started

Prerequisites

  • Claude Code installed (CLI, Desktop, or Web)
  • A test automation project to evaluate
  • Project-specific CLI tools (dotnet, npm, pip, or mvn)

Installation

Option 1: Use as Standalone Tool (Recommended)

Clone this repository to use as a central evaluation workspace:

git clone https://github.com/lovelynfnt/agentic-qa.git
cd agentic-qa

Option 2: Copy into Your Project

Copy the .claude directory into your test automation project:

cp -r agentic-qa/.claude /path/to/your-test-project/
cd /path/to/your-test-project

Usage

From Standalone Workspace:

cd agentic-qa
claude
# Then ask Claude:
"Evaluate this project /path/to/your-automation-project"

From Your Project:

cd /path/to/your-test-project
claude
# Then ask Claude:
"Evaluate this project"

Evaluation Flow

  1. High-Level Pre-Check - Detects project type, test framework, test count
  2. Metric Selection - Choose which metric to evaluate
  3. Metric-Specific Pre-Check - Validates requirements for selected metric
  4. Run Evaluation - Executes analysis and calculates score
  5. Generate Report - Creates markdown report and JSON history

Sample Output

πŸ€– Test Automation Evaluator v1.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Analyzing project structure...

βœ… Detected: .NET project with SpecFlow
πŸ“Š Found: 42 test scenarios across 8 files

Ready to evaluate!

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“‹ Protocol: High-Level Pre-Check | πŸ“„ Step 1 of 5

[After selecting Dependency Health...]

## Dependency Health Evaluation Complete - Score: 85/100 🟑

### πŸ“¦ Package Analysis
- Total Packages: 42 packages (15 direct, 27 transitive)
- Outdated: 8 packages (2 major, 3 minor, 3 patch)
- Vulnerabilities: 1 security issue (High: 1)

### 🚨 Critical Issues (2 found)
1. Selenium.WebDriver is 1 major version behind
   - Current: 3.141.0 β†’ Latest: 4.15.0
   - Fix: `dotnet add package Selenium.WebDriver --version 4.15.0`

2. CVE-2023-12345: High vulnerability in Newtonsoft.Json
   - Fix: Update to version 13.0.2 or higher

πŸ“Š Detailed report: .claude/evaluations/2026-05-28_091234_dependency-health.md

πŸ—οΈ Architecture

Modular Design

.claude/
β”œβ”€β”€ CLAUDE.md                              # Entry point (routes to evaluator)
β”œβ”€β”€ personas/
β”‚   └── evaluator/
β”‚       β”œβ”€β”€ evaluator.md                   # Main orchestrator (Steps 1-2)
β”‚       └── metrics/
β”‚           β”œβ”€β”€ dependency-health.md       # Metric implementation
β”‚           β”œβ”€β”€ dependency-health-scoring.json
β”‚           β”œβ”€β”€ test-performance.md        # Coming soon
β”‚           β”œβ”€β”€ test-performance-scoring.json
β”‚           β”œβ”€β”€ code-quality.md            # Coming soon
β”‚           └── code-quality-scoring.json
└── evaluations/                           # Generated reports
    └── history/                           # JSON data for trends

Framework Support

Works with any test automation stack:

  • βœ… .NET (SpecFlow, NUnit, xUnit, MSTest)
  • βœ… JavaScript/TypeScript (Jest, Playwright, Cypress, Mocha)
  • βœ… Python (pytest, unittest, Robot Framework)
  • βœ… Java (JUnit, TestNG, Cucumber)

Data Persistence

Evaluations are saved for trend analysis:

  • Markdown reports - Human-readable, shareable (.md files)
  • JSON history - Machine-readable for tracking trends (.json files)
  • Configurable scoring - Adjust thresholds per metric

🎨 Use Cases

1. Sprint Health Checks

Run evaluations at the start of each sprint to identify technical debt and prioritize improvements.

2. Pre-Release Audits

Evaluate dependency security before major releases to ensure no critical vulnerabilities.

3. Onboarding New Team Members

Use evaluation reports to help new QA engineers understand project health and improvement areas.

4. Continuous Improvement Tracking

Regular evaluations show trends over time, helping teams measure the impact of quality initiatives.

5. Compliance & Security Audits

Generate reports showing dependency health and security posture for compliance reviews.

πŸ› οΈ Customization

Adjust Scoring Thresholds

Edit .claude/personas/evaluator/metrics/dependency-health-scoring.json:

{
  "criteria": {
    "outdated_major": {
      "threshold": 0,
      "severity": "critical",
      "points_deduction": 20
    },
    "security_vulnerabilities": {
      "points_deduction_per_vuln": {
        "critical": 25,
        "high": 15,
        "moderate": 8,
        "low": 3
      }
    }
  }
}

Change Grade Thresholds

{
  "grade_thresholds": {
    "A": 90,
    "B": 80,
    "C": 70,
    "D": 60,
    "F": 0
  }
}

Add Custom Metrics

  1. Create metrics/my-metric.md with evaluation instructions
  2. Create metrics/my-metric-scoring.json with scoring rules
  3. Update evaluator.md to include the new metric option

πŸ“– Directory Structure

agentic-qa/
β”œβ”€β”€ README.md                              # This file
β”œβ”€β”€ .claude/
β”‚   β”œβ”€β”€ CLAUDE.md                          # Entry point
β”‚   β”œβ”€β”€ settings.json                      # Claude Code settings
β”‚   β”œβ”€β”€ personas/
β”‚   β”‚   └── evaluator/
β”‚   β”‚       β”œβ”€β”€ evaluator.md               # Main persona
β”‚   β”‚       └── metrics/
β”‚   β”‚           β”œβ”€β”€ dependency-health.md
β”‚   β”‚           β”œβ”€β”€ dependency-health-scoring.json
β”‚   β”‚           β”œβ”€β”€ test-performance.md
β”‚   β”‚           β”œβ”€β”€ test-performance-scoring.json
β”‚   β”‚           β”œβ”€β”€ code-quality.md
β”‚   β”‚           └── code-quality-scoring.json
β”‚   └── evaluations/                       # Generated reports
β”‚       └── history/                       # JSON data
└── docs/                                  # Additional documentation (Coming Soon)

🀝 Contributing

We welcome contributions! Whether it's:

  • ✨ New metrics (test coverage, flakiness detection, etc.)
  • πŸ“ Documentation improvements
  • 🎨 Scoring templates for different project types
  • πŸ’‘ Feature suggestions
  • πŸ› Bug reports

πŸ“Š Roadmap

Version 1.0 (Current)

  • βœ… Modular architecture with per-metric files
  • βœ… Dependency Health metric (complete)
  • βœ… Framework-agnostic detection (.NET, Node.js, Python, Java)
  • βœ… Structured 5-step evaluation flow
  • βœ… Historical tracking with JSON data
  • βœ… Visual protocol indicators (banners, step tracking)

Version 1.1 (Planned)

  • 🚧 Test Performance metric
  • 🚧 Code Quality metric
  • 🚧 "All Metrics" evaluation option
  • 🚧 Trend comparison (current vs previous)
  • 🚧 Grade improvement suggestions

πŸ™ Acknowledgments

Built with Claude Code by Anthropic.


Note: Agentic QA requires Claude Code and uses natural language commands (not custom slash commands). Type phrases like "Evaluate this project [path]" to activate the evaluator.

About

Autonomous intelligence for quality assurance

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors