Vera is an extensible testing engine designed to bring software engineering rigor to AI feature development. It provides a standardized framework for evaluating Generative AI outputs using a Hybrid Evaluation approach: combining deterministic static checks with semantic "LLM-as-a-Judge" evaluation.
Building reliable AI features is hard. Traditional unit tests struggle with the probabilistic nature of LLMs, while manual testing is unscalable and subjective. Vera bridges this gap.
- 🤖 Hybrid Evaluation: strict programmatic validation (e.g., "is valid JSON?") combined with nuanced AI grading (e.g., "is the tone professional?").
- 📏 Spec-Driven Quality: Define success using natural language Rubrics, Safety Constraints, and Golden Datasets that the LLM Judge follows.
- ⚡ High Performance: Built on
asyncioandanyiofor parallel test execution. - 🔌 Plugin Architecture: deeply extensible via
pluggy. Add new features, commands, or reporters easily. - 📊 Standardized Reporting: Get detailed CSV reports with granular scores, reasoning, and pass/fail status.
- Installation and Usage: Setup guide and CLI reference.
- Testing Philosophy: Deep dive into "LLM-as-a-Judge" and Test Specs.
- Plugin Development: How to build custom evaluation plugins.
- Contributing: Guide for contributing to the Vera core engine.
Vera evaluates your AI feature in four steps:
- Execution: Runs your feature against a set of inputs (Test Cases).
- Static Analysis: Runs Python-based checks on the output (e.g., regex, syntax validation).
- LLM Evaluation: An "LLM Judge" (e.g., Gemini) grades the output based on your provided Specs (Rubrics, Safety Constraints, Style Guides).
- Reporting: Aggregates all scores and reasoning into a structured report.
- Python 3.14+
- uv (Recommended for dependency management)
-
Set up the environment:
# Create a virtual environment with Python 3.14 uv venv --python 3.14 source .venv/bin/activate
-
Install Vera Core:
# Install the core engine from the local directory uv pip install vera -
Install an Example Plugin: Vera requires plugins to define what to test. Let's install the SQL Query Assistant example.
uv pip install plugin_example/vera_sql_query_assistant
-
Configure API Key: Vera uses Gemini as the default judge. You'll need an API key.
vera config -k # Follow the prompt to enter your key securely -
Verify Setup: Check if the plugin is recognized.
vera list # Should show: vera_sql_query_assistant -
Run the Evaluation:
# Run tests and save results to ./out vera test --dst-dir ./out
You can now open the generated CSV file in the out/ directory to analyze the results.
Maintained by the Vera Team.
