Release ACP-Evals v1.0.0 - Built for BeeAI · jbarnes850/acp-evals

ACP-Evals v1.0.0

Production-grade evaluation framework for ACP agents

What's New

This is the initial release of ACP-Evals, built specifically for the BeeAI community. The framework provides three core evaluators for testing AI agents with real LLM-powered assessment.

Core Features

AccuracyEval: LLM-powered semantic evaluation of response quality
PerformanceEval: Latency and resource efficiency tracking
ReliabilityEval: Consistency and tool usage validation

Key Capabilities

Complete transparency with full LLM judge reasoning
Professional CLI interface with rich terminal output
No text truncation - see complete agent responses
Support for ACP agents, Python functions, and BeeAI integration
Multiple evaluation rubrics (factual, research_quality, code_quality)
CI/CD ready with JSON export and standard exit codes

Getting Started

pip install acp-evals
acp-evals check

Documentation

Note on History

This release represents a fresh start for the project with a clean, focused codebase. Previous development history is preserved in the archive/pre-v1 branch.

Built with ❤️ for the BeeAI community.

Full Changelog: https://github.com/jbarnes850/acp-evals/commits/v1.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACP-Evals v1.0.0 - Built for BeeAI

Choose a tag to compare

Sorry, something went wrong.