Industrial-grade testing framework for LLM prompts
TunePrompt is a comprehensive testing framework designed specifically for Large Language Model (LLM) prompts. It helps developers validate, test, and optimize their prompts with industrial-grade reliability and accuracy.
The first production-ready release of TunePrompt, the industrial-grade testing framework for the modern LLM stack.
- Multi-Provider Support: Seamlessly test across OpenAI, Anthropic, Gemini, and OpenRouter.
- Semantic Evaluation: Advanced vector-based scoring to detect logic drift and nuance shifts.
- Auto-Fix Engine (Premium): AI-powered prompt optimization for failing tests.
- Cloud Orchestration: Unified synchronization with the TunePrompt Dashboard.
- Industrial CLI: Built-in watch mode, CI/CD integration, and historical analytics.
- Multi-provider Support: Native integration with Google Gemini, OpenAI, Anthropic, and OpenRouter.
- Semantic Testing: Compare outputs using high-precision embedding similarity.
- JSON Validation: Validate structured outputs with schema-aware checks.
- LLM-based Judging: Utilize advanced providers as evaluators for qualitative metrics.
- Watch Mode: Immediate feedback loop with automatic re-runs on file changes.
- CI/CD Ready: Native integration patterns for industrial deployment pipelines.
- Cloud Sync: Global telemetry and result storage via the dashboard.
- Auto-fix Engine: Iterative refinement loop for intelligent prompt repair.
npm install -g tuneprompt- Initialize a new project:
tuneprompt init-
Create test files in the
testsdirectory with your prompts and expectations -
Run tests:
tuneprompt run- Run tests with cloud sync (requires activation):
tuneprompt run --cloudtuneprompt init: Initialize a new TunePrompt projecttuneprompt run: Run prompt teststuneprompt run --watch: Run tests in watch modetuneprompt run --cloud: Run tests and upload results to cloudtuneprompt run --ci: Run tests in CI modetuneprompt fix: Auto-fix failing prompts (Premium feature)tuneprompt history: View test run historytuneprompt activate [subscription-id]: Activate your Premium licensetuneprompt status: Check license status
TunePrompt uses a configuration file to define providers and settings. The default location is tuneprompt.config.js in your project root.
Example configuration:
module.exports = {
providers: {
openai: {
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o',
},
anthropic: {
apiKey: process.env.ANTHROPIC_API_KEY,
model: 'claude-3-opus-20240229',
},
openrouter: {
apiKey: process.env.OPENROUTER_API_KEY,
model: 'openai/gpt-4o',
}
},
threshold: 0.85,
testDir: './tests',
outputFormat: 'table'
};Tests are defined in JSON files in the tests directory. Each test file contains an array of test cases:
[
{
"description": "User onboarding welcome message",
"prompt": "Generate a friendly welcome message for a user named {{name}}.",
"variables": {
"name": "Alice"
},
"expect": "Welcome, Alice! We are glad you are here.",
"config": {
"threshold": 0.85,
"method": "semantic",
"model": "gpt-4o",
"provider": "openai"
}
}
]exact: Exact string matchsemantic: Semantic similarity comparisonjson: JSON structure validationllm-judge: LLM-based evaluation
TunePrompt offers cloud synchronization for storing test results and viewing them in a dashboard. To use cloud features:
- Purchase a subscription at https://www.tuneprompt.xyz
- Activate your license:
tuneprompt activate [your-subscription-id]- Run tests with cloud sync:
tuneprompt run --cloud- Auto-fix Engine: Automatically repair failing prompts using AI
- Cloud sync & team collaboration: Store results in the cloud and collaborate with your team
- Advanced diagnostics: Detailed insights and recommendations
Create a .env file in your project root with your API keys:
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENROUTER_API_KEY=your_openrouter_api_keyContributions are welcome! Please feel free to submit a Pull Request.
MIT