Promptly v1

Black-box test harness for LLM-powered systems

Promptly is a comprehensive testing platform for LLM applications. It captures LLM interactions through HTTP endpoints, evaluates responses against expectations using both deterministic rules and LLM judges, and provides detailed test results with full trace visibility.

Features

Multi-Environment Testing: Test across development, staging, and production environments
Flexible Response Mapping: Define JSONPath-based mappings to extract canonical traces from any API response format
Deterministic & LLM-Based Evaluations: Combine regex, text matching, tool call verification with AI-powered judgement
Test Suite Management: Organize tests with YAML import/export, Git integration, and historical tracking
Comprehensive Reporting: View pass rates, failure reasons, latency, token usage, and cost metrics
Background Test Execution: Queue runs and process them asynchronously with configurable concurrency
SDK & CLI: Integrate Promptly into CI/CD pipelines with the .NET SDK and CLI tool

Architecture

┌─────────────────────┐     ┌──────────────────────┐     ┌────────────────────┐
│   React Frontend    │────▶│  ASP.NET Core API    │────▶│   PostgreSQL DB    │
│  (Material UI)      │     │  (.NET 10)           │     │                    │
└─────────────────────┘     └──────────────────────┘     └────────────────────┘
                                      │
                                      │ HTTP
                                      ▼
                            ┌──────────────────────┐
                            │  Python Worker       │
                            │  (FastAPI)           │
                            │  - LLM Judges        │
                            │  - Mapping Proposals │
                            └──────────────────────┘

C# Control Plane: Authentication, CRUD APIs, deterministic evaluation, test orchestration, background worker Python Worker: LLM judges, groundedness scoring, mapping proposals via OpenAI API React Frontend: Material UI, mapping wizard, test management, results visualization Database: PostgreSQL with EF Core

Quick Start

Prerequisites

Docker & Docker Compose
Azure OpenAI resource (or OpenAI API key)
.NET 10 SDK (for local development - optional)
Node.js 18+ (for local frontend development - optional)

Running with Docker Compose

Clone the repository
```
git clone <repository-url>
cd Promptly
```

Configure Azure OpenAI (or OpenAI)

Create a .env file in the docker directory:

For Azure OpenAI (Recommended):

# Azure OpenAI Configuration - REQUIRED
PROMPTLY_LLM_PROVIDER=azureopenai
PROMPTLY_LLM_API_KEY=your_azure_openai_api_key
PROMPTLY_LLM_AZURE_ENDPOINT=https://your-resource-name.openai.azure.com
PROMPTLY_LLM_API_VERSION=2024-08-01-preview
PROMPTLY_LLM_MODEL_DEFAULT=your-deployment-name

For standard OpenAI:

PROMPTLY_LLM_PROVIDER=openai
PROMPTLY_LLM_API_KEY=sk-your-openai-api-key
PROMPTLY_LLM_MODEL_DEFAULT=gpt-4o-mini

Database and other settings are pre-configured - see SETUP_CHECKLIST.md for details.

Start all services
```
cd docker
docker-compose up -d
```
Access the application
- Web UI: http://localhost:3000
- API: http://localhost:5000
- Swagger: http://localhost:5000/swagger
- Python Worker: http://localhost:8000
Initialize database

The database will be automatically migrated on first startup.

Configuration

C# Control Plane (Promptly.Server)

Configuration via appsettings.json or environment variables:

ConnectionStrings:Default: PostgreSQL connection string
JWT:Key: Secret key for JWT signing (min 32 characters)
JWT:Issuer: JWT issuer
JWT:Audience: JWT audience
JWT:ExpiryMinutes: Token expiry time (default: 60)
DATA_PROTECTION_PATH: Path for Data Protection keys persistence
PROMPTLY_EVAL_BASE_URL: Python worker base URL
TestRunner:PollingIntervalSeconds: Background worker polling interval (default: 5)
TestRunner:MaxConcurrentRuns: Max concurrent test runs (default: 2)

Python Worker (Promptly.Worker)

Configuration via environment variables:

For Azure OpenAI:

PROMPTLY_LLM_PROVIDER: azureopenai
PROMPTLY_LLM_API_KEY: Your Azure OpenAI API key
PROMPTLY_LLM_AZURE_ENDPOINT: Your Azure endpoint (e.g., https://your-resource.openai.azure.com)
PROMPTLY_LLM_API_VERSION: API version (default: 2024-08-01-preview)
PROMPTLY_LLM_MODEL_DEFAULT: Your deployment name (NOT model ID - use the name you gave the deployment in Azure AI Studio)

For standard OpenAI:

PROMPTLY_LLM_PROVIDER: openai
PROMPTLY_LLM_API_KEY: Your OpenAI API key (starts with sk-)
PROMPTLY_LLM_MODEL_DEFAULT: Model ID (e.g., gpt-4o-mini)

Important for Azure: Use your deployment name, not the model name. If you deployed GPT-4o and named it "my-gpt4-deployment", use my-gpt4-deployment as the model default.

React Frontend (Promptly.Web)

Configuration via environment variables:

VITE_API_BASE_URL: Base URL for C# API (default: http://localhost:5000)

Usage

1. Register & Login

Navigate to http://localhost:3000 and create an account.

2. Create a Project

Projects organize your testing efforts. Create a project for each application you're testing.

3. Set Up Environment

Define environments (dev, staging, prod) with base URLs and headers:

Base URL: http://localhost:5000/demo
Headers: Add authentication headers if needed (encrypted at rest)

4. Add Endpoint & Mapping

Create an endpoint pointing to your LLM application's API:

Path: /chat
Method: POST
Timeout: 30s

Use the Mapping Wizard to define how to extract canonical traces from responses:

Provide a sample response JSON
AI proposes a mapping spec with JSONPath expressions
Validate the mapping against the sample
Save as default

5. Create Test Suite

Organize tests into suites. Import tests from YAML:

- id: test-1
  name: Greeting Test
  description: Verify the assistant greets users properly
  input:
    messages:
      - role: user
        content: Hello!
  expectations:
    - type: contains_text
      text: "Hello"
      case_insensitive: true
    - type: banned_text
      text: "error"
      case_insensitive: true

Expectation types:

contains_text: Substring search
banned_text: Ensure text is NOT present
regex_match: Pattern matching
link_pattern: Verify URLs match pattern
tool_called: Check if specific tool was called
tool_sequence: Verify tool call order
llm_judge: AI-based scoring with custom rubric
groundedness: Check response is grounded in retrieved docs

6. Run Tests

Queue a test run:

Select environment, endpoint, and mapping spec
Optionally tag with Git commit hash
View live progress and results

7. Analyze Results

View detailed results:

Summary: Pass rate, latency, tokens, cost
Per-Test Results: Status, expectations breakdown, failure reasons
Canonical Trace: Messages, tool calls, usage, retrieved docs
Raw Response: Original JSON for debugging

MappingSpec Format

Define how to parse arbitrary JSON into canonical traces:

{
  "version": 1,
  "messages": {
    "itemsPath": "$.choices[*].message",
    "rolePath": "$.role",
    "contentPath": "$.content"
  },
  "toolCalls": {
    "itemsPath": "$.tool_calls[*]",
    "namePath": "$.function.name",
    "argumentsPath": "$.function.arguments"
  },
  "usage": {
    "objectPath": "$.usage",
    "promptTokensPath": "$.prompt_tokens",
    "completionTokensPath": "$.completion_tokens",
    "totalTokensPath": "$.total_tokens"
  },
  "retrievedDocs": {
    "itemsPath": "$.retrieved_docs[*]",
    "contentPath": "$.content",
    "titlePath": "$.title",
    "idPath": "$.id"
  },
  "fallback": {
    "singleAssistantContentPath": "$.response"
  }
}

CLI Tool

Install the CLI tool:

dotnet tool install --global Promptly.Cli

Trigger runs from CI/CD:

promptly trigger \
  --base-url http://localhost:5000 \
  --api-key <your-api-key> \
  --suite <suite-id> \
  --env <environment-id> \
  --endpoint <endpoint-id> \
  --mapping <mapping-id> \
  --commit $(git rev-parse HEAD)

promptly wait --base-url http://localhost:5000 --api-key <your-api-key> --run-id <run-id>

Exit code:

0: All tests passed
1: One or more tests failed

Development

Running Locally (Without Docker)

C# API:

cd Promptly.Server
dotnet ef database update
dotnet run

Python Worker:

cd Promptly.Worker
pip install -r requirements.txt
uvicorn main:app --reload --port 8000

React Frontend:

cd Promptly.Web/src/web
npm install
npm run dev

Database Migrations

Create a new migration:

cd Promptly.Server
dotnet ef migrations add <MigrationName>
dotnet ef database update

API Documentation

Swagger UI is available at http://localhost:5000/swagger with JWT bearer authentication.

Key endpoints:

POST /api/auth/register: Create account
POST /api/auth/login: Get JWT token
POST /api/projects: Create project
POST /api/environments: Create environment
POST /api/endpoints/{id}/mapping/propose: AI-powered mapping proposal
POST /api/suites: Create test suite
POST /api/suites/{id}/tests/import: Import tests from YAML
POST /api/runs: Queue test run
GET /api/runs/{id}/results: Get run results

Troubleshooting

Database Connection Issues

Ensure PostgreSQL is running:

docker-compose ps postgres

Check logs:

docker-compose logs postgres

Python Worker Errors

Check LLM API key is set:

docker-compose exec promptly-eval printenv PROMPTLY_LLM_API_KEY

View logs:

docker-compose logs promptly-eval

Frontend Connection Issues

Ensure VITE_API_BASE_URL points to the correct API URL. Check browser console for errors.

Background Worker Not Processing Runs

Check worker logs:

docker-compose logs promptly-server | grep TestRunWorkerService

Verify TestRunner configuration in appsettings.json.

Tech Stack

Backend: ASP.NET Core 10, Entity Framework Core, PostgreSQL
Worker: Python 3.11+, FastAPI, OpenAI SDK
Frontend: React 19, TypeScript, Material-UI, Vite
Auth: ASP.NET Identity, JWT, API Keys
Security: Data Protection API for encrypted storage
Mapping: JsonPath.Net for deterministic parsing
Evaluation: Regex, LLM judges (OpenAI/Azure)

License

[Specify your license here]

Contributing

[Specify contribution guidelines]

Support

For issues and questions, please open a GitHub issue.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude		.claude
Promptly.Application		Promptly.Application
Promptly.Cli		Promptly.Cli
Promptly.Domain		Promptly.Domain
Promptly.Infrastructure		Promptly.Infrastructure
Promptly.Sdk.DotNet		Promptly.Sdk.DotNet
Promptly.Server		Promptly.Server
Promptly.Web/src/web		Promptly.Web/src/web
Promptly.Worker		Promptly.Worker
docker		docker
.dockerignore		.dockerignore
.gitignore		.gitignore
CONFIGURATION.md		CONFIGURATION.md
DesignSpec.md		DesignSpec.md
PROVIDE_THESE_VALUES.md		PROVIDE_THESE_VALUES.md
Promptly.slnx		Promptly.slnx
README.md		README.md
SETUP_CHECKLIST.md		SETUP_CHECKLIST.md
test_mapping_proposal.json		test_mapping_proposal.json

Folders and files

Latest commit

History

Repository files navigation

Promptly v1

Features

Architecture

Quick Start

Prerequisites

Running with Docker Compose

Configuration

C# Control Plane (Promptly.Server)

Python Worker (Promptly.Worker)

React Frontend (Promptly.Web)

Usage

1. Register & Login

2. Create a Project

3. Set Up Environment

4. Add Endpoint & Mapping

5. Create Test Suite

6. Run Tests

7. Analyze Results

MappingSpec Format

CLI Tool

Development

Running Locally (Without Docker)

Database Migrations

API Documentation

Troubleshooting

Database Connection Issues

Python Worker Errors

Frontend Connection Issues

Background Worker Not Processing Runs

Tech Stack

License

Contributing

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages