Skip to content

ES7/pguard-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pguard-llm

Prompt version control and A/B testing for LLM applications.

pip install pguard-llm

The Problem

You tweak a prompt. It gets better. You tweak it again. Now you don't remember what changed, which version was best, or how much each version costs to run.

pguard-llm fixes this — version your prompts, run them, compare them.

Quick Start

from pguard import Prompt

p = Prompt("summarize")

# Save versions
p.save("v1", "Summarize this: {text}", description="Simple")
p.save("v2", "In 3 bullet points, summarize: {text}", description="Structured")

# Run against an LLM
result = p.run(
    "v1",
    provider="openai",
    model="gpt-4o",
    api_key="sk-...",
    input_vars={"text": "Your article text here..."}
)

print(result.output)
print(result.cost_usd)
print(result.latency_ms)

# Compare v1 vs v2
comparison = p.compare("v1", "v2")
print(comparison.summary())

Supported Providers

Provider Install Models
OpenAI pip install openai gpt-4o, gpt-4o-mini, ...
Anthropic pip install anthropic claude-sonnet-4, claude-haiku-4
Gemini pip install google-genai gemini-2.5-flash, gemini-1.5-pro

Install with Provider

pip install "pguard-llm[openai]"
pip install "pguard-llm[anthropic]"
pip install "pguard-llm[gemini]"
pip install "pguard-llm[all]"

Storage Backends

# File storage (default) — zero setup
p = Prompt("summarize", storage="file")

# SQLite — better for querying
p = Prompt("summarize", storage="sqlite")

A/B Comparison

comparison = p.compare("v1", "v2")
summary = comparison.summary()

# summary contains:
# - latency_ms: avg latency per version + winner
# - cost_usd: avg cost per version + winner
# - quality_score: avg quality per version + winner
# - tokens_avg: avg tokens per version

CLI

pguard list                        # list all prompts
pguard versions summarize          # list versions
pguard show summarize v1           # show template
pguard runs summarize v1           # show run history
pguard compare summarize v1 v2     # compare versions

RunResult

result.output        # LLM response text
result.cost_usd      # Cost in USD
result.latency_ms    # Latency in milliseconds
result.tokens_in     # Input tokens
result.tokens_out    # Output tokens
result.quality_score # Quality score (0-1)
result.provider      # Provider used
result.model         # Model used

License

MIT

About

Prompt version control and A/B testing for LLM applications

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages