`PromptTrace`

Stop losing your best prompts.

Lightweight prompt versioning & evaluation tracker for LLM engineers.
One decorator. Automatic versioning. Local SQLite. Beautiful dashboard.

Quick Start • Features • Dashboard • API Reference • Configuration

screencapture-127-0-0-1-8777-2026-03-26-06_37_49

The Problem

You iterate on prompts 50 times a day. You had a great system prompt last Tuesday that got 92% accuracy — but you lost it. You changed one word and everything broke, but you can't remember which word.

Your eval scores live in scattered notebooks and print() statements.

PromptTrace fixes this. → pip install prompttrace → done.

📦 Installation

pip install prompttrace

Requirements: Python 3.9+ · Single dependency: rich

🚀 Quick Start

1 → Decorate your LLM calls

from prompttrace import trace

@trace(experiment="my-chatbot", model="gpt-4o")
def generate(prompt, temperature=0.7):
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
    )
    return response.choices[0].message.content

# Every call is now automatically tracked
generate("Explain quantum computing in one sentence.", temperature=0.3)
generate("Explain quantum computing in one sentence.", temperature=0.9)

2 → Launch the dashboard

from prompttrace import dashboard

dashboard()  # → http://127.0.0.1:8777

Or from the terminal:

prompttrace

That's it. Every prompt, output, latency, model, and generation parameter is logged and visualized.

✨ Features

	Feature	Description
🎯	`@trace` decorator	Wrap any LLM call — auto-logs prompt, output, latency, params
📝	`log_call()` function	Manual logging for when you can't use a decorator
📊	Auto eval	Pass an `eval_fn` to score outputs automatically
🔀	Prompt versioning	Every unique prompt gets a hash — see how changes affect results
⚖️	Side-by-side compare	Diff two prompts word-by-word, see outputs and metrics
🖥️	Web dashboard	Modern UI with animated charts, tables, filters — zero JS deps
🔒	Local-only	Everything in SQLite. No cloud. No API keys. No telemetry
🎨	Rich terminal logs	Colorful, emoji-powered console output via `rich`
🔄	Real-time updates	Dashboard auto-refreshes every 2s — no manual reload
🗑️	Experiment management	Delete experiments, filter dashboard by experiment
📤	CSV export	One-click export of all traces for external analysis

🖥️ Dashboard

Launch with prompttrace or from prompttrace import dashboard; dashboard().

Three views:

View	What it does
Dashboard	Stats cards, latency chart, status donut, model usage — filterable by experiment
Traces	Full table of all logged calls with search, filter, delete, and CSV export
Compare	Select two prompts → word-level diff highlighting with outputs side-by-side

screencapture-127-0-0-1-8777-2026-03-26-06_39_01

screencapture-127-0-0-1-8777-2026-03-26-06_39_26

📖 Usage Guide

The `@trace` Decorator

from prompttrace import trace

@trace(
    experiment="summarizer",       # Group related traces
    model="claude-3-sonnet",       # Model identifier
    tags=["prod", "v2"],           # Optional tags
    description="Q3 summary bot",  # Optional experiment description
)
def summarize(prompt, temperature=0.5, max_tokens=500):
    # Your LLM call here
    return llm_response

What gets logged automatically:

Prompt text · Output · Latency · Generation parameters (temperature, top_p, max_tokens, etc.) · Input variables · Status (success / error) · Error messages · Approximate token counts

Returning Metadata

Return a dict to include token counts:

@trace(experiment="qa", model="gpt-4o")
def answer(prompt):
    resp = openai.chat.completions.create(...)
    return {
        "output": resp.choices[0].message.content,
        "token_count_input": resp.usage.prompt_tokens,
        "token_count_output": resp.usage.completion_tokens,
    }

Auto Evaluation

Pass an eval_fn to score every output automatically:

def my_eval(prompt, output):
    """Return a dict of metric_name: score."""
    return {
        "relevance": compute_relevance(prompt, output),
        "length_ok": 1.0 if 50 < len(output) < 500 else 0.0,
        "has_citation": 1.0 if "[source]" in output else 0.0,
    }

@trace(experiment="research-bot", model="gpt-4o", eval_fn=my_eval)
def research(prompt):
    return call_llm(prompt)

Metrics appear in the terminal and the dashboard.

Manual Logging with `log_call()`

For cases where a decorator doesn't fit:

from prompttrace import log_call
import time

start = time.perf_counter()
output = my_llm_pipeline(prompt)
elapsed = (time.perf_counter() - start) * 1000

log_call(
    prompt="Translate to French: Hello world",
    output="Bonjour le monde",
    experiment="translation",
    model="gpt-4o-mini",
    generation_params={"temperature": 0.2},
    latency_ms=elapsed,
    token_count_input=8,
    token_count_output=5,
    tags=["translation", "french"],
    eval_metrics={"bleu": 0.95, "fluency": 0.88},
)

CLI

# Default (localhost:8777)
prompttrace

# Custom port
prompttrace --port 9000

# Accessible from network
prompttrace --host 0.0.0.0 --port 8777

📋 API Reference

`@trace(...)`

Parameter	Type	Default	Description
`experiment`	`str`	`"default"`	Experiment name for grouping
`model`	`str`	`"unknown"`	Model identifier
`tags`	`list[str]`	`None`	Optional tags
`eval_fn`	`callable`	`None`	`fn(prompt, output) → dict[str, float]`
`description`	`str`	`""`	Experiment description

`log_call(...)`

Parameter	Type	Default	Description
`prompt`	`str`	required	The prompt template
`output`	`str`	required	The LLM output
`experiment`	`str`	`"default"`	Experiment name
`model`	`str`	`"unknown"`	Model identifier
`generation_params`	`dict`	`None`	e.g. `{"temperature": 0.7}`
`input_variables`	`dict`	`None`	Template variables
`latency_ms`	`float`	`0`	Response time in ms
`token_count_input`	`int`	`0`	Input token count
`token_count_output`	`int`	`0`	Output token count
`status`	`str`	`"success"`	`"success"` or `"error"`
`error_message`	`str`	`""`	Error details
`tags`	`list[str]`	`None`	Optional tags
`eval_metrics`	`dict`	`None`	`{"metric": score}`

`dashboard(host, port)`

Launches the web UI. Blocks until Ctrl+C.

⚙️ Configuration

Database Location

By default, traces are stored in .prompttrace/traces.db in the current directory.

# Override via environment variable
export PROMPTTRACE_DB=/path/to/my/traces.db

# Override programmatically
from prompttrace import set_db_path
set_db_path("/path/to/my/traces.db")

📁 Project Structure

your-project/
├── pyproject.toml
├── README.md
├── example.py
└── prompttrace/
    ├── __init__.py          # Public API exports
    ├── core.py              # @trace decorator, log_call, dashboard launcher
    ├── db.py                # SQLite database layer
    ├── server.py            # Built-in HTTP server + JSON API
    ├── cli.py               # CLI entry point
    ├── dashboard.html       # Single-file web dashboard (zero JS deps)
    └── logo.png             # App logo

📄 License

MIT — use it however you want.

PromptTrace
_{Stop losing your best prompts.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`PromptTrace`

The Problem

📦 Installation

🚀 Quick Start

1 → Decorate your LLM calls

2 → Launch the dashboard

✨ Features

🖥️ Dashboard

📖 Usage Guide

The `@trace` Decorator

Returning Metadata

Auto Evaluation

Manual Logging with `log_call()`

CLI

📋 API Reference

`@trace(...)`

`log_call(...)`

`dashboard(host, port)`

⚙️ Configuration

Database Location

📁 Project Structure

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
prompttrace		prompttrace
LICENSE		LICENSE
README.md		README.md
example.py		example.py
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

PromptTrace

The Problem

📦 Installation

🚀 Quick Start

1 → Decorate your LLM calls

2 → Launch the dashboard

✨ Features

🖥️ Dashboard

📖 Usage Guide

The @trace Decorator

Returning Metadata

Auto Evaluation

Manual Logging with log_call()

CLI

📋 API Reference

@trace(...)

log_call(...)

dashboard(host, port)

⚙️ Configuration

Database Location

📁 Project Structure

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`PromptTrace`

The `@trace` Decorator

Manual Logging with `log_call()`

`@trace(...)`

`log_call(...)`

`dashboard(host, port)`

Packages