Deterministic, replayable LLM workflows with first-class testing.
Chronicon is a workflow engine for building LLM applications that can be:
- Replayed exactly from execution logs
- Tested against real past executions (no mocks)
- Debugged with full visibility into every step
Use Chronicon when:
- ✅ Testing LLM applications against real past executions
- ✅ Debugging complex LLM workflows by replaying them exactly
- ✅ Building regression tests without mocking LLM responses
- ✅ Developing locally with full execution visibility
- ✅ You need deterministic replay of any workflow execution
Don't use Chronicon for:
- ❌ Production job orchestration (use Trigger.dev, Temporal, Celery)
- ❌ Distributed systems or parallel execution
- ❌ Cloud deployment or scaling
- ❌ Real-time production workloads
- ❌ General background jobs unrelated to LLMs
TL;DR: Chronicon is a development and testing tool for LLM apps, not production infrastructure.
- ❌ An agent framework
- ❌ A prompt library
- ❌ A distributed system
- ❌ A flexible DSL
- ❌ A model hosting service
- Determinism first: If a workflow ran once, it must be replayable from logs
- Explicit state: No hidden globals or magic context
- Side effects are explicit: LLM calls, API calls, and IO are logged
- Minimal surface area: Opinionated over flexible
- Local-first: Single-process runtime
from chronicon import workflow, step, execute, replay, anthropic_llm_call
# Effectful step: makes HTTP call, logged for replay
@step
def fetch(url: str) -> str:
import requests
return requests.get(url).text
# Pure step: deterministic, still logged for tracing
@step
def extract_score(text: str) -> int:
# Extract first number from text
import re
match = re.search(r'\d+', text)
return int(match.group()) if match else 0
@workflow
def summarize(url: str) -> tuple[str, int]:
from chronicon import llm_call
# Step 1: Fetch text (effectful, logged)
text = fetch(url)
# Step 2: LLM summarization (effectful, logged)
summary = llm_call(
prompt=f"Summarize this text:\n\n{text}",
model="claude-3-5-sonnet-20241022",
)
# Step 3: Score quality (effectful, logged)
score_text = llm_call(
prompt=f"Rate this summary 1-10:\n\n{summary}",
model="claude-3-5-sonnet-20241022",
)
# Step 4: Parse score (pure, logged)
score = extract_score(score_text)
return summary, score
# Run it
llm_call = anthropic_llm_call()
result = execute(summarize, url="https://example.com", llm_call=llm_call)
# Replay it exactly - all steps replayed from logs, no HTTP or LLM calls
result = replay(execution_id=result.execution_id)
# Test against past execution
assert replay(execution_id="abc123").output == expected_outputUse different LLM providers for different tasks in the same workflow:
from chronicon import workflow, execute, anthropic_llm_call, ollama_llm_call
@workflow
def analyze(text: str) -> dict:
from chronicon import llm_call
# Use local Ollama for fast classification
category = llm_call(
prompt=f"Classify: {text}",
model="llama2",
provider="ollama" # Route to local model
)
# Use Claude for detailed analysis
analysis = llm_call(
prompt=f"Analyze: {text}",
model="claude-3-5-sonnet-20241022",
provider="anthropic" # Route to cloud API
)
return {"category": category, "analysis": analysis}
# Execute with multiple providers
providers = {
"anthropic": anthropic_llm_call(),
"ollama": ollama_llm_call(),
}
result = execute(analyze, "sample text", providers=providers)A Python function decorated with @workflow. Workflows are:
- Versioned by source code hash
- Explicit about inputs and outputs
- Replayable from execution logs
Atomic unit of execution. Two kinds:
- Pure: Deterministic functions
- Effectful: LLM calls, API calls, IO
All steps are:
- Hashable and identifiable
- Inputs and outputs are serializable
- Logged in execution trace
Append-only SQLite database recording:
- Workflow version hash
- Step ID + version hash
- Inputs and outputs (serialized)
- LLM request parameters
- Timestamps and errors
Can:
- Replay an execution deterministically
- Resume from step N
- Detect divergence (hash mismatch, input change, code change)
- Explain why a replay diverged
- Quick Start Guide - Get running in 5 minutes
- LLM Providers - Multi-provider configuration and usage
- Project Structure - Architecture and design
- Development Guide - Contributing and testing
- Project Summary - Complete implementation details
# Clone and install
git clone <repo-url>
cd chronicon
python3 -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
# Set API key
export ANTHROPIC_API_KEY="your-key-here"# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
black .
ruff check .MVP in progress. Chronicon is v0: local-only, single-process, sequential execution.
What's working:
- ✅ Workflow definition and versioning
- ✅ Execution logging (SQLite)
- ✅ Deterministic replay
- ✅ Multiple LLM providers (Anthropic, OpenAI, Google, Ollama, custom)
- ✅ Provider routing (different models per task)
What's not (and won't be in v0):
- ❌ Parallelism
- ❌ Agents
- ❌ Cloud deployment
- ❌ UI
MIT