Skip to content

davanstrien/agent-traces

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-traces

Parse multi-format agent session traces into Polars DataFrames and Parquet.

Reads JSONL files from Pi, Claude Code, Codex, and ATIF formats → normalized three-table layout (sessions, events, content) → Parquet. Designed for analytical workflows: behavioral analysis, cost tracking, error patterns, training data curation.

Install

pip install agent-traces @ git+https://github.com/davanstrien/agent-traces.git

Or with uv:

uv pip install "agent-traces @ git+https://github.com/davanstrien/agent-traces.git"

Quick Start

from agent_traces import TraceDataset

# Load from HuggingFace Hub
ds = TraceDataset.from_hub("badlogicgames/pi-mono")

# Three normalized tables
ds.sessions    # 1 row/session: model, counts, tokens, cost
ds.events      # 1 row/entry: type, role, tool_name, is_error
ds.content     # 1 row/entry with text

# Convenience views for common analyses
ds.user_messages       # turn, nTurns, model, msg + session metadata
ds.assistant_messages  # content_text, thinking, tool_calls
ds.tool_calls          # tool_name + session metadata

# Aggregates
ds.tool_counts(group_by="model")
ds.token_stats(group_by="model")
ds.error_rate(group_by="model")
ds.summary()

# Export
ds.to_parquet("output/")          # sessions.parquet + events.parquet + content.parquet
ds.to_flat_parquet("flat.parquet") # single 44-column table (backward compat)

Batch Loading

# Multiple datasets at once
ds = TraceDataset.from_hub_batch([
    "badlogicgames/pi-mono",
    "0xSero/pi-sessions",
    "moikapy/0xKobolds",
])

# Search Hub by tag
for repo_id, ds in TraceDataset.from_hub_search(limit=10):
    print(ds.summary())

# Merge datasets
combined = ds1 + ds2

Local Files

ds = TraceDataset.from_dir("path/to/sessions/")

Or use the parser directly for the raw 44-column flat table:

from agent_traces import parse_sessions

df = parse_sessions("path/to/*.jsonl")

CLI

# Convert a Hub dataset to Parquet
agent-traces convert badlogicgames/pi-mono -o output.parquet

# Convert local files
agent-traces convert ./sessions/*.jsonl -o output.parquet

# Batch convert from a file of repo IDs
agent-traces convert --from-file repos.txt --output-dir parsed/

# Inspect a dataset
agent-traces inspect badlogicgames/pi-mono

Supported Formats

Format Example Dataset Detection
Pi badlogicgames/pi-mono type field in JSON
Claude Code ultralazr/claude-code-traces type + uuid/sessionId
Codex cfahlgren1/agent-sessions-list type: "item" + payload wrapper
ATIF vinhnx90/vtcode-sessions First-byte `

Format is auto-detected per file. The agent column in the events table identifies the source format.

Architecture

JSONL lines → msgspec decode → Entry (dataclass)
                                     │
                         ┌───────────┼───────────┐
                         ▼           ▼           ▼
                    sessions     events      content
                   (1 row/     (1 row/     (1 row per
                    session)    entry)      entry with
                                           text only)

The three-table layout avoids the 80%-null problem of a single wide table:

Table Rows Columns Size (pi-mono)
sessions 626 15 64 KB
events 36,791 20 1.4 MB
content 16,841 9 13 MB
flat (legacy) 36,791 44 16 MB

Example Analyses

Error patterns by model

ds = TraceDataset.from_hub("badlogicgames/pi-mono")
ds.error_rate(group_by="model")

Tool usage distribution

ds.tool_counts(group_by="model")

Cost of struggling sessions

s = ds.sessions
struggling = s.filter(pl.col("n_errors") >= 5)
healthy = s.filter(pl.col("n_errors") == 0)
print(f"Struggling: ${struggling['cost_total'].mean():.2f}/session")
print(f"Healthy:    ${healthy['cost_total'].mean():.2f}/session")

Session health predictor

The behavioral features (tool choice ratios, verbosity, error rates) predict session outcomes with AUC 0.84 after just 2 turns — even without seeing error counts. See the blog post for details.

Development

# Install with dev dependencies
uv sync

# Run tests
uv run pytest

# Lint
uv run ruff check .

# Type check
uv run ty check agent_traces/

# Format
uv run black agent_traces/ tests/

License

MIT

About

Parse and analyze coding agent session traces (Pi, Claude Code, Codex, ATIF) into Polars DataFrames and Parquet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages