HILT is a privacy-first, open-source format for logging human–AI interactions. Drop in one line at startup and every LLM call is captured with prompts, completions, metrics, and error context—no refactors, no custom wrappers.
- One-line auto-instrumentation for the official OpenAI Python SDK (
client.chat.completions.create) - Deterministic conversation threading with prompt/completion links and reply metadata
- Rich telemetry: latency, token usage, cost estimates, HTTP-style status codes, and error surfaces
- Storage backends you control: append-only JSONL files or real-time Google Sheets dashboards
- Thread-safe context management so you can override sessions per request, per worker, or per tenant
- Manual event logging via
Session.append()for tool calls, human feedback, or guardrail results
pip install hilt-pythonNeed Google Sheets streaming? Install the Sheets extra:
pip install "hilt-python[sheets]"from hilt import instrument, uninstrument
from openai import OpenAI
# Enable automatic logging (writes to logs/chat.jsonl by default)
instrument(backend="local", filepath="logs/chat.jsonl")
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Give me three onboarding tips"}],
)
print(response.choices[0].message.content)
# Stop logging when your app shuts down
uninstrument()After the single instrument() call:
- Prompts and completions are recorded as separate events
- Latency, tokens, cost, and status codes are populated automatically
- Conversation IDs remain stable so you can trace every exchange end to end
- Pick the columns you need for privacy or dashboards (see below)
instrument(..., columns=[...]) works for both JSONL and Google Sheets backends. If you omit the argument, JSONL writes the full event payload while Sheets defaults to every column listed here.
| Column | Description |
|---|---|
timestamp |
ISO timestamp of the event (UTC) |
conversation_id |
Stable thread identifier (conv_xxx) |
event_id |
Unique event UUID (useful for reply linking) |
reply_to |
Event ID this message responds to |
status_code |
HTTP-like status from providers (e.g., 200, 429) |
session |
Human-readable session alias (first characters of the ID) |
speaker |
human / agent plus identifier |
action |
Event type (prompt, completion, system, …) |
message |
Normalised single-line content (truncated to 500 chars) |
tokens_in |
Prompt tokens (if provider reports usage) |
tokens_out |
Completion tokens (if provider reports usage) |
cost_usd |
Monetised cost for the call (six decimal precision) |
latency_ms |
Wall-clock latency from request start to response |
model |
Provider/model label returned by the SDK |
relevance_score |
Generic score slot (e.g., retrieval relevance) |
instrument(
backend="local",
filepath="logs/redacted.jsonl",
columns=["timestamp", "speaker", "action", "tokens_out", "cost_usd"],
)instrument(backend="local", filepath="logs/app.jsonl")- Privacy-first: data never leaves your environment
- Plays nicely with analytics tooling (Python, Pandas, Spark, etc.)
See Google Sheets setup guide for credential and sheet ID steps.
instrument(
backend="sheets",
sheet_id="1abc...",
credentials_path="credentials.json",
worksheet_name="LLM Logs",
columns=["timestamp", "message", "cost_usd", "status_code"],
)- Great for support, QA, or cost monitoring teams
- Columns control both ordering and visibility
- Works with
credentials_pathor in-memorycredentials_json
instrument(
backend="local",
filepath="logs/app.jsonl",
providers=["openai"], # Anthropic / Gemini planned
)Passing an empty list opens the session without patching any providers—useful for manual logging scenarios.
- Nothing recorded? Ensure
instrument()runs before importing the OpenAI client. - Async apps? Use the same call; the instrumentation is thread-safe and works with
AsyncOpenAI. - Large logs? Rotate files daily (
logs/app-YYYY-MM-DD.jsonl) and prune with a cron job. - Sheets failing? Double-check the service account has editor access and that
hilt[sheets]is installed.
See docs/ for deeper guides on privacy, advanced contexts, and FAQ.
Contributions are welcome! Start with CONTRIBUTING.md. The test suite lives in tests/, and linting/type checking is configured via Ruff, Black, and MyPy.
- Add auto-instrumentation for Anthropic Claude
- Add auto-instrumentation for Google Gemini
- Add auto-instrumentation for LangGraph
Released under the Apache License 2.0.
pip install hilt-python