Skip to content

Commit 54542e0

Browse files
authored
Merge pull request #60 from ClickHouse/pufit/langfuse-observability
Add optional Langfuse observability
2 parents 440d94e + 2e7218e commit 54542e0

14 files changed

Lines changed: 959 additions & 162 deletions

File tree

config.example.yaml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,13 @@ memory:
6363
cron:
6464
system_file: ~/.nerve/cron/system.yaml # Managed by 'nerve init' — system crons
6565
jobs_file: ~/.nerve/cron/jobs.yaml # Your custom crons — never touched by Nerve
66+
67+
# Observability (optional) — Langfuse tracing for the agent loop and memU.
68+
# Set keys in config.local.yaml. Without keys, the integration is a no-op.
69+
# See docs/observability.md for setup details.
70+
# langfuse:
71+
# public_key: pk-lf-... # from your Langfuse project (set in config.local.yaml)
72+
# secret_key: sk-lf-... # from your Langfuse project (set in config.local.yaml)
73+
# host: https://cloud.langfuse.com
74+
# # redact_patterns: # optional — strip these regexes from spans
75+
# # - "sk-ant-[A-Za-z0-9_\\-]{20,}"

docs/observability.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Observability — Langfuse
2+
3+
Nerve has an optional Langfuse integration for tracing the agent loop and
4+
the memU memory pipeline. When configured, every Claude Agent SDK turn,
5+
tool call, and direct Anthropic SDK call (memU embeddings/condensation)
6+
becomes a span in your Langfuse project, tagged with `session_id`,
7+
`source` (`web` / `cron` / `telegram` / `hook`), `model`, and `channel`.
8+
9+
When the keys aren't set, the integration is a complete no-op — Nerve
10+
runs identically with zero observability overhead.
11+
12+
## What gets captured
13+
14+
| Surface | Source | Tags |
15+
|-------------------------------|------------------------------------------------------|---------------------------------------------------|
16+
| Agent turns + tool calls | `claude_agent_sdk` via LangSmith integration | `source:*`, `model:*`, `channel:*` (when present) |
17+
| memU chat / summarize / embed | `anthropic` SDK via `AnthropicInstrumentor` | `component:memu`, `purpose:summarize` |
18+
19+
Trace-level attributes (`session_id`, `metadata.parent_session_id`,
20+
`metadata.fork_from`) are propagated to every span emitted inside a turn
21+
via OpenTelemetry Baggage.
22+
23+
## Setup
24+
25+
### 1. Get a Langfuse project
26+
27+
Two options:
28+
29+
- **Langfuse Cloud** — sign up at <https://cloud.langfuse.com> and create
30+
a project. Region picks: `https://cloud.langfuse.com` (EU, default),
31+
`https://us.cloud.langfuse.com` (US),
32+
`https://jp.cloud.langfuse.com` (JP).
33+
- **Self-hosted** — follow the upstream deployment guide at
34+
<https://langfuse.com/self-hosting/deployment/docker-compose>, then
35+
point Nerve at the resulting host URL.
36+
37+
### 2. Get API keys
38+
39+
In the Langfuse UI: *Project Settings → API Keys → Create new API keys*.
40+
Copy the public (`pk-lf-...`) and secret (`sk-lf-...`) keys.
41+
42+
### 3. Configure Nerve
43+
44+
Add to `config.local.yaml` (gitignored):
45+
46+
```yaml
47+
langfuse:
48+
public_key: pk-lf-...
49+
secret_key: sk-lf-...
50+
host: https://cloud.langfuse.com
51+
```
52+
53+
Restart Nerve. On startup you should see one of:
54+
55+
- `Langfuse: enabled (host=...)` — keys valid, tracing active.
56+
- `Langfuse: disabled (no public_key/secret_key in config)` — keys absent.
57+
- `Langfuse: auth_check failed against ...` — keys present but rejected.
58+
59+
Visit the diagnostics page (`/diagnostics`) to confirm the live status.
60+
61+
## Configuration reference
62+
63+
| Field | Default | Notes |
64+
|-------------------|----------------------------------|-----------------------------------------------------------------|
65+
| `public_key` | `""` | `pk-lf-...` — required to activate. |
66+
| `secret_key` | `""` | `sk-lf-...` — required to activate. |
67+
| `host` | `https://cloud.langfuse.com` | Region endpoint or self-hosted URL. |
68+
| `redact_patterns` | (built-in secret regexes) | List of regexes — matched substrings are replaced with `[REDACTED]`. |
69+
70+
The default `redact_patterns` strip common secret formats: Anthropic API
71+
keys, Langfuse keys, and bcrypt hashes. Add more for any project-specific
72+
secret formats you don't want to leave the host.
73+
74+
## Privacy note
75+
76+
When enabled, **prompt content, tool inputs, and model outputs leave the
77+
host** to whichever Langfuse instance you point at. The `host` field is
78+
the boundary — make sure it points where you want the data to go. For
79+
strict data residency, self-host Langfuse on infrastructure you control.
80+
81+
`redact_patterns` is a defensive layer — useful even with trusted
82+
endpoints in case a secret leaks into a prompt accidentally.
83+
84+
## Disabling
85+
86+
Remove or empty the `public_key` / `secret_key` fields. No restart-time
87+
flags, no feature gates — the lack of keys is the off switch.
88+
89+
## Cost cross-check
90+
91+
Langfuse computes its own cost based on token counts and a price model
92+
maintained by Langfuse. Nerve's `db/usage.py` computes cost in-process
93+
via a hardcoded `MODEL_PRICING` dict and the SDK's
94+
`ResultMessage.total_cost_usd`. Expect minor mismatches between the two —
95+
they're independent calculations. Treat Langfuse as a second source of
96+
truth for catching local cost-tracking bugs.
97+
98+
## Troubleshooting
99+
100+
- **Spans aren't appearing.** Check `/api/observability/status` —
101+
if `auth_ok: false`, the keys are wrong. If `enabled: false` despite
102+
keys being set, look at startup logs for an `ImportError` on the
103+
`langfuse` package itself (run `uv pip install -e .` to refresh).
104+
- **Spans are tagged but session_id is missing.** That can happen if the
105+
installed Langfuse SDK doesn't accept `session_id=` kwarg in
106+
`propagate_attributes`. Upgrade to a newer Langfuse Python SDK.
107+
- **The host runs out of memory under heavy load.** The Langfuse SDK
108+
buffers spans and ships them async. If memory is tight you can drop
109+
the Anthropic instrumentation by editing `init_langfuse`, or deploy
110+
Langfuse self-hosted on a separate machine.

0 commit comments

Comments
 (0)