Observability integrations (umbrella): W&B, MLflow, Langfuse, OpenTelemetry, Phoenix

### Why

ClawLoop already emits structured episodes, reward signals, and layer-state transitions. Teams running it in production or research invariably have an observability stack they want those signals landing in. Shipping first-class sinks for the common ones makes ClawLoop feel native in existing workflows instead of yet another dashboard to check.

Each item below is a small, self-contained adapter with a clear contract — ideal entry points for first-time contributors.

### Integration stubs

- [ ] **Weights & Biases sink** — log per-iteration reward curves, playbook growth, layer state hashes. Pattern: `clawloop.integrations.wandb.WandbSink(run_id=...)` consuming the existing episode stream.
- [ ] **MLflow tracking** — iterations as runs, playbook entries as artifacts, reward signals as metrics. Same shape as W&B, different backend.
- [ ] **Langfuse trace export** — emit episode messages + tool calls as Langfuse traces so LLM-observability users can search/replay inside their existing UI.
- [ ] **OpenTelemetry spans** — one span per episode, nested spans per step/tool-call. Lets users pipe into any OTel-compatible backend (Honeycomb, Datadog, Tempo, etc.) without a bespoke adapter.
- [ ] **Arize Phoenix export** — trace + evaluation shape for teams already using Phoenix for LLM eval.

### Contract

Each sink should:
1. Consume the existing `Episode` / `EpisodeSummary` / iteration-level events — no core changes.
2. Be an **optional extra**: `uv sync --extra wandb` etc., so core stays dependency-light.
3. Ship with a minimal example under `examples/observability/` and a one-paragraph README section.
4. Fail soft — a broken sink never breaks a training run.

### Why an umbrella?

Each integration is ~a day of work and independent of the others. Tracking them together shows intent; splitting them off keeps PRs reviewable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observability integrations (umbrella): W&B, MLflow, Langfuse, OpenTelemetry, Phoenix #52

Why

Integration stubs

Contract

Why an umbrella?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Observability integrations (umbrella): W&B, MLflow, Langfuse, OpenTelemetry, Phoenix #52

Description

Why

Integration stubs

Contract

Why an umbrella?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions