Skip to content

Add production-observability example#117

Merged
chris-colinsky merged 1 commit into
mainfrom
feature/example-production-observability
Jun 2, 2026
Merged

Add production-observability example#117
chris-colinsky merged 1 commit into
mainfrom
feature/example-production-observability

Conversation

@chris-colinsky
Copy link
Copy Markdown
Member

Summary

Second of three examples picked from the audit (after PR #116's chat-with-multimodal; crash-and-resume on example 08 still to come).

New examples/12-production-observability/ demonstrates the production-grade observability stack end-to-end. Pairs the dual-observer pattern (the README pitch's "no SaaS lock-in" claim) with the caller-hook surface from proposal 0043 and the canonical TimingMiddleware so a reader sees what each piece does in one place.

What's wired:

  • Both observers on one graphOTelObserver + LangfuseObserver attached simultaneously (proposal 0031). Each consumes the same NodeEvent stream independently; nothing in node code knows there are two.
  • Caller hooks for trace.input / trace.output (proposal 0043 §8.4.1). Hooks return domain dicts like {"question": ...} / {"answer": ..., "model": ...} shaped for the Langfuse UI viewer; raw State stays out of trace payloads.
  • Built-in TimingMiddleware wrapping the respond node. on_complete callback receives a TimingRecord(node_name, duration_ms, outcome, exception_category) and prints a one-line summary; production callbacks would queue to a metrics backend (StatsD / Prometheus / OTLP metrics) instead.
  • invoke(metadata={...}) carrying multi-tenant identifiers (tenantId / requestId / featureFlag). Both observers pick them up in one call: OTel as openarmature.user.* span attributes, Langfuse as top-level trace.metadata keys plus per-observation metadata.
  • In-memory capturesInMemoryLangfuseClient + InMemorySpanExporter capture in-process so the demo prints what both backends would have ingested without needing real cloud credentials. Walk-through doc shows the production swap (LangfuseSDKAdapter, BatchSpanProcessor + OTLPSpanExporter).
  • disable_llm_payload=False on BOTH observers so the captured LLM input messages and output content appear in both backends. The example's whole point would be undercut by leaving the payload capture asymmetric.
  • try / except NodeException at the invoke() boundary surfaces the underlying LlmProviderError category so a reader sees the production-shape error path. Both observer captures still print on failure so the dual-observer story extends to failure modes too.

Complementary to example 03 (observer hooks at finer granularity) and example 10 (Langfuse + LangfusePromptBackend prompt linkage). This example's headline is the production-shape wiring, not the hook surface or prompt management.

Verified end-to-end

Real run against gpt-4o-mini: TimingMiddleware fired with [timing] respond: 4217.6ms (success), both captures show the three caller-supplied metadata entries, Langfuse trace shows input / output from the caller hooks, LLM payload captured on the Generation (input messages + output content), three distinct identifiers (requestId from the caller, correlation_id from OA, Trace id = invocation_id).

Test plan

  • Pyright clean
  • ruff + format clean
  • No em dashes
  • mkdocs strict build clean
  • AGENTS.md drift test passes after regen
  • Full suite (1080 pass, no regressions)
  • Manual end-to-end run successful

Out of scope

  • Crash-and-resume drama on example 08 (next PR, third of the audit picks).
  • Example renumbering + topical catalog reorder (task docs: flip site_url to apex openarmature.ai #34, after all example work lands).
  • Real Langfuse SDK / OTLP exporter setup (walk-through doc covers the swap recipe; example 03 owns the OTel-only side at finer granularity; example 10 owns the LangfusePromptBackend prompt-linkage side).

Reviewer notes

One observation from the manual run: the OTel side captured two spans (respond + openarmature.llm.complete) but didn't include the openarmature.invocation root span. Worth investigating later but not in this PR's scope — the spans that DO appear carry the metadata correctly and the headline demonstrates. May file a follow-on if it turns out the InMemorySpanExporter captures aren't reaching the invocation span specifically.

New examples/12-production-observability/ demonstrates the
production-grade observability stack end-to-end. Pairs the dual-
observer pattern (the README pitch's "no SaaS lock-in" claim) with
the caller-hook surface from proposal 0043 and the canonical
TimingMiddleware so a reader sees what each piece does in one
place.

What's wired:

- Both OTelObserver and LangfuseObserver attached to the same
  graph (proposal 0031). Each consumes the same NodeEvent stream
  independently; nothing in node code knows there are two.
- trace_input_from_state / trace_output_from_state caller hooks on
  the LangfuseObserver (proposal 0043 §8.4.1). Hooks return domain
  dicts shaped for the Langfuse UI viewer; raw State stays out of
  trace payloads.
- Built-in TimingMiddleware wrapping the respond node. The
  on_complete callback receives a TimingRecord and prints a
  one-line summary; production callbacks would queue to a metrics
  backend (StatsD / Prometheus / OTLP metrics) instead.
- invoke(metadata={...}) carrying multi-tenant identifiers (tenantId
  / requestId / featureFlag). Both observers pick them up in one
  call: OTel as openarmature.user.* span attributes, Langfuse as
  top-level trace.metadata keys.
- InMemoryLangfuseClient + InMemorySpanExporter capture in-process
  so the demo prints what both backends would have ingested without
  needing real cloud credentials. Walk-through doc shows the
  production swap (LangfuseSDKAdapter, BatchSpanProcessor +
  OTLPSpanExporter).
- disable_llm_payload=False on BOTH observers so the captured LLM
  input messages and output content appear in both backends (the
  whole point of the example would be undercut by leaving the
  payload capture asymmetric).
- try/except NodeException at the invoke() boundary surfaces the
  underlying LlmProviderError category so a reader sees the
  production-shape error path. Both observer captures still print
  on failure so the dual-observer story extends to failure modes.

Complementary to example 03 (observer hooks at finer granularity)
and example 10 (Langfuse + LangfusePromptBackend prompt linkage).
This example's headline is the production-shape wiring, not the
hook surface or prompt management.
Copilot AI review requested due to automatic review settings June 2, 2026 00:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new example #12 that wires both the OTel and Langfuse observers on a single graph, with caller hooks shaping trace.input/trace.output, the canonical TimingMiddleware, and invoke(metadata=...) propagation to demonstrate a production-shape observability stack end-to-end. Includes companion documentation, catalog/nav entries, and a CHANGELOG note.

Changes:

  • New runnable example at examples/12-production-observability/main.py using in-memory captures for both backends.
  • New docs page docs/examples/12-production-observability.md plus catalog/nav entries.
  • AGENTS.md and CHANGELOG entries updated for the new example.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
examples/12-production-observability/main.py New single-node graph wired with dual observers, timing middleware, and metadata propagation.
docs/examples/12-production-observability.md New walk-through doc with sample output and production-swap recipe.
docs/examples/index.md Adds catalog entry for example 12.
mkdocs.yml Adds example 12 to the docs nav.
src/openarmature/AGENTS.md Mentions example 12 in the bundled examples index.
CHANGELOG.md Adds an "Added" bullet for the new example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chris-colinsky chris-colinsky merged commit 260d1ec into main Jun 2, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the feature/example-production-observability branch June 2, 2026 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants