Add production-observability example by chris-colinsky · Pull Request #117 · LunarCommand/openarmature-python

chris-colinsky · 2026-06-02T00:22:53Z

Summary

Second of three examples picked from the audit (after PR #116's chat-with-multimodal; crash-and-resume on example 08 still to come).

New examples/12-production-observability/ demonstrates the production-grade observability stack end-to-end. Pairs the dual-observer pattern (the README pitch's "no SaaS lock-in" claim) with the caller-hook surface from proposal 0043 and the canonical TimingMiddleware so a reader sees what each piece does in one place.

What's wired:

Both observers on one graph — OTelObserver + LangfuseObserver attached simultaneously (proposal 0031). Each consumes the same NodeEvent stream independently; nothing in node code knows there are two.
Caller hooks for trace.input / trace.output (proposal 0043 §8.4.1). Hooks return domain dicts like {"question": ...} / {"answer": ..., "model": ...} shaped for the Langfuse UI viewer; raw State stays out of trace payloads.
Built-in TimingMiddleware wrapping the respond node. on_complete callback receives a TimingRecord(node_name, duration_ms, outcome, exception_category) and prints a one-line summary; production callbacks would queue to a metrics backend (StatsD / Prometheus / OTLP metrics) instead.
invoke(metadata={...}) carrying multi-tenant identifiers (tenantId / requestId / featureFlag). Both observers pick them up in one call: OTel as openarmature.user.* span attributes, Langfuse as top-level trace.metadata keys plus per-observation metadata.
In-memory captures — InMemoryLangfuseClient + InMemorySpanExporter capture in-process so the demo prints what both backends would have ingested without needing real cloud credentials. Walk-through doc shows the production swap (LangfuseSDKAdapter, BatchSpanProcessor + OTLPSpanExporter).
disable_llm_payload=False on BOTH observers so the captured LLM input messages and output content appear in both backends. The example's whole point would be undercut by leaving the payload capture asymmetric.
try / except NodeException at the invoke() boundary surfaces the underlying LlmProviderError category so a reader sees the production-shape error path. Both observer captures still print on failure so the dual-observer story extends to failure modes too.

Complementary to example 03 (observer hooks at finer granularity) and example 10 (Langfuse + LangfusePromptBackend prompt linkage). This example's headline is the production-shape wiring, not the hook surface or prompt management.

Verified end-to-end

Real run against gpt-4o-mini: TimingMiddleware fired with [timing] respond: 4217.6ms (success), both captures show the three caller-supplied metadata entries, Langfuse trace shows input / output from the caller hooks, LLM payload captured on the Generation (input messages + output content), three distinct identifiers (requestId from the caller, correlation_id from OA, Trace id = invocation_id).

Test plan

Out of scope

Crash-and-resume drama on example 08 (next PR, third of the audit picks).
Example renumbering + topical catalog reorder (task docs: flip site_url to apex openarmature.ai #34, after all example work lands).
Real Langfuse SDK / OTLP exporter setup (walk-through doc covers the swap recipe; example 03 owns the OTel-only side at finer granularity; example 10 owns the LangfusePromptBackend prompt-linkage side).

Reviewer notes

One observation from the manual run: the OTel side captured two spans (respond + openarmature.llm.complete) but didn't include the openarmature.invocation root span. Worth investigating later but not in this PR's scope — the spans that DO appear carry the metadata correctly and the headline demonstrates. May file a follow-on if it turns out the InMemorySpanExporter captures aren't reaching the invocation span specifically.

New examples/12-production-observability/ demonstrates the production-grade observability stack end-to-end. Pairs the dual- observer pattern (the README pitch's "no SaaS lock-in" claim) with the caller-hook surface from proposal 0043 and the canonical TimingMiddleware so a reader sees what each piece does in one place. What's wired: - Both OTelObserver and LangfuseObserver attached to the same graph (proposal 0031). Each consumes the same NodeEvent stream independently; nothing in node code knows there are two. - trace_input_from_state / trace_output_from_state caller hooks on the LangfuseObserver (proposal 0043 §8.4.1). Hooks return domain dicts shaped for the Langfuse UI viewer; raw State stays out of trace payloads. - Built-in TimingMiddleware wrapping the respond node. The on_complete callback receives a TimingRecord and prints a one-line summary; production callbacks would queue to a metrics backend (StatsD / Prometheus / OTLP metrics) instead. - invoke(metadata={...}) carrying multi-tenant identifiers (tenantId / requestId / featureFlag). Both observers pick them up in one call: OTel as openarmature.user.* span attributes, Langfuse as top-level trace.metadata keys. - InMemoryLangfuseClient + InMemorySpanExporter capture in-process so the demo prints what both backends would have ingested without needing real cloud credentials. Walk-through doc shows the production swap (LangfuseSDKAdapter, BatchSpanProcessor + OTLPSpanExporter). - disable_llm_payload=False on BOTH observers so the captured LLM input messages and output content appear in both backends (the whole point of the example would be undercut by leaving the payload capture asymmetric). - try/except NodeException at the invoke() boundary surfaces the underlying LlmProviderError category so a reader sees the production-shape error path. Both observer captures still print on failure so the dual-observer story extends to failure modes. Complementary to example 03 (observer hooks at finer granularity) and example 10 (Langfuse + LangfusePromptBackend prompt linkage). This example's headline is the production-shape wiring, not the hook surface or prompt management.

Copilot

Pull request overview

Adds a new example #12 that wires both the OTel and Langfuse observers on a single graph, with caller hooks shaping trace.input/trace.output, the canonical TimingMiddleware, and invoke(metadata=...) propagation to demonstrate a production-shape observability stack end-to-end. Includes companion documentation, catalog/nav entries, and a CHANGELOG note.

Changes:

New runnable example at examples/12-production-observability/main.py using in-memory captures for both backends.
New docs page docs/examples/12-production-observability.md plus catalog/nav entries.
AGENTS.md and CHANGELOG entries updated for the new example.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`examples/12-production-observability/main.py`	New single-node graph wired with dual observers, timing middleware, and metadata propagation.
`docs/examples/12-production-observability.md`	New walk-through doc with sample output and production-swap recipe.
`docs/examples/index.md`	Adds catalog entry for example 12.
`mkdocs.yml`	Adds example 12 to the docs nav.
`src/openarmature/AGENTS.md`	Mentions example 12 in the bundled examples index.
`CHANGELOG.md`	Adds an "Added" bullet for the new example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings June 2, 2026 00:22

Copilot started reviewing on behalf of chris-colinsky June 2, 2026 00:23 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

chris-colinsky merged commit 260d1ec into main Jun 2, 2026
7 checks passed

chris-colinsky deleted the feature/example-production-observability branch June 2, 2026 00:25

chris-colinsky mentioned this pull request Jun 2, 2026

Add examples 11 and 12 to smoke test #119

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add production-observability example#117

Add production-observability example#117
chris-colinsky merged 1 commit into
mainfrom
feature/example-production-observability

chris-colinsky commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chris-colinsky commented Jun 2, 2026

Summary

Verified end-to-end

Test plan

Out of scope

Reviewer notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants