feat(instrument): OpenTelemetry GenAI semantic conventions for all LLM-call adapters (spec 07)#125
Closed
mmercuri wants to merge 1 commit into
Conversation
…M-call adapters (spec 07)
Implements spec ``07-otel-genai-semantic-conventions.md``: every
LLM-call event emitted by every adapter (9 providers + 16 frameworks +
embedding) is additively stamped with the canonical OpenTelemetry
GenAI ``gen_ai.*`` attribute set, alongside the existing custom
LayerLens attribute set (CLAUDE.md "complete means complete" — no
attribute removed, no dashboard broken).
The wiring is two-layered:
1. Centralised hook in ``BaseAdapter.emit_dict_event`` — recognises
``model.invoke`` / ``embedding.create`` / ``model.{request,response}``
event types and stamps via the shared
``stamp_genai_attributes(...)`` helper. Covers every framework
adapter automatically (no per-site plumbing needed).
2. Per-adapter explicit stamping in
``LLMProviderAdapter._emit_model_invoke`` — gives the most
accurate ``gen_ai.system`` resolution by passing the concrete
``provider`` argument through ``detect_gen_ai_system``. Tool calls
additionally carry ``gen_ai.tool.name`` / ``gen_ai.tool.call.id``.
Stamping is **idempotent**, **additive**, and **safe** (never raises
on partial / malformed payloads; the centralised hook swallows
stamping errors at DEBUG so the circuit-breaker path keeps running).
New module
``src/layerlens/instrument/adapters/_base/genai_semconv.py`` exports:
* Module-level constants for every spec attribute key
(``GEN_AI_SYSTEM``, ``GEN_AI_REQUEST_MODEL``,
``GEN_AI_USAGE_INPUT_TOKENS``, ``GEN_AI_RESPONSE_FINISH_REASONS``,
``GEN_AI_TOOL_NAME``, ``AWS_BEDROCK_GUARDRAIL_ID``, …).
* ``detect_gen_ai_system(name)`` — maps adapter / framework /
provider strings to the canonical OTel ``gen_ai.system`` value
(``openai``, ``anthropic``, ``aws.bedrock``, ``gcp.vertex_ai``,
``gcp.gemini``, ``cohere``, ``mistral_ai``, ``ollama``,
``litellm``, ``azure.openai``, with documented ``_OTHER``
fallback).
* ``stamp_genai_attributes(payload, request_kwargs, response_obj,
*, system, operation)`` — additive in-place stamping, supports
flat and nested ``model`` layouts (langchain), reads usage from
payload OR ``token_usage`` dict OR response object, normalises
finish-reasons to ``list[str]`` per spec.
Provider-specific extensions:
* OpenAI / Azure OpenAI — ``system_fingerprint``, ``service_tier``,
``response_format``.
* Anthropic — cache creation / read input tokens.
* AWS Bedrock — guardrail / knowledge-base / agent IDs (per spec §4.3
``aws.bedrock.*`` namespace, not ``gen_ai.*``).
Tests:
* ``tests/instrument/adapters/_base/test_genai_semconv.py`` — 49
tests pinning every constant to the spec spelling, exercising
every helper code path, asserting the additive contract, and
verifying the centralised hook stamps automatically.
* ``tests/instrument/adapters/test_genai_semconv_per_adapter.py`` —
35 tests parameterised across all 9 providers and all 16
framework adapters, asserting ``gen_ai.*`` attributes appear on
emitted ``model.invoke`` events alongside legacy fields.
Documentation:
* ``docs/adapters/otel-genai-conventions.md`` — contract,
implementation, per-adapter coverage matrix.
No regression in existing 258 framework tests; full helper +
per-adapter suite (84 tests) green.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements spec
07-otel-genai-semantic-conventions.md— every LLM-call event emitted by every adapter (9 providers + 16 framework + embedding) is additively stamped with the canonical OpenTelemetry GenAIgen_ai.*attribute set, alongside the existing custom LayerLens attributes.src/layerlens/instrument/adapters/_base/genai_semconv.py(constants +stamp_genai_attributes+detect_gen_ai_system)BaseAdapter.emit_dict_eventcovers every framework adapter automaticallyLLMProviderAdapter._emit_model_invokeand_emit_tool_callsfor accurategen_ai.system/gen_ai.tool.*resolutiondocs/adapters/otel-genai-conventions.md— contract + per-adapter coverage matrixgen_ai.* Attribute Set Stamped
Required (always present on
model.invoke):gen_ai.system,gen_ai.provider.name,gen_ai.operation.nameRequest:
gen_ai.request.{model, max_tokens, temperature, top_p, top_k, frequency_penalty, presence_penalty, stop_sequences, seed, choice.count, encoding_formats}Response:
gen_ai.response.{id, model, finish_reasons}Usage:
gen_ai.usage.{input_tokens, output_tokens}Tool:
gen_ai.tool.{name, call.id, description, type}Agent:
gen_ai.agent.{id, name, description}Provider-specific:
gen_ai.openai.{request.service_tier, request.response_format, response.service_tier, response.system_fingerprint}gen_ai.anthropic.{cache_creation_input_tokens, cache_read_input_tokens}aws.bedrock.{guardrail.id, knowledge_base.id, agent.id}gen_ai.google.safety_ratingsPer-Adapter Wiring Coverage
_emit_model_invoke+ centralised hookBaseAdapter.emit_dict_event)embedding.create→gen_ai.operation.name=embeddings)Test plan
uv run pytest tests/instrument/adapters/_base/test_genai_semconv.py -x— 49 passeduv run pytest tests/instrument/adapters/test_genai_semconv_per_adapter.py— 35 passeduv run pytest tests/instrument/adapters/frameworks/— 258 passed, 1 skipped (no regression)uv run mypy --strict src/layerlens/instrument/adapters/_base/genai_semconv.py— cleanuv run ruff check— cleanCLAUDE.md Compliance
gen_ai.*attribute names match official OTel spec verbatim (constant table is the source of truth;TestGenAiAttributeNamesMatchSpecpins each one)