Skip to content

v1.1.0 - Multimodal observability

Choose a tag to compare

@Mandark-droid Mandark-droid released this 28 Apr 10:32
· 18 commits to main since this release

v1.1.0 — Multimodal observability + new model pricing

Multimodal observability

First-class capture of image, audio, video, and document content parts on
OpenAI, Anthropic, Google Gemini, and Groq spans. Defines an additive,
OTel-compatible attribute namespace for multimodal content that is being
proposed upstream to OpenTelemetry semantic-conventions
(issue #3672).

New module: genai_otel.media — provider-agnostic ContentPart detection,
offload pipeline (redact → upload → URI), pluggable stores
(filesystem / s3 / minio / http) and built-in redactors
(exif_stripper, face_blur, pdf_pii_redact). All heavy deps lazy-imported.

New attributes (additive):

gen_ai.prompt.{n}.role
gen_ai.prompt.{n}.content.{m}.{type, text, media_uri, media_mime_type,
                                media_byte_size, media_source}
gen_ai.completion.* mirror namespace
gen_ai.media.stripped_reason

Defaults: GENAI_OTEL_MEDIA_CAPTURE_MODE=off. Text-only behaviour is
byte-identical to 1.0.x — multimodal capture is opt-in.

Quickstart:

pip install 'genai-otel-instrument[multimodal,openai,anthropic,google]'
export GENAI_OTEL_MEDIA_CAPTURE_MODE=full
export GENAI_OTEL_MEDIA_STORE=minio
export GENAI_OTEL_MEDIA_STORE_ENDPOINT=http://localhost:9000
export GENAI_OTEL_MEDIA_STORE_ACCESS_KEY=...
export GENAI_OTEL_MEDIA_STORE_SECRET_KEY=...

See docs/guides/multimodal.md
for the full guide and examples/multimodal/
for runnable scripts (vision, audio, video, document, face-blur, end-to-end validator).

New model pricing

  • OpenAI GPT-5.5 — input $5/1M, output $30/1M (Apr 2026)
  • DeepSeek V4 Flash — input $0.14/1M, output $0.28/1M
  • DeepSeek V4 Pro — input $0.435/1M, output $0.87/1M (75% promotional rate until 2026-05-31)

Validation

  • 41 new unit tests (provider × modality detection matrix, offload pipeline gating, store backends, redactor graceful-degrade, per-instrumentor wiring)
  • Live integration test against MinIO (skipped without env vars)
  • End-to-end validator: span → OTel collector → OpenSearch round-trip with bytes resolvable from MinIO via media_uri

New pyproject extras

multimodal-images (Pillow), multimodal-pdf (pypdf), multimodal-faces (opencv), multimodal-s3 (boto3), umbrella multimodal.

Backwards compatibility

Purely additive. No removals. Existing text-only deployments see no change.