feat(observability): trace span for embeddings api by bzp2010 · Pull Request #84 · api7/aisix

bzp2010 · 2026-05-04T07:29:29Z

Summary by CodeRabbit

New Features
- Distributed tracing for embedding operations with richer telemetry and error tagging
Improvements
- Unified telemetry attributes across chat, embedding, and message flows
- Better error tracking for gateway failures and timeouts
- Consistent naming and telemetry standards across handlers
Tests
- Added tests validating telemetry property keys and values for embedding telemetry

coderabbitai · 2026-05-04T07:29:41Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1794c894-40b3-4907-8d03-447cc2e85f88

📥 Commits

Reviewing files that changed from the base of the PR and between cb4b8ae and 2df03ac.

📒 Files selected for processing (1)

src/proxy/handlers/embeddings/span_attributes.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/proxy/handlers/embeddings/span_attributes.rs

📝 Walkthrough

Walkthrough

Standardizes telemetry attribute keys by replacing hardcoded strings with OpenTelemetry semantic convention constants, enables the semconv_experimental feature, and adds telemetry span construction and wiring for embeddings including error/timeouts and tests.

Changes

OpenTelemetry Semantic Convention Standardization & Embeddings Telemetry

Layer / File(s)	Summary
Dependency `Cargo.toml`	Enable `semconv_experimental` feature for `opentelemetry-semantic-conventions = "0.31.0"`.
Semantic constants import `src/proxy/.../span_attributes/telemetry.rs`, `src/proxy/utils/trace/span_attributes.rs`	Import `opentelemetry_semantic_conventions::attribute::{GEN_AI_, SERVER_ADDRESS, SERVER_PORT, USER_ID, ...}` and replace literal `"gen_ai."`, `"server.*"`, `"user.id"` keys with these constants.
Core telemetry helpers `src/proxy/utils/trace/span_attributes.rs`	Switch finish-reasons and usage token keys to `GEN_AI_RESPONSE_FINISH_REASONS`, `GEN_AI_USAGE_INPUT_TOKENS`, `GEN_AI_USAGE_OUTPUT_TOKENS`.
Chat completions telemetry `src/proxy/handlers/chat_completions/span_attributes/telemetry.rs`, `src/proxy/handlers/chat_completions/mod.rs`	Replace request/response/chunk/usage keys with semantic constants; update span name from `"aisix.llm.chat_completion"` to `"aisix.llm.chat_completions"` and use `GEN_AI_RESPONSE_FINISH_REASONS` for finish reasons matching.
Messages telemetry `src/proxy/handlers/messages/span_attributes/telemetry.rs`	Replace request/response/chunk/usage keys with semantic-convention constants (`GEN_AI_*`, `SERVER_ADDRESS`, `SERVER_PORT`, `USER_ID`).
Embeddings telemetry: data & wiring `src/proxy/handlers/embeddings/span_attributes.rs`, `src/proxy/handlers/embeddings/mod.rs`	Add `span_attributes` module with `request_span_properties(request, provider, base_url)` and `response_span_properties(response, usage)` that emit semantic-convention keys; create span in embeddings handler, apply properties, wrap gateway embed call with `WithSpan` + `maybe_timeout`, record response properties on success and distinct `error.type` properties for gateway errors vs timeouts.
Tests `src/proxy/handlers/embeddings/span_attributes.rs` (tests inside)	Add unit tests asserting emitted embedding request/response span property keys/values (encoding formats, invocation parameters, server fields, vectors, dimension and usage values).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat(proxy): move embed api to new provider #41: Changes to embedding/gateway/embed types and provider embed support — related to embeddings handler and telemetry wiring.
feat(observability): add messages genai span attributes #68: Refactors trace utilities and span attribute helpers — related to the semantic-convention constants adoption.

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
E2e Test Quality Review	⚠️ Warning	PR's primary observability feature (tracing spans for embeddings API) lacks E2E test validation of span creation, recording, and telemetry properties.	Add E2E test cases that verify span creation, recorded traces, span names, and span attributes (gen_ai.operation.name, gen_ai.request.model) in telemetry output.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective of the pull request, which is to add trace span instrumentation for the embeddings API.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Check	✅ Passed	Security analysis reveals no critical, high, or medium-severity vulnerabilities. OpenTelemetry tracing additions properly handle legitimate observability data without exposing sensitive credentials.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bzp/feat-o11y-embed-span

_{Review rate limit: 3/5 reviews remaining, refill in 16 minutes and 23 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/proxy/handlers/chat_completions/mod.rs (1)

76-76: Span name rename needs downstream observability coordination.

aisix.llm.chat_completion → aisix.llm.chat_completions can break existing dashboards/alerts until queries are updated.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/proxy/handlers/chat_completions/mod.rs` at line 76, The span name was
changed to "aisix.llm.chat_completions" in the Span::enter_with_local_parent
call which will break downstream dashboards/alerts expecting
"aisix.llm.chat_completion"; either revert the literal back to the previous name
or, if the rename is required, emit the legacy name as well (e.g., create the
new span under the existing name or attach the old name as an attribute) and add
a const for both names so callers use a single source of truth; update any
observability mapping/docs and coordinate with downstream teams before landing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/proxy/handlers/embeddings/span_attributes.rs`:
- Around line 89-99: The code currently serializes and pushes full embeddings
via the blocks that call serde_json::to_string(response) and
serde_json::to_string(&data.embedding) inside the if-let and the for (index,
data) in response.data.iter().enumerate() loop; replace those pushes with
compact metadata only: emit output-level metadata such as total item count
(e.g., "output.count") and a source/model id if available, and per-item metadata
such as embedding length/dimension (e.g.,
"embedding.embeddings.{index}.embedding.length" or "embedding.{index}.dim")
rather than the full vector; remove or guard against pushing "output.value" and
the long "embedding.embeddings.{index}.embedding.vector" entries to avoid huge
span attributes.

---

Nitpick comments:
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Line 76: The span name was changed to "aisix.llm.chat_completions" in the
Span::enter_with_local_parent call which will break downstream dashboards/alerts
expecting "aisix.llm.chat_completion"; either revert the literal back to the
previous name or, if the rename is required, emit the legacy name as well (e.g.,
create the new span under the existing name or attach the old name as an
attribute) and add a const for both names so callers use a single source of
truth; update any observability mapping/docs and coordinate with downstream
teams before landing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e4dbf074-b2ed-4b5f-8cf1-1ac3e099ef3a

📥 Commits

Reviewing files that changed from the base of the PR and between 0429787 and cb4b8ae.

📒 Files selected for processing (7)

Cargo.toml
src/proxy/handlers/chat_completions/mod.rs
src/proxy/handlers/chat_completions/span_attributes/telemetry.rs
src/proxy/handlers/embeddings/mod.rs
src/proxy/handlers/embeddings/span_attributes.rs
src/proxy/handlers/messages/span_attributes/telemetry.rs
src/proxy/utils/trace/span_attributes.rs

coderabbitai · 2026-05-04T07:37:09Z

+    if let Some(value) = serde_json::to_string(response).ok() {
+        properties.push(("output.value".into(), value));
+    }
+
+    for (index, data) in response.data.iter().enumerate() {
+        if let Ok(value) = serde_json::to_string(&data.embedding) {
+            properties.push((
+                format!("embedding.embeddings.{index}.embedding.vector"),
+                value,
+            ));
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid emitting full embedding vectors into span attributes.

output.value plus per-vector properties can make spans extremely large for high-dimension/batched embeddings, which risks dropped traces and high telemetry overhead. Prefer compact metadata (counts/dimensions) instead.

💡 Suggested change (compact span payload)

- if let Some(value) = serde_json::to_string(response).ok() { - properties.push(("output.value".into(), value)); - } - - for (index, data) in response.data.iter().enumerate() { - if let Ok(value) = serde_json::to_string(&data.embedding) { - properties.push(( - format!("embedding.embeddings.{index}.embedding.vector"), - value, - )); - } - } + properties.push(( + "gen_ai.embeddings.count".into(), + response.data.len().to_string(), + ));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if let Some(value) = serde_json::to_string(response).ok() {

properties.push(("output.value".into(), value));

}

for (index, data) in response.data.iter().enumerate() {

if let Ok(value) = serde_json::to_string(&data.embedding) {

properties.push((

format!("embedding.embeddings.{index}.embedding.vector"),

value,

));

}

properties.push((

"gen_ai.embeddings.count".into(),

response.data.len().to_string(),

));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/embeddings/span_attributes.rs` around lines 89 - 99, The code currently serializes and pushes full embeddings via the blocks that call serde_json::to_string(response) and serde_json::to_string(&data.embedding) inside the if-let and the for (index, data) in response.data.iter().enumerate() loop; replace those pushes with compact metadata only: emit output-level metadata such as total item count (e.g., "output.count") and a source/model id if available, and per-item metadata such as embedding length/dimension (e.g., "embedding.embeddings.{index}.embedding.length" or "embedding.{index}.dim") rather than the full vector; remove or guard against pushing "output.value" and the long "embedding.embeddings.{index}.embedding.vector" entries to avoid huge span attributes.

This is known; it currently conforms to the specification but lacks redaction support.
I will add redaction support in the future; please create an issue for this upcoming task.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash gh issue create \ --repo api7/aisix \ --title "feat(observability): add redaction support for embedding span attributes" \ --body "## Summary Currently, the embeddings tracing span emits full embedding vectors and serialized response payloads as span attributes (e.g., \`output.value\`, \`embedding.embeddings.{index}.embedding.vector\`). While this conforms to the OpenInference/OpenTelemetry specification, it can result in very large span payloads for high-dimension or batched embeddings, risking dropped traces and high telemetry overhead. Redaction support should be added to allow users to optionally suppress or truncate these large attributes (e.g., replace full vectors with compact metadata such as count and dimension). ## Context - Introduced in: `#84` - Comment: https://github.com/api7/aisix/pull/84#discussion_r3180010194 ## Acceptance Criteria - [ ] Provide a redaction/filtering mechanism for span attributes that contain large payloads (e.g., \`output.value\`, \`embedding.embeddings.{index}.embedding.vector\`) - [ ] Ensure the default behavior still conforms to the specification - [ ] Allow users/operators to opt in to compact metadata output (e.g., counts/dimensions only) /cc `@bzp2010`" \ --assignee bzp2010

Length of output: 40

@bzp2010, understood! I've created a follow-up issue to track the redaction support work: #85

✏️ Learnings added

Learnt from: bzp2010 Repo: api7/aisix PR: 84 File: src/proxy/handlers/embeddings/span_attributes.rs:89-99 Timestamp: 2026-05-04T07:56:26.382Z Learning: In api7/aisix, the embeddings tracing span intentionally emits full embedding vectors (`embedding.embeddings.{index}.embedding.vector`) and serialized response payloads (`output.value`) as span attributes to conform to the OpenInference/OpenTelemetry specification. Redaction support is planned as a future enhancement (tracked in https://github.com/api7/aisix/issues/85). Do not flag these large span attributes as a bug during code review.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: bzp2010 Repo: api7/aisix PR: 34 File: src/proxy/hooks2/authorization/mod.rs:1-1 Timestamp: 2026-04-10T09:35:05.745Z Learning: In Rust code using the `anyhow` crate, treat `use anyhow::Result;` as potentially used even if the code uses non-default error types like `Result<T, SomeOtherError>` (i.e., `Result<T, E>` where `E` is not `anyhow::Error`). Because `anyhow::Result` is an alias for `std::result::Result<T, E=anyhow::Error>`, the import is referenced by the `Result<..., ...>` syntax. Do not flag `use anyhow::Result;` as unused solely due to the presence of `Result<T, SomeOtherError>` usages.

Learnt from: AlinsRan Repo: api7/aisix PR: 46 File: src/lib.rs:122-126 Timestamp: 2026-04-17T09:32:54.079Z Learning: Avoid adding `#[fastrace::trace]` to top-level process-lifetime entrypoints in `src/lib.rs`—specifically `run()`, `run_with_config()`, and `run_with_provider()`. These functions cover the entire server lifetime (startup → shutdown), while Fastrace spans should be used for request-scoped boundaries (e.g., HTTP/RPC handler functions) rather than global process entrypoints.

Learnt from: bzp2010 Repo: api7/aisix PR: 80 File: src/main.rs:9-9 Timestamp: 2026-05-03T12:45:26.883Z Learning: In this repository (api7/aisix), `aisix::utils::instance::init()` is intentionally **not** guaranteed to be called by the library entrypoints (`src/lib.rs` via `run()`, `run_with_config()`, `run_with_provider()`). Treat it as the embedder/binary’s responsibility to call `instance::init()` before handling any requests. Also note that `instance_id()` and `run_id()` in `src/utils/instance/mod.rs` intentionally panic (via `.expect()`) when the `OnceLock`s are uninitialized—this is by design. During code review, do **not** flag missing `instance::init()` wiring inside the library startup path as a bug and do **not** suggest moving `instance::init()` into the library automatically.

bzp2010 added 2 commits May 4, 2026 15:26

feat(observability): add trace span to embed api

85f94df

chore: use opentelemetry-semantic-conventions

cb4b8ae

fix lint

2df03ac

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request May 4, 2026

feat(observability): add redaction support for embedding span attributes #85

Open

3 tasks

bzp2010 merged commit ee53cd6 into main May 4, 2026
3 checks passed

bzp2010 deleted the bzp/feat-o11y-embed-span branch May 4, 2026 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(observability): trace span for embeddings api#84

feat(observability): trace span for embeddings api#84
bzp2010 merged 3 commits intomainfrom
bzp/feat-o11y-embed-span

bzp2010 commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 4, 2026 •

edited

Loading

Uh oh!

bzp2010 May 4, 2026

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bzp2010 commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bzp2010 May 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bzp2010 commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading

coderabbitai Bot May 4, 2026 •

edited

Loading