feat(observability): add chat completion genai span attributes#65
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughExpose primary-key reads for entity stores; add Model→Provider resolution and ProvidersStore accessor; introduce Provider semantic metadata and provider-specific overrides; add extensive chat-completion span telemetry, streaming output collector, and WithSpan propagation; change provider instance API to accept provider entries; add tracing in middlewares. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ProxyHandler as Proxy
participant Resources as ResourceRegistry
participant ModelEntry as Model
participant ProviderEntry as Provider
participant Gateway
participant Span as FastraceSpan
Client->>ProxyHandler: HTTP chat-completion request
ProxyHandler->>Resources: lookup model entry
Resources-->>ProxyHandler: ResourceEntry<Model>
ProxyHandler->>ModelEntry: model.provider(resources)
ModelEntry-->>ProxyHandler: Option<ResourceEntry<Provider>>
alt provider found
ProxyHandler->>Span: create span (aisix.llm.chat_completion)
ProxyHandler->>WithSpan: wrap gateway call with span
ProxyHandler->>Gateway: invoke provider via provider entry (streaming)
Gateway-->>ProxyHandler: stream chunks
ProxyHandler->>ProxyHandler: record_chunk → StreamOutputCollector
ProxyHandler-->>Span: apply request/response/chunk/usage properties
ProxyHandler-->>Client: streamed SSE/response
ProxyHandler->>Span: end span on EOS
else provider missing
ProxyHandler-->>Client: GatewayError::Internal (provider not found)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 3/5 reviews remaining, refill in 13 minutes and 30 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/proxy/handlers/chat_completions/span_attributes/tests.rs (1)
1-252: 🛠️ Refactor suggestion | 🟠 MajorSwitch this new test module to
pretty_assertions::assert_eq.Please import
pretty_assertions::assert_eqso all equality checks in this new file follow the repo’s test-output standard.Suggested patch
use serde_json::{Value, json}; +use pretty_assertions::assert_eq;As per coding guidelines:
{tests,src}/**/*.rs: “Use pretty_assertions::assert_eq and assert_matches::assert_matches for better test output”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/span_attributes/tests.rs` around lines 1 - 252, The tests use standard assert_eq which must be replaced by the repo standard pretty_assertions; import the macro by adding use pretty_assertions::assert_eq; at the top of this test module (near the existing use serde_json::{Value, json};) so all equality checks in request_span_properties_include_message_tool_and_user_attributes, response_span_properties_include_output_messages_and_reasoning_tokens, and stream_output_collector_accumulates_chunks_into_output_messages use pretty_assertions::assert_eq for improved test diffs.
🧹 Nitpick comments (5)
src/gateway/traits/chat_format.rs (1)
29-30: Use#[allow(dead_code)]instead of#[allow(unused)]on the trait method.
#[allow(unused)]is overly broad and can hide unrelated warnings. Thedead_codelint is the appropriate suppression for an unused associated function declaration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gateway/traits/chat_format.rs` around lines 29 - 30, The trait method declaration for fn name() -> &'static str currently uses #[allow(unused)]; change this to #[allow(dead_code)] to narrowly suppress the unused-associated-function lint. Locate the associated function declaration named name in the trait in chat_format.rs and replace the attribute so the suppression targets dead_code rather than the broader unused lint.src/proxy/middlewares/trace.rs (1)
53-66: Consider whether error handling triggers premature span closure.When
Poll::Ready(Some(Err(_)))occurs,on_eos()is called which takes the span. However, this may not be the actual end of stream from the framework's perspective—the error is returned to the caller. If there's any subsequent polling (unlikely but possible depending on framework behavior), the span will already be consumed.This is likely fine in practice since an error typically terminates the stream, but worth noting for robustness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/middlewares/trace.rs` around lines 53 - 66, The code currently calls self.on_eos() for Poll::Ready(Some(Err(_))) which may prematurely consume the tracing span; change the error branch to avoid closing the span on a single Err result—either call a new non-consuming handler like self.on_error(err) (create on_error in the same impl) or simply log the error without calling on_eos, and only call self.on_eos() on Poll::Ready(None); update the match arm for Poll::Ready(Some(Err(e))) to forward the error to the caller or to on_error(e) instead of on_eos(), keeping on_body_chunk and on_eos behavior unchanged.src/proxy/handlers/chat_completions/mod.rs (1)
159-179: Consider extracting stream state into a named struct for readability.The 8-tuple state is functionally correct but harder to maintain. A named struct would improve clarity.
♻️ Optional: Extract to named struct
struct StreamState { stream: ChatResponseStream<OpenAIChatFormat>, span: Span, chunk_index: usize, request_ctx: RequestContext, done: bool, saw_chunk: bool, usage_rx: Option<oneshot::Receiver<Usage>>, output_collector: StreamOutputCollector, }This would make the unfold closure parameters and return values more self-documenting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/mod.rs` around lines 159 - 179, The closure passed to futures::stream::unfold that builds sse_stream currently uses an 8-tuple state which is hard to read; create a named struct (e.g., StreamState) containing the fields stream (ChatResponseStream<OpenAIChatFormat>), span (Span), chunk_index (usize), request_ctx (RequestContext), done (bool), saw_chunk (bool), usage_rx (Option<oneshot::Receiver<Usage>>), and output_collector (StreamOutputCollector), replace the tuple in the unfold call and in the closure signature with this struct, and adjust all usages inside the closure and the initial state construction to use the struct fields for clarity.src/proxy/handlers/chat_completions/span_attributes/telemetry.rs (1)
1-15: Import ordering could be improved per coding guidelines.The imports should be sorted alphabetically within external crates.
reqwestshould come afterfastracealphabetically, butserde_jsonshould be between them.♻️ Optional: Sort imports alphabetically
use fastrace::prelude::Span; use reqwest::Url; use serde_json::{Map, Value};The current order happens to be alphabetical (fastrace < reqwest < serde_json), so this is actually correct. No change needed.
Actually, reviewing again:
fastrace<reqwest<serde_jsonis alphabetically correct.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/span_attributes/telemetry.rs` around lines 1 - 15, The import ordering for the external crates is already correct (fastrace, reqwest, serde_json); leave the use statements as-is and ensure they remain alphabetically sorted (fastrace, reqwest, serde_json) in telemetry.rs so no change is required to the current imports.src/proxy/handlers/chat_completions/span_attributes/message_attributes.rs (1)
225-235: Minor: Unnecessary intermediate vector allocation.The intermediate
Vec(line 230-233) clones each message unnecessarily. Consider passing an iterator or borrowing directly.♻️ Optional: Avoid intermediate allocation
pub(super) fn append_openinference_output_message_properties( properties: &mut Vec<(String, String)>, prefix: &str, messages: &[OutputMessageView], ) { - let message_views: Vec<_> = messages - .iter() - .map(|message| message.message.clone()) - .collect(); - append_openinference_message_properties(properties, prefix, &message_views); + for (message_index, message) in messages.iter().enumerate() { + append_single_message_properties(properties, prefix, message_index, &message.message); + } }Alternatively, refactor
append_openinference_message_propertiesto acceptimpl Iterator<Item = &MessageView>.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/span_attributes/message_attributes.rs` around lines 225 - 235, The helper append_openinference_output_message_properties currently allocates and clones each OutputMessageView.message into a temporary Vec, which is unnecessary; instead avoid cloning by passing borrowed MessageView references to append_openinference_message_properties — either change append_openinference_message_properties to accept impl Iterator<Item=&MessageView> (preferred) and forward messages.iter().map(|m| &m.message) directly, or build a Vec<&MessageView> (messages.iter().map(|m| &m.message).collect::<Vec<_>>()) and pass that slice to the existing function; update references to OutputMessageView and MessageView accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config/entities/models.rs`:
- Around line 38-42: The public helper Model::provider should have a doc comment
and tracing; add a one-line doc comment (/// Returns the provider resource for
this model, if present.) above the method and annotate the function with
#[fastrace::trace], keeping the signature fn provider(&self, resources:
&ResourceRegistry) -> Option<ResourceEntry<Provider>> and the existing body that
calls resources.providers.get_by_id(&self.provider_id).
In `@src/gateway/traits/provider.rs`:
- Around line 324-327: The new test added uses the standard assert_eq; replace
it with pretty_assertions::assert_eq by adding an import for
pretty_assertions::assert_eq at the top of the test module and update the
assertion calls in the new test path (the block that inspects ProviderMeta /
ProviderSemanticConventions / StreamReaderKind and uses
ChatTransform/EmbedTransform/CompatQuirks) to call pretty_assertions::assert_eq
instead of the std macro so test failures show improved diffs.
- Around line 48-53: Add doc comments (///) to the public struct
ProviderSemanticConventions and to the public method
ProviderMeta::semantic_conventions() describing their purpose and usage; include
short field-level descriptions for gen_ai_provider_name, llm_system, and
llm_provider on ProviderSemanticConventions, and document what
ProviderMeta::semantic_conventions() returns and when it may be None or Some, so
the public API is properly documented per Rust doc guidelines.
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Around line 99-102: The code incorrectly sets an error attribute on a
successful stream branch: when matching Ok(Ok(ChatResponse::Stream { stream,
usage_rx })), remove the LocalSpan::add_property(|| ("error.type", "no_stream"))
call so successful handling via handle_stream_request(stream, usage_rx, &mut
request_ctx, span).await does not record an error; keep any span handling but
delete that erroneous error.type assignment in the ChatResponse::Stream branch.
In `@src/proxy/provider.rs`:
- Around line 14-17: Add the missing #[fastrace::trace] attribute to the public
function create_provider_instance and update its doc comment to reference
"provider" rather than "model"; locate the function declaration for
create_provider_instance and prepend the #[fastrace::trace] decorator, and edit
the preceding /// doc comment text to replace any occurrences of "model" with
"provider" so docs and tracing follow project guidelines.
In `@src/utils/future.rs`:
- Around line 28-39: The poll implementation can panic because it calls unwrap()
on this.span which is an Option; update the WithSpan design to guarantee a Span
is present: either (preferred) change the field type from Option<Span> to Span
(update the WithSpan struct, all constructors/usages, and remove Option handling
in impl Future for WithSpan so poll uses this.span.set_local_parent() and
returns the owned Span directly), or if Option must remain, add a constructor
new(inner, span: Span) that stores Some(span) and replace unwrap() in poll with
a safe handling (e.g., expect with a clear message or return Poll::Ready((val,
this.span.take().expect("WithSpan must be constructed with Some(span)")))) and
document the invariant so external code cannot create WithSpan with span ==
None; adjust any call sites referencing WithSpan::span accordingly.
---
Outside diff comments:
In `@src/proxy/handlers/chat_completions/span_attributes/tests.rs`:
- Around line 1-252: The tests use standard assert_eq which must be replaced by
the repo standard pretty_assertions; import the macro by adding use
pretty_assertions::assert_eq; at the top of this test module (near the existing
use serde_json::{Value, json};) so all equality checks in
request_span_properties_include_message_tool_and_user_attributes,
response_span_properties_include_output_messages_and_reasoning_tokens, and
stream_output_collector_accumulates_chunks_into_output_messages use
pretty_assertions::assert_eq for improved test diffs.
---
Nitpick comments:
In `@src/gateway/traits/chat_format.rs`:
- Around line 29-30: The trait method declaration for fn name() -> &'static str
currently uses #[allow(unused)]; change this to #[allow(dead_code)] to narrowly
suppress the unused-associated-function lint. Locate the associated function
declaration named name in the trait in chat_format.rs and replace the attribute
so the suppression targets dead_code rather than the broader unused lint.
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Around line 159-179: The closure passed to futures::stream::unfold that builds
sse_stream currently uses an 8-tuple state which is hard to read; create a named
struct (e.g., StreamState) containing the fields stream
(ChatResponseStream<OpenAIChatFormat>), span (Span), chunk_index (usize),
request_ctx (RequestContext), done (bool), saw_chunk (bool), usage_rx
(Option<oneshot::Receiver<Usage>>), and output_collector
(StreamOutputCollector), replace the tuple in the unfold call and in the closure
signature with this struct, and adjust all usages inside the closure and the
initial state construction to use the struct fields for clarity.
In `@src/proxy/handlers/chat_completions/span_attributes/message_attributes.rs`:
- Around line 225-235: The helper append_openinference_output_message_properties
currently allocates and clones each OutputMessageView.message into a temporary
Vec, which is unnecessary; instead avoid cloning by passing borrowed MessageView
references to append_openinference_message_properties — either change
append_openinference_message_properties to accept impl
Iterator<Item=&MessageView> (preferred) and forward messages.iter().map(|m|
&m.message) directly, or build a Vec<&MessageView> (messages.iter().map(|m|
&m.message).collect::<Vec<_>>()) and pass that slice to the existing function;
update references to OutputMessageView and MessageView accordingly.
In `@src/proxy/handlers/chat_completions/span_attributes/telemetry.rs`:
- Around line 1-15: The import ordering for the external crates is already
correct (fastrace, reqwest, serde_json); leave the use statements as-is and
ensure they remain alphabetically sorted (fastrace, reqwest, serde_json) in
telemetry.rs so no change is required to the current imports.
In `@src/proxy/middlewares/trace.rs`:
- Around line 53-66: The code currently calls self.on_eos() for
Poll::Ready(Some(Err(_))) which may prematurely consume the tracing span; change
the error branch to avoid closing the span on a single Err result—either call a
new non-consuming handler like self.on_error(err) (create on_error in the same
impl) or simply log the error without calling on_eos, and only call
self.on_eos() on Poll::Ready(None); update the match arm for
Poll::Ready(Some(Err(e))) to forward the error to the caller or to on_error(e)
instead of on_eos(), keeping on_body_chunk and on_eos behavior unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 168b96de-aab2-4d8c-b21b-5f45bf9ebea8
📒 Files selected for processing (23)
src/config/entities/mod.rssrc/config/entities/models.rssrc/config/entities/providers.rssrc/gateway/providers/azure.rssrc/gateway/providers/bedrock.rssrc/gateway/providers/gemini.rssrc/gateway/traits/chat_format.rssrc/gateway/traits/mod.rssrc/gateway/traits/provider.rssrc/proxy/handlers/chat_completions/mod.rssrc/proxy/handlers/chat_completions/span_attributes/message_attributes.rssrc/proxy/handlers/chat_completions/span_attributes/mod.rssrc/proxy/handlers/chat_completions/span_attributes/stream_output.rssrc/proxy/handlers/chat_completions/span_attributes/telemetry.rssrc/proxy/handlers/chat_completions/span_attributes/tests.rssrc/proxy/handlers/embeddings/mod.rssrc/proxy/handlers/messages/mod.rssrc/proxy/hooks/authorization/mod.rssrc/proxy/middlewares/auth.rssrc/proxy/middlewares/trace.rssrc/proxy/provider.rssrc/utils/future.rssrc/utils/observability/mod.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/proxy/handlers/chat_completions/mod.rs (1)
158-178: Consider extracting the 8-tuple state into a named struct.The unfold state has grown to an 8-element tuple, making it harder to reason about which element is which. A named struct would improve readability and reduce the risk of accidental reordering.
♻️ Example struct definition
struct StreamUnfoldState { stream: ChatResponseStream<OpenAIChatFormat>, span: Span, chunk_idx: usize, request_ctx: RequestContext, done: bool, saw_chunk: bool, usage_rx: Option<oneshot::Receiver<Usage>>, output_collector: StreamOutputCollector, }This would make the state transitions clearer and reduce the cognitive load when reading the destructuring patterns.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/mod.rs` around lines 158 - 178, The 8-tuple state passed to futures::stream::unfold is hard to read; extract it into a named struct (e.g., StreamUnfoldState) that contains fields for stream (ChatResponseStream<OpenAIChatFormat>), span (Span), chunk_idx/idx (usize), request_ctx/stream_request_ctx (RequestContext), done (bool), saw_chunk (bool), usage_rx (Option<oneshot::Receiver<Usage>>), and output_collector (StreamOutputCollector), then update the unfold call to initialize that struct and adjust the async closure to destructure/consume StreamUnfoldState instead of the tuple so all state access is clear and ordering mistakes are avoided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Around line 158-178: The 8-tuple state passed to futures::stream::unfold is
hard to read; extract it into a named struct (e.g., StreamUnfoldState) that
contains fields for stream (ChatResponseStream<OpenAIChatFormat>), span (Span),
chunk_idx/idx (usize), request_ctx/stream_request_ctx (RequestContext), done
(bool), saw_chunk (bool), usage_rx (Option<oneshot::Receiver<Usage>>), and
output_collector (StreamOutputCollector), then update the unfold call to
initialize that struct and adjust the async closure to destructure/consume
StreamUnfoldState instead of the tuple so all state access is clear and ordering
mistakes are avoided.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ae11bbc8-b350-4d63-a3a2-b4c929475e3a
📒 Files selected for processing (4)
src/config/entities/models.rssrc/gateway/traits/provider.rssrc/proxy/handlers/chat_completions/mod.rssrc/proxy/provider.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/proxy/handlers/chat_completions/mod.rs (1)
43-47: Public handler is missing required trace decorator and doc comment.Please add
#[fastrace::trace]and a///doc comment onchat_completions.📝 Suggested minimal change
+/// Handle OpenAI-compatible chat completion requests and emit request-scoped telemetry. +#[fastrace::trace] pub async fn chat_completions(As per coding guidelines:
src/**/*.rs: Use#[fastrace::trace]decorator for distributed tracing on public functions; and**/*.rs: Use///for doc comments on public items in Rust.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/proxy/handlers/chat_completions/mod.rs` around lines 43 - 47, Add a doc comment and the distributed-tracing attribute to the public handler function chat_completions: prepend a triple-slash doc comment (///) describing the handler's purpose and then add the #[fastrace::trace] attribute above the fn signature for chat_completions (the function taking State(state): State<AppState>, mut request_ctx: RequestContext, Json(mut request_data): Json<ChatCompletionRequest>) and returning Result<Response, ChatCompletionError> so it conforms to the project's tracing and documentation guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Around line 231-239: The code currently calls span.add_properties(||
output_collector.output_message_span_properties()) which exports full streamed
messages/tool-call arguments to trace attributes; change this so
span.add_properties is only called when a config flag (e.g.,
config.export_streamed_output or runtime_settings.export_streamed_output) is
true, and before attaching ensure you sanitize the data by calling a
redaction/truncation helper (e.g.,
output_collector.redact_and_truncate_message_properties() or a new method that
returns sanitized span properties) instead of output_message_span_properties;
apply the same gated/sanitized logic in both places where span.add_properties is
used so raw sensitive content is never exported by default.
- Around line 95-108: The match arms in chat_completions should attach terminal
attributes directly to the resolved `span` instead of calling LocalSpan::add_*
so change the Complete/Gateway-error/Timeout branches to call
span.add_properties(...) and span.add_property(...) (use the existing `span`
variable) and convert the Err(Err) timeout mapping to use span before returning;
also add #[fastrace::trace] and the required /// doc comment to the public
chat_completions handler and add #[fastrace::trace] to the internal
handle_regular_request and handle_stream_request functions so traces are
created; finally stop recording raw model outputs into spans by removing or
redacting calls to output_collector.output_message_span_properties() (lines that
add full text/tool calls/args) and replace them with non-sensitive
summary/redacted properties instead.
---
Nitpick comments:
In `@src/proxy/handlers/chat_completions/mod.rs`:
- Around line 43-47: Add a doc comment and the distributed-tracing attribute to
the public handler function chat_completions: prepend a triple-slash doc comment
(///) describing the handler's purpose and then add the #[fastrace::trace]
attribute above the fn signature for chat_completions (the function taking
State(state): State<AppState>, mut request_ctx: RequestContext, Json(mut
request_data): Json<ChatCompletionRequest>) and returning Result<Response,
ChatCompletionError> so it conforms to the project's tracing and documentation
guidelines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 591b0a83-db58-419b-9046-a287640c580c
📒 Files selected for processing (1)
src/proxy/handlers/chat_completions/mod.rs
Preview
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests