-
Notifications
You must be signed in to change notification settings - Fork 104
feat(ai): Infer gen_ai_operation_type from span.op #5129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cb4caba to
2ceb394
Compare
2ceb394 to
129ad7a
Compare
| - Add gen_ai_cost_total_tokens attribute and double write total tokens cost. ([#5121](https://github.com/getsentry/relay/pull/5121)) | ||
| - Change mapping of incoming OTLP spans with `ERROR` status to Sentry's `internal_error` status. ([#5127](https://github.com/getsentry/relay/pull/5127)) | ||
| - Add `ai_operation_type_map` to global config ([#5125](https://github.com/getsentry/relay/pull/5125)) | ||
| - Add `ai_operation_type_map` to global config. ([#5125](https://github.com/getsentry/relay/pull/5125)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added dot at the end of the change log entry
| use crate::normalize::AiOperationTypeMap; | ||
| use crate::normalize::request; | ||
| use crate::span::ai::enrich_ai_span_data; | ||
| use crate::span::ai::enrich_ai_event_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a function rename - it was called enrich_ai_span_data but it was operating on an event that had a lot of spans.
Also a new function enrich_ai_span_data is now introduced that operates only on span, and it is called inside of enrich_ai_event_data to prevent code duplication
| pub ai_model_costs: Option<&'a ModelCosts>, | ||
|
|
||
| /// Configuration for mapping AI operation types from span.op to gen_ai.operation.type | ||
| pub ai_operation_type_map: Option<&'a AiOperationTypeMap>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like with ai_model_costs we don't have access to global config object in a code path where we would need it, so we add this map to NormalizationConfig which is accessible where we need it
129ad7a to
a5ea007
Compare
a5ea007 to
7c18e64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is a bit refactored:
extract_ai_datais no longer apubfunction - there is no need for itmap_ai_measurements_to_datais no longer apubfunction - there is no need for itis_ai_spanis no longer apubfunction - there is no need for it- to prevent logic duplication, we now have only 2
pubfunctionsenrich_ai_event_dataandenrich_ai_span_data(also called withinenrich_ai_event_data) and only those two unctions are publicly exposed
| if let Some(model_costs_config) = ai_model_costs { | ||
| extract_ai_data(span, model_costs_config); | ||
| } | ||
| enrich_ai_span_data(span, ai_model_costs, ai_operation_type_map); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a benefit of the refactor. next time we want to extend AI enrichment, we only need to change enrich_ai_span_data function, and not worry about updating all of the other places where it could be called
jjbayer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pressing request changes because of the .unwrap().
| let exact_key = Pattern::new(span_op).ok()?; | ||
| if self.operation_types.contains_key(&exact_key) { | ||
| return self.operation_types.get(&exact_key).map(String::as_str); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this part. Why would we set the type to the key of the hash map instead of the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of the hashmap is String, so if possible, we do only a hash lookup with Pattern value of span_op, if not we will iterate over all of the keys to search for a fallback.
I don't really understand the question: "Why would we set the type to the key of the hash map instead of the value?"
If it helps and adds more context why it is done like this, we do the similar thing for model cost hash map:
| pub fn cost_per_token(&self, model_id: &str) -> Option<ModelCostV2> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this without creating a Pattern for the key if you add a Borrow implementation to Pattern:
impl Borrow<str> for Pattern {
fn borrow(&self) -> &str {
&self.pattern
}
}From the docs:
The key may be any borrowed form of the map’s key type, but Hash and Eq on the borrowed form must match those for the key type.
| let exact_key = Pattern::new(span_op).ok()?; | |
| if self.operation_types.contains_key(&exact_key) { | |
| return self.operation_types.get(&exact_key).map(String::as_str); | |
| } | |
| if let Some(value) = self.operation_types.get(span_op) { | |
| return Some(value.as_str()); | |
| } |
Same for cost_per_token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it for get_operation_type, for cost_per_token i'll do it as a follow up because i need to change a bit more code (the reference will be returned instead the owned object), and it will have nicer commit associated with it
| return; | ||
| }; | ||
|
|
||
| if let Some(operation_type) = operation_type_map.get_operation_type(span.op.value().unwrap()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could panic because of the .unwrap().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, while i was debugging this, i just put unwrap to quickly iterate on it 😬
Removed it now
jjbayer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can optimize a little (see comment) but overall this looks good!
| let exact_key = Pattern::new(span_op).ok()?; | ||
| if self.operation_types.contains_key(&exact_key) { | ||
| return self.operation_types.get(&exact_key).map(String::as_str); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this without creating a Pattern for the key if you add a Borrow implementation to Pattern:
impl Borrow<str> for Pattern {
fn borrow(&self) -> &str {
&self.pattern
}
}From the docs:
The key may be any borrowed form of the map’s key type, but Hash and Eq on the borrowed form must match those for the key type.
| let exact_key = Pattern::new(span_op).ok()?; | |
| if self.operation_types.contains_key(&exact_key) { | |
| return self.operation_types.get(&exact_key).map(String::as_str); | |
| } | |
| if let Some(value) = self.operation_types.get(span_op) { | |
| return Some(value.as_str()); | |
| } |
Same for cost_per_token.
jjbayer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check comments before merging.
Based on the comment (#5129 (comment)) this PR improves the code for AI cost calculation.
Description
In relay, we want to infer the
gen_ai.operation.typeattribute (it's a newly added attribute) fromgen_ai.operation.name.This will be used in UI to easier query and plot data based on operation type. Currently we have a long list of attributes that we query for one operation type, but as that list grows, it is becoming harder and harder to pass all the parameters as query param in a request to EAP.
Implementation plan
PRs will be separated to make it easier to review. The mapping will be defined in sentry, and it will be propagated to relays as a part of global config to make it easier for us to change and extend the mappings.
(in relay):
ai_operation_type_mapto global config (feat(ai): Add ai_operation_type_map to global config #5125)(in sentry):
ai_operation_type_mapfrom defined list of attributesPart of TET-1092: Introduce
gen_ai.operation.typeattribute