-
Notifications
You must be signed in to change notification settings - Fork 191
RUST-2138 Add optional support for OpenTelemetry #1495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The utility functions here just got moved out of tracing since they're needed for otel as well.
| let cmd_name = cmd.name.clone(); | ||
| let target_db = cmd.target_db.clone(); | ||
| #[cfg(feature = "opentelemetry")] | ||
| let cmd_attrs = crate::otel::CommandAttributes::new(&cmd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otel::CommandAttributes exists because the opentelemetry trace needs values from both the Command struct and the Message struct, and constructing the latter consumes the former so they can't just both be passed in directly.
| self.drop_span(); | ||
| } | ||
|
|
||
| pub(crate) fn drop_span(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm annoyed that this is needed but commit / abort are called before the relevant operation is executed and that op needs to be included in the span.
| self.transaction.start( | ||
| options, | ||
| #[cfg(feature = "opentelemetry")] | ||
| self.client.start_transaction_span(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to keep span construction internal to Transaction but it doesn't have access to the Client (specifically, the client's tracer) so it has to happen here and get passed in.
| #[cfg(feature = "opentelemetry")] | ||
| pub mod otel; | ||
| #[cfg(not(feature = "opentelemetry"))] | ||
| mod otel_stub; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the executor opentelemetry stuff (future method chaining mostly) would be really awkward and noisy to keep behind #[cfg(feature = "opentelemetry")]; the otel_stub module exists to sidestep that.
| #[derive_where(skip)] | ||
| #[builder( | ||
| setter( | ||
| fn transform<S, T, P>(provider: P) -> Option<Arc<dyn ObjectSafeTracerProvider + Send + Sync>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what necessitated the bump on typed_builder - older versions don't support setters with constraints.
|
|
||
| #[tokio::test(flavor = "multi_thread")] | ||
| async fn run_unified_operation() { | ||
| // TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These and the skip reasons are going to turn into a follow-up PR for the driver otel spec&tests.
| fn name(&self) -> &CStr; | ||
|
|
||
| #[cfg(feature = "opentelemetry")] | ||
| type Otel: crate::otel::OtelWitness<Op = Self>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Motivating problem: I didn't want to just splat the new data extraction methods needed for otel into Operation - that's already got a ton of methods, it would be very noisy with the compiler conditional on each one, and it would mean otel-specific updates would always be touching core operation code.
Ideally, those methods would (a) be in their own trait and (b) that trait would be a supertrait of Operation when otel is enabled. (a) is easy enough but (b) is ugly - it would mean either duplicating the definition of Operation based on whether the feature is enabled or not, or making a no-op OperationSuper trait that itself gets duplicated. Either way it's a lot of noise and wouldn't scale well if we end up needing other optional traits on Operation.
So, I took a trick from my old Haskell days and used a type witness. That lets us prove to the compiler that any particular &impl Operation value can be converted into a &impl OtelInfo value, and the casts themselves are no-ops so it all compiles away.
| fn otel(op: &Self::Op) -> &impl OtelInfo { | ||
| op | ||
| } | ||
| fn output_cursor_id(output: &<Self::Op as Operation>::O) -> Option<i64> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory this shouldn't be needed but the compiler would lose the associated type constraints somewhere, so this propagates them.
src/client/executor.rs
Outdated
| .with_context(ctx.clone()) | ||
| .await; | ||
| #[cfg(feature = "opentelemetry")] | ||
| self.record_error(&ctx, &result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this record a duplicate error if an error is also recorded from the call to record_command_result on line 673?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope - this will record an error on the operation span, line 673 is recording for the command span. I had to go back and look to confirm this, so I refactored the code to make the different types of spans in use have different concrete types to hopefully make this easier for future reading.
src/otel.rs
Outdated
| KeyValue::new("exception.message", error.to_string()), | ||
| KeyValue::new("exception.type", error.kind.name()), | ||
| #[cfg(test)] | ||
| KeyValue::new("exception.stacktrace", error.bt.to_string()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about adding a feature flag for this rather than keeping it behind cfg(test)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me; done.
src/otel.rs
Outdated
| #[builder(field_defaults(default, setter(into)))] | ||
| #[serde(rename_all = "camelCase")] | ||
| #[non_exhaustive] | ||
| pub struct Options { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
isabelatkinson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, hit approve before looking at CI
592b76a to
dc984f2
Compare

RUST-2138
The bulk of the added code is in the new
crate::otelsubmodule; beyond that it's mostly some additional tracking in the executor and new fields for operations.