Skip to content

wrap timed streams and iterators in tracing::Spans; follow Opentelemetry conventions#3313

Closed
zvkemp wants to merge 2 commits into
transact-rs:mainfrom
zvkemp:logging-conventions
Closed

wrap timed streams and iterators in tracing::Spans; follow Opentelemetry conventions#3313
zvkemp wants to merge 2 commits into
transact-rs:mainfrom
zvkemp:logging-conventions

Conversation

@zvkemp
Copy link
Copy Markdown

@zvkemp zvkemp commented Jun 26, 2024

Does your PR solve an issue?

When exporting to instrumentation collectors, timed spans are generally more desirable than discrete events annotated with timing data. This PR:

  • wraps the existing logger and pool::acquire tracing events in tracing spans. Common metadata is moved to the span; generally the tracing subscriber can (and does by default in fmt::subscriber) write span-level metadata when events are logged, so the current output should be the same. The span level is set to the max level of the event log configuration, where appropriate.
  • Adds an associated NAME_LOWERCASE constant to Database, which follows the semantic conventions defined by Opentelemetry (https://opentelemetry.io/docs/specs/semconv/attributes-registry/db/). This value is attached to the span as db.system.

@zvkemp zvkemp force-pushed the logging-conventions branch from d4738ae to 6f96abe Compare July 17, 2024 13:14
Comment thread sqlx-core/src/logger.rs
// Human-friendly - includes units (usually ms). Also kept for backward compatibility
?elapsed,
// Search friendly - numeric
elapsed_secs = elapsed.as_secs_f64(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing elapsed and elapsed_secs is a breaking change to the instrumentation and reporting that others have built on top of existing sqlx logging.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ole' Hyrum's Law https://www.hyrumslaw.com/

Strictly speaking, we never made the exact format of log messages part of our API contract, but I'd also be pretty annoyed if this changed out from under me.

It'd also make me a hypocrite, given how the query macros came about.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was hoping this would be rolled into the breaking-changes-allowed 8.x release, as it's a little superfluous to track timings in addition to the tracing span's own timing, but I certainly see how it's useful. Restored and rebased.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

A standards based approach would certainly be better in the long run.

I have my own logging breaking change pr open - but its much smaller in scope and at @abonander's suggestion we tagged a few users who were interested to see if that break will affect them, as an alternative to maintaining two outputs.

The break that was in this PR (removing the elapses) would affect people.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was hoping this would be rolled into the breaking-changes-allowed 8.x release

Yeah, to be perfectly honest, this PR required more thinking than I wanted to do while I was trying to wrap up 0.8.0.

At a certain point, I have to stop myself from looking at new PRs, or else I'll sit there forever trying to get every one of them merged before I kick out a release.

If we're happy calling this a breaking change, we can plan to land this for 0.9.0.

Copy link
Copy Markdown
Author

@zvkemp zvkemp Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a certain point, I have to stop myself from looking at new PRs, or else I'll sit there forever trying to get every one of them merged before I kick out a release.

Completely understand; hope you didn't take my comment as a criticism. Currently re-working this to not be a breaking change; the tracing span should be opt-in if it lands in 0.8, and whether or not that would change in 0.9 is an open question.

@zvkemp zvkemp force-pushed the logging-conventions branch 2 times, most recently from e632167 to a5a0a89 Compare September 6, 2024 14:58
Comment thread sqlx-core/src/logger.rs Outdated
// Human-friendly - includes units (usually ms). Also kept for backward compatibility
?elapsed,
// Search friendly - numeric
elapsed_secs = elapsed.as_secs_f64(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks like it might trip up the format linting, and in any case for git history it would be nice to avoid reordering the fields.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation addressed (sorry, the formatter is still not good at catching things in macro invocations). Can you clarify the issue with field ordering? The shared fields were moved into the span, but in the same order

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the fields need to also be kept in the existing logging after being added to the new span.

Moving them from the individual log statement to the span changes the json structure that loggers like tracing-subscriber with the json feature will output - with that crate the items move from .fields to .span - which is good for standardization, but breaks existing consumers. We need both to avoid a breaking change.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, there doesn't currently seem to be a way to configure the json formatter to flatten the span fields into the event output (though maybe there should be?).

Would it be acceptable to gate some of this behavior behind a feature flag? It sounds like the span isn't really necessary for the slow query logger, and the event isn't really necessary for opentelemetry tracing. It would also be nice for those two things to work well together with a minimum of duplication.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slow query logger can already be configured to never trigger.

For example, if span tracking was added, but no existing logging was diminished or removed, a crate user could set slow_statements_level to LevelFilter::Off and rely exclusively on metrics reported on the span for their monitoring and alerting.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that's true. I'm reusing this config to control span visibility as well, so I'll rework how that happens.

@zvkemp zvkemp force-pushed the logging-conventions branch from a5a0a89 to 83a4f71 Compare September 6, 2024 15:45
@zvkemp zvkemp force-pushed the logging-conventions branch from 83a4f71 to 21967bd Compare September 6, 2024 15:47
Comment thread sqlx-core/src/logger.rs Outdated

if was_slow {
self.span.record("summary", summary);
self.span.record("db.statement", sql);
Copy link
Copy Markdown
Contributor

@iamjpotts iamjpotts Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else to keep in mind, and may need adjustment here, is that logging the query isn't free.

Its not uncommon for select statements to exceed 25KB or 50KB or more, and 100% logging of those will have a materially negative impact on both the running application outputting the logs, and the log observer attempting to capture them.

This is why the current implementation only logs the query when the query is slow.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm taking that into account. The revised implementation will allow the span to be leveled individually, and the query will only be recorded if any of the events or spans are within the configured level.

@zvkemp
Copy link
Copy Markdown
Author

zvkemp commented Sep 6, 2024

I've pushed a new commit to refactor these changes; notably it allows setting a span level independently from the log level (defaults to Off, which creates a no-op span). This complicates the level checking a bit but should avoid doing unnecessary work or recording large fields when they aren't explicitly required by the configuration. Note that I haven't had a chance to validate these changes yet, but should be able to test it out soon.

@iamjpotts
Copy link
Copy Markdown
Contributor

Great.

I'm interested in local validation also.

I think eventually my first iteration of the logging could be pared back in a 0.9 version of the crate. Spans are better conceptually and for standards, but we need to consider some of the conditional logging scenarios, and for now, some backwards compatibility.

I may get to it this weekend. Cheers.

@sj-robert
Copy link
Copy Markdown

Is this the only thing blocking 0.9.0? Because I am desperate for that release 😅 anything I can do to help?

@abonander
Copy link
Copy Markdown
Collaborator

I'm so sorry for letting this languish as long as I did, but unfortunately I don't think this is mergeable at this point:

  • Numerous merge conflicts.
  • The added span fields don't actually match the semantic conventions: https://opentelemetry.io/docs/specs/semconv/db/database-spans/
    • db.system -> db.system.name
    • db.statement -> db.query.text
    • These could have simply been changed since this PR was opened but either way at this point they're outdated.
  • tracing::enabled!() is fundamentally broken and we need to stop using it: Nested event generation bypasses level filtering tokio-rs/tracing#2448 (comment)
  • I think we need to phase out dynamic event and span levels entirely because it isn't really idiomatic to tracing. Libraries are meant to hardcode the verbosity level and then the user is supposed to add filters for what they do and don't want to see in traces.

@abonander abonander closed this Apr 14, 2026
Diggsey added a commit to platformed-com/sqlx that referenced this pull request May 21, 2026
QueryLogger now opens a hardcoded INFO-level `db.query` span in `new()`
that closes on drop, with field names following the OTel database span
semantic conventions (`db.system.name`, `db.query.text`,
`db.response.returned_rows`, `db.response.affected_rows`,
`otel.kind = "client"`). The existing close-time event is emitted inside
the span via `Span::in_scope` so callers get both — the span for OTel
correlation, the event for the existing `rows_affected`/`elapsed_secs`
fields under whatever level `LogSettings` configures.

`QueryLogger::new` keeps its old signature; drivers attach the system
name via the additive `with_db_system_name` builder, which records the
`db.system.name` attribute on the already-open span.

QueryLogger stores `Span`, not `EnteredSpan`, so it stays `Send` across
the postgres/mysql `try_stream!` paths that broke transact-rs#3176. A small
`InstrumentedStream` wrapper enters the span on each `poll_next` and
drops the guard before returning, so no guard is held across an await.
SQLite's synchronous `Iterator` path just relies on the close-event
in-scope emission.

Design constraints come from abonander's 2026-04-14 review on transact-rs#3313:
current OTel semconv field names, no `tracing::enabled!()` (broken per
tokio-rs/tracing#2448), hardcoded verbosity rather than runtime-
configurable span levels.
Diggsey added a commit to platformed-com/sqlx that referenced this pull request May 21, 2026
QueryLogger now opens a hardcoded INFO-level `db.query` span in `new()`
that closes on drop, with field names following the OTel database span
semantic conventions (`db.system.name`, `db.query.text`,
`db.response.returned_rows`, `db.response.affected_rows`,
`otel.kind = "client"`). The existing close-time event is emitted inside
the span via `Span::in_scope` so callers get both — the span for OTel
correlation, the event for the existing `rows_affected`/`elapsed_secs`
fields under whatever level `LogSettings` configures.

`QueryLogger::new` keeps its old signature; drivers attach the system
name via the additive `with_db_system_name` builder, which records the
`db.system.name` attribute on the already-open span.

QueryLogger stores `Span`, not `EnteredSpan`, so it stays `Send` across
the postgres/mysql `try_stream!` paths that broke transact-rs#3176. A small
`InstrumentedStream` wrapper (using `pin-project-lite`, already present
transitively via the async runtimes) enters the span on each `poll_next`
and drops the guard before returning, so no guard is held across an
await. SQLite's synchronous `Iterator` path just relies on the
close-event in-scope emission.

Design constraints come from abonander's 2026-04-14 review on transact-rs#3313:
current OTel semconv field names, no `tracing::enabled!()` (broken per
tokio-rs/tracing#2448), hardcoded verbosity rather than runtime-
configurable span levels.
Diggsey added a commit to platformed-com/sqlx that referenced this pull request May 21, 2026
QueryLogger now opens a hardcoded INFO-level `db.query` span in `new()`
that closes on drop, with field names following the OTel database span
semantic conventions (`db.system.name`, `db.query.text`,
`db.response.returned_rows`, `db.response.affected_rows`,
`otel.kind = "client"`). The existing close-time event is emitted inside
the span via `Span::in_scope` so callers get both — the span for OTel
correlation, the event for the existing `rows_affected`/`elapsed_secs`
fields under whatever level `LogSettings` configures.

`QueryLogger::new` keeps its old signature; drivers attach the system
name via the additive `with_db_system_name` builder, which records the
`db.system.name` attribute on the already-open span.

QueryLogger stores `Span`, not `EnteredSpan`, so it stays `Send` across
the postgres/mysql `try_stream!` paths that broke transact-rs#3176. A small
`InstrumentedStream` wrapper (using `pin-project-lite`, already present
transitively via the async runtimes) enters the span on each `poll_next`
and drops the guard before returning, so no guard is held across an
await. SQLite's synchronous `Iterator` path just relies on the
close-event in-scope emission.

Design constraints come from abonander's 2026-04-14 review on transact-rs#3313:
current OTel semconv field names, no `tracing::enabled!()` (broken per
tokio-rs/tracing#2448), hardcoded verbosity rather than runtime-
configurable span levels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants