[Proposal] [Next] Query Logging Improvements #942

abonander · 2020-12-29T19:55:08Z

A lot of people will probably agree with me when I say there's room for improvement with how we currently log queries.

Logging is very noisy because we always print the whole query at INFO. I think at most we should be printing at INFO is the source file and line of the query (which we can do with #[track_caller] and std::panic::Location::caller()), maybe the first couple keywords in the query as context, and the time the query took. We should only be logging the full query (with parameters interpolated) at TRACE.
Logging is expensive because we pretty-print the SQL. I think we shouldn't be doing nearly that much to it. At most, we should be removing excess indentation so the query isn't weirdly left-padded in logs. I think sqlformat can help us there instead of rolling it ourselves. (cc @shssoichiro)
It doesn't make sense for ConnectOptions::{log_statements, log_slow_statements} to take a log::LevelFilter; instead, it should be Option<log::Level>. Then, if someone is using a crate like tracing that includes log but doesn't want to use log directly, they can still disable query logging by passing None (Add a way to disable query logging programmatically #939).
~~(Bug) we don't actually obey the log-level setting for sqlx::query, only the global log setting. This doesn't need to be in next, though; I'll be fixing it shortly.~~ Done: fix(logging): make query logging obey level setting for sqlx::query target #945

The text was updated successfully, but these errors were encountered:

abonander · 2020-12-29T20:32:14Z

I'm thinking we can change how much we log based on the log level set for sqlx::query:

At INFO, we only print a short summary of the query as mentioned in bullet point 1, so something like this:

[INFO sqlx::query] "SELECT foo, ... FROM bar ..." at src/db/my_query.rs:42 executed in 490.32ns

At DEBUG (and the slow-query logging at WARN or otherwise configured level), we print that plus query parameters:

[DEBUG sqlx::query] "SELECT foo, ... FROM bar ..." at src/db/my_query.rs:42 executed in 490.32ns; parameters: ["baz", 1234567, "2020-12-29T12:28:19-08:00"]

At TRACE, we print the whole query verbatim (only excess left-padding removed) with parameters interpolated:

[TRACE sqlx::query] query at src/db/my_query.rs:42 executed in 490.32ns:
SELECT 
    foo, 
    baz, 
    quux
FROM bar
WHERE 
    baz = 'baz' AND quux <> 1234567 AND created_at > `2020-12-29T12:28:19-08:00`
LIMIT 5

Thanks to #[track_caller], we can get source file and line info for both the query macros and functions.

SomeoneToIgnore · 2020-12-30T00:01:26Z

Looks great, thanks for coming with this up.
I have nothing to add to the issue message, but the first comment with the query format proposal looks worth discussing.

On DEBUG level, to avoid accidentally printing a couple of thousands parameters, might be good to limit them with some arbitrary number.
I like the idea to separate the verbosity levels and wonder if we should drag the separation further and split logging level and logging verbosity?

If I've understood the current proposal right, sqlx query verbosity changes depending on the log::Level we set for either.
With this model, would it ever be possible to WARN slow query with parameters in the log message? Maybe somebody would be insterested in logging the whole slow query with some non-TRACE level.

If you think that might make sense, consider keeping the log level just the log level and add some kind of verbosity parameter to the query settings builder, allowing to switch the formats.

==
Come to think of it, this separation might serve as a better compromise with non-log users due to being framework agnostic.

abonander · 2020-12-30T00:35:53Z

With this model, would it ever be possible to WARN slow query with parameters in the log message?

I didn't think of slow query logging when I wrote that out but yeah, it should definitely include the parameters so it's possible to repro the slowdown. So the slow query log at WARN should probably be the same format as DEBUG.

On DEBUG level, to avoid accidentally printing a couple of thousands parameters, might be good to limit them with some arbitrary number.

That's hard cause like, what arbitrary number is acceptable? What about for a slow-query log? Then you'd probably want all the parameters for reproducibility. Perhaps for slow-query logging we should have an option to dump the full query data with the complete SQL and all parameters to a file or some other sink.

shssoichiro · 2020-12-30T00:52:48Z

I like the ideas for INFO and TRACE, but I don't think printing parameters is a good idea. It has the potential for injecting PII into logs.

abonander · 2020-12-30T01:08:30Z

True, but it could come in handy for reproducing slow queries. Instead of changing output for DEBUG/WARN, maybe we have a setter on ConnectOptions?

/// Log query arguments (values substituted into value placeholders, i.e. `$N` / `?`).
///
/// ### Note: Beware of Personally Identifiable Information
/// This will likely expose personally identifiable information (PII), or otherwise infringe on privacy of users 
//// (which may be protected by law), if the logs are submitted to third-party providers like Sentry, Firebase, etc.
///
/// Take care when enabling this option.
pub fn log_arguments(&mut self, log_args: bool) -> &mut Self

As suggested in my previous comment, maybe we don't want to lean directly on log for saving reproducible queries but instead allow directly configuring a sink for query data for processing/storage.

jkoudys · 2020-12-30T03:03:06Z

@abonander big agree on the 2nd point. Personally I strongly dislike that the query is formatted at all -- we're losing valuable information in the logs! The original, unaltered string can have a lot of interesting hints at the cause of a problem that are simply being thrown away irrecoverably. I could always pipe the query through an sql formatter from vim if I really cared about it being pretty - taking that hit at runtime seems pointless. The query formatter it's running now is doing a full tokenization, which is so much more expensive than the remove-the-indent bit you suggest.

SomeoneToIgnore · 2020-12-30T15:10:14Z

So the slow query log at WARN should probably be the same format as DEBUG.

I presume, you have to do the same for Error level too then, keeping in mind that any user can change the log levels for both queries with the code like

let mut connection_options = MySqlConnectOptions::from_str(connection_url)?;
connection_options
    .log_statements(log::LevelFilter::Error)
    .log_slow_statements(log::LevelFilter::Error, Duration::from_secs(1));
let pool = MySqlPoolOptions::new().connect_with(connection_options).await?;

So, putting all formats and queries together, did I get you right:

	regular query format	slow query format
off	—	—
trace	full query, interpolated	full query, interpolated
debug	short summary + parameters	short summary + parameters
info	short summary	short summary + parameters
warn	short summary	short summary + parameters
error	short summary	short summary + parameters

?

For me, it's

a bit odd to see inconsistent query formats on different log levels
just odd to see a library behavior to change due to different log level (configurable via env variables!), I fail to recall any similar cases for libraries. Keeping in mind the performance issues, still looks like a footgun feature in general

Thus said, I still would have preferred something like enum QueryLogFormat { Full, WithParameters(Option<usize>), Summary, None } with the way to set it simiarly to the log::LevelFilter now + some reasonable defaults for both queries.

But the choice is yours after all, I won't insist on this point of view much 🙂

SomeoneToIgnore · 2020-12-30T15:18:51Z

what arbitrary number is acceptable? What about for a slow-query log?

I think you can either make it configurable, or just pick some arbitrary number (100?) and recommend to log the full log for all parameters.

Parameters' value inflate with their quantity drastically: 1-10 params can be valuable, but what about 1843, especially when they are displayed detached from the main query? :)

Then you'd probably want all the parameters for reproducibility. Perhaps for slow-query logging we should have an option to dump the full query data with the complete SQL and all parameters to a file or some other sink.

Great ideas in general, not sure if feasible to implement right away.
Slow sql detection + configurable query format + the pointer to the place where the query originates might be enough to cover the majority of the use cases.

I like the idea to try and be as failsafe as possible for the end user, but not sure it's possible in general to predict everything and log the query in the best possible format, keeping this whole shebang configurable might be just enough.

shssoichiro · 2020-12-30T15:56:02Z

I'm also +1 for making query logging configurable, perhaps as an option when building the sqlx Connection, with some sensible default. Probably we would want two parameters, one to configure the standard log level (defaults to either off or the short summary) and one to configure the slow query log level (defaults to the short summary).

mehcode · 2020-12-30T18:01:12Z

Let's take a step back here and go survey how other large SQL frameworks do this. Rails, Django, etc. There must be some inspiration on what works well to draw from.

Fishrock123 · 2021-03-22T19:30:47Z

It would be nice if tracing events could be emitted alongside the logging events (with a separate loglevel) at some granularity.

We'd prefer this stuff in our honeycomb traces. I turn all the default "info" stuff from SQLx down to "debug" because it's not useful enough in production.

Thomasdezeeuw · 2021-09-03T15:39:50Z

I just wanted to echo the request to change these logs from info to trace. We're actually forced to disable info logging because of SQLx, which normally includes useful stuff as well. Making the logging optional (perhaps by the setting the required LogLevel) and whether or not to log the parameters would also be a big plus for us.

jplatte · 2021-09-03T16:00:11Z

The log level can already be customized (and had its default changed) on master since #1371, so that will be part of the next release.

malobre · 2021-09-11T18:26:39Z

~~It would be nice to have query-logging / SQL pretty-printing feature-gated. sqlformat depends on regex which is quite long to compile.~~

shssoichiro · 2021-09-12T16:51:32Z

I just published a new patch version of sqlformat that removes the regex dependency. Does this help?

malobre · 2021-09-12T18:19:46Z

Yes it does! Thanks a lot.

bin · 2021-10-23T03:39:06Z

I'd suggest adding the ability to entirely disable logging of the actual query and print just a warning message with the line number when a query is slow, as this would be useful for very large queries. I use large batch insert queries that max out the number of placeholders, and formatting these to log causes my inserts to take about thirty times longer in debug builds and around three times longer in release. It chews up quite a lot of CPU in the process compared to the non-logging insertions (which are mainly I/O bound.)

abonander · 2021-10-23T03:49:09Z

Yeah, I think I tend to agree cause I personally don't really pay attention at all to the queries being logged, it just adds noise.

And if you're running in, say, Gcloud, you pay by the byte for log storage so noise isn't just annoying but can be costly too. Learned that the hard way recently.

With #[track_caller] we can report the file and line even when using the query functions and not the macros.

taladar · 2022-01-15T18:24:30Z

And if you're running in, say, Gcloud, you pay by the byte for log storage so noise isn't just annoying but can be costly too. Learned that the hard way recently.

I don't think libraries should tailor their log output to very specific use cases like that when there is the option to just set the log level per module in RUST_LOG. Primarily logging all queries should be something enabled in development and when trying to trace specific bugs only anyway.

MaxFangX · 2022-07-29T19:11:30Z

Just want to express a +1 for the original proposal which makes the INFO log much more concise, and moves the pretty-printed, multi-line log to TRACE.

My company is running a warp webserver which uses SeaORM which uses sqlx, and we log at INFO under normal circumstances. Our problem was that whenever anything happens on our server that triggers a DB hit, sqlx's extremely verbose INFO log ends up dominating about 70% of the vertical space in our log outputs, making it really hard to visually identify the high-level application logs which are really what we're interested in. As I understand it, many people are using sqlx relatively deep in their stack, and usually don't care to see full sqlx queries when everything is working fine - it's only when something is going wrong with the queries that people may want to see the full, verbose output - in which case they will set their log level to DEBUG or TRACE (and in which case seeing the full pretty-printed query is extremely helpful).

Of all the dependencies we have, sqlx is the only one where we had to adjust the log level because it was too verbose. I think this is probably the case for most people, and therefore that by default the INFO logs should print a relatively short, single-line log at most. My company ended up disabling sqlx logging entirely, and only turn it back on whenever we are debugging database queries.

penso · 2022-08-06T13:46:01Z

Agree and same situation than @MaxFangX at my company

arifd · 2022-08-21T23:18:17Z

Beyond a discussion of how to format the logs, I would like to propose that any logging of the happy and typical path be reduced to TRACE. For reasons already stated by other people in this thread, but also because this allows me to optimize away slqx logs entirely at compile time by setting a compile time flag.

INFO is valuable to me in production for other reasons, so I can't disable that level at compile time.

Because of this, even with filtering, every database access unnecessarily incurs an extra atomic load and a boolean equality check in release builds.

MahdiBaghbani · 2022-11-21T12:40:06Z

I recently developed an API with rust (Poem with Seaql which uses sqlx underneath),
I believe sqlx shouldn't emit anything at INFO level, it makes production environments totally unreadable, specially if the query is complex.
it is occurring for every single request on an API endpoint. it is useful while debugging but not otherwise.

shssoichiro · 2022-11-27T04:22:31Z

As the original author of the log implementation, and after having used this for a while in production... I am +1 for moving the normal, happy-path logs to DEBUG level. It does add quite a bit of noise, and the comments I've seen about performance are valid as well.

arifd · 2023-01-27T15:12:21Z

As the original author of the log implementation, and after having used this for a while in production... I am +1 for moving the normal, happy-path logs to DEBUG level. It does add quite a bit of noise, and the comments I've seen about performance are valid as well.

I don't filter DEBUG in production so I won't get any performance benefit from this change. Perhaps could set the level behind a feature flag or compile time env var, so that at least we have the choice for what works best for us?

on a separate note, I have slow query logging enabled, and the output looks like this:

  2023-01-27T14:53:24.137806Z  WARN sqlx::query: UPDATE entries SET processing_status …; rows affected: 18462, rows returned: 0, elapsed: 1.791s

UPDATE
  entries
SET
  processing_status = 'ready_to_process',
  processing_metadata_is_up_to_date = false,
  processing_error = NULL
WHERE
  id = ANY($1)

If the log event emits the SQL anyway, could we use the space where it prints a snippet of the SQL (UPDATE entries SET processing_status …;) to instead tell us the reason behind this WARN event (slow) - this will let me know i need to bring my attention to the elapsed field, and not otherwise be concerned that i can't understand why it's warning me (or anyone on the team)

felipelalli · 2023-02-27T06:10:14Z

As the original author of the log implementation, and after having used this for a while in production... I am +1 for moving the normal, happy-path logs to DEBUG level. It does add quite a bit of noise, and the comments I've seen about performance are valid as well.

I would like to adjust the log levels of full queries to TRACE. Is it possible to make this configurable?

Furthermore, I propose setting slow queries to WARN level instead. It might be useful to have the "slow" interval as a configurable parameter, with a default setting of around 100ms.

CosmicHorrorDev · 2023-02-27T06:17:35Z

I would like to adjust the log levels of full queries to TRACE. Is it possible to make this configurable?

https://docs.rs/sqlx/latest/sqlx/trait.ConnectOptions.html#tymethod.log_statements

Furthermore, I propose setting slow queries to WARN level instead. It might be useful to have the "slow" interval as a configurable parameter, with a default setting of around 100ms.

https://docs.rs/sqlx/latest/sqlx/trait.ConnectOptions.html#tymethod.log_slow_statements

vexx32 · 2023-06-30T22:16:48Z

So currently I am passing a connection string into PgPoolOptions::connect()... I can't find a way to disable statement logging without also having to go through the whole annoyance of splitting out the connection string into pieces in order to supply it to a PgConnectOptions instance. Why is there no way to set a full connection string on PgConnectOptions directly? 😩

abonander · 2023-06-30T22:19:56Z

There is; you can parse it from the string and modify it in-place:

// Using the standard `FromStr` trait and `str::parse()` method
let mut opts: PgConnectOptions = "<your database URL>".parse()?;
opts.log_statements(...);

let pool = pool_options.connect_with(&opts).await?;

vexx32 · 2023-06-30T22:21:13Z

That's what I was missing, thank you so much! 💖

temeddix · 2023-08-20T08:53:54Z

This needs to be fixed, as this pollutes the terminal significantly. Especially on bulk insertion, the console(or the log file) becomes a mess. The content of the whole query should be excluded or be written as TRACE level of logging IMHO.

Below is the Rust code that I'm using:

let mut mysql_connect_option: MySqlConnectOptions = database_address.parse().unwrap();
mysql_connect_option = mysql_connect_option.log_statements(log::LevelFilter::Trace);
let mysql_pool_option = MySqlPoolOptions::new().max_connections(32);
let mysql_pool = mysql_pool_option
    .connect_with(mysql_connect_option)
    .await
    .expect("Unable to connect to the database");
tracing::info!("Connected to database");

This is the console output:

(...goes on and on...)

EDIT: This seems to be the trick:

let mut mysql_connect_option: MySqlConnectOptions = database_address.parse().unwrap();
mysql_connect_option = mysql_connect_option.disable_statement_logging();
let mysql_pool_option = MySqlPoolOptions::new().max_connections(32);
let mysql_pool = mysql_pool_option
    .connect_with(mysql_connect_option)
    .await
    .expect("Unable to connect to the database");
tracing::info!("Connected to the database");

abonander added the proposal label Dec 29, 2020

abonander added this to the 0.5 milestone Dec 29, 2020

abonander mentioned this issue Dec 29, 2020

Add a way to disable query logging programmatically #939

Merged

mehcode removed this from the 0.5 milestone Mar 16, 2021

ThomWright mentioned this issue Jun 6, 2022

Observability improvements #1896

Open

wimax-grapl mentioned this issue Sep 14, 2022

Make our Rust logs a bit more readable by quieting noisy stuff grapl-security/grapl#1984

Merged

CosmicHorrorDev mentioned this issue Oct 26, 2022

Potentially switch from log to tracing #2177

Open

nc7s mentioned this issue Mar 10, 2023

Lower default logging level for statements to Debug #2393

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] [Next] Query Logging Improvements #942

[Proposal] [Next] Query Logging Improvements #942

abonander commented Dec 29, 2020 •

edited

Loading

abonander commented Dec 29, 2020 •

edited

Loading

SomeoneToIgnore commented Dec 30, 2020 •

edited

Loading

abonander commented Dec 30, 2020 •

edited

Loading

shssoichiro commented Dec 30, 2020

abonander commented Dec 30, 2020 •

edited

Loading

jkoudys commented Dec 30, 2020

SomeoneToIgnore commented Dec 30, 2020

SomeoneToIgnore commented Dec 30, 2020

shssoichiro commented Dec 30, 2020

mehcode commented Dec 30, 2020

Fishrock123 commented Mar 22, 2021

Thomasdezeeuw commented Sep 3, 2021

jplatte commented Sep 3, 2021

malobre commented Sep 11, 2021 •

edited

Loading

shssoichiro commented Sep 12, 2021

malobre commented Sep 12, 2021

bin commented Oct 23, 2021

abonander commented Oct 23, 2021

taladar commented Jan 15, 2022

MaxFangX commented Jul 29, 2022

penso commented Aug 6, 2022

arifd commented Aug 21, 2022

MahdiBaghbani commented Nov 21, 2022 •

edited

Loading

shssoichiro commented Nov 27, 2022

arifd commented Jan 27, 2023 •

edited

Loading

felipelalli commented Feb 27, 2023 •

edited

Loading

CosmicHorrorDev commented Feb 27, 2023

vexx32 commented Jun 30, 2023

abonander commented Jun 30, 2023 •

edited

Loading

vexx32 commented Jun 30, 2023

temeddix commented Aug 20, 2023 •

edited

Loading

[Proposal] [Next] Query Logging Improvements #942

[Proposal] [Next] Query Logging Improvements #942

Comments

abonander commented Dec 29, 2020 • edited Loading

abonander commented Dec 29, 2020 • edited Loading

SomeoneToIgnore commented Dec 30, 2020 • edited Loading

abonander commented Dec 30, 2020 • edited Loading

shssoichiro commented Dec 30, 2020

abonander commented Dec 30, 2020 • edited Loading

jkoudys commented Dec 30, 2020

SomeoneToIgnore commented Dec 30, 2020

SomeoneToIgnore commented Dec 30, 2020

shssoichiro commented Dec 30, 2020

mehcode commented Dec 30, 2020

Fishrock123 commented Mar 22, 2021

Thomasdezeeuw commented Sep 3, 2021

jplatte commented Sep 3, 2021

malobre commented Sep 11, 2021 • edited Loading

shssoichiro commented Sep 12, 2021

malobre commented Sep 12, 2021

bin commented Oct 23, 2021

abonander commented Oct 23, 2021

taladar commented Jan 15, 2022

MaxFangX commented Jul 29, 2022

penso commented Aug 6, 2022

arifd commented Aug 21, 2022

MahdiBaghbani commented Nov 21, 2022 • edited Loading

shssoichiro commented Nov 27, 2022

arifd commented Jan 27, 2023 • edited Loading

felipelalli commented Feb 27, 2023 • edited Loading

CosmicHorrorDev commented Feb 27, 2023

vexx32 commented Jun 30, 2023

abonander commented Jun 30, 2023 • edited Loading

vexx32 commented Jun 30, 2023

temeddix commented Aug 20, 2023 • edited Loading

abonander commented Dec 29, 2020 •

edited

Loading

abonander commented Dec 29, 2020 •

edited

Loading

SomeoneToIgnore commented Dec 30, 2020 •

edited

Loading

abonander commented Dec 30, 2020 •

edited

Loading

abonander commented Dec 30, 2020 •

edited

Loading

malobre commented Sep 11, 2021 •

edited

Loading

MahdiBaghbani commented Nov 21, 2022 •

edited

Loading

arifd commented Jan 27, 2023 •

edited

Loading

felipelalli commented Feb 27, 2023 •

edited

Loading

abonander commented Jun 30, 2023 •

edited

Loading

temeddix commented Aug 20, 2023 •

edited

Loading